Skip to content

fix: WebSocket connection-manager concurrency + proxy-aware rate limiting#62

Open
williaby wants to merge 2 commits into
claude/repo-architecture-review-n7rlgfrom
claude/tier-2-concurrency
Open

fix: WebSocket connection-manager concurrency + proxy-aware rate limiting#62
williaby wants to merge 2 commits into
claude/repo-architecture-review-n7rlgfrom
claude/tier-2-concurrency

Conversation

@williaby
Copy link
Copy Markdown
Contributor

Summary

Tier 2 of the architecture review. Stacked on #61 — base is the PR #61 branch, so this diff shows only the tier-2 changes. Rebase/retarget to main once #61 merges.

C3 — WebSocket ConnectionManager concurrency safety

The previous manager had two real races when a broadcast overlapped a connect/disconnect on the same batch:

  • "Set changed size during iteration"broadcast() iterated the live connection set while await-ing each send_json; a concurrent disconnect mutating that set raised mid-iteration. Now it iterates a snapshot (list(...)).
  • Key resurrection — cleanup used defaultdict access (self._connections[key].discard(...)), which recreated a batch key that a concurrent disconnect had just emptied and removed, leaking entries. Switched to a plain dict with non-resurrecting cleanup (.get/.pop) in both broadcast and disconnect.

Rationale for not adding an asyncio.Lock: under CPython's single-threaded event loop, connect/disconnect mutations are synchronous (atomic w.r.t. other coroutines); the only interleaving point is the awaited sends in broadcast, which the snapshot + non-resurrecting cleanup fully cover. A lock would force disconnect to become async (rippling into the router and tests) for no additional correctness here.

H6 — proxy-aware rate limiting

Behind Cloudflare, request.client.host is the proxy IP, so per-IP rate limiting effectively keyed every request on one IP. RateLimitMiddleware can now resolve the client IP from a configured header:

  • New settings rate_limit_trust_proxy (default off) and rate_limit_client_ip_header (default CF-Connecting-IP).
  • Default off is deliberate: the header is only trusted when behind a proxy that overwrites it; otherwise a client could spoof it to evade limits. When off, behavior is unchanged (request.client.host).
  • Comma-separated forwarding headers (e.g. X-Forwarded-For) use the first (originating) entry.

Verification

Gate Result
Tests 456 passed, 1 skipped (+8 new)
Coverage 92.16% (gate: 80%)
ruff check + format clean
basedpyright 0 errors

New tests: broadcast tolerates concurrent disconnect; broadcast doesn't resurrect a removed batch; proxy IP resolution (trusted/untrusted/missing-header/forwarded-list/no-client) + an integration test proving distinct CF-Connecting-IP values are limited independently.

Test plan

  • uv run pytest — 456 passed, 1 skipped
  • uv run ruff check . / ruff format --check
  • uv run basedpyright src/ — 0 errors
  • Coverage ≥ 80% (actual 92.16%)

https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE


Generated by Claude Code

…ting

Tier 2 of the architecture review (stacked on PR #61).

WebSocket ConnectionManager (C3): eliminate broadcast races
- broadcast() now iterates a snapshot of the connection set, so a concurrent
  connect/disconnect during an awaited send can no longer raise "Set changed
  size during iteration".
- Replace defaultdict with a plain dict and use non-resurrecting cleanup
  (.get/.pop) in broadcast and disconnect, so a batch emptied/removed by a
  concurrent disconnect is not silently recreated as an empty set (key leak).

Rate limiting (H6): proxy-aware client IP
- RateLimitMiddleware can resolve the client IP from a configured header
  (default CF-Connecting-IP) via the new rate_limit_trust_proxy /
  rate_limit_client_ip_header settings. Default is OFF so the header is only
  trusted behind a proxy that overwrites it (otherwise clients could spoof it
  to evade per-IP limits). Behind Cloudflare this makes per-IP limiting
  effective again instead of keying every request on the proxy IP.

Adds concurrency regression tests for the manager and unit/integration tests
for proxy IP resolution. 456 passed, coverage 92.16%.

https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE
Copilot AI review requested due to automatic review settings May 30, 2026 01:49
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)
  • main
  • master
  • develop

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1c50f07b-915d-4e91-b3d1-8b73f2a0fe72

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/tier-2-concurrency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets tier-2 reliability/security improvements: safer WebSocket batch connection bookkeeping during concurrent disconnects, and configurable proxy-aware IP selection for rate limiting.

Changes:

  • Replaces ConnectionManager’s defaultdict with a plain dict and snapshot-based broadcast iteration.
  • Adds proxy header configuration for rate-limit client IP resolution.
  • Adds regression/unit tests for WebSocket races and proxy-aware rate limiting.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/rag_processor/websocket/connection_manager.py Makes connection cleanup non-resurrecting and broadcast iteration mutation-safe.
src/rag_processor/middleware/security.py Adds trusted proxy header support to rate-limit key selection and wires it through security config.
src/rag_processor/core/config.py Adds settings for proxy-aware rate limiting.
src/rag_processor/main.py Passes new rate-limit proxy settings into middleware configuration.
tests/unit/test_websocket.py Adds regression coverage for broadcast/disconnect races and updates direct connection setup.
tests/unit/test_middleware_security.py Adds tests for proxy header IP resolution and independent forwarded-IP rate limiting.

Comment on lines +185 to +193
if forwarded:
# X-Forwarded-For-style headers may list multiple hops; the
# first is the originating client.
return forwarded.split(",")[0].strip()
logger.warning(
"trust_proxy_headers enabled but %s header missing; "
"falling back to direct peer address",
self.client_ip_header,
)
Comment on lines +551 to +552
trust_proxy_headers: bool = False
client_ip_header: str = "CF-Connecting-IP"
- _get_client_ip now ignores a blank leading entry in the forwarded header
  (e.g. ", 10.0.0.1") and falls back to the peer address instead of keying
  every malformed request on "".
- Document the trust_proxy_headers and client_ip_header fields in the
  SecurityConfig docstring.
- Add a regression test for the blank-leading-entry fallback.

https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants