refactor: tier-3 medium & low findings (config hardening, validation, cleanups)#63
refactor: tier-3 medium & low findings (config hardening, validation, cleanups)#63williaby wants to merge 1 commit into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (3)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Tier-3 cleanup PR that lands the first three "low" findings from the architecture review: removes a dead stub module (and its associated ruff ignore), exports all API routers from the rag_processor.api package, and adds an explanatory comment to Sentry's before_send filter.
Changes:
- Delete
src/rag_processor/utils/financial.pyand drop the now-obsolete per-file ruff ignore inpyproject.toml. - Re-export
batch_routeranduser_routerfromrag_processor.apiand update__all__. - Replace the terse "Example:" comment in
before_send_hookwith a rationale explaining whyKeyboardInterrupt/SystemExitare dropped.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/rag_processor/utils/financial.py | Removed dead stub module (L1). |
| pyproject.toml | Removed the ruff per-file ignore tied to the deleted stub. |
| src/rag_processor/api/init.py | Added batch_router and user_router exports; refreshed __all__ (L2). |
| src/rag_processor/core/sentry.py | Documented why KeyboardInterrupt/SystemExit are filtered (L6). |
36a49d4 to
b75fda3
Compare
Addresses the medium and low findings from the repository architecture review (stacked on the tier-2 work). Medium: - M3: PDF classification thresholds (scanned chars/page and the scanned/ digital high-confidence cutoffs) move to Settings; FileClassifier reads them by default and accepts per-instance overrides. Removes the hardcoded 50/10/200 magic numbers. - M4: extract ensure_batch_owned() in auth.dependencies; get_batch/get_job delegate to it, so the "missing-or-not-owned -> 404 (never 403), log opaque IDs only" behavior lives in one place. - M5: replace the always-pass placeholder readiness checks with a real check_redis() that PINGs Redis off the event loop. The check always reports the true status; it only fails the /health/ready probe when the new readiness_require_redis setting is enabled, so CI/dev without Redis still report ready while staying truthful in the body. - M6: load_pipeline_config gains an opt-in strict mode that raises ConfigurationError on a missing config file instead of silently falling back to the built-in localhost defaults; the lenient warning now states the defaults are not production-safe. - M7: ingest_files resolves its FileRouter via a get_file_router dependency (shared singleton) instead of a module-level global, enabling app.dependency_overrides in tests. Low: - L1: remove the unused utils/financial.py stub and its dead ruff ignore. - L2: export the batch/health/ingest/user routers from rag_processor.api and complete __all__. - L5: the readiness 503 body now conforms to the declared ReadinessStatus schema instead of an ad-hoc dict. - L6: document why Sentry before_send drops KeyboardInterrupt/SystemExit. Tests: add test_routing_config (M3 thresholds + M7 DI), test_health_readiness (M5 check + L5 schema), ensure_batch_owned coverage (M4), and pipeline strict-mode tests (M6). https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE
5dffd65 to
3899270
Compare
Summary
Tier 3 of the architecture review — the medium and low findings. Stacked on #62 (base is the tier-2 branch), so this diff shows only tier-3 changes. Retarget to
mainonce #61/#62 merge.Low findings
utils/financial.pystub (+ its dead ruff ignore).batch,health,ingest,user) fromrag_processor.apiand fix__all__.ReadinessStatusmodel.before_senddropsKeyboardInterrupt/SystemExit.Medium findings
ensure_batch_ownedhelper (dedupes the 404-as-403 / opaque-logging logic acrossget_batch/get_job).check_redisPINGs via a worker thread), gated byreadiness_require_redisso it's truthful but only fails the probe when Redis is declared required.ConfigurationErroron a missing file instead of silently falling back to localhost defaults (opt-in; default behavior unchanged + clearer warning).FileRoutervia aget_file_routerdependency instead of a module-level singleton (enablesdependency_overrides).Deliberately deferred (with rationale)
orjson— adds a dependency for no measured benefit.except— remaining ones already justified inline.Verification (final commit
4cc8d63)New tests:
test_routing_config(M3 + M7),test_health_readiness(M5 + L5),ensure_batch_ownedcoverage (M4), pipeline strict-mode tests (M6).https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE