refactor: tier-3 medium & low findings (config hardening, validation, cleanups) by williaby · Pull Request #63 · ByronWilliamsCPA/rag-processor

williaby · 2026-05-30T03:39:35Z

Summary

Tier 3 of the architecture review — the medium and low findings. Stacked on #62 (base is the tier-2 branch), so this diff shows only tier-3 changes. Retarget to main once #61/#62 merge.

Low findings

L1 Remove dead utils/financial.py stub (+ its dead ruff ignore).
L2 Export all routers (batch, health, ingest, user) from rag_processor.api and fix __all__.
L5 Readiness 503 body conforms to the declared ReadinessStatus model.
L6 Document why Sentry before_send drops KeyboardInterrupt/SystemExit.

Medium findings

M3 PDF classification thresholds moved to settings (configurable, per-instance overridable; removes 50/10/200 magic numbers).
M4 Shared ensure_batch_owned helper (dedupes the 404-as-403 / opaque-logging logic across get_batch/get_job).
M5 Real Redis readiness check (check_redis PINGs via a worker thread), gated by readiness_require_redis so it's truthful but only fails the probe when Redis is declared required.
M6 Strict pipeline-config loading: raise ConfigurationError on a missing file instead of silently falling back to localhost defaults (opt-in; default behavior unchanged + clearer warning).
M7 Inject FileRouter via a get_file_router dependency instead of a module-level singleton (enables dependency_overrides).

Deliberately deferred (with rationale)

L3 / L4 / M12 — defaults are sensible / low-value; can be a follow-up tier.
M8 JWKS backoff/circuit-breaker — the stampede-preventing lock already landed in refactor: unify Redis access, wire ingest pipeline, consolidate auth #61; a full breaker is larger.
M9 orjson — adds a dependency for no measured benefit.
M10 Broad except — remaining ones already justified inline.
M11 Event-replay ordering on trimmed history — low-value edge case.

Verification (final commit `4cc8d63`)

473 passed, 1 skipped
ruff check + format: clean
basedpyright: 0 errors
coverage: 92.83% (gate 80%)

New tests: test_routing_config (M3 + M7), test_health_readiness (M5 + L5), ensure_batch_owned coverage (M4), pipeline strict-mode tests (M6).

Process note: this branch was rebuilt + force-pushed once during development to recover from a partially-applied edit batch, and intermediate commits were briefly red (a stale non-editable venv install masked the real source, plus a couple of edits that didn't land). Both are resolved (editable reinstall + the fixes above); the final HEAD is verified green against the real src/.

https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE

coderabbitai · 2026-05-30T03:39:41Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)

main
master
develop

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7d9d9c3d-018c-4036-98c9-eaaf9c1f8533

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/tier-3-findings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Tier-3 cleanup PR that lands the first three "low" findings from the architecture review: removes a dead stub module (and its associated ruff ignore), exports all API routers from the rag_processor.api package, and adds an explanatory comment to Sentry's before_send filter.

Changes:

Delete src/rag_processor/utils/financial.py and drop the now-obsolete per-file ruff ignore in pyproject.toml.
Re-export batch_router and user_router from rag_processor.api and update __all__.
Replace the terse "Example:" comment in before_send_hook with a rationale explaining why KeyboardInterrupt / SystemExit are dropped.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
src/rag_processor/utils/financial.py	Removed dead stub module (L1).
pyproject.toml	Removed the ruff per-file ignore tied to the deleted stub.
src/rag_processor/api/init.py	Added `batch_router` and `user_router` exports; refreshed `__all__` (L2).
src/rag_processor/core/sentry.py	Documented why `KeyboardInterrupt`/`SystemExit` are filtered (L6).

Addresses the medium and low findings from the repository architecture review (stacked on the tier-2 work). Medium: - M3: PDF classification thresholds (scanned chars/page and the scanned/ digital high-confidence cutoffs) move to Settings; FileClassifier reads them by default and accepts per-instance overrides. Removes the hardcoded 50/10/200 magic numbers. - M4: extract ensure_batch_owned() in auth.dependencies; get_batch/get_job delegate to it, so the "missing-or-not-owned -> 404 (never 403), log opaque IDs only" behavior lives in one place. - M5: replace the always-pass placeholder readiness checks with a real check_redis() that PINGs Redis off the event loop. The check always reports the true status; it only fails the /health/ready probe when the new readiness_require_redis setting is enabled, so CI/dev without Redis still report ready while staying truthful in the body. - M6: load_pipeline_config gains an opt-in strict mode that raises ConfigurationError on a missing config file instead of silently falling back to the built-in localhost defaults; the lenient warning now states the defaults are not production-safe. - M7: ingest_files resolves its FileRouter via a get_file_router dependency (shared singleton) instead of a module-level global, enabling app.dependency_overrides in tests. Low: - L1: remove the unused utils/financial.py stub and its dead ruff ignore. - L2: export the batch/health/ingest/user routers from rag_processor.api and complete __all__. - L5: the readiness 503 body now conforms to the declared ReadinessStatus schema instead of an ad-hoc dict. - L6: document why Sentry before_send drops KeyboardInterrupt/SystemExit. Tests: add test_routing_config (M3 thresholds + M7 DI), test_health_readiness (M5 check + L5 schema), ensure_batch_owned coverage (M4), and pipeline strict-mode tests (M6). https://claude.ai/code/session_01PA6dtgMhfzSe22VVtqBfxE

Copilot AI review requested due to automatic review settings May 30, 2026 03:39

Copilot started reviewing on behalf of williaby May 30, 2026 03:39 View session

Copilot AI reviewed May 30, 2026

View reviewed changes

williaby force-pushed the claude/tier-3-findings branch from 36a49d4 to b75fda3 Compare May 30, 2026 04:14

williaby force-pushed the claude/tier-3-findings branch from 5dffd65 to 3899270 Compare May 30, 2026 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: tier-3 medium & low findings (config hardening, validation, cleanups)#63

refactor: tier-3 medium & low findings (config hardening, validation, cleanups)#63
williaby wants to merge 1 commit into
claude/tier-2-concurrencyfrom
claude/tier-3-findings

williaby commented May 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026 •

edited

Loading

Review skipped

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

williaby commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Low findings

Medium findings

Deliberately deferred (with rationale)

Verification (final commit 4cc8d63)

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

williaby commented May 30, 2026 •

edited

Loading

Verification (final commit `4cc8d63`)

coderabbitai Bot commented May 30, 2026 •

edited

Loading