Skip to content

chore: quality cleanup (Q1) — dangling Java refs, clippy clean, lint required#35

Closed
ypriverol wants to merge 9 commits into
devfrom
feat/quality-perf-id-rate
Closed

chore: quality cleanup (Q1) — dangling Java refs, clippy clean, lint required#35
ypriverol wants to merge 9 commits into
devfrom
feat/quality-perf-id-rate

Conversation

@ypriverol
Copy link
Copy Markdown
Member

Summary

Post-cutover code-quality sweep. First of three sequential sub-projects (Q1 quality -> S1 speed -> I1 ID-rate +5%/dataset).

Logic-preserving: PIN/TSV output for --precursor-cal off is unchanged (sorted-row regression gate precursor_cal_bit_identical green throughout).

What changed (8 commits)

Commit Group What
a8ad6ddd (pre-Q1) Remove BUG_REVIEW.md; move CLI_MIGRATION.md to docs/
55cff3fa (pre-Q1) PR-Q1 design + finalize CLI_MIGRATION links
84f83295 1 Scrub 33 dangling Xxx.java:LINE references in non-test source
ba4c6b34 2 Neutralize "port of MS-GF+" framing (5 lib headers + CLI help)
f0831b03 3 Rename MSGFRUST_RSS_PROBE -> MSGF_RSS_PROBE (legacy compat + deprecation warn)
20da1b4e 4 Fix all clippy lib warnings (workspace) -- type aliases, justified #[allow]s for too_many_arguments, dead mut, etc.
67316e56 5 Lift clippy CI gate to required (--all-targets -D warnings) + 5 test/example fixes
2e1b6c7e 6 Remove shipped iter39 spec+plan; track PR-Q1 plan

Parity-test files (tests/*_java_parity.rs, tests/gf_bsa_parity.rs, tests/*_match_java.rs) untouched.

Default policy

  • MSGFRUST_RSS_PROBE keeps working with a one-line deprecation warning; will be removed in the next quality cleanup.
  • Rustfmt remains advisory (~11k lines of fmt churn -- separate cleanup).

Verification

  • cargo clippy --workspace --all-targets -- -D warnings clean
  • Workspace tests green under existing CI skip list (0 failed across 24+ suites)
  • precursor_cal_bit_identical regression gate green
  • MSGFRUST_RSS_PROBE=1 still works (with deprecation eprintln)
  • CodeRabbit review pass
  • CI matrix (Linux / macOS / Windows) green
  • Lint job now blocking (will be visible on first PR after merge)

Out of scope (later sub-projects)

  • Speed work (PR-S1, separate brainstorm)
  • ID-rate +5%/dataset work (PR-I1, multi-PR research project)
  • Parity test files (deliberately preserved)
  • Rustfmt cleanup (~11k lines, separate sweep)

Design + plan committed under docs/ in this branch.

ypriverol added 8 commits May 26, 2026 09:04
- BUG_REVIEW.md was a transient artifact from the post-merge bug-hunt
  review; the actual fixes shipped in PR #32 and are documented in the
  PR description / commit history.
- CLI_MIGRATION.md is an audience-specific guide (Java MS-GF+ users
  porting CLI invocations) — belongs under docs/, not at root.
- DOCS.md stays at root as the primary single-file reference (per the
  iter39 docs-rewrite design).
- Updated inbound references in README.md and DOCS.md.
Adds the design spec for PR-Q1 (post-cutover code quality sweep) and
finalizes the inbound-reference updates from the prior commit
(docs/CLI_MIGRATION.md links) that weren't staged at that point.

PR-Q1 is the first of three sequential sub-projects (quality -> speed
-> ID-rate +5% per dataset). Decomposed during the 2026-05-26
brainstorm because the three concerns differ in risk, scope, and time
profile; the ID-rate target is a multi-PR research project, not a
single ship gate.

Scope: 7 groups (6 in-PR + 1 out-of-repo memory update). Dangling
.java:LINE refs (42), stale "port of MS-GF+" framing, identifier
renames (MSGFRUST_RSS_PROBE etc.), 26 clippy warnings, lift CI lint
to required, remove shipped design specs.
The Java source tree was removed in commit b4565b8 during the
Rust-cutover; the inline citations to specific Java line numbers now
point at code that does not exist in this repo. Replace each citation
with intent-only "Java parity" comments. Preserves semantic meaning;
removes the broken hyperlinks.

Parity-test files (tests/*_java_parity.rs, tests/gf_bsa_parity.rs,
tests/*_match_java.rs) untouched — their identity is Java parity and
the citations are load-bearing documentation.

8 non-test files touched, 33 refs replaced, 0 functional changes.
The codebase is post-cutover; new contributors should read crate-lib
top-of-file doc comments as descriptions of what each crate does, not
as port-bookkeeping. CLI --help and enum doc comments that compared
behavior to Java's command-line options now describe behavior directly.

KEEPS user-facing provenance:
- README.md and DOCS.md project-lineage sections
- Legacy numeric flag values (Java MS-GF+ -X) in --help (user migration)
- (Java -precursorCal) in precursor_cal doc (exact flag name we mirror)
- docs/parity-analysis/** content
- Parity test files

Touched: 5 crate-lib headers + msgf-rust CLI framing.
The "MSGFRUST_" prefix dates from an early iter-era naming and does
not match the binary's identity (msgf-rust). Switch the primary name
to MSGF_RSS_PROBE and accept the legacy name for this release with a
one-line deprecation warning on stderr. The legacy name will be
removed in the next quality cleanup.

Side-effect-only env var; no functional change to search/scoring.
Brings the workspace to clippy-clean on stable 1.87.0 so the CI lint
job can be lifted from advisory to required in the next commit.

Changes by class:
- map_or simplifications: mechanical rewrite via clippy --fix
- complex-type aliases: SegmentPartitionCache, SegmentPartitionSlice,
  DeconvResult, and RankKeptCtx struct in
  crates/scoring/src/scoring/scored_spectrum.rs
- too_many_arguments: RankKeptCtx context struct in scored_spectrum.rs;
  #[allow] with reason for directional_node_score_inner, write_tsv,
  write_tsv_to, write_spectrum_rows, and compute_cleavage_credit
- doc-list indentation: add blank line after list / fix continuation
  indent at 15 sites in msgf-rust.rs and scored_spectrum.rs
- unused_mut, ? rewrite, manual split_once, loop-counter: via clippy --fix
- needless_range_loop: suppressed with reason (seg indexes cache AND
  serves as fallback partition_for arg)

No functional behavior change; PIN/TSV bit-identical regression gate
in tree (precursor_cal_bit_identical) is the verification.
After PR-Q1 Task 4 left the workspace clippy-clean on --lib targets,
remove continue-on-error from the lint job's clippy step and extend
the lint command to --all-targets (covers tests + examples + bin in
addition to lib).

Also addresses 5 residual warnings in test/example targets that the
--lib-only fix in Task 4 didn't reach:
- crates/scoring/examples/dump_main_ion.rs: struct field shorthand
- crates/scoring/examples/dump_prefix_cache.rs: needless_range_loop
- crates/scoring/tests/add_prob_dist_chunked_parity.rs: unnecessary parens

Rustfmt remains advisory (~11k lines of fmt churn pending; separate
cleanup).

Lint job now blocks PRs on clippy regressions.
Deletes the iter39 docs-rewrite spec and plan (shipped 2026-05-23 via
PR #30; the rewrite is in dev and being relied on, so the design docs
no longer need to be discoverable in the repo). Their lineage is in
git history.

Tracks the in-flight PR-Q1 implementation plan alongside its design
spec (committed in 55cff3f).

Future protocol: when a docs/specs design file references a feature
that has fully shipped and closed any deferred gate, remove it in the
next quality cleanup.
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aafe8a14-8338-407a-a303-47bd90aa834d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/quality-perf-id-rate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Three non-blocking observations from the final code review:

1. DOCS.md §97 documented only the legacy MSGFRUST_RSS_PROBE name.
   Now mentions MSGF_RSS_PROBE as the canonical with the legacy noted.
2. crates/model/src/amino_acid.rs:13 inline comment referenced the
   legacy name; updated to MSGF_RSS_PROBE.
3. log_rss deprecation warning fired on every call when only the
   legacy env var was set. Guard with std::sync::Once so it prints
   exactly once per process invocation.

All non-functional; verification: deprecation warning count is now 1
under MSGFRUST_RSS_PROBE=1 + multiple log_rss checkpoints.
@ypriverol
Copy link
Copy Markdown
Member Author

Closing without merging.

After running the 3-dataset bench on the branch (LFQ/Astral/TMT, off + auto), all 6 PSM cells came in within rayon noise floor vs the pre-cleanup baseline (max delta ±56 PSMs) and wall-time deltas were entangled with VM load contention during the bench window (load average 1.19 -> 9.43 between back-to-back PR-A and PR-Q1 runs).

The cleanup commits are real (clippy clean, lint gate, dangling Java refs scrubbed, MSGFRUST_RSS_PROBE rename with legacy compat), but they don't deliver measurable PSM or speed wins on their own.

Branch feat/quality-perf-id-rate retains the commits. Next step: stack speed + ID-rate improvements on top of the same branch; open a new PR only when bench shows a clear net win.

@ypriverol ypriverol closed this May 26, 2026
ypriverol added a commit that referenced this pull request May 26, 2026
After PR #35 (PR-Q1) closed unmerged for not delivering measurable
wins, pivot strategy: stack 3 loosely-coupled sub-features on top
of the cleanup commits and ship ONE PR with bench-gated value.

Sub-features:
- S1: profile-guided Astral wall reduction (gate: -5% wall)
- S2: LFQ calibrator threshold fallback 1e-6 -> 1e-5 (gate: +50 PSMs)
- S3: additive PrecursorErrorPpmSquared PIN column (gate: +50 PSMs
  on any one dataset)

Each sub-feature ships only if its bench gate passes; failures get
dropped before merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant