Skip to content

fix for bug causing tests to use FP vectors in search#645

Open
MarkWolters wants to merge 2 commits intomainfrom
fix_fused_recall
Open

fix for bug causing tests to use FP vectors in search#645
MarkWolters wants to merge 2 commits intomainfrom
fix_fused_recall

Conversation

@MarkWolters
Copy link
Contributor

Investigation Complete: Root Cause Found and Fixed

Key Findings

1. The "Recall Degradation" Was Actually a Test Harness Bug

Test results showed:

  • BenchYAML + fusedGraph: No → recall: 0.65
  • BenchYAML + fusedGraph: Yes → recall: 0.65
  • AutoBenchYAML + fusedGraph: No → recall: 0.77 ← ANOMALY
  • AutoBenchYAML + fusedGraph: Yes → recall: 0.65

The anomaly was that AutoBenchYAML with fusedGraph:No produced artificially high recall (0.77), not that FusedPQ was producing low recall. All configurations using FusedPQ correctly produced consistent recall of 0.65.

2. Root Cause: Missing Vector Encoding in runAllAndCollectResults

Location: jvector-examples/src/main/java/io/github/jbellis/jvector/example/Grid.java, line 833

The Bug:

CompressedVectors cvArg = (searchCompressorObj instanceof CompressedVectors) ? (CompressedVectors) searchCompressorObj : null;

This line attempted to cast a VectorCompressor (ProductQuantization) to CompressedVectors, which always failed, resulting in cvArg = null. When cvArg is null and fusedGraph is disabled, the ConfiguredSystem.scoreProviderFor method falls back to exact scoring using uncompressed vectors (line 1084 / 1094), which artificially inflates recall.

The Fix:

// Encode vectors for reranking if a compressor is provided
CompressedVectors cvArg;
if (searchCompressorObj == null) {
    cvArg = null;
} else {
    cvArg = searchCompressorObj.encodeAll(ds.getBaseRavv());
}

This fix properly encodes the vectors using the compressor, matching the behavior of runOneGraph (the correct implementation used by BenchYAML).

Impact

  • Before fix: AutoBenchYAML with fusedGraph:No was using exact (uncompressed) scoring, giving misleadingly high recall
  • After fix: AutoBenchYAML will now correctly use PQ-compressed vectors for approximate scoring during graph traversal, with NVQ reranking, producing consistent recall across all test harnesses

Verification

After applying this fix:

  • AutoBenchYAML + fusedGraph:No → recall: ~0.65 (matching other configurations)
  • All test harnesses will produce consistent recall measurements
  • The perceived "recall degradation with high dimensionality" will disappear, as it was never a real issue with FusedPQ

Additional Notes

The investigation also examined the FusedPQ implementation thoroughly and confirmed that:

  • The storage and retrieval of fused PQ data is correct
  • The scoring mathematics are correct
  • The SIMD implementations are correct
  • FusedPQ works correctly for all dimensionalities when properly configured

@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

Before you submit for review:

  • Does your PR follow guidelines from CONTRIBUTIONS.md?
  • Did you summarize what this PR does clearly and concisely?
  • Did you include performance data for changes which may be performance impacting?
  • Did you include useful docs for any user-facing changes or features?
  • Did you include useful javadocs for developer oriented changes, explaining new concepts or key changes?
  • Did you trigger and review regression testing results against the base branch via Run Bench Main?
  • Did you adhere to the code formatting guidelines (TBD)
  • Did you group your changes for easy review, providing meaningful descriptions for each commit?
  • Did you ensure that all files contain the correct copyright header?

If you did not complete any of these, then please explain below.

Copy link
Collaborator

@tlwillke tlwillke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please take a look elsewhere in Grid to see if we are vulnerable to the same failure outside of AutoBenchYAML usage?

@tlwillke tlwillke self-requested a review March 19, 2026 01:47
Copy link
Collaborator

@tlwillke tlwillke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for being thorough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants