-
Notifications
You must be signed in to change notification settings - Fork 35
Enable batch iterator in IVF #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ib/dynamic_ivf
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enables batch iterator functionality for IVF (Inverted File) indexes in the SVS library. The batch iterator allows incremental retrieval of nearest neighbors in configurable batch sizes, expanding the search space progressively across multiple iterations.
Key changes:
- Introduces
BatchIteratorclass for both static and dynamic IVF indexes that maintains internal state for efficient iterative searching - Adds single-query search capability to IVF indexes via
search()method with scratchspace - Updates Python bindings to reflect that
sizenow returns total vectors instead of cluster count
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| include/svs/index/ivf/iterator.h | Core BatchIterator implementation with state management and progressive search expansion |
| include/svs/index/ivf/index.h | Adds scratchspace types, single-query search, and batch iterator factory for static IVF |
| include/svs/index/ivf/dynamic_ivf.h | Adds scratchspace types, single-query search, and batch iterator factory for dynamic IVF |
| include/svs/index/ivf/extensions.h | Implements single_search customization point for single-query operations |
| include/svs/orchestrators/ivf_iterator.h | Type-erased wrapper for IVFIterator to support polymorphic usage |
| include/svs/orchestrators/ivf.h | Adds batch_iterator() method to IVF orchestrator |
| include/svs/orchestrators/dynamic_ivf.h | Adds batch_iterator() method to DynamicIVF orchestrator |
| tests/svs/index/ivf/iterator.cpp | Comprehensive unit tests for batch iterator functionality |
| tests/svs/index/ivf/index.cpp | Unit tests for static IVF single-query search and scratchspace |
| tests/svs/index/ivf/dynamic_ivf.cpp | Unit tests for dynamic IVF single-query search and scratchspace |
| tests/integration/ivf/iterator.cpp | Integration tests validating iterator behavior in realistic scenarios |
| tests/CMakeLists.txt | Registers new test files in build system |
| bindings/python/tests/test_ivf.py | Updates Python test to use correct size semantics |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| query, | ||
| [&parent]() { | ||
| // Start with a reasonable initial n_probes for iteration | ||
| // Use 10% of clusters or at least 5, whichever gives more coverage |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states 'whichever gives more coverage' but the logic actually takes the maximum of the two values (5 or 10% of clusters), then limits it to the total number of clusters. Consider clarifying that it uses 'whichever is larger' rather than 'more coverage'.
| // Use 10% of clusters or at least 5, whichever gives more coverage | |
| // Use 10% of clusters or at least 5, whichever is larger (capped at num_clusters) |
| /// Calling this method signals the iterator to abandon its cached state. | ||
| /// | ||
| /// This can be helpful for measuring performance and verifying recall values. | ||
| void restart_next_search() const { impl_->restart_next_search(); } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method is marked const but modifies internal state through impl_->restart_next_search(). Either remove the const qualifier or document why this const method modifies state.
| void restart_next_search() const { impl_->restart_next_search(); } | |
| void restart_next_search() { impl_->restart_next_search(); } |
No description provided.