Skip to content

feat: optimize one segmented vector segment per run#6402

Merged
Xuanwo merged 5 commits intomainfrom
xuanwo/segmented-vector-index-optimize
Apr 3, 2026
Merged

feat: optimize one segmented vector segment per run#6402
Xuanwo merged 5 commits intomainfrom
xuanwo/segmented-vector-index-optimize

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Apr 3, 2026

This changes segmented vector index optimize so the default rebalance path keeps segment boundaries and rewrites only the single worst segment in each run. It builds on #6400's logical vector index / IVF view work and avoids the current behavior where segmented optimize treats the logical index as one physical index.

I also added a regression test that creates a skewed two-segment IVF index and verifies that optimize replaces only the oversized segment while leaving the other segment untouched.

@github-actions github-actions bot added the enhancement New feature or request label Apr 3, 2026
Base automatically changed from xuanwo/logical-vector-index-ivf-view to main April 3, 2026 14:02
@Xuanwo Xuanwo force-pushed the xuanwo/segmented-vector-index-optimize branch from 745d07f to 5001720 Compare April 3, 2026 14:03
@Xuanwo Xuanwo marked this pull request as ready for review April 3, 2026 14:04
let split_candidate =
(max_partition_size > split_threshold).then(|| SegmentRebalanceCandidate {
segment_id: metadata.uuid,
score: max_partition_size - split_threshold,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use the number of partitions which are over split_threshold as the score?

(num_partitions > 1 && min_partition_size < join_threshold).then(|| {
SegmentRebalanceCandidate {
segment_id: metadata.uuid,
score: join_threshold - min_partition_size,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as L346

@Xuanwo Xuanwo force-pushed the xuanwo/segmented-vector-index-optimize branch from 5001720 to a1824b2 Compare April 3, 2026 14:09
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 89.59732% with 31 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index/append.rs 81.51% 9 Missing and 13 partials ⚠️
rust/lance/src/index/vector/ivf.rs 86.95% 7 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@Xuanwo Xuanwo merged commit 01a25e9 into main Apr 3, 2026
27 of 28 checks passed
@Xuanwo Xuanwo deleted the xuanwo/segmented-vector-index-optimize branch April 3, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants