Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,8 @@ impl<I: VectorId> NeighborProvider<I> {

/// Create a snapshot of the adjacency list index
///
pub fn snapshot(&self) {
self.adjacency_list_index.snapshot();
pub fn snapshot(&self) -> std::path::PathBuf {
self.adjacency_list_index.snapshot()
}

/// Return the maximum degree (number of neighbors per vector)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1897,8 +1897,28 @@ where
}

// Save vectors and neighbors
self.full_vectors.snapshot();
self.neighbor_provider.snapshot();
let vectors_snapshot_path = self.full_vectors.snapshot();
let neighbors_snapshot_path = self.neighbor_provider.snapshot();

// Copy snapshot files to the target prefix location if they differ
let target_vectors_path = BfTreePaths::vectors_bftree(&saved_params.prefix);
if vectors_snapshot_path != target_vectors_path {
std::fs::copy(&vectors_snapshot_path, &target_vectors_path).map_err(|e| {
Comment on lines 1899 to +1906
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new behavior is meant to handle the case where a provider was loaded from one prefix (snapshot location) and then saved to a different prefix, but the existing tests appear to only cover save+load using the same prefix path. Adding a regression test that loads from prefix A, saves to prefix B, and then loads from prefix B would validate the copy logic and prevent future regressions.

Copilot uses AI. Check for mistakes.
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
Comment on lines +1903 to +1911
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_with takes a StorageWriteProvider, but these new .bftree copies bypass it by calling std::fs::copy directly. If a non-filesystem storage provider is used (e.g., virtual/in-memory, remote, etc.), params/delete/PQ files will be written to storage but the bf-tree files will be written to the local filesystem instead, yielding an incomplete/incorrect save. Consider copying via StorageReadProvider/StorageWriteProvider streams (read snapshot file, write to storage.create_for_write) or documenting/enforcing that this SaveWith impl only supports filesystem-backed storage.

Copilot uses AI. Check for mistakes.
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
std::fs::copy(&neighbors_snapshot_path, &target_neighbors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;
Comment on lines +1906 to +1920
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::fs::copy runs a potentially large, blocking file copy inside an async save_with. This can block the executor thread and adds an extra full read+write of the index file. Consider moving the copy into a blocking section (e.g., spawn_blocking) or using an async file copy implementation if save_with is expected to run on a Tokio runtime.

Suggested change
std::fs::copy(&vectors_snapshot_path, &target_vectors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
std::fs::copy(&neighbors_snapshot_path, &target_neighbors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;
let src = vectors_snapshot_path.clone();
let dst = target_vectors_path.clone();
tokio::task::spawn_blocking(move || std::fs::copy(&src, &dst))
.await
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to execute blocking copy for vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
let src = neighbors_snapshot_path.clone();
let dst = target_neighbors_path.clone();
tokio::task::spawn_blocking(move || std::fs::copy(&src, &dst))
.await
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to execute blocking copy for neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;

Copilot uses AI. Check for mistakes.
}

// Save delete bitmap
{
Expand Down Expand Up @@ -2035,9 +2055,38 @@ where
}

// Save vectors, neighbors, and quant vectors
self.full_vectors.snapshot();
self.neighbor_provider.snapshot();
self.quant_vectors.snapshot();
let vectors_snapshot_path = self.full_vectors.snapshot();
let neighbors_snapshot_path = self.neighbor_provider.snapshot();
let quant_snapshot_path = self.quant_vectors.snapshot();

// Copy snapshot files to the target prefix location if they differ
let target_vectors_path = BfTreePaths::vectors_bftree(&saved_params.prefix);
if vectors_snapshot_path != target_vectors_path {
std::fs::copy(&vectors_snapshot_path, &target_vectors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
std::fs::copy(&neighbors_snapshot_path, &target_neighbors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;
}
let target_quant_path = BfTreePaths::quant_bftree(&saved_params.prefix);
if quant_snapshot_path != target_quant_path {
std::fs::copy(&quant_snapshot_path, &target_quant_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy quant from {:?} to {:?}: {}",
quant_snapshot_path, target_quant_path, e
))
})?;
Comment on lines +2065 to +2088
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same performance concern in the quantized save_with: these std::fs::copy calls can block the async executor and double the amount of IO for large bf-tree files. Consider offloading to a blocking task or using an async copy primitive.

Suggested change
std::fs::copy(&vectors_snapshot_path, &target_vectors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
std::fs::copy(&neighbors_snapshot_path, &target_neighbors_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;
}
let target_quant_path = BfTreePaths::quant_bftree(&saved_params.prefix);
if quant_snapshot_path != target_quant_path {
std::fs::copy(&quant_snapshot_path, &target_quant_path).map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy quant from {:?} to {:?}: {}",
quant_snapshot_path, target_quant_path, e
))
})?;
tokio::fs::copy(&vectors_snapshot_path, &target_vectors_path)
.await
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy vectors from {:?} to {:?}: {}",
vectors_snapshot_path, target_vectors_path, e
))
})?;
}
let target_neighbors_path = BfTreePaths::neighbors_bftree(&saved_params.prefix);
if neighbors_snapshot_path != target_neighbors_path {
tokio::fs::copy(&neighbors_snapshot_path, &target_neighbors_path)
.await
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy neighbors from {:?} to {:?}: {}",
neighbors_snapshot_path, target_neighbors_path, e
))
})?;
}
let target_quant_path = BfTreePaths::quant_bftree(&saved_params.prefix);
if quant_snapshot_path != target_quant_path {
tokio::fs::copy(&quant_snapshot_path, &target_quant_path)
.await
.map_err(|e| {
ANNError::log_index_error(format!(
"Failed to copy quant from {:?} to {:?}: {}",
quant_snapshot_path, target_quant_path, e
))
})?;

Copilot uses AI. Check for mistakes.
}
Comment on lines +2062 to +2089
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above in the quantized save_with: the .bftree files are copied via std::fs::copy rather than through the provided StorageWriteProvider. This makes the save output dependent on the local filesystem even though other artifacts (params JSON, delete bitmap, PQ pivots) are written via storage.

Copilot uses AI. Check for mistakes.

// Save PQ table metadata and data using PQStorage format
let filename = BfTreePaths::pq_pivots_bin(&saved_params.prefix);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ impl QuantVectorProvider {

/// Create a snapshot of the quant vector index
///
pub fn snapshot(&self) {
self.quant_vector_index.snapshot();
pub fn snapshot(&self) -> std::path::PathBuf {
self.quant_vector_index.snapshot()
}

/// Create a new instance from an existing BfTree (for loading from snapshot)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ impl<T: VectorRepr, I: VectorId> VectorProvider<T, I> {
/// Create a snapshot of the vector index
///
#[inline(always)]
pub fn snapshot(&self) {
self.vector_index.snapshot();
pub fn snapshot(&self) -> std::path::PathBuf {
self.vector_index.snapshot()
}

/// Set vector with Id, `i``, to `v`
Expand Down
Loading