From 4756484c036b3ee6a25f966427141fba20402f6c Mon Sep 17 00:00:00 2001 From: Simran Spiller Date: Mon, 8 Dec 2025 15:57:13 +0100 Subject: [PATCH 1/3] Vector index: storedValues and indexHint --- .../3.12/develop/http-api/indexes/vector.md | 16 ++++++ .../working-with-indexes/vector-indexes.md | 8 +++ .../version-3.12/whats-new-in-3-12.md | 56 +++++++++++++++++++ .../4.0/develop/http-api/indexes/vector.md | 16 ++++++ .../working-with-indexes/vector-indexes.md | 8 +++ .../version-3.12/whats-new-in-3-12.md | 56 +++++++++++++++++++ 6 files changed, 160 insertions(+) diff --git a/site/content/arangodb/3.12/develop/http-api/indexes/vector.md b/site/content/arangodb/3.12/develop/http-api/indexes/vector.md index 0bbd14d4de..666586eb54 100644 --- a/site/content/arangodb/3.12/develop/http-api/indexes/vector.md +++ b/site/content/arangodb/3.12/develop/http-api/indexes/vector.md @@ -65,6 +65,22 @@ paths: maxItems: 1 items: type: string + storedValues: + description: | + Store additional attributes in the index. Unlike with other index types, this + is not for covering projections with the index but for adding attributes that + you filter on. This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the filtering and + once for the matches. + + The maximum number of attributes that you can use in `storedValues` is 32. + type: array + uniqueItems: true + items: + description: | + A list of attribute paths. The `.` character denotes sub-attributes. + type: string + type: string sparse: description: | Whether to create a sparse index that excludes documents with diff --git a/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md index 5162c982cd..681806dddc 100644 --- a/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md +++ b/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md @@ -74,6 +74,14 @@ centroids and the quality of vector search thus degrades. Set this option to `true` to keep the collection/shards available for write operations by not using an exclusive write lock for the duration of the index creation. Default: `false`. +- **storedValues** (array of strings): + Store additional attributes in the index. Unlike with other index types, this + is not for covering projections with the index but for adding attributes that + you filter on. This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the filtering and + once for the matches. + + The maximum number of attributes that you can use in `storedValues` is 32. - **params**: The parameters as used by the Faiss library. - **metric** (string): The measure for calculating the vector similarity: - `"cosine"`: Angular similarity. Vectors are automatically diff --git a/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md index d0d0c77010..d35f7ebb5b 100644 --- a/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md +++ b/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md @@ -1551,6 +1551,62 @@ The accompanying AQL function is the following: - `APPROX_NEAR_INNER_PRODUCT()` +--- + +Introduced in: v3.12.7 + +Vector indexes now support `storedValues` to store additional attributes in the +index. Unlike with other index types, this is not for covering projections with +the index but for adding attributes that you filter on. This lets you make the +lookup in the vector index more efficient because it avoids materializing +documents twice, once for the filtering and once for the matches. + +For example, if you set `storedValues` to `["val"]` in a vector index over +`["vector"]`, then the following query can utilize this index for the +filtering by `val` and the lookup using `vector`, but not for the projection of +`attr` even if you added it to `storedValues` as well: + +```aql + FOR doc IN coll + FILTER doc.val > 3 + SORT APPROX_NEAR_INNER_PRODUCT(doc.vector, @q) DESC + LIMIT 3 + RETURN doc.attr +``` + +The query execution plan, the utilization of `storedValues` for filtering is +indicated by `/* covered by storedValues */`: + +```aql +Execution plan: + Id NodeType Par Est. Comment + 1 SingletonNode 1 * ROOT + 10 CalculationNode 1 - LET #4 = [ ... ] /* json expression */ /* const assignment */ + 11 EnumerateNearVectorNode 3 - FOR doc OF coll IN TOP 3 NEAR #4 DISTANCE INTO #2 FILTER (doc.`val` > 3) /* early pruning */ /* covered by storedValues */ + 7 LimitNode 3 - LIMIT 0, 3 + 12 MaterializeNode 3 - MATERIALIZE doc INTO #5 /* (projections: `attr`) */ LET #6 = #5.`attr` + 9 ReturnNode 3 - RETURN #6 + +Indexes used: + By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges + 11 foo vector coll false false false n/a [ `vector` ] [ `val` ] #4 +``` + +--- + +Introduced in: v3.12.7 + +The `FOR` operation now supports `indexHint` and `forceIndexHint` for vector +indexes to make the AQL optimizer prefer respectively require specific +vector indexes: + +```aql +FOR doc IN c OPTIONS { indexHint: ["vec_idx_1", "vec_idx_2"], forceIndexHint: true } + SORT APPROX_NEAR_COSINE(doc.vector, @q) DESC + LIMIT 3 + RETURN doc +``` + ## Server options ### Effective and available startup options diff --git a/site/content/arangodb/4.0/develop/http-api/indexes/vector.md b/site/content/arangodb/4.0/develop/http-api/indexes/vector.md index 0bbd14d4de..666586eb54 100644 --- a/site/content/arangodb/4.0/develop/http-api/indexes/vector.md +++ b/site/content/arangodb/4.0/develop/http-api/indexes/vector.md @@ -65,6 +65,22 @@ paths: maxItems: 1 items: type: string + storedValues: + description: | + Store additional attributes in the index. Unlike with other index types, this + is not for covering projections with the index but for adding attributes that + you filter on. This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the filtering and + once for the matches. + + The maximum number of attributes that you can use in `storedValues` is 32. + type: array + uniqueItems: true + items: + description: | + A list of attribute paths. The `.` character denotes sub-attributes. + type: string + type: string sparse: description: | Whether to create a sparse index that excludes documents with diff --git a/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md index 5162c982cd..681806dddc 100644 --- a/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md +++ b/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md @@ -74,6 +74,14 @@ centroids and the quality of vector search thus degrades. Set this option to `true` to keep the collection/shards available for write operations by not using an exclusive write lock for the duration of the index creation. Default: `false`. +- **storedValues** (array of strings): + Store additional attributes in the index. Unlike with other index types, this + is not for covering projections with the index but for adding attributes that + you filter on. This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the filtering and + once for the matches. + + The maximum number of attributes that you can use in `storedValues` is 32. - **params**: The parameters as used by the Faiss library. - **metric** (string): The measure for calculating the vector similarity: - `"cosine"`: Angular similarity. Vectors are automatically diff --git a/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md index d0d0c77010..d35f7ebb5b 100644 --- a/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md +++ b/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md @@ -1551,6 +1551,62 @@ The accompanying AQL function is the following: - `APPROX_NEAR_INNER_PRODUCT()` +--- + +Introduced in: v3.12.7 + +Vector indexes now support `storedValues` to store additional attributes in the +index. Unlike with other index types, this is not for covering projections with +the index but for adding attributes that you filter on. This lets you make the +lookup in the vector index more efficient because it avoids materializing +documents twice, once for the filtering and once for the matches. + +For example, if you set `storedValues` to `["val"]` in a vector index over +`["vector"]`, then the following query can utilize this index for the +filtering by `val` and the lookup using `vector`, but not for the projection of +`attr` even if you added it to `storedValues` as well: + +```aql + FOR doc IN coll + FILTER doc.val > 3 + SORT APPROX_NEAR_INNER_PRODUCT(doc.vector, @q) DESC + LIMIT 3 + RETURN doc.attr +``` + +The query execution plan, the utilization of `storedValues` for filtering is +indicated by `/* covered by storedValues */`: + +```aql +Execution plan: + Id NodeType Par Est. Comment + 1 SingletonNode 1 * ROOT + 10 CalculationNode 1 - LET #4 = [ ... ] /* json expression */ /* const assignment */ + 11 EnumerateNearVectorNode 3 - FOR doc OF coll IN TOP 3 NEAR #4 DISTANCE INTO #2 FILTER (doc.`val` > 3) /* early pruning */ /* covered by storedValues */ + 7 LimitNode 3 - LIMIT 0, 3 + 12 MaterializeNode 3 - MATERIALIZE doc INTO #5 /* (projections: `attr`) */ LET #6 = #5.`attr` + 9 ReturnNode 3 - RETURN #6 + +Indexes used: + By Name Type Collection Unique Sparse Cache Selectivity Fields Stored values Ranges + 11 foo vector coll false false false n/a [ `vector` ] [ `val` ] #4 +``` + +--- + +Introduced in: v3.12.7 + +The `FOR` operation now supports `indexHint` and `forceIndexHint` for vector +indexes to make the AQL optimizer prefer respectively require specific +vector indexes: + +```aql +FOR doc IN c OPTIONS { indexHint: ["vec_idx_1", "vec_idx_2"], forceIndexHint: true } + SORT APPROX_NEAR_COSINE(doc.vector, @q) DESC + LIMIT 3 + RETURN doc +``` + ## Server options ### Effective and available startup options From 8836ce24494272d49884188d7fd7ec7d08c2c405 Mon Sep 17 00:00:00 2001 From: Simran Spiller Date: Tue, 9 Dec 2025 16:32:08 +0100 Subject: [PATCH 2/3] Add introduced in remarks --- .../arangodb/3.12/develop/http-api/indexes/vector.md | 11 ++++++----- .../indexing/working-with-indexes/vector-indexes.md | 2 +- .../arangodb/4.0/develop/http-api/indexes/vector.md | 11 ++++++----- .../indexing/working-with-indexes/vector-indexes.md | 2 +- 4 files changed, 14 insertions(+), 12 deletions(-) diff --git a/site/content/arangodb/3.12/develop/http-api/indexes/vector.md b/site/content/arangodb/3.12/develop/http-api/indexes/vector.md index 666586eb54..fb7dcaf17d 100644 --- a/site/content/arangodb/3.12/develop/http-api/indexes/vector.md +++ b/site/content/arangodb/3.12/develop/http-api/indexes/vector.md @@ -67,11 +67,12 @@ paths: type: string storedValues: description: | - Store additional attributes in the index. Unlike with other index types, this - is not for covering projections with the index but for adding attributes that - you filter on. This lets you make the lookup in the vector index more efficient - because it avoids materializing documents twice, once for the filtering and - once for the matches. + Store additional attributes in the index (introduced in v3.12.7). + Unlike with other index types, this is not for covering projections + with the index but for adding attributes that you filter on. + This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the + filtering and once for the matches. The maximum number of attributes that you can use in `storedValues` is 32. type: array diff --git a/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md index 681806dddc..80196b8553 100644 --- a/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md +++ b/site/content/arangodb/3.12/indexes-and-search/indexing/working-with-indexes/vector-indexes.md @@ -74,7 +74,7 @@ centroids and the quality of vector search thus degrades. Set this option to `true` to keep the collection/shards available for write operations by not using an exclusive write lock for the duration of the index creation. Default: `false`. -- **storedValues** (array of strings): +- **storedValues** (array of strings, introduced in v3.12.7): Store additional attributes in the index. Unlike with other index types, this is not for covering projections with the index but for adding attributes that you filter on. This lets you make the lookup in the vector index more efficient diff --git a/site/content/arangodb/4.0/develop/http-api/indexes/vector.md b/site/content/arangodb/4.0/develop/http-api/indexes/vector.md index 666586eb54..fb7dcaf17d 100644 --- a/site/content/arangodb/4.0/develop/http-api/indexes/vector.md +++ b/site/content/arangodb/4.0/develop/http-api/indexes/vector.md @@ -67,11 +67,12 @@ paths: type: string storedValues: description: | - Store additional attributes in the index. Unlike with other index types, this - is not for covering projections with the index but for adding attributes that - you filter on. This lets you make the lookup in the vector index more efficient - because it avoids materializing documents twice, once for the filtering and - once for the matches. + Store additional attributes in the index (introduced in v3.12.7). + Unlike with other index types, this is not for covering projections + with the index but for adding attributes that you filter on. + This lets you make the lookup in the vector index more efficient + because it avoids materializing documents twice, once for the + filtering and once for the matches. The maximum number of attributes that you can use in `storedValues` is 32. type: array diff --git a/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md b/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md index 681806dddc..80196b8553 100644 --- a/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md +++ b/site/content/arangodb/4.0/indexes-and-search/indexing/working-with-indexes/vector-indexes.md @@ -74,7 +74,7 @@ centroids and the quality of vector search thus degrades. Set this option to `true` to keep the collection/shards available for write operations by not using an exclusive write lock for the duration of the index creation. Default: `false`. -- **storedValues** (array of strings): +- **storedValues** (array of strings, introduced in v3.12.7): Store additional attributes in the index. Unlike with other index types, this is not for covering projections with the index but for adding attributes that you filter on. This lets you make the lookup in the vector index more efficient From eddb83d0f2d450a5e61e397c511f20fcf3b16428 Mon Sep 17 00:00:00 2001 From: Simran Spiller Date: Tue, 9 Dec 2025 16:41:54 +0100 Subject: [PATCH 3/3] Add push-filter-into-enumerate-near optimizer rule to release notes --- .../3.12/release-notes/version-3.12/api-changes-in-3-12.md | 2 ++ .../3.12/release-notes/version-3.12/whats-new-in-3-12.md | 7 ++++--- .../4.0/release-notes/version-3.12/api-changes-in-3-12.md | 2 ++ .../4.0/release-notes/version-3.12/whats-new-in-3-12.md | 7 ++++--- 4 files changed, 12 insertions(+), 6 deletions(-) diff --git a/site/content/arangodb/3.12/release-notes/version-3.12/api-changes-in-3-12.md b/site/content/arangodb/3.12/release-notes/version-3.12/api-changes-in-3-12.md index 8cc010cc0a..57518506c5 100644 --- a/site/content/arangodb/3.12/release-notes/version-3.12/api-changes-in-3-12.md +++ b/site/content/arangodb/3.12/release-notes/version-3.12/api-changes-in-3-12.md @@ -101,6 +101,8 @@ A `replace-entries-with-object-iteration` rule has been added in v3.12.3. A `use-index-for-collect` and a `use-vector-index` rule have been added in v3.12.4. +A `push-filter-into-enumerate-near` rule has been added in v3.12.7. + The affected endpoints are `POST /_api/cursor`, `POST /_api/explain`, and `GET /_api/query/rules`. diff --git a/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md index d35f7ebb5b..1c0c561f48 100644 --- a/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md +++ b/site/content/arangodb/3.12/release-notes/version-3.12/whats-new-in-3-12.md @@ -1541,6 +1541,8 @@ FOR doc IN coll RETURN doc ``` +The filtering is handled by the `use-vector-index` optimizer rule in v3.12.6. + Vector indexes can now be sparse to exclude documents with the embedding attribute for indexing missing or set to `null`. @@ -1592,9 +1594,8 @@ Indexes used: 11 foo vector coll false false false n/a [ `vector` ] [ `val` ] #4 ``` ---- - -Introduced in: v3.12.7 +The new `push-filter-into-enumerate-near` optimizer rule now handles everything +related to vector index filtering (with and without `storedValues`). The `FOR` operation now supports `indexHint` and `forceIndexHint` for vector indexes to make the AQL optimizer prefer respectively require specific diff --git a/site/content/arangodb/4.0/release-notes/version-3.12/api-changes-in-3-12.md b/site/content/arangodb/4.0/release-notes/version-3.12/api-changes-in-3-12.md index 8cc010cc0a..57518506c5 100644 --- a/site/content/arangodb/4.0/release-notes/version-3.12/api-changes-in-3-12.md +++ b/site/content/arangodb/4.0/release-notes/version-3.12/api-changes-in-3-12.md @@ -101,6 +101,8 @@ A `replace-entries-with-object-iteration` rule has been added in v3.12.3. A `use-index-for-collect` and a `use-vector-index` rule have been added in v3.12.4. +A `push-filter-into-enumerate-near` rule has been added in v3.12.7. + The affected endpoints are `POST /_api/cursor`, `POST /_api/explain`, and `GET /_api/query/rules`. diff --git a/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md index d35f7ebb5b..1c0c561f48 100644 --- a/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md +++ b/site/content/arangodb/4.0/release-notes/version-3.12/whats-new-in-3-12.md @@ -1541,6 +1541,8 @@ FOR doc IN coll RETURN doc ``` +The filtering is handled by the `use-vector-index` optimizer rule in v3.12.6. + Vector indexes can now be sparse to exclude documents with the embedding attribute for indexing missing or set to `null`. @@ -1592,9 +1594,8 @@ Indexes used: 11 foo vector coll false false false n/a [ `vector` ] [ `val` ] #4 ``` ---- - -Introduced in: v3.12.7 +The new `push-filter-into-enumerate-near` optimizer rule now handles everything +related to vector index filtering (with and without `storedValues`). The `FOR` operation now supports `indexHint` and `forceIndexHint` for vector indexes to make the AQL optimizer prefer respectively require specific