Push down filters on nested fields as nested queries #4825

yuancu · 2025-11-19T07:38:11Z

Description

Because nested fields are indexed as hidden documents, we cannot query them directly. Instead, we have to use the nested query to access them.

For example, for the following mapping:

{
  "mappings": {
    "properties": {
      "id": {"type": "keyword"},
      "items": {
        "type": "nested",
        "properties": {
          "name": {"type": "keyword"}
        }
      }
    }
  }
}

if we want to use term query to match those items with name banana, instead of using:

{
  "query": {
    "term": {
      "items.name": {
        "value": "banana",
        "boost": 1
      }
    }
  }
}

We should use

{
  "query": {
    "nested": {
      "path": "items",
      "query": {
        "term": {
          "items.name": {
            "value": "banana",
            "boost": 1
          }
        }
      }
    }
  }
}

Here, the nested clause steps down into the nested items field. It no longer has access to fields in the root document, nor fields in any other nested document.

Work items

Fix the case in the issue [BUG] PPL filter on computed fields from nested paths returns zero results #4508: script's access to a single nested field.
Support querying deep nested objects: if a field of a nested object is another nested object, we should construct a nested query using the deepest level
Support nested query in other types of filter.
- Term query
- Range query
- Multi-terms query
Support filters on both nested and root objects, where they can be separated. E.g. | where items.name = 'banana' and id = 2 can be separated to two filters: items.name = 'banana' stays under a nested query, while id = 2 stays on a normal term query, combined with a bool query. Already implemented.
Support filters on nested and root objects, where they can not be separated

Related Issues

Resolves #4508

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

qianheng-aws · 2025-11-21T09:36:11Z

opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java

+                nestedPaths));
+      }
+      if (!nestedPaths.isEmpty()) {
+        return nestedQuery(nestedPaths.get(0), scriptQuery, ScoreMode.None);


Since we have supported Agg script and Sort script now, I think we need to identify whether it's a filter script before wrapping it with nested query. And please add a test on agg script with nested fields.

Script sort does not make use of PredicateAnalyzer.QueryExpression, thus does not interleave with the current change:

sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java

Line 499 in c31227c

() -> SortBuilders.scriptSort(scriptExpr.getScript(), sortType).order(order));

The same is with field sort

sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java

Line 476 in c31227c

() -> SortBuilders.fieldSort(digest.getFieldName()).order(order).missing(missing));

However, they still need nested queries to access nested fields. As a result, the following queries does not work (author is a nested field)

script sort: source=opensearch-sql_test_index_cascaded_nested | eval lower_name = lower(author.name) | sort lower_name

field sort: source=opensearch-sql_test_index_cascaded_nested | sort author.name

Should I fix them in this PR or another one?

For agg script, I added a test case. Yet I doubt I still miss many cases? Originally I thought they were only handled in AggregateAnalyzer, so I could raise another PR for nested fields in aggregations.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4508.yml (1)

28-42: Consider adding edge case tests for nested field filtering completeness.

While the core scenarios are well covered, the test suite could benefit from additional edge cases to improve robustness:

Empty or null nested fields: What happens when items is missing, empty, or contains null values?

Multiple nested values per document: Documents with multiple items (e.g., items: [{"name": "apple"}, {"name": "banana"}]) to validate flattening and filtering behavior.

Deep nesting levels: Test cascaded nested objects (if the PR supports this per the objectives).

These tests would validate boundary conditions and ensure the fix handles all nested field scenarios correctly.

Also applies to: 44-58

integ-test/src/test/resources/expectedOutput/calcite/filter_multiple_nested_cascaded_range.yaml (1)

8-8: Consider formatting for long-line readability in YAML test fixtures.

Line 8 contains the entire physical plan and OpenSearchRequestBuilder on a single line, making it difficult to inspect and maintain. For complex nested structures, consider formatting the embedded JSON with newlines or splitting across multiple lines if the YAML parser supports it.

This is a minor maintainability concern and does not affect test correctness.
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)
1848-1867: Consider adding null check for name parameter.

The method checks for null/empty fieldTypes but not for null name. If name is null, line 1852 (name.contains(".")) will throw a NullPointerException.

Looking at the callers, name could potentially be null when derived from schema lookups that return null for out-of-bounds indices (e.g., line 1614). Consider adding a null check:
 private static String resolveNestedPath(String name, Map<String, ExprType> fieldTypes) {
-  if (fieldTypes == null || fieldTypes.isEmpty()) {
+  if (name == null || fieldTypes == null || fieldTypes.isEmpty()) {
     return "";
   }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b88bf56 and d2b94ea.

📒 Files selected for processing (21)

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (4 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteWhereCommandIT.java (1 hunks)
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1 hunks)
integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java (1 hunks)
integ-test/src/test/resources/cascaded_nested.json (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/agg_filter_nested.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/filter_computed_nested.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/filter_multiple_nested_cascaded_range.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/filter_nested_term.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/filter_nested_terms.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/filter_root_and_nested.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/agg_filter_nested.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/filter_computed_nested.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/filter_multiple_nested_cascaded_range.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/filter_nested_term.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/filter_nested_terms.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/filter_root_and_nested.yaml (1 hunks)
integ-test/src/test/resources/indexDefinitions/cascaded_nested_index_mapping.json (1 hunks)
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4508.yml (1 hunks)
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (10 hunks)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1)

integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java (1)

TestsConstants (9-102)

opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (2)

core/src/main/java/org/opensearch/sql/calcite/type/ExprIPType.java (1)

ExprIPType (12-16)

core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)

OpenSearchTypeFactory (63-409)

🪛 Biome (2.1.2)

integ-test/src/test/resources/cascaded_nested.json

[error] 1-2: End of file expected