From b7402399a565dc0a3e621ecb4d61cfd48a4eb2fc Mon Sep 17 00:00:00 2001 From: lihangyu Date: Wed, 13 May 2026 03:23:15 +0800 Subject: [PATCH] [feature](iceberg) Support reading Iceberg variant from Parquet ### What problem does this PR solve? Issue Number: N/A Related PR: #63192 Problem Summary: Doris could not read Iceberg v3 VARIANT columns from Parquet files. This change maps Iceberg VARIANT to Doris VARIANT, validates the Parquet VariantShredding wrapper shape, decodes metadata/value residual data, reads shredded typed_value columns, and prunes shredded Parquet leaves for accessed variant paths. The VARIANT reader and planner changes stay scoped to the Iceberg/Parquet VARIANT path instead of coupling generic nested-column code to Iceberg-only behavior. Typed-only shredded projections stay on native Parquet typed columns when residual value columns are not selected, with counter coverage to catch row-wise performance regressions. Selected residual or complex layouts still fall back to row-wise reconstruction. This also preserves VARIANT subpaths through casts, validates the actual Iceberg data-file format for VARIANT reads, rejects duplicate VariantShredding structural children, preserves null temporal typed leaves without reading their physical value, and keeps delete-only Iceberg MERGE projections from reading unused visible target data columns. ### Release note Support reading Iceberg v3 VARIANT Parquet columns, including shredded typed_value column pruning and binary/UUID/primitive residual VARIANT values. Writing Iceberg VARIANT columns is rejected with an explicit unsupported error. ### Check List (For Author) - Test: Regression test / Unit Test / Manual test - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.DirectTypedOnlyReaderCountersUseNativePath:ParquetVariantReaderTest.VariantReaderCountersUseRowWiseWhenResidualValueSelected:ParquetVariantReaderTest.RowWisePreservesExplicitVariantNullShreddedArrayElement:ParquetVariantReaderTest.RowWiseRejectsMissingShreddedArrayElement' (4 tests passed) - Unit Test: ./run-be-ut.sh --run -f 'ParquetVariantReaderTest.RowWisePreservesNullComplexTypedArrayElement:ParquetVariantReaderTest.RowWiseRejectsMissingShreddedArrayElement' (2 tests passed) - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.*' (85 tests passed on rerun; the first attempt failed before tests in OpenBLAS CMake getarch bootstrap) - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.*:NestedColumnAccessHelperTest.*' (127 tests passed) - Unit Test: ./run-be-ut.sh --run --filter='IcebergReaderCreateColumnIdsTest.*' (9 tests passed) - Unit Test: ./run-be-ut.sh --run --filter=ParquetVariantReaderTest.RejectVariantSchemaWithDuplicateStructuralChild:ParquetVariantReaderTest.DirectTypedOnlyPreservesTemporalLeafNull (2 tests passed; rerun after clang-format also passed) - Unit Test: ./run-be-ut.sh --run --filter=ParquetVariantReaderTest.DirectTypedOnlyReaderCountersUseNativePath (1 test passed after latest changes) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantComparisonPredicateCollectsWholeVariantOperand (1 test passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantCastProjectionKeepsSubPathWithSiblingPredicate (1 test passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest (70 tests passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.VariantPruningLogicTest#testExplodeSubqueryJoinAggAccessPaths (1 test passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.datasource.iceberg.source.IcebergScanNodeTest#testValidateVariantDataFileFormatRejectsOrcSplit (1 test passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.datasource.iceberg.source.IcebergScanNodeTest (6 tests passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.commands.IcebergMergeCommandTest#testDeleteProjectionDoesNotReadVisibleTargetColumns (1 test passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.VariantPruningLogicTest (11 tests passed; Maven reactor succeeded) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.datasource.iceberg.IcebergUtilsTest (passed) - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.SlotTypeReplacerTest (5 tests passed) - Regression test: performance regression coverage is included in regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy, including profile assertions that typed-only projections increment VariantDirectTypedValueReadRows and keep VariantRowWiseReadRows at 0. Not run locally in this worktree because no local Doris cluster/output BE+FE runtime is available. - Regression test: Added regression-test/suites/external_table_p0/iceberg/test_iceberg_variant_table_path.groovy to exercise the Iceberg REST catalog table path with nested VARIANT access and profile read-column assertions. Not run locally because Docker access to spark-iceberg is unavailable in this worktree. - Manual test: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/clang-format.sh - Manual test: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh - Manual test: git diff --check - Manual test: cd fe && mvn -pl fe-core checkstyle:check -DskipTests - Static analysis: CLANG_TIDY_BINARY=/tmp/clang-tidy-resource-filter build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (passed for changed lines after adding the clang-tidy resource-dir and filtering a pre-existing be/src/core/types.h clang-tidy-nolint diagnostic; the unwrapped script was blocked by that existing header diagnostic) - Behavior changed: Yes. Doris can read Iceberg v3 VARIANT Parquet columns, supports typed-only shredded projection pruning on native typed columns, reconstructs selected residual or complex layouts row-wise, rejects malformed VariantShredding schemas and missing present shredded array payloads, preserves null complex/temporal typed values and explicit Variant null array elements, forces root access for whole-VARIANT scalar/comparison consumers while preserving literal subpath pruning for typed reads, recursively rejects Iceberg VARIANT reads from non-Parquet data files during scan planning, avoids reading unused target data columns for delete-only Iceberg MERGE, and rejects Iceberg VARIANT data-file writes explicitly. - Does this need documentation: No --- .../format/parquet/delta_bit_pack_decoder.h | 4 +- .../format/parquet/parquet_column_convert.cpp | 76 + .../parquet/parquet_nested_column_utils.cpp | 533 +++ .../parquet/parquet_nested_column_utils.h | 40 + .../format/parquet/parquet_variant_reader.cpp | 1161 +++++++ .../format/parquet/parquet_variant_reader.h | 38 + be/src/format/parquet/schema_desc.cpp | 127 +- be/src/format/parquet/schema_desc.h | 25 + .../parquet/vparquet_column_chunk_reader.cpp | 8 +- .../parquet/vparquet_column_chunk_reader.h | 3 +- .../format/parquet/vparquet_column_reader.cpp | 2078 +++++++++++- .../format/parquet/vparquet_column_reader.h | 121 +- be/src/format/parquet/vparquet_reader.cpp | 244 +- be/src/format/parquet/vparquet_reader.h | 15 +- .../hive/hive_parquet_nested_column_utils.cpp | 144 +- .../hive/hive_parquet_nested_column_utils.h | 5 +- be/src/format/table/hive_reader.cpp | 54 +- be/src/format/table/hive_reader.h | 4 +- .../table/iceberg/arrow_schema_util.cpp | 3 + .../iceberg_parquet_nested_column_utils.cpp | 146 +- .../iceberg_parquet_nested_column_utils.h | 6 +- be/src/format/table/iceberg/types.cpp | 2 + be/src/format/table/iceberg/types.h | 10 + be/src/format/table/iceberg_reader.cpp | 8 +- .../parquet/delta_byte_array_decoder_test.cpp | 90 +- be/test/format/parquet/parquet_expr_test.cpp | 62 + .../parquet/parquet_variant_reader_test.cpp | 2994 +++++++++++++++++ .../hive_reader_create_column_ids_test.cpp | 21 +- .../iceberg_reader_create_column_ids_test.cpp | 155 +- .../nested_column_access_helper_test.cpp | 1113 ++++++ .../connector/iceberg/IcebergTypeMapping.java | 4 + .../datasource/iceberg/IcebergUtils.java | 50 + .../iceberg/source/IcebergScanNode.java | 59 + .../translator/PhysicalPlanTranslator.java | 3 +- ...rgMergeSinkToPhysicalIcebergMergeSink.java | 1 + .../AccessPathExpressionCollector.java | 213 +- .../rewrite/AccessPathPlanCollector.java | 108 +- .../rules/rewrite/NestedColumnPruning.java | 32 + .../rules/rewrite/SlotTypeReplacer.java | 19 +- .../plans/commands/IcebergMergeCommand.java | 26 +- .../insert/IcebergInsertExecutor.java | 14 +- .../logical/LogicalIcebergMergeSink.java | 35 +- .../physical/PhysicalIcebergMergeSink.java | 35 +- .../doris/planner/IcebergMergeSink.java | 10 + .../doris/planner/IcebergTableSink.java | 1 + .../datasource/iceberg/IcebergUtilsTest.java | 6 + .../iceberg/source/IcebergScanNodeTest.java | 84 + .../rules/rewrite/PruneNestedColumnTest.java | 325 ++ .../rules/rewrite/SlotTypeReplacerTest.java | 210 ++ .../rewrite/VariantPruningLogicTest.java | 53 +- .../commands/IcebergMergeCommandTest.java | 60 + .../doris/planner/IcebergMergeSinkTest.java | 33 + .../doris/planner/IcebergTableSinkTest.java | 89 + .../tvf/iceberg_variant_binary_typed.parquet | Bin 0 -> 743 bytes .../iceberg_variant_binary_unshredded.parquet | Bin 0 -> 764 bytes .../tvf/iceberg_variant_shredded.parquet | Bin 0 -> 1865 bytes .../iceberg_variant_temporal_typed.parquet | Bin 0 -> 1348 bytes ...ceberg_variant_temporal_unshredded.parquet | Bin 0 -> 917 bytes .../tvf/iceberg_variant_typed_only.parquet | Bin 0 -> 1724 bytes .../tvf/iceberg_variant_unshredded.parquet | Bin 0 -> 1561 bytes .../tvf/test_local_tvf_iceberg_variant.out | 51 + .../test_iceberg_variant_table_path.groovy | 137 + .../tvf/test_local_tvf_iceberg_variant.groovy | 448 +++ 63 files changed, 10874 insertions(+), 522 deletions(-) create mode 100644 be/src/format/parquet/parquet_nested_column_utils.cpp create mode 100644 be/src/format/parquet/parquet_nested_column_utils.h create mode 100644 be/src/format/parquet/parquet_variant_reader.cpp create mode 100644 be/src/format/parquet/parquet_variant_reader.h create mode 100644 be/test/format/parquet/parquet_variant_reader_test.cpp create mode 100644 be/test/format/table/nested_column_access_helper_test.cpp create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacerTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/planner/IcebergTableSinkTest.java create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_binary_typed.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_binary_unshredded.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_shredded.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_temporal_typed.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_temporal_unshredded.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_typed_only.parquet create mode 100644 regression-test/data/external_table_p0/tvf/iceberg_variant_unshredded.parquet create mode 100644 regression-test/data/external_table_p0/tvf/test_local_tvf_iceberg_variant.out create mode 100644 regression-test/suites/external_table_p0/iceberg/test_iceberg_variant_table_path.groovy create mode 100644 regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy diff --git a/be/src/format/parquet/delta_bit_pack_decoder.h b/be/src/format/parquet/delta_bit_pack_decoder.h index 52d45ea2297b33..d547909fafd7dc 100644 --- a/be/src/format/parquet/delta_bit_pack_decoder.h +++ b/be/src/format/parquet/delta_bit_pack_decoder.h @@ -31,6 +31,7 @@ #include "common/status.h" #include "core/data_type/data_type.h" +#include "core/data_type/data_type_nullable.h" #include "format/parquet/decoder.h" #include "format/parquet/fix_length_plain_decoder.h" #include "format/parquet/parquet_common.h" @@ -329,7 +330,8 @@ class DeltaByteArrayDecoder : public DeltaDecoder { RETURN_IF_ERROR(_get_internal(_values.data(), cast_set(num_values - null_count), &num_valid_values)); DCHECK_EQ(num_values - null_count, num_valid_values); - if (doris_column->is_column_string()) { + if (doris_column->is_column_string() || + remove_nullable(data_type)->get_primitive_type() == TYPE_VARBINARY) { return decode_byte_array(_values, doris_column, data_type, select_vector); } else { return decode_fixed_byte_array(_values, doris_column, data_type, diff --git a/be/src/format/parquet/parquet_column_convert.cpp b/be/src/format/parquet/parquet_column_convert.cpp index 940e95bd973306..981bd5b461acb0 100644 --- a/be/src/format/parquet/parquet_column_convert.cpp +++ b/be/src/format/parquet/parquet_column_convert.cpp @@ -29,6 +29,68 @@ namespace doris::parquet { const cctz::time_zone ConvertParams::utc0 = cctz::utc_time_zone(); +namespace { + +struct TimeToMicroScale { + int64_t numerator; + int64_t denominator; +}; + +TimeToMicroScale time_unit_to_micro_scale(const tparquet::TimeUnit& time_unit) { + if (time_unit.__isset.MILLIS) { + return {1000, 1}; + } + if (time_unit.__isset.MICROS) { + return {1, 1}; + } + DCHECK(time_unit.__isset.NANOS); + return {1, 1000}; +} + +TimeToMicroScale parquet_time_to_micro_scale(const tparquet::SchemaElement& schema) { + if (schema.__isset.logicalType && schema.logicalType.__isset.TIME) { + return time_unit_to_micro_scale(schema.logicalType.TIME.unit); + } + DCHECK(schema.__isset.converted_type); + if (schema.converted_type == tparquet::ConvertedType::TIME_MILLIS) { + return {1000, 1}; + } + DCHECK(schema.converted_type == tparquet::ConvertedType::TIME_MICROS); + return {1, 1}; +} + +template +class VariantIntToTimeV2 final : public PhysicalToLogicalConverter { +public: + explicit VariantIntToTimeV2(TimeToMicroScale scale) : _scale(scale) {} + + Status physical_convert(ColumnPtr& src_physical_col, ColumnPtr& src_logical_column) override { + using SrcColumnType = typename PrimitiveTypeTraits::ColumnType; + using TimeType = typename PrimitiveTypeTraits::CppType; + + ColumnPtr src_col = remove_nullable(src_physical_col); + MutableColumnPtr dst_col = remove_nullable(src_logical_column)->assume_mutable(); + + size_t rows = src_col->size(); + size_t start_idx = dst_col->size(); + dst_col->resize(start_idx + rows); + + const auto& src_data = static_cast(src_col.get())->get_data(); + auto& data = static_cast(dst_col.get())->get_data(); + + for (int i = 0; i < rows; i++) { + data[start_idx + i] = + static_cast(src_data[i] * _scale.numerator / _scale.denominator); + } + return Status::OK(); + } + +private: + TimeToMicroScale _scale; +}; + +} // namespace + #define FOR_LOGICAL_DECIMAL_TYPES(M) \ M(TYPE_DECIMAL32) \ M(TYPE_DECIMAL64) \ @@ -246,6 +308,20 @@ std::unique_ptr PhysicalToLogicalConverter::get_conv convert_params.get(), physical_converter); } else if (src_logical_primitive == TYPE_DATEV2) { physical_converter = std::make_unique(); + } else if (src_logical_primitive == TYPE_TIMEV2) { + if (!field_schema->is_in_variant) { + physical_converter = + std::make_unique(src_physical_type, src_logical_type); + } else if (src_physical_type == tparquet::Type::INT32) { + physical_converter = std::make_unique>( + parquet_time_to_micro_scale(parquet_schema)); + } else if (src_physical_type == tparquet::Type::INT64) { + physical_converter = std::make_unique>( + parquet_time_to_micro_scale(parquet_schema)); + } else { + physical_converter = + std::make_unique(src_physical_type, src_logical_type); + } } else if (src_logical_primitive == TYPE_DATETIMEV2) { if (src_physical_type == tparquet::Type::INT96) { // int96 only stores nanoseconds in standard parquet file diff --git a/be/src/format/parquet/parquet_nested_column_utils.cpp b/be/src/format/parquet/parquet_nested_column_utils.cpp new file mode 100644 index 00000000000000..d43767da4bb1ef --- /dev/null +++ b/be/src/format/parquet/parquet_nested_column_utils.cpp @@ -0,0 +1,533 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "format/parquet/parquet_nested_column_utils.h" + +#include +#include +#include +#include +#include + +#include "core/data_type/data_type_nullable.h" +#include "format/parquet/schema_desc.h" + +namespace doris { +namespace { + +enum class NestedPathMode { + NAME, + FIELD_ID, +}; + +void add_column_id_range(const FieldSchema& field_schema, std::set& column_ids) { + const uint64_t start_id = field_schema.get_column_id(); + const uint64_t max_column_id = field_schema.get_max_column_id(); + for (uint64_t id = start_id; id <= max_column_id; ++id) { + column_ids.insert(id); + } +} + +const FieldSchema* find_child_by_structural_name(const FieldSchema& field_schema, + std::string_view name) { + std::string lower_name(name); + std::transform(lower_name.begin(), lower_name.end(), lower_name.begin(), + [](unsigned char c) { return static_cast(std::tolower(c)); }); + for (const auto& child : field_schema.children) { + if (child.name == name || child.lower_case_name == lower_name) { + return &child; + } + } + return nullptr; +} + +const FieldSchema* find_child_by_exact_name(const FieldSchema& field_schema, + std::string_view name) { + for (const auto& child : field_schema.children) { + if (child.name == name) { + return &child; + } + } + return nullptr; +} + +const FieldSchema* find_variant_typed_child_by_key(const FieldSchema& field_schema, + std::string_view key) { + return find_child_by_exact_name(field_schema, key); +} + +void add_variant_metadata(const FieldSchema& variant_field, std::set& column_ids) { + if (const auto* metadata = find_child_by_structural_name(variant_field, "metadata")) { + add_column_id_range(*metadata, column_ids); + } +} + +bool is_unannotated_variant_value_field(const FieldSchema& field) { + // VARIANT residual value is raw binary; annotated strings named value are user fields. + return field.lower_case_name == "value" && field.physical_type == tparquet::Type::BYTE_ARRAY && + !field.parquet_schema.__isset.logicalType && + !field.parquet_schema.__isset.converted_type; +} + +const FieldSchema* find_variant_value_field(const FieldSchema& field_schema) { + for (const auto& child : field_schema.children) { + if (is_unannotated_variant_value_field(child)) { + return &child; + } + } + return nullptr; +} + +void add_variant_value(const FieldSchema& variant_field, std::set& column_ids) { + add_variant_metadata(variant_field, column_ids); + if (const auto* value = find_variant_value_field(variant_field)) { + add_column_id_range(*value, column_ids); + } +} + +struct VariantColumnIdExtractionResult { + bool has_child_columns = false; + bool needs_metadata = false; +}; + +using VariantPathMap = std::unordered_map>>; + +bool is_shredded_variant_field(const FieldSchema& field_schema) { + bool has_value = false; + const FieldSchema* typed_value = nullptr; + for (const auto& child : field_schema.children) { + if (child.lower_case_name == "value") { + if (!is_unannotated_variant_value_field(child)) { + return false; + } + has_value = true; + continue; + } + if (child.lower_case_name == "typed_value") { + typed_value = &child; + continue; + } + return false; + } + if (has_value) { + return true; + } + if (typed_value == nullptr) { + return false; + } + const auto type = remove_nullable(typed_value->data_type); + return type->get_primitive_type() == TYPE_STRUCT || type->get_primitive_type() == TYPE_ARRAY; +} + +bool add_shredded_variant_field_value(const FieldSchema& shredded_field, + std::set& column_ids) { + if (const auto* value = find_variant_value_field(shredded_field)) { + add_column_id_range(*value, column_ids); + return true; + } + return false; +} + +bool is_variant_array_subscript(std::string_view path) { + return !path.empty() && + std::all_of(path.begin(), path.end(), [](unsigned char c) { return std::isdigit(c); }); +} + +bool is_terminal_variant_meta_component(std::string_view path) { + return path == "NULL" || path == "OFFSET"; +} + +const std::vector& effective_variant_path(const std::vector& raw_path, + std::vector& stripped_path) { + if (!raw_path.empty() && is_terminal_variant_meta_component(raw_path.back())) { + stripped_path.assign(raw_path.begin(), raw_path.end() - 1); + return stripped_path; + } + return raw_path; +} + +bool contains_inherited_metadata_value(const FieldSchema& field_schema) { + if (is_shredded_variant_field(field_schema) && + find_variant_value_field(field_schema) != nullptr) { + return true; + } + return std::any_of( + field_schema.children.begin(), field_schema.children.end(), + [](const FieldSchema& child) { return contains_inherited_metadata_value(child); }); +} + +VariantColumnIdExtractionResult extract_variant_typed_nested_column_ids( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids, NestedPathMode mode); + +VariantColumnIdExtractionResult extract_typed_value_path(const FieldSchema& typed_value, + const std::vector& path, + std::set& column_ids, + NestedPathMode mode) { + VariantColumnIdExtractionResult result; + const auto typed_value_type = remove_nullable(typed_value.data_type); + if (typed_value_type->get_primitive_type() != TYPE_STRUCT) { + result = extract_variant_typed_nested_column_ids(typed_value, {path}, column_ids, mode); + } else if (const auto* typed_child = find_variant_typed_child_by_key(typed_value, path[0])) { + if (path.size() == 1) { + add_column_id_range(*typed_child, column_ids); + result.has_child_columns = true; + result.needs_metadata = contains_inherited_metadata_value(*typed_child); + } else { + std::vector> child_paths { + std::vector(path.begin() + 1, path.end())}; + result = extract_variant_typed_nested_column_ids(*typed_child, child_paths, column_ids, + mode); + } + } + + if (result.has_child_columns) { + column_ids.insert(typed_value.get_column_id()); + } + return result; +} + +void add_variant_typed_path(PrimitiveType field_type, const FieldSchema& field_schema, + const std::vector& path, + VariantColumnIdExtractionResult* result, std::set& column_ids, + VariantPathMap* child_paths) { + if (path.empty()) { + add_column_id_range(field_schema, column_ids); + result->has_child_columns = true; + result->needs_metadata |= contains_inherited_metadata_value(field_schema); + return; + } + + const bool is_list = field_type == PrimitiveType::TYPE_ARRAY; + const bool is_map = field_type == PrimitiveType::TYPE_MAP; + std::vector remaining; + std::string child_key; + if (is_list) { + child_key = "*"; + if (!is_variant_array_subscript(path[0])) { + remaining.assign(path.begin(), path.end()); + } else if (path.size() > 1) { + remaining.assign(path.begin() + 1, path.end()); + } + } else if (is_map) { + (*child_paths)["KEYS"].emplace_back(); + child_key = "VALUES"; + if (path.size() > 1) { + remaining.assign(path.begin() + 1, path.end()); + } + } else { + child_key = path[0]; + if (path.size() > 1) { + remaining.assign(path.begin() + 1, path.end()); + } + } + (*child_paths)[child_key].push_back(std::move(remaining)); +} + +std::string variant_typed_child_key(PrimitiveType field_type, const FieldSchema& field_schema, + uint64_t child_index) { + if (field_type == PrimitiveType::TYPE_ARRAY) { + return "*"; + } + if (field_type == PrimitiveType::TYPE_MAP) { + if (child_index == 0) { + return "KEYS"; + } + return child_index == 1 ? "VALUES" : ""; + } + return field_schema.children[child_index].name; +} + +void append_variant_child_paths(const VariantPathMap& paths_by_name, const std::string& key, + std::vector>& child_paths) { + auto child_paths_it = paths_by_name.find(key); + if (child_paths_it != paths_by_name.end()) { + child_paths.insert(child_paths.end(), child_paths_it->second.begin(), + child_paths_it->second.end()); + } +} + +std::vector> collect_variant_typed_child_paths( + const VariantPathMap& paths_by_name, const std::string& child_key) { + std::vector> child_paths; + append_variant_child_paths(paths_by_name, child_key, child_paths); + return child_paths; +} + +void extract_variant_typed_child_column_ids( + const FieldSchema& child, const std::vector>& child_paths, + std::set& column_ids, NestedPathMode mode, + VariantColumnIdExtractionResult* result) { + const bool needs_full_child = + std::any_of(child_paths.begin(), child_paths.end(), + [](const std::vector& path) { return path.empty(); }); + if (needs_full_child) { + add_column_id_range(child, column_ids); + result->has_child_columns = true; + result->needs_metadata |= contains_inherited_metadata_value(child); + return; + } + + auto child_result = + extract_variant_typed_nested_column_ids(child, child_paths, column_ids, mode); + result->has_child_columns |= child_result.has_child_columns; + result->needs_metadata |= child_result.needs_metadata; +} + +VariantColumnIdExtractionResult extract_shredded_variant_field_ids( + const FieldSchema& shredded_field, const std::vector>& paths, + std::set& column_ids, NestedPathMode mode) { + const auto* typed_value = find_child_by_structural_name(shredded_field, "typed_value"); + VariantColumnIdExtractionResult result; + + for (const auto& raw_path : paths) { + std::vector stripped_path; + const auto& path = effective_variant_path(raw_path, stripped_path); + if (path.empty()) { + add_column_id_range(shredded_field, column_ids); + result.has_child_columns = true; + result.needs_metadata |= contains_inherited_metadata_value(shredded_field); + continue; + } + + VariantColumnIdExtractionResult typed_result; + if (typed_value != nullptr) { + typed_result = extract_typed_value_path(*typed_value, path, column_ids, mode); + result.needs_metadata |= typed_result.needs_metadata; + } + const bool has_residual_value = + add_shredded_variant_field_value(shredded_field, column_ids); + if (has_residual_value) { + result.needs_metadata = true; + } + if (!typed_result.has_child_columns) { + result.has_child_columns |= has_residual_value; + continue; + } + result.has_child_columns = true; + } + + if (result.has_child_columns) { + column_ids.insert(shredded_field.get_column_id()); + } + return result; +} + +VariantColumnIdExtractionResult extract_variant_nested_column_ids( + const FieldSchema& variant_field, const std::vector>& paths, + std::set& column_ids, NestedPathMode mode) { + const auto* typed_value = find_child_by_structural_name(variant_field, "typed_value"); + VariantColumnIdExtractionResult result; + + for (const auto& raw_path : paths) { + std::vector stripped_path; + const auto& path = effective_variant_path(raw_path, stripped_path); + if (path.empty()) { + add_column_id_range(variant_field, column_ids); + result.has_child_columns = true; + continue; + } + + VariantColumnIdExtractionResult typed_result; + if (typed_value != nullptr) { + typed_result = extract_typed_value_path(*typed_value, path, column_ids, mode); + if (typed_result.needs_metadata) { + add_variant_metadata(variant_field, column_ids); + } + } + + if (!typed_result.has_child_columns) { + add_variant_value(variant_field, column_ids); + } + result.has_child_columns = true; + } + + if (result.has_child_columns) { + column_ids.insert(variant_field.get_column_id()); + } + return result; +} + +VariantColumnIdExtractionResult extract_variant_typed_nested_column_ids( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids, NestedPathMode mode) { + if (remove_nullable(field_schema.data_type)->get_primitive_type() == + PrimitiveType::TYPE_VARIANT) { + return extract_variant_nested_column_ids(field_schema, paths, column_ids, mode); + } + if (is_shredded_variant_field(field_schema)) { + return extract_shredded_variant_field_ids(field_schema, paths, column_ids, mode); + } + + VariantColumnIdExtractionResult result; + VariantPathMap child_paths_by_name; + const auto field_type = remove_nullable(field_schema.data_type)->get_primitive_type(); + for (const auto& path : paths) { + add_variant_typed_path(field_type, field_schema, path, &result, column_ids, + &child_paths_by_name); + } + + for (uint64_t i = 0; i < field_schema.children.size(); ++i) { + const auto& child = field_schema.children[i]; + const std::string child_key = variant_typed_child_key(field_type, field_schema, i); + auto child_paths = collect_variant_typed_child_paths(child_paths_by_name, child_key); + if (child_paths.empty()) { + continue; + } + extract_variant_typed_child_column_ids(child, child_paths, column_ids, mode, &result); + } + + if (result.has_child_columns) { + column_ids.insert(field_schema.get_column_id()); + } + return result; +} + +void normalize_map_wildcard( + std::unordered_map>>& child_paths) { + auto wildcard_it = child_paths.find("*"); + if (wildcard_it == child_paths.end()) { + return; + } + + auto wildcard_paths = std::move(wildcard_it->second); + child_paths.erase(wildcard_it); + auto& values_paths = child_paths["VALUES"]; + values_paths.insert(values_paths.end(), wildcard_paths.begin(), wildcard_paths.end()); + child_paths["KEYS"].emplace_back(); +} + +std::string get_nested_child_key(const FieldSchema& field_schema, uint64_t child_index, + NestedPathMode mode) { + const auto field_type = remove_nullable(field_schema.data_type)->get_primitive_type(); + if (field_type == PrimitiveType::TYPE_ARRAY) { + return "*"; + } + if (field_type == PrimitiveType::TYPE_MAP) { + if (child_index == 0) { + return "KEYS"; + } + return child_index == 1 ? "VALUES" : ""; + } + + const auto& child = field_schema.children[child_index]; + if (mode == NestedPathMode::NAME) { + return child.lower_case_name; + } + return std::to_string(child.field_id); +} + +bool should_skip_nested_child_key(std::string_view child_key, NestedPathMode mode) { + return child_key.empty() || (mode == NestedPathMode::FIELD_ID && child_key == "-1"); +} + +void extract_nested_column_ids_impl(const FieldSchema& field_schema, + const std::vector>& paths, + std::set& column_ids, NestedPathMode mode) { + const auto field_type = remove_nullable(field_schema.data_type)->get_primitive_type(); + if (field_type == PrimitiveType::TYPE_VARIANT) { + static_cast(extract_variant_nested_column_ids(field_schema, paths, column_ids, mode)); + return; + } + + std::unordered_map>> child_paths_by_key; + for (const auto& path : paths) { + if (path.empty()) { + continue; + } + std::vector remaining; + if (path.size() > 1) { + remaining.assign(path.begin() + 1, path.end()); + } + child_paths_by_key[path[0]].push_back(std::move(remaining)); + } + + if (field_type == PrimitiveType::TYPE_MAP) { + normalize_map_wildcard(child_paths_by_key); + } + + bool has_child_columns = false; + if (field_type == PrimitiveType::TYPE_ARRAY && + child_paths_by_key.find("OFFSET") != child_paths_by_key.end()) { + has_child_columns = true; + } + for (uint64_t i = 0; i < field_schema.children.size(); ++i) { + const auto& child = field_schema.children[i]; + const std::string child_key = get_nested_child_key(field_schema, i, mode); + if (should_skip_nested_child_key(child_key, mode)) { + continue; + } + + if (field_type == PrimitiveType::TYPE_MAP && i == 0) { + const bool has_keys_access = + child_paths_by_key.find("KEYS") != child_paths_by_key.end(); + const bool has_values_access = + child_paths_by_key.find("VALUES") != child_paths_by_key.end(); + const bool has_offset_access = + child_paths_by_key.find("OFFSET") != child_paths_by_key.end(); + const bool has_null_access = + child_paths_by_key.find("NULL") != child_paths_by_key.end(); + if (!has_keys_access && (has_values_access || has_offset_access || has_null_access)) { + add_column_id_range(child, column_ids); + has_child_columns = true; + continue; + } + } + + auto child_paths_it = child_paths_by_key.find(child_key); + if (child_paths_it == child_paths_by_key.end()) { + continue; + } + + const auto& child_paths = child_paths_it->second; + const bool needs_full_child = + std::any_of(child_paths.begin(), child_paths.end(), + [](const std::vector& path) { return path.empty(); }); + + if (needs_full_child) { + add_column_id_range(child, column_ids); + has_child_columns = true; + continue; + } + + const size_t before_size = column_ids.size(); + extract_nested_column_ids_impl(child, child_paths, column_ids, mode); + if (column_ids.size() > before_size) { + has_child_columns = true; + } + } + + if (has_child_columns) { + column_ids.insert(field_schema.get_column_id()); + } +} + +} // namespace + +void ParquetNestedColumnUtils::extract_nested_column_ids_by_name( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids) { + extract_nested_column_ids_impl(field_schema, paths, column_ids, NestedPathMode::NAME); +} + +void ParquetNestedColumnUtils::extract_nested_column_ids_by_field_id( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids) { + extract_nested_column_ids_impl(field_schema, paths, column_ids, NestedPathMode::FIELD_ID); +} + +} // namespace doris diff --git a/be/src/format/parquet/parquet_nested_column_utils.h b/be/src/format/parquet/parquet_nested_column_utils.h new file mode 100644 index 00000000000000..181fba58faee7b --- /dev/null +++ b/be/src/format/parquet/parquet_nested_column_utils.h @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include +#include +#include + +namespace doris { + +struct FieldSchema; + +class ParquetNestedColumnUtils { +public: + static void extract_nested_column_ids_by_name( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids); + + static void extract_nested_column_ids_by_field_id( + const FieldSchema& field_schema, const std::vector>& paths, + std::set& column_ids); +}; + +} // namespace doris diff --git a/be/src/format/parquet/parquet_variant_reader.cpp b/be/src/format/parquet/parquet_variant_reader.cpp new file mode 100644 index 00000000000000..8d63065a13a92e --- /dev/null +++ b/be/src/format/parquet/parquet_variant_reader.cpp @@ -0,0 +1,1161 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "format/parquet/parquet_variant_reader.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "core/column/column_variant.h" +#include "core/data_type/data_type_decimal.h" +#include "core/value/jsonb_value.h" +#include "exec/common/variant_util.h" + +namespace doris::parquet { + +std::string format_variant_uuid(const uint8_t* ptr) { + static constexpr char hex[] = "0123456789abcdef"; + std::string uuid; + uuid.reserve(36); + for (int i = 0; i < 16; ++i) { + if (i == 4 || i == 6 || i == 8 || i == 10) { + uuid.push_back('-'); + } + uuid.push_back(hex[ptr[i] >> 4]); + uuid.push_back(hex[ptr[i] & 0x0f]); + } + return uuid; +} + +namespace { + +struct VariantMetadata { + std::vector dictionary; +}; + +struct VariantObjectLayout { + std::vector field_ids; + std::vector field_offsets; + std::vector field_ends; + const uint8_t* fields = nullptr; + uint64_t total_size = 0; +}; + +struct VariantArrayLayout { + std::vector field_offsets; + const uint8_t* fields = nullptr; + uint64_t total_size = 0; +}; + +uint64_t read_unsigned_le(const uint8_t* ptr, int size) { + uint64_t value = 0; + for (int i = 0; i < size; ++i) { + value |= static_cast(ptr[i]) << (i * 8); + } + return value; +} + +int64_t read_signed_le(const uint8_t* ptr, int size) { + uint64_t value = read_unsigned_le(ptr, size); + if (size < 8) { + uint64_t sign_bit = uint64_t {1} << (size * 8 - 1); + if ((value & sign_bit) != 0) { + uint64_t mask = ~((uint64_t {1} << (size * 8)) - 1); + value |= mask; + } + } + return static_cast(value); +} + +__int128 read_signed_int128_le(const uint8_t* ptr) { + unsigned __int128 unsigned_value = 0; + for (int i = 15; i >= 0; --i) { + unsigned_value <<= 8; + unsigned_value |= ptr[i]; + } + static constexpr unsigned __int128 sign_bit = static_cast(1) << 127; + if ((unsigned_value & sign_bit) == 0) { + return static_cast<__int128>(unsigned_value); + } + static constexpr __int128 signed_half_range = static_cast<__int128>(1) << 126; + return (static_cast<__int128>(unsigned_value & (sign_bit - 1)) - signed_half_range) - + signed_half_range; +} + +Status require_available(const uint8_t* ptr, const uint8_t* end, size_t size, + std::string_view context) { + if (ptr > end) { + return Status::Corruption("Invalid Parquet VARIANT {} encoding", context); + } + if (size > static_cast(end - ptr)) { + return Status::Corruption("Invalid Parquet VARIANT {} encoding", context); + } + return Status::OK(); +} + +Status require_available_entries(const uint8_t* ptr, const uint8_t* end, uint64_t entries, + size_t entry_size, std::string_view context) { + if (entries > std::numeric_limits::max() / entry_size) { + return Status::Corruption("Invalid Parquet VARIANT {} encoding", context); + } + return require_available(ptr, end, static_cast(entries) * entry_size, context); +} + +bool variant_string_less(std::string_view lhs, std::string_view rhs) { + return std::lexicographical_compare( + lhs.begin(), lhs.end(), rhs.begin(), rhs.end(), [](char left, char right) { + return static_cast(left) < static_cast(right); + }); +} + +bool is_valid_utf8(std::string_view value) { + const auto* data = reinterpret_cast(value.data()); + const auto* end = data + value.size(); + while (data < end) { + const uint8_t first = *data++; + if (first <= 0x7f) { + continue; + } + + uint32_t code_point = 0; + size_t continuation_bytes = 0; + if (first >= 0xc2 && first <= 0xdf) { + code_point = first & 0x1f; + continuation_bytes = 1; + } else if (first >= 0xe0 && first <= 0xef) { + code_point = first & 0x0f; + continuation_bytes = 2; + } else if (first >= 0xf0 && first <= 0xf4) { + code_point = first & 0x07; + continuation_bytes = 3; + } else { + return false; + } + + if (static_cast(end - data) < continuation_bytes) { + return false; + } + for (size_t i = 0; i < continuation_bytes; ++i) { + const uint8_t byte = *data++; + if ((byte & 0xc0) != 0x80) { + return false; + } + code_point = (code_point << 6) | (byte & 0x3f); + } + + if ((continuation_bytes == 2 && code_point < 0x800) || + (continuation_bytes == 3 && code_point < 0x10000) || + (code_point >= 0xd800 && code_point <= 0xdfff) || code_point > 0x10ffff) { + return false; + } + } + return true; +} + +Status require_valid_utf8(std::string_view value, std::string_view context) { + if (!is_valid_utf8(value)) { + return Status::Corruption("Invalid Parquet VARIANT {} UTF-8 string", context); + } + return Status::OK(); +} + +Status validate_array_field_offsets(const std::vector& field_offsets, uint64_t total_size, + std::string_view context) { + if (field_offsets.empty() || field_offsets.front() != 0) { + return Status::Corruption("Invalid Parquet VARIANT {} field offsets", context); + } + for (size_t i = 0; i < field_offsets.size(); ++i) { + if (field_offsets[i] > total_size) { + return Status::Corruption("Invalid Parquet VARIANT {} field offset {}", context, + field_offsets[i]); + } + if (i > 0 && field_offsets[i] < field_offsets[i - 1]) { + return Status::Corruption("Invalid Parquet VARIANT {} field offsets", context); + } + } + return Status::OK(); +} + +Status compute_object_field_ends(const std::vector& field_offsets, uint64_t total_size, + std::vector* field_ends) { + if (field_offsets.empty()) { + return Status::Corruption("Invalid Parquet VARIANT object field offsets"); + } + size_t num_elements = field_offsets.size() - 1; + if (num_elements == 0) { + if (total_size != 0) { + return Status::Corruption("Invalid Parquet VARIANT object field offsets"); + } + return Status::OK(); + } + + std::vector> physical_offsets; + physical_offsets.reserve(num_elements); + for (size_t i = 0; i < num_elements; ++i) { + if (field_offsets[i] >= total_size) { + return Status::Corruption("Invalid Parquet VARIANT object field offset {}", + field_offsets[i]); + } + physical_offsets.emplace_back(field_offsets[i], i); + } + std::sort(physical_offsets.begin(), physical_offsets.end()); + if (physical_offsets.front().first != 0) { + return Status::Corruption("Invalid Parquet VARIANT object field offsets"); + } + + field_ends->assign(num_elements, 0); + for (size_t i = 0; i < physical_offsets.size(); ++i) { + if (i > 0 && physical_offsets[i].first == physical_offsets[i - 1].first) { + return Status::Corruption("Invalid Parquet VARIANT object field offsets"); + } + uint64_t child_end = + i + 1 < physical_offsets.size() ? physical_offsets[i + 1].first : total_size; + (*field_ends)[physical_offsets[i].second] = child_end; + } + return Status::OK(); +} + +void append_json_string(std::string_view value, std::string* json, bool escape_non_ascii = false) { + json->push_back('"'); + static constexpr char hex[] = "0123456789abcdef"; + for (unsigned char c : value) { + switch (c) { + case '"': + json->append("\\\""); + break; + case '\\': + json->append("\\\\"); + break; + case '\b': + json->append("\\b"); + break; + case '\f': + json->append("\\f"); + break; + case '\n': + json->append("\\n"); + break; + case '\r': + json->append("\\r"); + break; + case '\t': + json->append("\\t"); + break; + default: + if (c < 0x20 || (escape_non_ascii && c >= 0x80)) { + json->append("\\u00"); + json->push_back(hex[c >> 4]); + json->push_back(hex[c & 0x0f]); + } else { + json->push_back(static_cast(c)); + } + break; + } + } + json->push_back('"'); +} + +template +Status append_floating_json(T value, std::string* json) { + std::ostringstream oss; + oss << std::setprecision(std::numeric_limits::max_digits10) << value; + json->append(oss.str()); + return Status::OK(); +} + +std::string int128_to_string(__int128 value) { + if (value == 0) { + return "0"; + } + bool negative = value < 0; + unsigned __int128 unsigned_value = negative ? static_cast(-(value + 1)) + 1 + : static_cast(value); + std::string digits; + while (unsigned_value > 0) { + digits.push_back(static_cast('0' + unsigned_value % 10)); + unsigned_value /= 10; + } + if (negative) { + digits.push_back('-'); + } + std::reverse(digits.begin(), digits.end()); + return digits; +} + +void append_decimal_json(__int128 unscaled, int scale, std::string* json) { + std::string value = int128_to_string(unscaled); + bool negative = !value.empty() && value[0] == '-'; + std::string digits = negative ? value.substr(1) : value; + if (scale == 0) { + json->append(value); + return; + } + if (scale > 0) { + if (digits.size() <= static_cast(scale)) { + digits.insert(0, static_cast(scale) + 1 - digits.size(), '0'); + } + digits.insert(digits.end() - scale, '.'); + if (negative) { + json->push_back('-'); + } + json->append(digits); + return; + } + if (negative) { + json->push_back('-'); + } + json->append(digits); + json->append(static_cast(-scale), '0'); +} + +Status decode_primitive(uint8_t primitive_header, const uint8_t* ptr, const uint8_t* end, + std::string* json, const uint8_t** next); +Status decode_value(const uint8_t* ptr, const uint8_t* end, const VariantMetadata& metadata, + std::string* json, const uint8_t** next); + +void append_uuid_json(const uint8_t* ptr, std::string* json) { + json->push_back('"'); + json->append(format_variant_uuid(ptr)); + json->push_back('"'); +} + +Status make_jsonb_field(std::string_view json, FieldWithDataType* value) { + JsonBinaryValue jsonb_value; + RETURN_IF_ERROR(jsonb_value.from_json_string(json.data(), json.size())); + value->field = + Field::create_field(JsonbField(jsonb_value.value(), jsonb_value.size())); + value->base_scalar_type_id = TYPE_JSONB; + value->num_dimensions = 0; + value->precision = 0; + value->scale = 0; + return Status::OK(); +} + +std::string make_null_array_json(size_t elements) { + std::string json = "["; + for (size_t i = 0; i < elements; ++i) { + if (i != 0) { + json.push_back(','); + } + json.append("null"); + } + json.push_back(']'); + return json; +} + +Status insert_empty_object_marker(const PathInData& path, VariantMap* values) { + FieldWithDataType value; + RETURN_IF_ERROR(make_jsonb_field("{}", &value)); + (*values)[path] = std::move(value); + return Status::OK(); +} + +Status parse_json_to_variant_map(std::string_view json, const PathInData& prefix, + VariantMap* values) { + auto parsed_column = ColumnVariant::create(0, false); + ParseConfig parse_config; + StringRef json_ref(json.data(), json.size()); + RETURN_IF_CATCH_EXCEPTION( + variant_util::parse_json_to_variant(*parsed_column, json_ref, nullptr, parse_config)); + Field parsed = (*parsed_column)[0]; + if (parsed.is_null()) { + (*values)[prefix] = FieldWithDataType {.field = Field()}; + return Status::OK(); + } + + PathInDataBuilder path; + path.append(prefix.get_parts(), false); + for (auto& [parsed_path, value] : parsed.get()) { + path.append(parsed_path.get_parts(), false); + (*values)[path.build()] = std::move(value); + for (size_t i = 0; i < parsed_path.get_parts().size(); ++i) { + path.pop_back(); + } + } + return Status::OK(); +} + +void fill_field_type_info(FieldWithDataType* value) { + FieldInfo info; + variant_util::get_field_info(value->field, &info); + value->base_scalar_type_id = info.scalar_type_id; + value->num_dimensions = static_cast(info.num_dimensions); + value->precision = info.precision; + value->scale = info.scale; +} + +template +void set_primitive_variant_field(const typename PrimitiveTypeTraits::CppType& data, + FieldWithDataType* value) { + value->field = Field::create_field(data); + fill_field_type_info(value); +} + +Status read_decimal_primitive_field(uint8_t primitive_header, const uint8_t* ptr, + const uint8_t* end, FieldWithDataType* value, + const uint8_t** next) { + int value_size = 16; + if (primitive_header == 8) { + value_size = 4; + } else if (primitive_header == 9) { + value_size = 8; + } + RETURN_IF_ERROR(require_available(ptr, end, 1 + value_size, "decimal value")); + int scale = static_cast(*ptr++); + if (scale < 0 || scale > BeConsts::MAX_DECIMAL128_PRECISION) { + return Status::Corruption("Invalid Parquet VARIANT decimal scale {}", scale); + } + + if (primitive_header == 8) { + set_primitive_variant_field( + Decimal32(static_cast(read_signed_le(ptr, value_size))), value); + value->precision = BeConsts::MAX_DECIMAL32_PRECISION; + } else if (primitive_header == 9) { + set_primitive_variant_field( + Decimal64(static_cast(read_signed_le(ptr, value_size))), value); + value->precision = BeConsts::MAX_DECIMAL64_PRECISION; + } else { + set_primitive_variant_field(Decimal128V3(read_signed_int128_le(ptr)), + value); + value->precision = BeConsts::MAX_DECIMAL128_PRECISION; + } + value->scale = scale; + *next = ptr + value_size; + return Status::OK(); +} + +Status read_integral_primitive_field(uint8_t primitive_header, const uint8_t* ptr, + const uint8_t* end, FieldWithDataType* value, + const uint8_t** next) { + int value_size = 8; + if (primitive_header == 3) { + value_size = 1; + } else if (primitive_header == 4) { + value_size = 2; + } else if (primitive_header == 5 || primitive_header == 11) { + value_size = 4; + } + RETURN_IF_ERROR(require_available(ptr, end, value_size, "integer value")); + const auto data = static_cast(read_signed_le(ptr, value_size)); + + switch (primitive_header) { + case 3: + set_primitive_variant_field(static_cast(data), value); + break; + case 4: + set_primitive_variant_field(static_cast(data), value); + break; + case 5: + set_primitive_variant_field(static_cast(data), value); + break; + case 6: + case 11: + case 12: + case 13: + case 17: + set_primitive_variant_field(data, value); + break; + case 18: + case 19: + set_primitive_variant_field(data / 1000, value); + break; + default: + return Status::Corruption("Unsupported Parquet VARIANT primitive header {}", + primitive_header); + } + *next = ptr + value_size; + return Status::OK(); +} + +Status read_floating_primitive_field(uint8_t primitive_header, const uint8_t* ptr, + const uint8_t* end, FieldWithDataType* value, + const uint8_t** next) { + if (primitive_header == 14) { + RETURN_IF_ERROR(require_available(ptr, end, 4, "float value")); + auto bits = static_cast(read_unsigned_le(ptr, 4)); + float data; + std::memcpy(&data, &bits, sizeof(data)); + set_primitive_variant_field(data, value); + *next = ptr + 4; + return Status::OK(); + } + + DCHECK_EQ(primitive_header, 7); + RETURN_IF_ERROR(require_available(ptr, end, 8, "double value")); + uint64_t bits = read_unsigned_le(ptr, 8); + double data; + std::memcpy(&data, &bits, sizeof(data)); + set_primitive_variant_field(data, value); + *next = ptr + 8; + return Status::OK(); +} + +Status read_binary_primitive_field(const uint8_t* ptr, const uint8_t* end, FieldWithDataType* value, + std::deque* string_values, const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 4, "binary length")); + uint64_t size = read_unsigned_le(ptr, 4); + ptr += 4; + RETURN_IF_ERROR(require_available(ptr, end, size, "binary value")); + string_values->emplace_back(reinterpret_cast(ptr), static_cast(size)); + value->field = Field::create_field(StringView(string_values->back())); + fill_field_type_info(value); + *next = ptr + size; + return Status::OK(); +} + +Status read_string_primitive_field(const uint8_t* ptr, const uint8_t* end, FieldWithDataType* value, + const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 4, "binary or string length")); + uint64_t size = read_unsigned_le(ptr, 4); + ptr += 4; + RETURN_IF_ERROR(require_available(ptr, end, size, "string value")); + std::string_view data(reinterpret_cast(ptr), static_cast(size)); + RETURN_IF_ERROR(require_valid_utf8(data, "string value")); + value->field = Field::create_field(String(data)); + fill_field_type_info(value); + *next = ptr + size; + return Status::OK(); +} + +Status read_uuid_primitive_field(const uint8_t* ptr, const uint8_t* end, FieldWithDataType* value, + const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 16, "uuid value")); + value->field = Field::create_field(format_variant_uuid(ptr)); + fill_field_type_info(value); + *next = ptr + 16; + return Status::OK(); +} + +Status read_array_layout(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + VariantArrayLayout* layout) { + int field_offset_size = (value_header & 0x03) + 1; + int num_elements_size = (value_header & 0x04) != 0 ? 4 : 1; + + RETURN_IF_ERROR(require_available(ptr, end, num_elements_size, "array element count")); + uint64_t num_elements = read_unsigned_le(ptr, num_elements_size); + ptr += num_elements_size; + + RETURN_IF_ERROR(require_available_entries(ptr, end, num_elements + 1, field_offset_size, + "array field offsets")); + layout->field_offsets.resize(num_elements + 1); + for (uint64_t i = 0; i <= num_elements; ++i) { + layout->field_offsets[i] = read_unsigned_le(ptr, field_offset_size); + ptr += field_offset_size; + } + + layout->total_size = layout->field_offsets.back(); + layout->fields = ptr; + RETURN_IF_ERROR( + require_available(layout->fields, end, layout->total_size, "array field values")); + RETURN_IF_ERROR( + validate_array_field_offsets(layout->field_offsets, layout->total_size, "array")); + return Status::OK(); +} + +Status read_object_layout(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, VariantObjectLayout* layout) { + int field_offset_size = (value_header & 0x03) + 1; + int field_id_size = ((value_header >> 2) & 0x03) + 1; + int num_elements_size = (value_header & 0x10) != 0 ? 4 : 1; + + RETURN_IF_ERROR(require_available(ptr, end, num_elements_size, "object element count")); + uint64_t num_elements = read_unsigned_le(ptr, num_elements_size); + ptr += num_elements_size; + + RETURN_IF_ERROR( + require_available_entries(ptr, end, num_elements, field_id_size, "object field ids")); + layout->field_ids.resize(num_elements); + for (uint64_t i = 0; i < num_elements; ++i) { + layout->field_ids[i] = read_unsigned_le(ptr, field_id_size); + ptr += field_id_size; + if (layout->field_ids[i] >= metadata.dictionary.size()) { + return Status::Corruption("Invalid Parquet VARIANT object field id {}", + layout->field_ids[i]); + } + if (i > 0 && !variant_string_less(metadata.dictionary[layout->field_ids[i - 1]], + metadata.dictionary[layout->field_ids[i]])) { + return Status::Corruption("Invalid Parquet VARIANT object field names"); + } + } + + RETURN_IF_ERROR(require_available_entries(ptr, end, num_elements + 1, field_offset_size, + "object field offsets")); + layout->field_offsets.resize(num_elements + 1); + for (uint64_t i = 0; i <= num_elements; ++i) { + layout->field_offsets[i] = read_unsigned_le(ptr, field_offset_size); + ptr += field_offset_size; + } + + layout->total_size = layout->field_offsets.back(); + layout->fields = ptr; + RETURN_IF_ERROR( + require_available(layout->fields, end, layout->total_size, "object field values")); + RETURN_IF_ERROR(compute_object_field_ends(layout->field_offsets, layout->total_size, + &layout->field_ends)); + return Status::OK(); +} + +Status decode_value_to_variant_map(const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, PathInDataBuilder* path, + VariantMap* values, std::deque* string_values, + const uint8_t** next); + +Status decode_primitive_to_variant_map(uint8_t primitive_header, const uint8_t* ptr, + const uint8_t* end, const VariantMetadata&, + PathInDataBuilder* path, VariantMap* values, + std::deque* string_values, + const uint8_t** next) { + FieldWithDataType value; + switch (primitive_header) { + case 0: + value.field = Field(); + value.base_scalar_type_id = INVALID_TYPE; + *next = ptr; + break; + case 1: + set_primitive_variant_field(true, &value); + *next = ptr; + break; + case 2: + set_primitive_variant_field(false, &value); + *next = ptr; + break; + case 3: + case 4: + case 5: + case 6: + case 11: + case 12: + case 13: + case 17: + case 18: + case 19: + RETURN_IF_ERROR(read_integral_primitive_field(primitive_header, ptr, end, &value, next)); + break; + case 7: + case 14: + RETURN_IF_ERROR(read_floating_primitive_field(primitive_header, ptr, end, &value, next)); + break; + case 8: + case 9: + case 10: + RETURN_IF_ERROR(read_decimal_primitive_field(primitive_header, ptr, end, &value, next)); + break; + case 15: + RETURN_IF_ERROR(read_binary_primitive_field(ptr, end, &value, string_values, next)); + break; + case 16: + RETURN_IF_ERROR(read_string_primitive_field(ptr, end, &value, next)); + break; + case 20: + RETURN_IF_ERROR(read_uuid_primitive_field(ptr, end, &value, next)); + break; + default: + return Status::Corruption("Unsupported Parquet VARIANT primitive header {}", + primitive_header); + } + (*values)[path->build()] = std::move(value); + return Status::OK(); +} + +Status decode_object_to_variant_map(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, PathInDataBuilder* path, + VariantMap* values, std::deque* string_values, + const uint8_t** next) { + VariantObjectLayout layout; + RETURN_IF_ERROR(read_object_layout(value_header, ptr, end, metadata, &layout)); + + if (layout.field_ids.empty()) { + RETURN_IF_ERROR(insert_empty_object_marker(path->build(), values)); + } + + for (uint64_t i = 0; i < layout.field_ids.size(); ++i) { + const uint8_t* child_begin = layout.fields + layout.field_offsets[i]; + const uint8_t* child_end = layout.fields + layout.field_ends[i]; + const uint8_t* child_next = nullptr; + path->append(metadata.dictionary[layout.field_ids[i]], false); + RETURN_IF_ERROR(decode_value_to_variant_map(child_begin, child_end, metadata, path, values, + string_values, &child_next)); + path->pop_back(); + if (child_next != child_end) { + return Status::Corruption("Invalid Parquet VARIANT object child value length"); + } + } + *next = layout.fields + layout.total_size; + return Status::OK(); +} + +void move_variant_map_to_field(VariantMap&& element_values, FieldWithDataType* value) { + if (element_values.size() == 1 && element_values.begin()->first.empty()) { + *value = std::move(element_values.begin()->second); + return; + } + value->field = Field::create_field(std::move(element_values)); + fill_field_type_info(value); +} + +Status decode_array_element_to_field(const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, FieldWithDataType* value, + std::deque* string_values, const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 1, "array child value")); + const uint8_t value_metadata = *ptr++; + const uint8_t basic_type = value_metadata & 0x03; + const uint8_t value_header = value_metadata >> 2; + + if (basic_type == 0) { + VariantMap element_values; + PathInDataBuilder element_path; + RETURN_IF_ERROR(decode_primitive_to_variant_map(value_header, ptr, end, metadata, + &element_path, &element_values, + string_values, next)); + move_variant_map_to_field(std::move(element_values), value); + return Status::OK(); + } + + if (basic_type == 1) { + const size_t size = value_header; + RETURN_IF_ERROR(require_available(ptr, end, size, "short string value")); + std::string_view data(reinterpret_cast(ptr), size); + RETURN_IF_ERROR(require_valid_utf8(data, "short string value")); + value->field = Field::create_field(String(data)); + fill_field_type_info(value); + *next = ptr + size; + return Status::OK(); + } + + if (basic_type == 2 || basic_type == 3) { + VariantMap element_values; + PathInDataBuilder element_path; + RETURN_IF_ERROR(decode_value_to_variant_map(ptr - 1, end, metadata, &element_path, + &element_values, string_values, next)); + move_variant_map_to_field(std::move(element_values), value); + return Status::OK(); + } + + std::string json; + RETURN_IF_ERROR(decode_value(ptr - 1, end, metadata, &json, next)); + VariantMap element_values; + RETURN_IF_ERROR(parse_json_to_variant_map(json, PathInData(), &element_values)); + move_variant_map_to_field(std::move(element_values), value); + return Status::OK(); +} + +Status decode_array_to_variant_map(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, PathInDataBuilder* path, + VariantMap* values, std::deque* string_values, + const uint8_t** next) { + VariantArrayLayout layout; + RETURN_IF_ERROR(read_array_layout(value_header, ptr, end, &layout)); + + Array array; + array.reserve(layout.field_offsets.size() - 1); + for (uint64_t i = 0; i + 1 < layout.field_offsets.size(); ++i) { + const uint8_t* child_begin = layout.fields + layout.field_offsets[i]; + const uint8_t* child_end = layout.fields + layout.field_offsets[i + 1]; + const uint8_t* child_next = nullptr; + FieldWithDataType child; + RETURN_IF_ERROR(decode_array_element_to_field(child_begin, child_end, metadata, &child, + string_values, &child_next)); + if (child_next != child_end) { + return Status::Corruption("Invalid Parquet VARIANT array child value length"); + } + array.push_back(std::move(child.field)); + } + + FieldWithDataType value; + const size_t elements = array.size(); + value.field = Field::create_field(std::move(array)); + fill_field_type_info(&value); + if (value.base_scalar_type_id == INVALID_TYPE) { + RETURN_IF_ERROR(make_jsonb_field(make_null_array_json(elements), &value)); + } + (*values)[path->build()] = std::move(value); + *next = layout.fields + layout.total_size; + return Status::OK(); +} + +Status decode_value_to_variant_map(const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, PathInDataBuilder* path, + VariantMap* values, std::deque* string_values, + const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 1, "value")); + uint8_t value_metadata = *ptr++; + uint8_t basic_type = value_metadata & 0x03; + uint8_t value_header = value_metadata >> 2; + + switch (basic_type) { + case 0: + return decode_primitive_to_variant_map(value_header, ptr, end, metadata, path, values, + string_values, next); + case 2: + return decode_object_to_variant_map(value_header, ptr, end, metadata, path, values, + string_values, next); + case 1: + [[fallthrough]]; + case 3: { + if (basic_type == 3) { + Status array_st = decode_array_to_variant_map(value_header, ptr, end, metadata, path, + values, string_values, next); + if (array_st.ok()) { + return array_st; + } + if (!array_st.is()) { + return array_st; + } + } + std::string json; + RETURN_IF_ERROR(decode_value(ptr - 1, end, metadata, &json, next)); + return parse_json_to_variant_map(json, path->build(), values); + } + default: + return Status::Corruption("Unsupported Parquet VARIANT basic type {}", basic_type); + } +} + +Status decode_metadata(const StringRef& metadata, VariantMetadata* result) { + const auto* ptr = reinterpret_cast(metadata.data); + const auto* end = ptr + metadata.size; + RETURN_IF_ERROR(require_available(ptr, end, 1, "metadata")); + uint8_t header = *ptr++; + uint8_t version = header & 0x0f; + if (version != 1) { + return Status::Corruption("Unsupported Parquet VARIANT metadata version {}", version); + } + if ((header & 0x20) != 0) { + return Status::Corruption("Invalid Parquet VARIANT metadata header {}", header); + } + const bool sorted_strings = (header & 0x10) != 0; + int offset_size = ((header >> 6) & 0x03) + 1; + RETURN_IF_ERROR(require_available(ptr, end, offset_size, "metadata dictionary size")); + uint64_t dictionary_size = read_unsigned_le(ptr, offset_size); + ptr += offset_size; + + RETURN_IF_ERROR(require_available_entries(ptr, end, dictionary_size + 1, offset_size, + "metadata dictionary offsets")); + std::vector offsets(dictionary_size + 1); + for (uint64_t i = 0; i <= dictionary_size; ++i) { + offsets[i] = read_unsigned_le(ptr, offset_size); + ptr += offset_size; + if (i > 0 && offsets[i] < offsets[i - 1]) { + return Status::Corruption("Invalid Parquet VARIANT metadata dictionary offsets"); + } + } + if (offsets.front() != 0) { + return Status::Corruption("Invalid Parquet VARIANT metadata dictionary offsets"); + } + + RETURN_IF_ERROR(require_available(ptr, end, offsets.back(), "metadata dictionary bytes")); + if (ptr + offsets.back() != end) { + return Status::Corruption("Invalid Parquet VARIANT metadata dictionary bytes"); + } + result->dictionary.clear(); + result->dictionary.reserve(dictionary_size); + for (uint64_t i = 0; i < dictionary_size; ++i) { + std::string entry(reinterpret_cast(ptr + offsets[i]), + offsets[i + 1] - offsets[i]); + RETURN_IF_ERROR(require_valid_utf8(entry, "metadata dictionary")); + if (sorted_strings && !result->dictionary.empty() && + !variant_string_less(result->dictionary.back(), entry)) { + return Status::Corruption("Invalid Parquet VARIANT sorted metadata dictionary key"); + } + result->dictionary.emplace_back(std::move(entry)); + } + return Status::OK(); +} + +// NOLINTNEXTLINE(readability-function-cognitive-complexity, readability-function-size): VARIANT primitive tags are a compact spec switch. +Status decode_primitive(uint8_t primitive_header, const uint8_t* ptr, const uint8_t* end, + std::string* json, const uint8_t** next) { + switch (primitive_header) { + case 0: + json->append("null"); + *next = ptr; + return Status::OK(); + case 1: + json->append("true"); + *next = ptr; + return Status::OK(); + case 2: + json->append("false"); + *next = ptr; + return Status::OK(); + case 3: + RETURN_IF_ERROR(require_available(ptr, end, 1, "int8 value")); + json->append(std::to_string(static_cast(*ptr))); + *next = ptr + 1; + return Status::OK(); + case 4: + RETURN_IF_ERROR(require_available(ptr, end, 2, "int16 value")); + json->append(std::to_string(read_signed_le(ptr, 2))); + *next = ptr + 2; + return Status::OK(); + case 5: + RETURN_IF_ERROR(require_available(ptr, end, 4, "int32 value")); + json->append(std::to_string(read_signed_le(ptr, 4))); + *next = ptr + 4; + return Status::OK(); + case 6: + RETURN_IF_ERROR(require_available(ptr, end, 8, "int64 value")); + json->append(std::to_string(read_signed_le(ptr, 8))); + *next = ptr + 8; + return Status::OK(); + case 7: { + RETURN_IF_ERROR(require_available(ptr, end, 8, "double value")); + uint64_t bits = read_unsigned_le(ptr, 8); + double value; + std::memcpy(&value, &bits, sizeof(value)); + RETURN_IF_ERROR(append_floating_json(value, json)); + *next = ptr + 8; + return Status::OK(); + } + case 8: + case 9: + case 10: { + int value_size = 16; + if (primitive_header == 8) { + value_size = 4; + } else if (primitive_header == 9) { + value_size = 8; + } + RETURN_IF_ERROR(require_available(ptr, end, 1 + value_size, "decimal value")); + int scale = static_cast(*ptr++); + if (scale < 0 || scale > 38) { + return Status::Corruption("Invalid Parquet VARIANT decimal scale {}", scale); + } + __int128 unscaled = 0; + if (value_size == 16) { + unscaled = read_signed_int128_le(ptr); + } else { + unscaled = read_signed_le(ptr, value_size); + } + append_decimal_json(unscaled, scale, json); + *next = ptr + value_size; + return Status::OK(); + } + case 11: + RETURN_IF_ERROR(require_available(ptr, end, 4, "date value")); + json->append(std::to_string(read_signed_le(ptr, 4))); + *next = ptr + 4; + return Status::OK(); + case 12: + case 13: + case 17: + RETURN_IF_ERROR(require_available(ptr, end, 8, "time or timestamp value")); + json->append(std::to_string(read_signed_le(ptr, 8))); + *next = ptr + 8; + return Status::OK(); + case 18: + case 19: + RETURN_IF_ERROR(require_available(ptr, end, 8, "nanosecond timestamp value")); + json->append(std::to_string(read_signed_le(ptr, 8) / 1000)); + *next = ptr + 8; + return Status::OK(); + case 14: { + RETURN_IF_ERROR(require_available(ptr, end, 4, "float value")); + auto bits = static_cast(read_unsigned_le(ptr, 4)); + float value; + std::memcpy(&value, &bits, sizeof(value)); + RETURN_IF_ERROR(append_floating_json(value, json)); + *next = ptr + 4; + return Status::OK(); + } + case 15: { + RETURN_IF_ERROR(require_available(ptr, end, 4, "binary length")); + uint64_t size = read_unsigned_le(ptr, 4); + ptr += 4; + RETURN_IF_ERROR(require_available(ptr, end, size, "binary value")); + std::string_view value(reinterpret_cast(ptr), static_cast(size)); + append_json_string(value, json, true); + *next = ptr + size; + return Status::OK(); + } + case 16: { + RETURN_IF_ERROR(require_available(ptr, end, 4, "binary or string length")); + uint64_t size = read_unsigned_le(ptr, 4); + ptr += 4; + RETURN_IF_ERROR(require_available(ptr, end, size, "string value")); + std::string_view value(reinterpret_cast(ptr), static_cast(size)); + RETURN_IF_ERROR(require_valid_utf8(value, "string value")); + append_json_string(value, json); + *next = ptr + size; + return Status::OK(); + } + case 20: + RETURN_IF_ERROR(require_available(ptr, end, 16, "uuid value")); + append_uuid_json(ptr, json); + *next = ptr + 16; + return Status::OK(); + default: + return Status::Corruption("Unsupported Parquet VARIANT primitive header {}", + primitive_header); + } +} + +Status decode_object(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, std::string* json, const uint8_t** next) { + int field_offset_size = (value_header & 0x03) + 1; + int field_id_size = ((value_header >> 2) & 0x03) + 1; + int num_elements_size = (value_header & 0x10) != 0 ? 4 : 1; + + RETURN_IF_ERROR(require_available(ptr, end, num_elements_size, "object element count")); + uint64_t num_elements = read_unsigned_le(ptr, num_elements_size); + ptr += num_elements_size; + + RETURN_IF_ERROR( + require_available_entries(ptr, end, num_elements, field_id_size, "object field ids")); + std::vector field_ids(num_elements); + for (uint64_t i = 0; i < num_elements; ++i) { + field_ids[i] = read_unsigned_le(ptr, field_id_size); + ptr += field_id_size; + if (field_ids[i] >= metadata.dictionary.size()) { + return Status::Corruption("Invalid Parquet VARIANT object field id {}", field_ids[i]); + } + if (i > 0 && !variant_string_less(metadata.dictionary[field_ids[i - 1]], + metadata.dictionary[field_ids[i]])) { + return Status::Corruption("Invalid Parquet VARIANT object field names"); + } + } + + RETURN_IF_ERROR(require_available_entries(ptr, end, num_elements + 1, field_offset_size, + "object field offsets")); + std::vector field_offsets(num_elements + 1); + for (uint64_t i = 0; i <= num_elements; ++i) { + field_offsets[i] = read_unsigned_le(ptr, field_offset_size); + ptr += field_offset_size; + } + + uint64_t total_size = field_offsets.back(); + const uint8_t* fields = ptr; + RETURN_IF_ERROR(require_available(fields, end, total_size, "object field values")); + std::vector field_ends; + RETURN_IF_ERROR(compute_object_field_ends(field_offsets, total_size, &field_ends)); + + json->push_back('{'); + for (uint64_t i = 0; i < num_elements; ++i) { + if (i != 0) { + json->push_back(','); + } + append_json_string(metadata.dictionary[field_ids[i]], json); + json->push_back(':'); + const uint8_t* child_begin = fields + field_offsets[i]; + const uint8_t* child_end = fields + field_ends[i]; + const uint8_t* child_next = nullptr; + RETURN_IF_ERROR(decode_value(child_begin, child_end, metadata, json, &child_next)); + if (child_next != child_end) { + return Status::Corruption("Invalid Parquet VARIANT object child value length"); + } + } + json->push_back('}'); + *next = fields + total_size; + return Status::OK(); +} + +Status decode_array(uint8_t value_header, const uint8_t* ptr, const uint8_t* end, + const VariantMetadata& metadata, std::string* json, const uint8_t** next) { + VariantArrayLayout layout; + RETURN_IF_ERROR(read_array_layout(value_header, ptr, end, &layout)); + + json->push_back('['); + for (uint64_t i = 0; i + 1 < layout.field_offsets.size(); ++i) { + if (i != 0) { + json->push_back(','); + } + const uint8_t* child_begin = layout.fields + layout.field_offsets[i]; + const uint8_t* child_end = layout.fields + layout.field_offsets[i + 1]; + const uint8_t* child_next = nullptr; + RETURN_IF_ERROR(decode_value(child_begin, child_end, metadata, json, &child_next)); + if (child_next != child_end) { + return Status::Corruption("Invalid Parquet VARIANT array child value length"); + } + } + json->push_back(']'); + *next = layout.fields + layout.total_size; + return Status::OK(); +} + +Status decode_value(const uint8_t* ptr, const uint8_t* end, const VariantMetadata& metadata, + std::string* json, const uint8_t** next) { + RETURN_IF_ERROR(require_available(ptr, end, 1, "value")); + uint8_t value_metadata = *ptr++; + uint8_t basic_type = value_metadata & 0x03; + uint8_t value_header = value_metadata >> 2; + + switch (basic_type) { + case 0: + return decode_primitive(value_header, ptr, end, json, next); + case 1: { + size_t size = value_header; + RETURN_IF_ERROR(require_available(ptr, end, size, "short string value")); + std::string_view value(reinterpret_cast(ptr), static_cast(size)); + RETURN_IF_ERROR(require_valid_utf8(value, "short string value")); + append_json_string(value, json); + *next = ptr + size; + return Status::OK(); + } + case 2: + return decode_object(value_header, ptr, end, metadata, json, next); + case 3: + return decode_array(value_header, ptr, end, metadata, json, next); + default: + return Status::Corruption("Unsupported Parquet VARIANT basic type {}", basic_type); + } +} + +} // namespace + +Status decode_variant_to_json(const StringRef& metadata, const StringRef& value, + std::string* json) { + VariantMetadata decoded_metadata; + RETURN_IF_ERROR(decode_metadata(metadata, &decoded_metadata)); + json->clear(); + const auto* ptr = reinterpret_cast(value.data); + const auto* end = ptr + value.size; + const uint8_t* next = nullptr; + RETURN_IF_ERROR(decode_value(ptr, end, decoded_metadata, json, &next)); + if (next != end) { + return Status::Corruption("Invalid Parquet VARIANT value has {} trailing bytes", + end - next); + } + return Status::OK(); +} + +Status decode_variant_to_variant_map(const StringRef& metadata, const StringRef& value, + const PathInData& prefix, VariantMap* values, + std::deque* string_values) { + VariantMetadata decoded_metadata; + RETURN_IF_ERROR(decode_metadata(metadata, &decoded_metadata)); + const auto* ptr = reinterpret_cast(value.data); + const auto* end = ptr + value.size; + const uint8_t* next = nullptr; + PathInDataBuilder path; + path.append(prefix.get_parts(), false); + RETURN_IF_ERROR(decode_value_to_variant_map(ptr, end, decoded_metadata, &path, values, + string_values, &next)); + if (next != end) { + return Status::Corruption("Invalid Parquet VARIANT value has {} trailing bytes", + end - next); + } + return Status::OK(); +} + +} // namespace doris::parquet diff --git a/be/src/format/parquet/parquet_variant_reader.h b/be/src/format/parquet/parquet_variant_reader.h new file mode 100644 index 00000000000000..8289113f5fc963 --- /dev/null +++ b/be/src/format/parquet/parquet_variant_reader.h @@ -0,0 +1,38 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#pragma once + +#include +#include +#include + +#include "common/status.h" +#include "core/field.h" +#include "core/string_ref.h" + +namespace doris::parquet { + +std::string format_variant_uuid(const uint8_t* ptr); + +Status decode_variant_to_json(const StringRef& metadata, const StringRef& value, std::string* json); + +Status decode_variant_to_variant_map(const StringRef& metadata, const StringRef& value, + const PathInData& prefix, VariantMap* values, + std::deque* string_values); + +} // namespace doris::parquet diff --git a/be/src/format/parquet/schema_desc.cpp b/be/src/format/parquet/schema_desc.cpp index 972ce6f969b74c..61684ceb13d12c 100644 --- a/be/src/format/parquet/schema_desc.cpp +++ b/be/src/format/parquet/schema_desc.cpp @@ -17,8 +17,6 @@ #include "format/parquet/schema_desc.h" -#include - #include #include #include @@ -29,6 +27,7 @@ #include "core/data_type/data_type_factory.hpp" #include "core/data_type/data_type_map.h" #include "core/data_type/data_type_struct.h" +#include "core/data_type/data_type_variant.h" #include "core/data_type/define_primitive_type.h" #include "format/generic_reader.h" #include "format/table/table_schema_change_helper.h" @@ -66,6 +65,23 @@ static bool is_optional_node(const tparquet::SchemaElement& schema) { schema.repetition_type == tparquet::FieldRepetitionType::OPTIONAL; } +static bool is_variant_node(const tparquet::SchemaElement& schema) { + return schema.__isset.logicalType && schema.logicalType.__isset.VARIANT; +} + +static void mark_variant_subfields(FieldSchema* field) { + field->is_in_variant = true; + for (auto& child : field->children) { + mark_variant_subfields(&child); + } +} + +static bool is_unannotated_binary_field(const FieldSchema& field) { + return field.physical_type == tparquet::Type::BYTE_ARRAY && + !field.parquet_schema.__isset.logicalType && + !field.parquet_schema.__isset.converted_type; +} + static int num_children_node(const tparquet::SchemaElement& schema) { return schema.__isset.num_children ? schema.num_children : 0; } @@ -305,7 +321,8 @@ std::pair FieldDescriptor::convert_to_doris_type( } } } else if (logicalType.__isset.TIME) { - ans.first = DataTypeFactory::instance().create_data_type(TYPE_TIMEV2, nullable); + ans.first = DataTypeFactory::instance().create_data_type( + TYPE_TIMEV2, nullable, 0, logicalType.TIME.unit.__isset.MILLIS ? 3 : 6); } else if (logicalType.__isset.TIMESTAMP) { if (_enable_mapping_timestamp_tz) { if (logicalType.TIMESTAMP.isAdjustedToUTC) { @@ -351,9 +368,10 @@ std::pair FieldDescriptor::convert_to_doris_type( ans.first = DataTypeFactory::instance().create_data_type(TYPE_DATEV2, nullable); break; case tparquet::ConvertedType::type::TIME_MILLIS: - [[fallthrough]]; + ans.first = DataTypeFactory::instance().create_data_type(TYPE_TIMEV2, nullable, 0, 3); + break; case tparquet::ConvertedType::type::TIME_MICROS: - ans.first = DataTypeFactory::instance().create_data_type(TYPE_TIMEV2, nullable); + ans.first = DataTypeFactory::instance().create_data_type(TYPE_TIMEV2, nullable, 0, 6); break; case tparquet::ConvertedType::type::TIMESTAMP_MILLIS: ans.first = DataTypeFactory::instance().create_data_type(TYPE_DATETIMEV2, nullable, 0, 3); @@ -398,7 +416,10 @@ std::pair FieldDescriptor::convert_to_doris_type( Status FieldDescriptor::parse_group_field(const std::vector& t_schemas, size_t curr_pos, FieldSchema* group_field) { - auto& group_schema = t_schemas[curr_pos]; + const auto& group_schema = t_schemas[curr_pos]; + if (is_variant_node(group_schema)) { + return parse_variant_field(t_schemas, curr_pos, group_field); + } if (is_map_node(group_schema)) { // the map definition: // optional group (MAP) { @@ -446,6 +467,67 @@ Status FieldDescriptor::parse_group_field(const std::vector& t_schemas, + size_t curr_pos, FieldSchema* variant_field) { + RETURN_IF_ERROR(parse_struct_field(t_schemas, curr_pos, variant_field)); + + bool has_metadata = false; + bool metadata_required = false; + bool has_value = false; + bool has_typed_value = false; + for (const auto& child : variant_field->children) { + if (child.lower_case_name == "metadata") { + if (has_metadata) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' has duplicate metadata child", + variant_field->name); + } + if (!is_unannotated_binary_field(child)) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' metadata child must be unannotated binary", + variant_field->name); + } + has_metadata = true; + metadata_required = !child.data_type->is_nullable(); + } else if (child.lower_case_name == "value") { + if (has_value) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' has duplicate value child", + variant_field->name); + } + if (!is_unannotated_binary_field(child)) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' value child must be unannotated binary", + variant_field->name); + } + has_value = true; + } else if (child.lower_case_name == "typed_value") { + if (has_typed_value) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' has duplicate typed_value child", + variant_field->name); + } + has_typed_value = true; + } else { + return Status::InvalidArgument("Parquet VARIANT field '{}' has unexpected child '{}'", + variant_field->name, child.name); + } + } + if (!has_metadata || !metadata_required || (!has_value && !has_typed_value)) { + return Status::InvalidArgument( + "Parquet VARIANT field '{}' must contain required binary metadata and at least one " + "binary value or typed_value field", + variant_field->name); + } + + variant_field->data_type = std::make_shared(0, false); + if (is_optional_node(t_schemas[curr_pos])) { + variant_field->data_type = make_nullable(variant_field->data_type); + } + mark_variant_subfields(variant_field); + return Status::OK(); +} + Status FieldDescriptor::parse_list_field(const std::vector& t_schemas, size_t curr_pos, FieldSchema* list_field) { // the list definition: @@ -641,6 +723,32 @@ FieldSchema* FieldDescriptor::get_column(const std::string& name) const { return nullptr; } +namespace { + +void collect_physical_fields(FieldSchema* field, std::vector* physical_fields) { + if (field->children.empty()) { + if (field->physical_column_index >= 0) { + field->physical_column_index = cast_set(physical_fields->size()); + physical_fields->push_back(field); + } + return; + } + for (auto& child : field->children) { + collect_physical_fields(&child, physical_fields); + } +} + +} // namespace + +void FieldDescriptor::rebuild_indexes() { + _physical_fields.clear(); + _name_to_field.clear(); + for (auto& field : _fields) { + _name_to_field.emplace(field.name, &field); + collect_physical_fields(&field, &_physical_fields); + } +} + void FieldDescriptor::get_column_names(std::unordered_set* names) const { names->clear(); for (const FieldSchema& f : _fields) { @@ -668,6 +776,13 @@ void FieldDescriptor::assign_ids() { } } +FieldDescriptor FieldDescriptor::copy_with_assigned_ids() const { + FieldDescriptor copy = *this; + copy.rebuild_indexes(); + copy.assign_ids(); + return copy; +} + const FieldSchema* FieldDescriptor::find_column_by_id(uint64_t column_id) const { for (const auto& field : _fields) { if (auto result = field.find_column_by_id(column_id)) { diff --git a/be/src/format/parquet/schema_desc.h b/be/src/format/parquet/schema_desc.h index 544050a8516ea6..17027e5016cb09 100644 --- a/be/src/format/parquet/schema_desc.h +++ b/be/src/format/parquet/schema_desc.h @@ -58,6 +58,7 @@ struct FieldSchema { //For UInt8 -> Int16,UInt16 -> Int32,UInt32 -> Int64,UInt64 -> Int128. bool is_type_compatibility = false; + bool is_in_variant = false; FieldSchema() : data_type(std::make_shared()), column_id(UNASSIGNED_COLUMN_ID) {} @@ -101,6 +102,9 @@ class FieldDescriptor { Status parse_map_field(const std::vector& t_schemas, size_t curr_pos, FieldSchema* map_field); + Status parse_variant_field(const std::vector& t_schemas, + size_t curr_pos, FieldSchema* variant_field); + Status parse_struct_field(const std::vector& t_schemas, size_t curr_pos, FieldSchema* struct_field); @@ -110,6 +114,8 @@ class FieldDescriptor { Status parse_node_field(const std::vector& t_schemas, size_t curr_pos, FieldSchema* node_field); + void rebuild_indexes(); + std::pair convert_to_doris_type(tparquet::LogicalType logicalType, bool nullable); std::pair convert_to_doris_type( @@ -119,6 +125,23 @@ class FieldDescriptor { public: FieldDescriptor() = default; + FieldDescriptor(const FieldDescriptor& other) + : _fields(other._fields), + _next_schema_pos(other._next_schema_pos), + _enable_mapping_varbinary(other._enable_mapping_varbinary), + _enable_mapping_timestamp_tz(other._enable_mapping_timestamp_tz) { + rebuild_indexes(); + } + FieldDescriptor& operator=(const FieldDescriptor& other) { + if (this != &other) { + _fields = other._fields; + _next_schema_pos = other._next_schema_pos; + _enable_mapping_varbinary = other._enable_mapping_varbinary; + _enable_mapping_timestamp_tz = other._enable_mapping_timestamp_tz; + rebuild_indexes(); + } + return *this; + } ~FieldDescriptor() = default; /** @@ -161,6 +184,8 @@ class FieldDescriptor { */ void assign_ids(); + FieldDescriptor copy_with_assigned_ids() const; + const FieldSchema* find_column_by_id(uint64_t column_id) const; void set_enable_mapping_varbinary(bool enable) { _enable_mapping_varbinary = enable; } void set_enable_mapping_timestamp_tz(bool enable) { _enable_mapping_timestamp_tz = enable; } diff --git a/be/src/format/parquet/vparquet_column_chunk_reader.cpp b/be/src/format/parquet/vparquet_column_chunk_reader.cpp index b4b919f187073c..b03f7335e2bf5c 100644 --- a/be/src/format/parquet/vparquet_column_chunk_reader.cpp +++ b/be/src/format/parquet/vparquet_column_chunk_reader.cpp @@ -83,10 +83,16 @@ Status ColumnChunkReader::init() { template Status ColumnChunkReader::skip_nested_values( const std::vector& def_levels) { + return skip_nested_values(def_levels, 0, def_levels.size()); +} + +template +Status ColumnChunkReader::skip_nested_values( + const std::vector& def_levels, size_t begin, size_t end) { size_t no_value_cnt = 0; size_t value_cnt = 0; - for (size_t idx = 0; idx < def_levels.size(); idx++) { + for (size_t idx = begin; idx < end; idx++) { level_t def_level = def_levels[idx]; if (IN_COLLECTION && def_level < _field_schema->repeated_parent_def_level) { no_value_cnt++; diff --git a/be/src/format/parquet/vparquet_column_chunk_reader.h b/be/src/format/parquet/vparquet_column_chunk_reader.h index b117f6c6652e7e..bfa0ad73174d4a 100644 --- a/be/src/format/parquet/vparquet_column_chunk_reader.h +++ b/be/src/format/parquet/vparquet_column_chunk_reader.h @@ -191,12 +191,13 @@ class ColumnChunkReader { Status seek_to_nested_row(size_t left_row); Status skip_nested_values(const std::vector& def_levels); + Status skip_nested_values(const std::vector& def_levels, size_t begin, size_t end); Status fill_def(std::vector& def_values) { auto before_sz = def_values.size(); auto append_sz = _remaining_def_nums - _remaining_rep_nums; def_values.resize(before_sz + append_sz, 0); if (max_def_level() != 0) { - auto ptr = def_values.data() + before_sz; + auto* ptr = def_values.data() + before_sz; _def_level_decoder.get_levels(ptr, append_sz); } _remaining_def_nums -= append_sz; diff --git a/be/src/format/parquet/vparquet_column_reader.cpp b/be/src/format/parquet/vparquet_column_reader.cpp index ba7d42a5aed84e..ac096af06e733e 100644 --- a/be/src/format/parquet/vparquet_column_reader.cpp +++ b/be/src/format/parquet/vparquet_column_reader.cpp @@ -17,29 +17,53 @@ #include "format/parquet/vparquet_column_reader.h" +#include #include -#include +#include #include #include +#include +#include +#include +#include +#include #include +#include +#include "common/exception.h" #include "common/status.h" #include "core/column/column.h" #include "core/column/column_array.h" #include "core/column/column_map.h" #include "core/column/column_nullable.h" +#include "core/column/column_string.h" #include "core/column/column_struct.h" +#include "core/column/column_varbinary.h" +#include "core/column/column_variant.h" #include "core/data_type/data_type_array.h" +#include "core/data_type/data_type_factory.hpp" +#include "core/data_type/data_type_jsonb.h" #include "core/data_type/data_type_map.h" #include "core/data_type/data_type_nullable.h" +#include "core/data_type/data_type_number.h" +#include "core/data_type/data_type_string.h" #include "core/data_type/data_type_struct.h" +#include "core/data_type/data_type_variant.h" #include "core/data_type/define_primitive_type.h" +#include "core/data_type_serde/data_type_serde.h" +#include "core/string_buffer.hpp" +#include "core/value/jsonb_value.h" +#include "core/value/timestamptz_value.h" +#include "core/value/vdatetime_value.h" +#include "exec/common/variant_util.h" #include "format/parquet/level_decoder.h" +#include "format/parquet/parquet_variant_reader.h" #include "format/parquet/schema_desc.h" #include "format/parquet/vparquet_column_chunk_reader.h" #include "io/fs/tracing_file_reader.h" #include "runtime/runtime_profile.h" +#include "util/jsonb_document.h" namespace doris { static void fill_struct_null_map(FieldSchema* field, NullMap& null_map, @@ -103,6 +127,1837 @@ static void fill_array_offset(FieldSchema* field, ColumnArray::Offsets64& offset } } +static constexpr int64_t UNIX_EPOCH_DAYNR = 719528; +static constexpr int64_t MICROS_PER_SECOND = 1000000; + +static int64_t variant_date_value(const VecDateTimeValue& value) { + return value.daynr() - UNIX_EPOCH_DAYNR; +} + +static int64_t variant_date_value(const DateV2Value& value) { + return value.daynr() - UNIX_EPOCH_DAYNR; +} + +static int64_t variant_datetime_value(const VecDateTimeValue& value) { + int64_t timestamp = 0; + value.unix_timestamp(×tamp, cctz::utc_time_zone()); + return timestamp * MICROS_PER_SECOND; +} + +static int64_t variant_datetime_value(const DateV2Value& value) { + int64_t timestamp = 0; + value.unix_timestamp(×tamp, cctz::utc_time_zone()); + return timestamp * MICROS_PER_SECOND + value.microsecond(); +} + +static int64_t variant_datetime_value(const TimestampTzValue& value) { + int64_t timestamp = 0; + value.unix_timestamp(×tamp, cctz::utc_time_zone()); + return timestamp * MICROS_PER_SECOND + value.microsecond(); +} + +static int find_child_idx(const FieldSchema& field, std::string_view name) { + for (int i = 0; i < field.children.size(); ++i) { + if (field.children[i].lower_case_name == name) { + return i; + } + } + return -1; +} + +static bool is_variant_wrapper_typed_value_child(const FieldSchema& field) { + auto type = remove_nullable(field.data_type); + return type->get_primitive_type() == TYPE_STRUCT || type->get_primitive_type() == TYPE_ARRAY; +} + +static bool is_unannotated_variant_value_field(const FieldSchema& field) { + // VARIANT residual value is raw binary; annotated strings named value are user fields. + return field.lower_case_name == "value" && field.physical_type == tparquet::Type::BYTE_ARRAY && + !field.parquet_schema.__isset.logicalType && + !field.parquet_schema.__isset.converted_type; +} + +static bool is_unannotated_variant_metadata_field(const FieldSchema& field) { + return field.lower_case_name == "metadata" && + field.physical_type == tparquet::Type::BYTE_ARRAY && + !field.parquet_schema.__isset.logicalType && + !field.parquet_schema.__isset.converted_type; +} + +static bool is_variant_wrapper_field(const FieldSchema& field, + bool allow_scalar_typed_value_only_wrapper) { + auto type = remove_nullable(field.data_type); + if (type->get_primitive_type() != TYPE_STRUCT && type->get_primitive_type() != TYPE_VARIANT) { + return false; + } + + bool has_metadata = false; + bool has_value = false; + const FieldSchema* typed_value = nullptr; + for (const auto& child : field.children) { + if (child.lower_case_name == "metadata") { + if (!is_unannotated_variant_metadata_field(child)) { + return false; + } + has_metadata = true; + continue; + } + if (child.lower_case_name == "value") { + if (!is_unannotated_variant_value_field(child)) { + return false; + } + has_value = true; + continue; + } + if (child.lower_case_name == "typed_value") { + typed_value = &child; + continue; + } + return false; + } + if (has_metadata) { + return type->get_primitive_type() == TYPE_VARIANT && (has_value || typed_value != nullptr); + } + if (has_value) { + return typed_value != nullptr; + } + return typed_value != nullptr && (allow_scalar_typed_value_only_wrapper || + is_variant_wrapper_typed_value_child(*typed_value)); +} + +static bool is_value_only_variant_wrapper_candidate(const FieldSchema& field) { + auto type = remove_nullable(field.data_type); + if (type->get_primitive_type() != TYPE_STRUCT && type->get_primitive_type() != TYPE_VARIANT) { + return false; + } + + bool has_value = false; + for (const auto& child : field.children) { + if (is_unannotated_variant_value_field(child)) { + has_value = true; + continue; + } + return false; + } + return has_value; +} + +static Status get_binary_field(const Field& field, std::string* value, bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + *present = true; + switch (field.get_type()) { + case TYPE_STRING: + *value = field.get(); + return Status::OK(); + case TYPE_CHAR: + *value = field.get(); + return Status::OK(); + case TYPE_VARCHAR: + *value = field.get(); + return Status::OK(); + case TYPE_VARBINARY: { + auto ref = field.get().to_string_ref(); + value->assign(ref.data, ref.size); + return Status::OK(); + } + default: + return Status::Corruption("Parquet VARIANT binary field has unexpected Doris type {}", + field.get_type_name()); + } +} + +static PathInData append_path(const PathInData& prefix, const PathInData& suffix) { + if (prefix.empty()) { + return suffix; + } + if (suffix.empty()) { + return prefix; + } + PathInDataBuilder builder; + builder.append(prefix.get_parts(), false); + builder.append(suffix.get_parts(), false); + return builder.build(); +} + +static Status make_jsonb_field(std::string_view json, FieldWithDataType* value) { + JsonBinaryValue jsonb_value; + RETURN_IF_ERROR(jsonb_value.from_json_string(json.data(), json.size())); + value->field = + Field::create_field(JsonbField(jsonb_value.value(), jsonb_value.size())); + value->base_scalar_type_id = TYPE_JSONB; + value->num_dimensions = 0; + value->precision = 0; + value->scale = 0; + return Status::OK(); +} + +static std::string make_null_array_json(size_t elements) { + std::string json = "["; + for (size_t i = 0; i < elements; ++i) { + if (i != 0) { + json.push_back(','); + } + json.append("null"); + } + json.push_back(']'); + return json; +} + +static Status make_empty_object_field(Field* field) { + FieldWithDataType value; + RETURN_IF_ERROR(make_jsonb_field("{}", &value)); + *field = std::move(value.field); + return Status::OK(); +} + +static Status insert_jsonb_value(const PathInData& path, std::string_view json, + VariantMap* values) { + FieldWithDataType value; + RETURN_IF_ERROR(make_jsonb_field(json, &value)); + (*values)[path] = std::move(value); + return Status::OK(); +} + +static Status insert_empty_object_marker(const PathInData& path, VariantMap* values) { + return insert_jsonb_value(path, "{}", values); +} + +static bool is_empty_object_marker(const FieldWithDataType& value) { + if (value.field.get_type() != TYPE_JSONB) { + return false; + } + const auto& jsonb = value.field.get(); + const JsonbDocument* document = nullptr; + Status st = + JsonbDocument::checkAndCreateDocument(jsonb.get_value(), jsonb.get_size(), &document); + if (!st.ok() || document == nullptr || document->getValue() == nullptr || + !document->getValue()->isObject()) { + return false; + } + return document->getValue()->unpack()->numElem() == 0; +} + +static Status collect_empty_object_markers(const rapidjson::Value& value, PathInDataBuilder* path, + VariantMap* values) { + if (!value.IsObject()) { + return Status::OK(); + } + if (value.MemberCount() == 0) { + return insert_empty_object_marker(path->build(), values); + } + for (auto it = value.MemberBegin(); it != value.MemberEnd(); ++it) { + if (it->value.IsObject()) { + path->append(std::string_view(it->name.GetString(), it->name.GetStringLength()), false); + RETURN_IF_ERROR(collect_empty_object_markers(it->value, path, values)); + path->pop_back(); + } + } + return Status::OK(); +} + +static Status add_empty_object_markers_from_json(const std::string& json, const PathInData& prefix, + VariantMap* values) { + if (json.find("{}") == std::string::npos) { + return Status::OK(); + } + rapidjson::Document document; + document.Parse(json.data(), json.size()); + if (document.HasParseError()) { + return Status::Corruption("Invalid Parquet VARIANT decoded JSON"); + } + PathInDataBuilder path; + path.append(prefix.get_parts(), false); + return collect_empty_object_markers(document, &path, values); +} + +static Status parse_json_to_variant_map(const std::string& json, const PathInData& prefix, + VariantMap* values) { + auto parsed_column = ColumnVariant::create(0, false); + ParseConfig parse_config; + StringRef json_ref(json.data(), json.size()); + RETURN_IF_CATCH_EXCEPTION( + variant_util::parse_json_to_variant(*parsed_column, json_ref, nullptr, parse_config)); + Field parsed = (*parsed_column)[0]; + if (!parsed.is_null()) { + auto& parsed_values = parsed.get(); + for (auto& [path, value] : parsed_values) { + (*values)[append_path(prefix, path)] = std::move(value); + } + } + RETURN_IF_ERROR(add_empty_object_markers_from_json(json, prefix, values)); + return Status::OK(); +} + +static Status variant_map_to_json(VariantMap values, std::string* json) { + auto variant_column = ColumnVariant::create(0, false); + RETURN_IF_CATCH_EXCEPTION( + variant_column->insert(Field::create_field(std::move(values)))); + DataTypeSerDe::FormatOptions options; + variant_column->serialize_one_row_to_string(0, json, options); + return Status::OK(); +} + +static bool path_has_prefix(const PathInData& path, const PathInData& prefix) { + const auto& parts = path.get_parts(); + const auto& prefix_parts = prefix.get_parts(); + if (parts.size() < prefix_parts.size()) { + return false; + } + for (size_t i = 0; i < prefix_parts.size(); ++i) { + if (parts[i] != prefix_parts[i]) { + return false; + } + } + return true; +} + +static bool has_descendant_path(const VariantMap& values, const PathInData& prefix) { + const size_t prefix_size = prefix.get_parts().size(); + return std::ranges::any_of(values, [&](const auto& entry) { + const auto& path = entry.first; + return path.get_parts().size() > prefix_size && path_has_prefix(path, prefix); + }); +} + +static void erase_shadowed_empty_object_markers(VariantMap* values, + const VariantMap& shadowing_values) { + for (auto it = values->begin(); it != values->end();) { + if (is_empty_object_marker(it->second) && + (has_descendant_path(*values, it->first) || + has_descendant_path(shadowing_values, it->first))) { + it = values->erase(it); + continue; + } + ++it; + } +} + +static void erase_shadowed_empty_object_markers(VariantMap* value_values, + VariantMap* typed_values) { + erase_shadowed_empty_object_markers(value_values, *typed_values); + erase_shadowed_empty_object_markers(typed_values, *value_values); +} + +static Status check_no_shredded_value_typed_duplicates(const VariantMap& value_values, + const VariantMap& typed_values, + const PathInData& prefix) { + const size_t prefix_size = prefix.get_parts().size(); + for (const auto& value_entry : value_values) { + const auto& value_path = value_entry.first; + if (!path_has_prefix(value_path, prefix)) { + continue; + } + if (value_path.get_parts().size() == prefix_size) { + if (is_empty_object_marker(value_entry.second) && + !has_descendant_path(typed_values, value_path)) { + continue; + } + if (!typed_values.empty()) { + return Status::Corruption( + "Parquet VARIANT residual value conflicts with typed_value at path {}", + value_path.get_path()); + } + continue; + } + for (const auto& typed_entry : typed_values) { + const auto& typed_path = typed_entry.first; + if (!path_has_prefix(typed_path, prefix)) { + continue; + } + if (typed_path.get_parts().size() == prefix_size) { + if (is_empty_object_marker(typed_entry.second) && + !has_descendant_path(value_values, typed_path)) { + continue; + } + return Status::Corruption( + "Parquet VARIANT residual value and typed_value contain duplicate field {}", + value_path.get_parts()[prefix_size].key); + } + if (value_path.get_parts()[prefix_size] == typed_path.get_parts()[prefix_size]) { + if (value_path == typed_path && is_empty_object_marker(value_entry.second) && + is_empty_object_marker(typed_entry.second)) { + continue; + } + return Status::Corruption( + "Parquet VARIANT residual value and typed_value contain duplicate field {}", + value_path.get_parts()[prefix_size].key); + } + } + } + return Status::OK(); +} + +static bool has_direct_typed_parent_null(const std::vector& null_maps, size_t row) { + return std::ranges::any_of(null_maps, [&](const NullMap* null_map) { + DCHECK_LT(row, null_map->size()); + return (*null_map)[row]; + }); +} + +static void insert_direct_typed_leaf_range(const IColumn& column, size_t start, size_t rows, + const std::vector& parent_null_maps, + IColumn* variant_leaf) { + auto& nullable_leaf = assert_cast(*variant_leaf); + const IColumn* value_column = &column; + const NullMap* leaf_null_map = nullptr; + if (const auto* nullable_column = check_and_get_column(&column)) { + value_column = &nullable_column->get_nested_column(); + leaf_null_map = &nullable_column->get_null_map_data(); + } + + nullable_leaf.get_nested_column().insert_range_from(*value_column, start, rows); + auto& null_map = nullable_leaf.get_null_map_data(); + null_map.reserve(null_map.size() + rows); + for (size_t i = 0; i < rows; ++i) { + const size_t row = start + i; + const bool leaf_is_null = leaf_null_map != nullptr && (*leaf_null_map)[row]; + null_map.push_back(leaf_is_null || has_direct_typed_parent_null(parent_null_maps, row)); + } +} + +static bool is_temporal_variant_leaf_type(PrimitiveType type) { + switch (type) { + case TYPE_TIMEV2: + case TYPE_DATE: + case TYPE_DATETIME: + case TYPE_DATEV2: + case TYPE_DATETIMEV2: + case TYPE_TIMESTAMPTZ: + return true; + default: + return false; + } +} + +static bool is_floating_point_variant_leaf_type(PrimitiveType type) { + switch (type) { + case TYPE_FLOAT: + case TYPE_DOUBLE: + return true; + default: + return false; + } +} + +static bool is_uuid_typed_value_field(const FieldSchema& field_schema); +static bool contains_uuid_typed_value_field(const FieldSchema& field_schema); + +static DataTypePtr direct_variant_leaf_type(const DataTypePtr& data_type) { + const auto& type = remove_nullable(data_type); + if (is_temporal_variant_leaf_type(type->get_primitive_type())) { + return std::make_shared(); + } + return type; +} + +static DataTypePtr direct_variant_leaf_type(const FieldSchema& field_schema) { + const auto& type = remove_nullable(field_schema.data_type); + if (is_uuid_typed_value_field(field_schema)) { + return std::make_shared(); + } + if (type->get_primitive_type() == TYPE_ARRAY) { + DORIS_CHECK(!field_schema.children.empty()); + DataTypePtr nested_type = direct_variant_leaf_type(field_schema.children[0]); + if (field_schema.children[0].data_type->is_nullable()) { + nested_type = make_nullable(nested_type); + } + return std::make_shared(nested_type); + } + return direct_variant_leaf_type(field_schema.data_type); +} + +static bool contains_temporal_variant_leaf_type(const DataTypePtr& data_type) { + const auto& type = remove_nullable(data_type); + if (is_temporal_variant_leaf_type(type->get_primitive_type())) { + return true; + } + if (type->get_primitive_type() == TYPE_ARRAY) { + return contains_temporal_variant_leaf_type( + assert_cast(type.get())->get_nested_type()); + } + return false; +} + +static bool contains_floating_point_variant_leaf_type(const DataTypePtr& data_type) { + const auto& type = remove_nullable(data_type); + if (is_floating_point_variant_leaf_type(type->get_primitive_type())) { + return true; + } + if (type->get_primitive_type() == TYPE_ARRAY) { + return contains_floating_point_variant_leaf_type( + assert_cast(type.get())->get_nested_type()); + } + return false; +} + +static int64_t direct_temporal_variant_value(PrimitiveType type, const IColumn& column, + size_t row) { + switch (type) { + case TYPE_TIMEV2: + return static_cast( + std::llround(assert_cast(column).get_data()[row])); + case TYPE_DATE: + return variant_date_value(assert_cast(column).get_data()[row]); + case TYPE_DATETIME: + return variant_datetime_value(assert_cast(column).get_data()[row]); + case TYPE_DATEV2: + return variant_date_value(assert_cast(column).get_data()[row]); + case TYPE_DATETIMEV2: + return variant_datetime_value(assert_cast(column).get_data()[row]); + case TYPE_TIMESTAMPTZ: + return variant_datetime_value( + assert_cast(column).get_data()[row]); + default: + DORIS_CHECK(false); + return 0; + } +} + +static void insert_direct_typed_temporal_leaf_range( + PrimitiveType type, const IColumn& column, size_t start, size_t rows, + const std::vector& parent_null_maps, IColumn* variant_leaf) { + auto& nullable_leaf = assert_cast(*variant_leaf); + const IColumn* value_column = &column; + const NullMap* leaf_null_map = nullptr; + if (const auto* nullable_column = check_and_get_column(&column)) { + value_column = &nullable_column->get_nested_column(); + leaf_null_map = &nullable_column->get_null_map_data(); + } + + auto& data = assert_cast(nullable_leaf.get_nested_column()).get_data(); + data.reserve(data.size() + rows); + auto& null_map = nullable_leaf.get_null_map_data(); + null_map.reserve(null_map.size() + rows); + for (size_t i = 0; i < rows; ++i) { + const size_t row = start + i; + const bool leaf_is_null = leaf_null_map != nullptr && (*leaf_null_map)[row]; + const bool is_null = leaf_is_null || has_direct_typed_parent_null(parent_null_maps, row); + if (is_null) { + data.push_back(0); + null_map.push_back(1); + continue; + } + data.push_back(direct_temporal_variant_value(type, *value_column, row)); + null_map.push_back(0); + } +} + +static Status insert_direct_typed_uuid_leaf_range( + const IColumn& column, size_t start, size_t rows, + const std::vector& parent_null_maps, IColumn* variant_leaf) { + auto& nullable_leaf = assert_cast(*variant_leaf); + const IColumn* value_column = &column; + const NullMap* leaf_null_map = nullptr; + if (const auto* nullable_column = check_and_get_column(&column)) { + value_column = &nullable_column->get_nested_column(); + leaf_null_map = &nullable_column->get_null_map_data(); + } + + auto& data = assert_cast(nullable_leaf.get_nested_column()); + auto& null_map = nullable_leaf.get_null_map_data(); + null_map.reserve(null_map.size() + rows); + for (size_t i = 0; i < rows; ++i) { + const size_t row = start + i; + const bool leaf_is_null = leaf_null_map != nullptr && (*leaf_null_map)[row]; + const bool is_null = leaf_is_null || has_direct_typed_parent_null(parent_null_maps, row); + if (is_null) { + data.insert_default(); + null_map.push_back(1); + continue; + } + StringRef bytes = value_column->get_data_at(row); + if (bytes.size != 16) { + return Status::Corruption("Parquet VARIANT UUID typed_value has invalid length {}", + bytes.size); + } + std::string uuid = + parquet::format_variant_uuid(reinterpret_cast(bytes.data)); + data.insert_data(uuid.data(), uuid.size()); + null_map.push_back(0); + } + return Status::OK(); +} + +static void append_json_string(std::string_view value, std::string* json) { + auto column = ColumnString::create(); + VectorBufferWriter writer(*column); + writer.write_json_string(value); + writer.commit(); + json->append(column->get_data_at(0).data, column->get_data_at(0).size); +} + +static bool is_column_selected(const FieldSchema& field_schema, + const std::set& column_ids) { + return column_ids.empty() || column_ids.find(field_schema.get_column_id()) != column_ids.end(); +} + +static bool has_selected_column(const FieldSchema& field_schema, + const std::set& column_ids) { + if (is_column_selected(field_schema, column_ids)) { + return true; + } + return std::any_of(field_schema.children.begin(), field_schema.children.end(), + [&column_ids](const FieldSchema& child) { + return has_selected_column(child, column_ids); + }); +} + +static bool is_direct_variant_leaf_type(const DataTypePtr& data_type) { + const auto& type = remove_nullable(data_type); + switch (type->get_primitive_type()) { + case TYPE_BOOLEAN: + case TYPE_TINYINT: + case TYPE_SMALLINT: + case TYPE_INT: + case TYPE_BIGINT: + case TYPE_LARGEINT: + case TYPE_DECIMALV2: + case TYPE_DECIMAL32: + case TYPE_DECIMAL64: + case TYPE_DECIMAL128I: + case TYPE_DECIMAL256: + case TYPE_FLOAT: + case TYPE_DOUBLE: + case TYPE_STRING: + case TYPE_CHAR: + case TYPE_VARCHAR: + case TYPE_VARBINARY: + return true; + case TYPE_TIMEV2: + case TYPE_DATE: + case TYPE_DATETIME: + case TYPE_DATEV2: + case TYPE_DATETIMEV2: + case TYPE_TIMESTAMPTZ: + return true; + case TYPE_ARRAY: { + const auto* array_type = assert_cast(type.get()); + return is_direct_variant_leaf_type(array_type->get_nested_type()); + } + default: + return false; + } +} + +static bool can_direct_read_typed_value(const FieldSchema& field_schema, bool allow_variant_wrapper, + const std::set& column_ids) { + if (!has_selected_column(field_schema, column_ids)) { + return true; + } + if (allow_variant_wrapper && is_variant_wrapper_field(field_schema, false)) { + const int value_idx = find_child_idx(field_schema, "value"); + const int typed_value_idx = find_child_idx(field_schema, "typed_value"); + return (value_idx < 0 || + !has_selected_column(field_schema.children[value_idx], column_ids)) && + typed_value_idx >= 0 && + can_direct_read_typed_value(field_schema.children[typed_value_idx], false, + column_ids); + } + + const auto& type = remove_nullable(field_schema.data_type); + if (type->get_primitive_type() == TYPE_STRUCT) { + return std::all_of(field_schema.children.begin(), field_schema.children.end(), + [&column_ids](const FieldSchema& child) { + return can_direct_read_typed_value(child, true, column_ids); + }); + } + return is_direct_variant_leaf_type(field_schema.data_type); +} + +static bool has_selected_direct_typed_leaf(const FieldSchema& field_schema, + bool allow_variant_wrapper, + const std::set& column_ids) { + if (!has_selected_column(field_schema, column_ids)) { + return false; + } + if (allow_variant_wrapper && is_variant_wrapper_field(field_schema, false)) { + const int typed_value_idx = find_child_idx(field_schema, "typed_value"); + DCHECK_GE(typed_value_idx, 0); + return has_selected_direct_typed_leaf(field_schema.children[typed_value_idx], false, + column_ids); + } + + const auto& type = remove_nullable(field_schema.data_type); + if (type->get_primitive_type() == TYPE_STRUCT) { + return std::any_of(field_schema.children.begin(), field_schema.children.end(), + [&column_ids](const FieldSchema& child) { + return has_selected_direct_typed_leaf(child, true, column_ids); + }); + } + return is_direct_variant_leaf_type(field_schema.data_type); +} + +static bool can_use_direct_typed_only_value(const FieldSchema& variant_field, + const std::set& column_ids) { + const int value_idx = find_child_idx(variant_field, "value"); + const int typed_value_idx = find_child_idx(variant_field, "typed_value"); + return (value_idx < 0 || !has_selected_column(variant_field.children[value_idx], column_ids)) && + typed_value_idx >= 0 && + has_selected_direct_typed_leaf(variant_field.children[typed_value_idx], false, + column_ids) && + can_direct_read_typed_value(variant_field.children[typed_value_idx], false, column_ids); +} + +static DataTypePtr make_variant_struct_reader_type(const FieldSchema& field) { + DataTypes child_types; + Strings child_names; + child_types.reserve(field.children.size()); + child_names.reserve(field.children.size()); + for (const auto& child : field.children) { + child_types.push_back(make_nullable(child.data_type)); + child_names.push_back(child.name); + } + return std::make_shared(child_types, child_names); +} + +static ColumnPtr make_variant_struct_read_column(const FieldSchema& field, + const DataTypePtr& variant_struct_type) { + if (field.data_type->is_nullable()) { + return make_nullable(variant_struct_type)->create_column(); + } + return variant_struct_type->create_column(); +} + +static void fill_variant_field_info(FieldWithDataType* value) { + FieldInfo info; + variant_util::get_field_info(value->field, &info); + DCHECK_LE(info.num_dimensions, std::numeric_limits::max()); + value->base_scalar_type_id = info.scalar_type_id; + value->num_dimensions = static_cast(info.num_dimensions); +} + +static void fill_variant_leaf_type_info(const DataTypePtr& data_type, FieldWithDataType* value) { + auto leaf_type = remove_nullable(data_type); + size_t num_dimensions = 0; + while (leaf_type->get_primitive_type() == TYPE_ARRAY) { + ++num_dimensions; + leaf_type = remove_nullable( + assert_cast(leaf_type.get())->get_nested_type()); + } + DCHECK_LE(num_dimensions, std::numeric_limits::max()); + if (value->base_scalar_type_id == INVALID_TYPE) { + value->base_scalar_type_id = leaf_type->get_primitive_type(); + } + if (value->num_dimensions == 0 && num_dimensions > 0) { + value->num_dimensions = static_cast(num_dimensions); + } + if (is_decimal(leaf_type->get_primitive_type())) { + value->precision = leaf_type->get_precision(); + value->scale = leaf_type->get_scale(); + } +} + +static Status fill_floating_point_variant_field(const Field& field, FieldWithDataType* value) { + value->field = field; + fill_variant_field_info(value); + return Status::OK(); +} + +static Status fill_floating_point_variant_field(PrimitiveType type, const Field& field, + FieldWithDataType* value) { + DORIS_CHECK(type == TYPE_FLOAT || type == TYPE_DOUBLE); + return fill_floating_point_variant_field(field, value); +} + +static bool is_uuid_typed_value_field(const FieldSchema& field_schema) { + return field_schema.parquet_schema.__isset.logicalType && + field_schema.parquet_schema.logicalType.__isset.UUID; +} + +static bool contains_uuid_typed_value_field(const FieldSchema& field_schema) { + return is_uuid_typed_value_field(field_schema) || + std::any_of( + field_schema.children.begin(), field_schema.children.end(), + [](const FieldSchema& child) { return contains_uuid_typed_value_field(child); }); +} + +static Status uuid_field_to_string(const Field& field, std::string* uuid) { + StringRef bytes; + switch (field.get_type()) { + case TYPE_STRING: + bytes = StringRef(field.get()); + break; + case TYPE_CHAR: + bytes = StringRef(field.get()); + break; + case TYPE_VARCHAR: + bytes = StringRef(field.get()); + break; + case TYPE_VARBINARY: + bytes = field.get().to_string_ref(); + break; + default: + return Status::Corruption("Parquet VARIANT UUID typed_value has unexpected Doris type {}", + field.get_type_name()); + } + if (bytes.size != 16) { + return Status::Corruption("Parquet VARIANT UUID typed_value has invalid length {}", + bytes.size); + } + *uuid = parquet::format_variant_uuid(reinterpret_cast(bytes.data)); + return Status::OK(); +} + +static Status fill_uuid_variant_field(const Field& field, FieldWithDataType* value) { + std::string uuid; + RETURN_IF_ERROR(uuid_field_to_string(field, &uuid)); + value->field = Field::create_field(std::move(uuid)); + value->base_scalar_type_id = TYPE_STRING; + return Status::OK(); +} + +static Status fill_temporal_variant_field(PrimitiveType type, const Field& field, + FieldWithDataType* value) { + switch (type) { + case TYPE_TIMEV2: + value->field = Field::create_field( + static_cast(std::llround(field.get()))); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + case TYPE_DATE: + value->field = Field::create_field(variant_date_value(field.get())); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + case TYPE_DATETIME: + value->field = Field::create_field( + variant_datetime_value(field.get())); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + case TYPE_DATEV2: + value->field = + Field::create_field(variant_date_value(field.get())); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + case TYPE_DATETIMEV2: + value->field = Field::create_field( + variant_datetime_value(field.get())); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + case TYPE_TIMESTAMPTZ: + value->field = Field::create_field( + variant_datetime_value(field.get())); + value->base_scalar_type_id = TYPE_BIGINT; + return Status::OK(); + default: + DORIS_CHECK(false); + return Status::OK(); + } +} + +static uint8_t direct_array_dimensions(const DataTypePtr& data_type) { + uint8_t num_dimensions = 0; + auto type = remove_nullable(data_type); + while (type->get_primitive_type() == TYPE_ARRAY) { + ++num_dimensions; + type = remove_nullable(assert_cast(type.get())->get_nested_type()); + } + return num_dimensions; +} + +static PrimitiveType direct_array_base_scalar_type(const FieldSchema& field_schema) { + auto leaf_type = remove_nullable(direct_variant_leaf_type(field_schema)); + while (leaf_type->get_primitive_type() == TYPE_ARRAY) { + leaf_type = remove_nullable( + assert_cast(leaf_type.get())->get_nested_type()); + } + return leaf_type->get_primitive_type(); +} + +static Status convert_direct_array_value(const FieldSchema& field_schema, const Field& field, + Field* converted) { + if (field.is_null()) { + *converted = Field(); + return Status::OK(); + } + + const auto& type = remove_nullable(field_schema.data_type); + if (type->get_primitive_type() == TYPE_ARRAY) { + if (field_schema.children.empty()) { + return Status::Corruption("Parquet VARIANT array typed_value has no element schema"); + } + Array converted_elements; + const auto& elements = field.get(); + converted_elements.reserve(elements.size()); + for (const auto& element : elements) { + Field converted_element; + RETURN_IF_ERROR(convert_direct_array_value(field_schema.children[0], element, + &converted_element)); + converted_elements.push_back(std::move(converted_element)); + } + *converted = Field::create_field(std::move(converted_elements)); + return Status::OK(); + } + + if (is_uuid_typed_value_field(field_schema)) { + FieldWithDataType value; + RETURN_IF_ERROR(fill_uuid_variant_field(field, &value)); + *converted = std::move(value.field); + return Status::OK(); + } + if (is_temporal_variant_leaf_type(type->get_primitive_type())) { + FieldWithDataType value; + RETURN_IF_ERROR(fill_temporal_variant_field(type->get_primitive_type(), field, &value)); + *converted = std::move(value.field); + return Status::OK(); + } + if (is_floating_point_variant_leaf_type(type->get_primitive_type())) { + FieldWithDataType value; + RETURN_IF_ERROR( + fill_floating_point_variant_field(type->get_primitive_type(), field, &value)); + *converted = std::move(value.field); + return Status::OK(); + } + + *converted = field; + return Status::OK(); +} + +static Status insert_direct_typed_array_leaf_range( + const FieldSchema& field_schema, const IColumn& column, size_t start, size_t rows, + const std::vector& parent_null_maps, IColumn* variant_leaf) { + auto& nullable_leaf = assert_cast(*variant_leaf); + const IColumn* value_column = &column; + const NullMap* leaf_null_map = nullptr; + if (const auto* nullable_column = check_and_get_column(&column)) { + value_column = &nullable_column->get_nested_column(); + leaf_null_map = &nullable_column->get_null_map_data(); + } + + auto& data = nullable_leaf.get_nested_column(); + auto& null_map = nullable_leaf.get_null_map_data(); + null_map.reserve(null_map.size() + rows); + for (size_t i = 0; i < rows; ++i) { + const size_t row = start + i; + const bool leaf_is_null = leaf_null_map != nullptr && (*leaf_null_map)[row]; + const bool is_null = leaf_is_null || has_direct_typed_parent_null(parent_null_maps, row); + if (is_null) { + data.insert_default(); + null_map.push_back(1); + continue; + } + + Field field; + value_column->get(row, field); + Field converted; + RETURN_IF_ERROR(convert_direct_array_value(field_schema, field, &converted)); + data.insert(converted); + null_map.push_back(0); + } + return Status::OK(); +} + +static Status fill_direct_array_variant_field(const FieldSchema& field_schema, const Field& field, + FieldWithDataType* value, bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + *present = true; + RETURN_IF_ERROR(convert_direct_array_value(field_schema, field, &value->field)); + value->base_scalar_type_id = direct_array_base_scalar_type(field_schema); + value->num_dimensions = direct_array_dimensions(field_schema.data_type); + return Status::OK(); +} + +static Status field_to_variant_field(const FieldSchema& field_schema, const Field& field, + FieldWithDataType* value, bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + *present = true; + if (is_uuid_typed_value_field(field_schema)) { + return fill_uuid_variant_field(field, value); + } + const DataTypePtr& type = remove_nullable(field_schema.data_type); + if (is_temporal_variant_leaf_type(type->get_primitive_type())) { + return fill_temporal_variant_field(type->get_primitive_type(), field, value); + } + switch (type->get_primitive_type()) { + case TYPE_BOOLEAN: + case TYPE_TINYINT: + case TYPE_SMALLINT: + case TYPE_INT: + case TYPE_BIGINT: + case TYPE_LARGEINT: + case TYPE_DECIMALV2: + case TYPE_DECIMAL32: + case TYPE_DECIMAL64: + case TYPE_DECIMAL128I: + case TYPE_DECIMAL256: + case TYPE_STRING: + case TYPE_CHAR: + case TYPE_VARCHAR: + case TYPE_VARBINARY: + case TYPE_ARRAY: + value->field = field; + fill_variant_field_info(value); + fill_variant_leaf_type_info(type, value); + return Status::OK(); + case TYPE_FLOAT: + case TYPE_DOUBLE: + return fill_floating_point_variant_field(field, value); + default: + return Status::Corruption("Unsupported Parquet VARIANT typed_value Doris type {}", + type->get_name()); + } +} + +static Status typed_value_to_json(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, std::string* json, bool* present); +static Status typed_map_to_variant_map(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values); + +static Status serialize_field_to_json(const DataTypePtr& data_type, const Field& field, + std::string* json) { + MutableColumnPtr column = data_type->create_column(); + column->insert(field); + + auto json_column = ColumnString::create(); + VectorBufferWriter writer(*json_column); + auto serde = data_type->get_serde(); + DataTypeSerDe::FormatOptions options; + RETURN_IF_ERROR(serde->serialize_one_cell_to_json(*column, 0, writer, options)); + writer.commit(); + *json = json_column->get_data_at(0).to_string(); + return Status::OK(); +} + +static Status scalar_typed_value_to_json(const FieldSchema& field_schema, const Field& field, + std::string* json, bool* present) { + FieldWithDataType value; + RETURN_IF_ERROR(field_to_variant_field(field_schema, field, &value, present)); + if (!*present) { + return Status::OK(); + } + if (value.field.is_null()) { + *json = "null"; + return Status::OK(); + } + if (!is_uuid_typed_value_field(field_schema) && + remove_nullable(field_schema.data_type)->get_primitive_type() == TYPE_VARBINARY) { + return Status::NotSupported( + "Parquet VARIANT binary typed_value cannot be serialized to JSON"); + } + + DataTypePtr json_type; + if (value.base_scalar_type_id != PrimitiveType::INVALID_TYPE) { + json_type = DataTypeFactory::instance().create_data_type(value.base_scalar_type_id, false, + value.precision, value.scale); + } else { + json_type = remove_nullable(field_schema.data_type); + } + return serialize_field_to_json(json_type, value.field, json); +} + +static Status resolve_variant_metadata(const FieldSchema& variant_field, const Struct& fields, + const std::string* inherited_metadata, std::string* metadata, + bool* has_metadata) { + *has_metadata = false; + if (inherited_metadata != nullptr) { + *metadata = *inherited_metadata; + *has_metadata = true; + } + + const int metadata_idx = find_child_idx(variant_field, "metadata"); + if (metadata_idx >= 0) { + bool metadata_present = false; + RETURN_IF_ERROR(get_binary_field(fields[metadata_idx], metadata, &metadata_present)); + *has_metadata = metadata_present; + } + return Status::OK(); +} + +static Status variant_typed_value_to_json(const FieldSchema& variant_field, const Struct& fields, + const std::string& metadata, std::string* typed_json, + bool* typed_present) { + *typed_present = false; + const int typed_value_idx = find_child_idx(variant_field, "typed_value"); + if (typed_value_idx < 0) { + return Status::OK(); + } + return typed_value_to_json(variant_field.children[typed_value_idx], fields[typed_value_idx], + metadata, typed_json, typed_present); +} + +static Status variant_residual_value_to_json(const FieldSchema& variant_field, const Struct& fields, + const std::string& metadata, bool has_metadata, + std::string* value_json, bool* value_present) { + *value_present = false; + const int value_idx = find_child_idx(variant_field, "value"); + if (value_idx < 0) { + return Status::OK(); + } + + std::string value; + RETURN_IF_ERROR(get_binary_field(fields[value_idx], &value, value_present)); + if (!*value_present) { + return Status::OK(); + } + if (!has_metadata) { + return Status::Corruption("Parquet VARIANT value is present without metadata"); + } + return parquet::decode_variant_to_json(StringRef(metadata.data(), metadata.size()), + StringRef(value.data(), value.size()), value_json); +} + +static Status merge_variant_value_and_typed_json(const std::string& value_json, + const std::string& typed_json, std::string* json) { + VariantMap value_values; + RETURN_IF_ERROR(parse_json_to_variant_map(value_json, PathInData(), &value_values)); + VariantMap typed_values; + RETURN_IF_ERROR(parse_json_to_variant_map(typed_json, PathInData(), &typed_values)); + erase_shadowed_empty_object_markers(&value_values, &typed_values); + auto root_value = value_values.find(PathInData()); + if (root_value != value_values.end() && !is_empty_object_marker(root_value->second)) { + return Status::Corruption( + "Parquet VARIANT has conflicting non-object value and typed_value"); + } + RETURN_IF_ERROR( + check_no_shredded_value_typed_duplicates(value_values, typed_values, PathInData())); + value_values.merge(std::move(typed_values)); + return variant_map_to_json(std::move(value_values), json); +} + +static Status variant_to_json(const FieldSchema& variant_field, const Field& field, + const std::string* inherited_metadata, std::string* json, + bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + + const auto& fields = field.get(); + std::string metadata; + bool has_metadata = false; + RETURN_IF_ERROR(resolve_variant_metadata(variant_field, fields, inherited_metadata, &metadata, + &has_metadata)); + + std::string typed_json; + bool typed_present = false; + RETURN_IF_ERROR(variant_typed_value_to_json(variant_field, fields, metadata, &typed_json, + &typed_present)); + + std::string value_json; + bool value_present = false; + RETURN_IF_ERROR(variant_residual_value_to_json(variant_field, fields, metadata, has_metadata, + &value_json, &value_present)); + + if (value_present && typed_present) { + RETURN_IF_ERROR(merge_variant_value_and_typed_json(value_json, typed_json, json)); + *present = true; + return Status::OK(); + } + + if (typed_present) { + *json = std::move(typed_json); + *present = true; + return Status::OK(); + } + if (value_present) { + *json = std::move(value_json); + *present = true; + return Status::OK(); + } + + *present = false; + return Status::OK(); +} + +static Status shredded_field_to_json(const FieldSchema& field_schema, const Field& field, + const std::string& metadata, std::string* json, bool* present, + bool allow_scalar_typed_value_only_wrapper) { + if (is_variant_wrapper_field(field_schema, allow_scalar_typed_value_only_wrapper)) { + return variant_to_json(field_schema, field, &metadata, json, present); + } + if (is_value_only_variant_wrapper_candidate(field_schema)) { + Status st = variant_to_json(field_schema, field, &metadata, json, present); + if (st.ok()) { + return st; + } + if (!st.is()) { + return st; + } + } + return typed_value_to_json(field_schema, field, metadata, json, present); +} + +static Status typed_array_to_json(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, std::string* json, bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + if (typed_value_field.children.empty()) { + return Status::Corruption("Parquet VARIANT array typed_value has no element schema"); + } + + const auto& elements = field.get(); + const auto& element_schema = typed_value_field.children[0]; + json->clear(); + json->push_back('['); + for (size_t i = 0; i < elements.size(); ++i) { + if (i != 0) { + json->push_back(','); + } + std::string element_json; + bool element_present = false; + RETURN_IF_ERROR(shredded_field_to_json(element_schema, elements[i], metadata, &element_json, + &element_present, true)); + if (!element_present) { + if (elements[i].is_null()) { + json->append("null"); + continue; + } + return Status::Corruption("Parquet VARIANT array element is missing"); + } + json->append(element_json); + } + json->push_back(']'); + *present = true; + return Status::OK(); +} + +static Status typed_struct_to_json(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, std::string* json, bool* present) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + + const auto& fields = field.get(); + json->clear(); + json->push_back('{'); + bool first = true; + for (int i = 0; i < typed_value_field.children.size(); ++i) { + std::string child_json; + bool child_present = false; + RETURN_IF_ERROR(shredded_field_to_json(typed_value_field.children[i], fields[i], metadata, + &child_json, &child_present, false)); + if (!child_present) { + continue; + } + if (!first) { + json->push_back(','); + } + append_json_string(typed_value_field.children[i].name, json); + json->push_back(':'); + json->append(child_json); + first = false; + } + json->push_back('}'); + *present = true; + return Status::OK(); +} + +static Status typed_value_to_json(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, std::string* json, bool* present) { + const DataTypePtr& typed_type = remove_nullable(typed_value_field.data_type); + switch (typed_type->get_primitive_type()) { + case TYPE_STRUCT: + return typed_struct_to_json(typed_value_field, field, metadata, json, present); + case TYPE_ARRAY: + return typed_array_to_json(typed_value_field, field, metadata, json, present); + case TYPE_MAP: { + VariantMap values; + PathInDataBuilder path; + std::deque string_values; + RETURN_IF_ERROR(typed_map_to_variant_map(typed_value_field, field, metadata, &path, &values, + present, &string_values)); + if (!*present) { + return Status::OK(); + } + return variant_map_to_json(std::move(values), json); + } + default: + return scalar_typed_value_to_json(typed_value_field, field, json, present); + } +} + +static Status typed_value_to_variant_map(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values); + +static Status variant_to_variant_map(const FieldSchema& variant_field, const Field& field, + const std::string* inherited_metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + const auto& fields = field.get(); + const int metadata_idx = find_child_idx(variant_field, "metadata"); + const int value_idx = find_child_idx(variant_field, "value"); + const int typed_value_idx = find_child_idx(variant_field, "typed_value"); + + std::string metadata; + bool has_metadata = false; + if (inherited_metadata != nullptr) { + metadata = *inherited_metadata; + has_metadata = true; + } + if (metadata_idx >= 0) { + bool metadata_present = false; + RETURN_IF_ERROR(get_binary_field(fields[metadata_idx], &metadata, &metadata_present)); + has_metadata = metadata_present; + } + + VariantMap value_values; + bool value_present = false; + const PathInData current_path = path->build(); + if (value_idx >= 0) { + std::string value; + RETURN_IF_ERROR(get_binary_field(fields[value_idx], &value, &value_present)); + if (value_present) { + if (!has_metadata) { + return Status::Corruption("Parquet VARIANT value is present without metadata"); + } + RETURN_IF_ERROR(parquet::decode_variant_to_variant_map( + StringRef(metadata.data(), metadata.size()), + StringRef(value.data(), value.size()), current_path, &value_values, + string_values)); + } + } + + VariantMap typed_values; + bool typed_present = false; + if (typed_value_idx >= 0) { + RETURN_IF_ERROR(typed_value_to_variant_map(variant_field.children[typed_value_idx], + fields[typed_value_idx], metadata, path, + &typed_values, &typed_present, string_values)); + } + + erase_shadowed_empty_object_markers(&value_values, &typed_values); + auto current_value = value_values.find(current_path); + if (value_present && typed_present && current_value != value_values.end() && + !is_empty_object_marker(current_value->second)) { + return Status::Corruption( + "Parquet VARIANT has conflicting non-object value and typed_value"); + } + RETURN_IF_ERROR( + check_no_shredded_value_typed_duplicates(value_values, typed_values, current_path)); + values->merge(std::move(value_values)); + values->merge(std::move(typed_values)); + *present = value_present || typed_present; + return Status::OK(); +} + +static Status shredded_field_to_variant_map(const FieldSchema& field_schema, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values) { + if (is_variant_wrapper_field(field_schema, false)) { + return variant_to_variant_map(field_schema, field, &metadata, path, values, present, + string_values); + } + if (is_value_only_variant_wrapper_candidate(field_schema)) { + Status st = variant_to_variant_map(field_schema, field, &metadata, path, values, present, + string_values); + if (st.ok()) { + return st; + } + if (!st.is()) { + return st; + } + } + return typed_value_to_variant_map(field_schema, field, metadata, path, values, present, + string_values); +} + +static Status append_typed_field_to_variant_map(const FieldSchema& typed_value_field, + const Field& field, PathInDataBuilder* path, + VariantMap* values, bool* present) { + FieldWithDataType value; + RETURN_IF_ERROR(field_to_variant_field(typed_value_field, field, &value, present)); + if (*present) { + (*values)[path->build()] = std::move(value); + } + return Status::OK(); +} + +static void move_variant_map_to_field(VariantMap&& element_values, FieldWithDataType* value) { + if (element_values.size() == 1 && element_values.begin()->first.empty()) { + *value = std::move(element_values.begin()->second); + return; + } + value->field = Field::create_field(std::move(element_values)); + fill_variant_field_info(value); +} + +static Status typed_array_to_variant_map(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values) { + if ((contains_uuid_typed_value_field(typed_value_field) || + contains_temporal_variant_leaf_type(typed_value_field.data_type) || + contains_floating_point_variant_leaf_type(typed_value_field.data_type)) && + is_direct_variant_leaf_type(typed_value_field.data_type)) { + FieldWithDataType value; + RETURN_IF_ERROR(fill_direct_array_variant_field(typed_value_field, field, &value, present)); + if (*present) { + (*values)[path->build()] = std::move(value); + } + return Status::OK(); + } + if (is_direct_variant_leaf_type(typed_value_field.data_type)) { + return append_typed_field_to_variant_map(typed_value_field, field, path, values, present); + } + + if (field.is_null()) { + *present = false; + return Status::OK(); + } + if (typed_value_field.children.empty()) { + return Status::Corruption("Parquet VARIANT array typed_value has no element schema"); + } + + const auto& elements = field.get(); + const auto& element_schema = typed_value_field.children[0]; + Array array; + array.reserve(elements.size()); + for (const auto& element : elements) { + VariantMap element_values; + bool element_present = false; + PathInDataBuilder element_path; + RETURN_IF_ERROR(shredded_field_to_variant_map(element_schema, element, metadata, + &element_path, &element_values, + &element_present, string_values)); + if (!element_present) { + if (element.is_null()) { + array.push_back(Field()); + continue; + } + return Status::Corruption("Parquet VARIANT array element is missing"); + } + + FieldWithDataType element_value; + move_variant_map_to_field(std::move(element_values), &element_value); + array.push_back(std::move(element_value.field)); + } + + FieldWithDataType value; + const size_t elements_count = array.size(); + value.field = Field::create_field(std::move(array)); + fill_variant_field_info(&value); + if (value.base_scalar_type_id == INVALID_TYPE) { + RETURN_IF_ERROR(make_jsonb_field(make_null_array_json(elements_count), &value)); + } + (*values)[path->build()] = std::move(value); + *present = true; + return Status::OK(); +} + +static Status typed_map_to_variant_map(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + if (typed_value_field.children.size() != 2) { + return Status::Corruption("Parquet VARIANT map typed_value has {} child fields", + typed_value_field.children.size()); + } + + const auto& map = field.get(); + DORIS_CHECK(map.size() == 2); + DORIS_CHECK(map[0].get_type() == TYPE_ARRAY); + DORIS_CHECK(map[1].get_type() == TYPE_ARRAY); + const auto& keys = map[0].get(); + const auto& value_fields = map[1].get(); + DORIS_CHECK(keys.size() == value_fields.size()); + + if (keys.empty()) { + RETURN_IF_ERROR(insert_empty_object_marker(path->build(), values)); + *present = true; + return Status::OK(); + } + + std::set object_keys; + const FieldSchema& key_field = typed_value_field.children[0]; + const FieldSchema& value_field = typed_value_field.children[1]; + for (size_t i = 0; i < keys.size(); ++i) { + std::string key; + bool key_present = false; + RETURN_IF_ERROR(get_binary_field(keys[i], &key, &key_present)); + if (!key_present) { + return Status::Corruption("Parquet VARIANT map typed_value has null key {}", + key_field.name); + } + if (!object_keys.insert(key).second) { + return Status::Corruption("Parquet VARIANT map typed_value has duplicate key {}", key); + } + + path->append(key, false); + bool value_present = false; + Status st = shredded_field_to_variant_map(value_field, value_fields[i], metadata, path, + values, &value_present, string_values); + if (!st.ok()) { + path->pop_back(); + return st; + } + if (!value_present) { + (*values)[path->build()] = FieldWithDataType {.field = Field()}; + } + path->pop_back(); + } + *present = true; + return Status::OK(); +} + +static Status typed_value_to_variant_map(const FieldSchema& typed_value_field, const Field& field, + const std::string& metadata, PathInDataBuilder* path, + VariantMap* values, bool* present, + std::deque* string_values) { + if (field.is_null()) { + *present = false; + return Status::OK(); + } + const DataTypePtr& typed_type = remove_nullable(typed_value_field.data_type); + if (typed_type->get_primitive_type() == TYPE_STRUCT) { + const auto& fields = field.get(); + *present = true; + bool has_present_child = false; + for (int i = 0; i < typed_value_field.children.size(); ++i) { + path->append(typed_value_field.children[i].name, false); + bool child_present = false; + RETURN_IF_ERROR(shredded_field_to_variant_map(typed_value_field.children[i], fields[i], + metadata, path, values, &child_present, + string_values)); + has_present_child |= child_present; + path->pop_back(); + } + if (!has_present_child) { + RETURN_IF_ERROR(insert_empty_object_marker(path->build(), values)); + } + return Status::OK(); + } + if (typed_type->get_primitive_type() == TYPE_ARRAY) { + return typed_array_to_variant_map(typed_value_field, field, metadata, path, values, present, + string_values); + } + if (typed_type->get_primitive_type() == TYPE_MAP) { + return typed_map_to_variant_map(typed_value_field, field, metadata, path, values, present, + string_values); + } + + return append_typed_field_to_variant_map(typed_value_field, field, path, values, present); +} + +static bool direct_typed_value_present_at(const FieldSchema& field_schema, const IColumn& column, + size_t row, bool allow_variant_wrapper, + const std::set& column_ids, + const std::vector& parent_null_maps) { + if (!has_selected_column(field_schema, column_ids) || + has_direct_typed_parent_null(parent_null_maps, row)) { + return false; + } + + const IColumn* value_column = &column; + if (const auto* nullable_column = check_and_get_column(&column)) { + const auto& null_map = nullable_column->get_null_map_data(); + DCHECK_LT(row, null_map.size()); + if (null_map[row]) { + return false; + } + value_column = &nullable_column->get_nested_column(); + } + + if (allow_variant_wrapper && is_variant_wrapper_field(field_schema, false)) { + const int typed_value_idx = find_child_idx(field_schema, "typed_value"); + DCHECK_GE(typed_value_idx, 0); + const auto& typed_struct = assert_cast(*value_column); + return direct_typed_value_present_at(field_schema.children[typed_value_idx], + typed_struct.get_column(typed_value_idx), row, false, + column_ids, parent_null_maps); + } + + return true; +} + +static Status append_direct_typed_empty_object_markers( + const FieldSchema& field_schema, const ColumnStruct& struct_column, size_t start, + size_t rows, PathInDataBuilder* path, ColumnVariant* batch, + const std::set& column_ids, const std::vector& parent_null_maps) { + DataTypePtr marker_type = make_nullable(std::make_shared()); + MutableColumnPtr marker_column = marker_type->create_column(); + marker_column->insert_default(); + bool has_marker = false; + + const PathInData marker_path = path->build(); + Field empty_object; + RETURN_IF_ERROR(make_empty_object_field(&empty_object)); + for (size_t i = 0; i < rows; ++i) { + const size_t row = start + i; + if (has_direct_typed_parent_null(parent_null_maps, row)) { + marker_column->insert_default(); + has_marker |= marker_path.empty(); + continue; + } + + bool has_present_child = false; + for (int child_idx = 0; child_idx < field_schema.children.size(); ++child_idx) { + if (direct_typed_value_present_at(field_schema.children[child_idx], + struct_column.get_column(child_idx), row, true, + column_ids, parent_null_maps)) { + has_present_child = true; + break; + } + } + + if (has_present_child) { + marker_column->insert_default(); + continue; + } + marker_column->insert(empty_object); + has_marker = true; + } + + if (!has_marker) { + return Status::OK(); + } + if (!batch->add_sub_column(marker_path, std::move(marker_column), marker_type)) { + return Status::Corruption("Failed to add Parquet VARIANT empty typed object marker {}", + marker_path.get_path()); + } + return Status::OK(); +} + +static Status append_direct_typed_column_to_batch(const FieldSchema& field_schema, + const IColumn& column, size_t start, size_t rows, + PathInDataBuilder* path, ColumnVariant* batch, + bool allow_variant_wrapper, + const std::set& column_ids, + std::vector parent_null_maps) { + if (!has_selected_column(field_schema, column_ids)) { + return Status::OK(); + } + + const IColumn* value_column = &column; + if (const auto* nullable_column = check_and_get_column(&column)) { + parent_null_maps.push_back(&nullable_column->get_null_map_data()); + value_column = &nullable_column->get_nested_column(); + } + + if (allow_variant_wrapper && is_variant_wrapper_field(field_schema, false)) { + const int typed_value_idx = find_child_idx(field_schema, "typed_value"); + DCHECK_GE(typed_value_idx, 0); + const auto& typed_struct = assert_cast(*value_column); + return append_direct_typed_column_to_batch( + field_schema.children[typed_value_idx], typed_struct.get_column(typed_value_idx), + start, rows, path, batch, false, column_ids, parent_null_maps); + } + + const auto& type = remove_nullable(field_schema.data_type); + if (type->get_primitive_type() == TYPE_STRUCT) { + const auto& struct_column = assert_cast(*value_column); + for (int i = 0; i < field_schema.children.size(); ++i) { + if (!has_selected_column(field_schema.children[i], column_ids)) { + continue; + } + path->append(field_schema.children[i].name, false); + RETURN_IF_ERROR(append_direct_typed_column_to_batch( + field_schema.children[i], struct_column.get_column(i), start, rows, path, batch, + true, column_ids, parent_null_maps)); + path->pop_back(); + } + return append_direct_typed_empty_object_markers(field_schema, struct_column, start, rows, + path, batch, column_ids, parent_null_maps); + } + + DataTypePtr variant_leaf_type = make_nullable(direct_variant_leaf_type(field_schema)); + MutableColumnPtr variant_leaf = variant_leaf_type->create_column(); + variant_leaf->insert_default(); + if (type->get_primitive_type() == TYPE_ARRAY && + (contains_uuid_typed_value_field(field_schema) || + contains_temporal_variant_leaf_type(field_schema.data_type) || + contains_floating_point_variant_leaf_type(field_schema.data_type))) { + RETURN_IF_ERROR(insert_direct_typed_array_leaf_range( + field_schema, *value_column, start, rows, parent_null_maps, variant_leaf.get())); + } else if (is_uuid_typed_value_field(field_schema)) { + RETURN_IF_ERROR(insert_direct_typed_uuid_leaf_range(*value_column, start, rows, + parent_null_maps, variant_leaf.get())); + } else if (is_temporal_variant_leaf_type(type->get_primitive_type())) { + insert_direct_typed_temporal_leaf_range(type->get_primitive_type(), *value_column, start, + rows, parent_null_maps, variant_leaf.get()); + } else { + insert_direct_typed_leaf_range(*value_column, start, rows, parent_null_maps, + variant_leaf.get()); + } + if (!batch->add_sub_column(path->build(), std::move(variant_leaf), variant_leaf_type)) { + return Status::Corruption("Failed to add Parquet VARIANT typed subcolumn {}", + path->build().get_path()); + } + return Status::OK(); +} + +static Status append_variant_struct_rows_to_column( + const FieldSchema& field_schema, const ColumnStruct& variant_struct_column, + const NullMap* struct_null_map, size_t start, size_t rows, + const std::set& column_ids, ColumnPtr& doris_column, + ParquetColumnReader::ColumnStatistics* variant_statistics) { + DCHECK_LE(start + rows, variant_struct_column.size()); + + MutableColumnPtr variant_column_ptr; + NullMap* null_map_ptr = nullptr; + auto mutable_column = doris_column->assume_mutable(); + if (doris_column->is_nullable()) { + auto* nullable_column = assert_cast(mutable_column.get()); + variant_column_ptr = nullable_column->get_nested_column_ptr(); + null_map_ptr = &nullable_column->get_null_map_data(); + } else { + if (field_schema.data_type->is_nullable()) { + return Status::Corruption("Not nullable column has null values in parquet file"); + } + variant_column_ptr = std::move(mutable_column); + } + auto* variant_column = assert_cast(variant_column_ptr.get()); + + const int typed_value_idx = find_child_idx(field_schema, "typed_value"); + if (can_use_direct_typed_only_value(field_schema, column_ids)) { + variant_statistics->variant_direct_typed_value_read_rows += static_cast(rows); + MutableColumnPtr batch_variant_column = + ColumnVariant::create(variant_column->max_subcolumns_count(), + variant_column->enable_doc_mode(), rows + 1); + auto* batch_variant = assert_cast(batch_variant_column.get()); + PathInDataBuilder path; + RETURN_IF_ERROR(append_direct_typed_column_to_batch( + field_schema.children[typed_value_idx], + variant_struct_column.get_column(typed_value_idx), start, rows, &path, + batch_variant, false, column_ids, {})); + variant_column->insert_range_from(*batch_variant_column, 1, rows); + if (null_map_ptr != nullptr) { + for (size_t i = start; i < start + rows; ++i) { + null_map_ptr->push_back(struct_null_map != nullptr && (*struct_null_map)[i]); + } + } + return Status::OK(); + } + + variant_statistics->variant_rowwise_read_rows += static_cast(rows); + for (size_t i = start; i < start + rows; ++i) { + if (struct_null_map != nullptr && (*struct_null_map)[i]) { + if (null_map_ptr == nullptr) { + return Status::Corruption("Not nullable column has null values in parquet file"); + } + variant_column->insert_default(); + null_map_ptr->push_back(1); + continue; + } + VariantMap values; + bool present = false; + PathInDataBuilder path; + std::deque string_values; + RETURN_IF_ERROR(variant_to_variant_map(field_schema, variant_struct_column[i], nullptr, + &path, &values, &present, &string_values)); + if (!present) { + values[PathInData()] = FieldWithDataType {.field = Field()}; + } + RETURN_IF_CATCH_EXCEPTION( + variant_column->insert(Field::create_field(std::move(values)))); + if (null_map_ptr != nullptr) { + null_map_ptr->push_back(0); + } + } + return Status::OK(); +} + +#ifdef BE_TEST +namespace parquet_variant_reader_test { +bool can_direct_read_typed_value_for_test(const FieldSchema& typed_value_field) { + const std::set column_ids; + return can_direct_read_typed_value(typed_value_field, false, column_ids); +} + +bool can_use_direct_typed_only_value_for_test(const FieldSchema& variant_field, + const std::set& column_ids) { + return can_use_direct_typed_only_value(variant_field, column_ids); +} + +Status append_direct_typed_column_to_batch_for_test(const FieldSchema& typed_value_field, + const IColumn& typed_value_column, size_t start, + size_t rows, ColumnVariant* batch) { + PathInDataBuilder path; + const std::set column_ids; + return append_direct_typed_column_to_batch(typed_value_field, typed_value_column, start, rows, + &path, batch, false, column_ids, {}); +} + +Status read_variant_row_for_test(const FieldSchema& variant_field, const Field& field, + bool output_nullable, Field* result, bool* sql_null) { + if (field.is_null()) { + if (!output_nullable) { + return Status::Corruption("Not nullable column has null values in parquet file"); + } + *sql_null = true; + return Status::OK(); + } + + VariantMap values; + bool present = false; + PathInDataBuilder path; + std::deque string_values; + RETURN_IF_ERROR(variant_to_variant_map(variant_field, field, nullptr, &path, &values, &present, + &string_values)); + if (!present) { + values[PathInData()] = FieldWithDataType {.field = Field()}; + } + + auto variant_column = ColumnVariant::create(0, false); + RETURN_IF_CATCH_EXCEPTION( + variant_column->insert(Field::create_field(std::move(values)))); + variant_column->get(0, *result); + *sql_null = false; + return Status::OK(); +} + +Status read_variant_rows_for_test(const FieldSchema& variant_field, const IColumn& struct_column, + const std::set& column_ids, ColumnPtr& doris_column, + int64_t* direct_rows, int64_t* rowwise_rows) { + const IColumn* struct_source = &struct_column; + const NullMap* struct_null_map = nullptr; + if (const auto* nullable_struct = check_and_get_column(struct_source)) { + struct_null_map = &nullable_struct->get_null_map_data(); + struct_source = &nullable_struct->get_nested_column(); + } + const auto& variant_struct_column = assert_cast(*struct_source); + + ParquetColumnReader::ColumnStatistics variant_statistics; + RETURN_IF_ERROR(append_variant_struct_rows_to_column( + variant_field, variant_struct_column, struct_null_map, 0, variant_struct_column.size(), + column_ids, doris_column, &variant_statistics)); + *direct_rows = variant_statistics.variant_direct_typed_value_read_rows; + *rowwise_rows = variant_statistics.variant_rowwise_read_rows; + return Status::OK(); +} + +Status variant_to_json_for_test(const FieldSchema& variant_field, const Field& field, + const std::string& inherited_metadata, std::string* json, + bool* present) { + return variant_to_json(variant_field, field, &inherited_metadata, json, present); +} + +bool variant_struct_reader_type_is_nullable_for_test(const FieldSchema& variant_field) { + return make_variant_struct_reader_type(variant_field)->is_nullable(); +} + +bool variant_struct_reader_column_is_nullable_for_test(const FieldSchema& variant_field) { + auto variant_struct_type = make_variant_struct_reader_type(variant_field); + return make_variant_struct_read_column(variant_field, variant_struct_type)->is_nullable(); +} +} // namespace parquet_variant_reader_test +#endif + +// Existing recursive factory keeps nested reader wiring and shared state in one dispatch point. +// NOLINTNEXTLINE(readability-function-cognitive-complexity,readability-function-size) Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, const tparquet::RowGroup& row_group, const RowRanges& row_ranges, const cctz::time_zone* ctz, io::IOContext* io_ctx, @@ -113,29 +1968,33 @@ Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, const std::set& column_ids, const std::set& filter_column_ids) { size_t total_rows = row_group.num_rows; - if (field->data_type->get_primitive_type() == TYPE_ARRAY) { + const auto field_primitive_type = remove_nullable(field->data_type)->get_primitive_type(); + if (field_primitive_type == TYPE_ARRAY) { + const bool offset_only = !column_ids.empty() && + column_ids.contains(field->get_column_id()) && + !column_ids.contains(field->children[0].get_column_id()); std::unique_ptr element_reader; - RETURN_IF_ERROR(create(file, &field->children[0], row_group, row_ranges, ctz, io_ctx, + RETURN_IF_ERROR(create(file, field->children.data(), row_group, row_ranges, ctz, io_ctx, element_reader, max_buf_size, col_offsets, state, true, column_ids, filter_column_ids)); auto array_reader = ArrayColumnReader::create_unique(row_ranges, total_rows, ctz, io_ctx); element_reader->set_column_in_nested(); - RETURN_IF_ERROR(array_reader->init(std::move(element_reader), field)); + RETURN_IF_ERROR(array_reader->init(std::move(element_reader), field, offset_only)); array_reader->_filter_column_ids = filter_column_ids; reader.reset(array_reader.release()); - } else if (field->data_type->get_primitive_type() == TYPE_MAP) { + } else if (field_primitive_type == TYPE_MAP) { std::unique_ptr key_reader; std::unique_ptr value_reader; if (column_ids.empty() || column_ids.find(field->children[0].get_column_id()) != column_ids.end()) { // Create key reader - RETURN_IF_ERROR(create(file, &field->children[0], row_group, row_ranges, ctz, io_ctx, + RETURN_IF_ERROR(create(file, field->children.data(), row_group, row_ranges, ctz, io_ctx, key_reader, max_buf_size, col_offsets, state, true, column_ids, filter_column_ids)); } else { auto skip_reader = std::make_unique(row_ranges, total_rows, ctz, - io_ctx, &field->children[0]); + io_ctx, field->children.data()); key_reader = std::move(skip_reader); } @@ -147,7 +2006,7 @@ Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, filter_column_ids)); } else { auto skip_reader = std::make_unique(row_ranges, total_rows, ctz, - io_ctx, &field->children[0]); + io_ctx, &field->children[1]); value_reader = std::move(skip_reader); } @@ -157,7 +2016,7 @@ Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, RETURN_IF_ERROR(map_reader->init(std::move(key_reader), std::move(value_reader), field)); map_reader->_filter_column_ids = filter_column_ids; reader.reset(map_reader.release()); - } else if (field->data_type->get_primitive_type() == TYPE_STRUCT) { + } else if (field_primitive_type == TYPE_STRUCT) { std::unordered_map> child_readers; child_readers.reserve(field->children.size()); int non_skip_reader_idx = -1; @@ -184,7 +2043,7 @@ Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, // If all children are SkipReadingReader, force the first child to call create if (non_skip_reader_idx == -1) { std::unique_ptr child_reader; - RETURN_IF_ERROR(create(file, &field->children[0], row_group, row_ranges, ctz, io_ctx, + RETURN_IF_ERROR(create(file, field->children.data(), row_group, row_ranges, ctz, io_ctx, child_reader, max_buf_size, col_offsets, state, in_collection, column_ids, filter_column_ids)); child_reader->set_column_in_nested(); @@ -194,6 +2053,13 @@ Status ParquetColumnReader::create(io::FileReaderSPtr file, FieldSchema* field, RETURN_IF_ERROR(struct_reader->init(std::move(child_readers), field)); struct_reader->_filter_column_ids = filter_column_ids; reader.reset(struct_reader.release()); + } else if (field_primitive_type == TYPE_VARIANT) { + auto variant_reader = + VariantColumnReader::create_unique(row_ranges, total_rows, ctz, io_ctx); + RETURN_IF_ERROR(variant_reader->init(file, field, row_group, max_buf_size, col_offsets, + state, in_collection, column_ids, filter_column_ids)); + variant_reader->_filter_column_ids = filter_column_ids; + reader.reset(variant_reader.release()); } else { auto physical_index = field->physical_column_index; const tparquet::OffsetIndex* offset_index = @@ -288,7 +2154,7 @@ Status ScalarColumnReader::_skip_values(size_t num_ size_t loop_skip = def_decoder.get_next_run(&def_level, num_values - skipped); if (loop_skip == 0) { std::stringstream ss; - auto& bit_reader = def_decoder.rle_decoder().bit_reader(); + const auto& bit_reader = def_decoder.rle_decoder().bit_reader(); ss << "def_decoder buffer (hex): "; for (size_t i = 0; i < bit_reader.max_bytes(); ++i) { ss << std::hex << std::setw(2) << std::setfill('0') @@ -346,7 +2212,7 @@ Status ScalarColumnReader::_read_values(size_t num_ size_t loop_read = def_decoder.get_next_run(&def_level, num_values - has_read); if (loop_read == 0) { std::stringstream ss; - auto& bit_reader = def_decoder.rle_decoder().bit_reader(); + const auto& bit_reader = def_decoder.rle_decoder().bit_reader(); ss << "def_decoder buffer (hex): "; for (size_t i = 0; i < bit_reader.max_bytes(); ++i) { ss << std::hex << std::setw(2) << std::setfill('0') @@ -377,7 +2243,7 @@ Status ScalarColumnReader::_read_values(size_t num_ } data_column = doris_column->assume_mutable(); } - if (null_map.size() == 0) { + if (null_map.empty()) { size_t remaining = num_values; while (remaining > USHRT_MAX) { null_map.emplace_back(USHRT_MAX); @@ -402,6 +2268,8 @@ Status ScalarColumnReader::_read_values(size_t num_ * whether the reader should read the remaining value of the last row in previous page. */ template +// Existing nested scalar reader is the central row/page alignment loop for complex values. +// NOLINTNEXTLINE(readability-function-cognitive-complexity,readability-function-size) Status ScalarColumnReader::_read_nested_column( ColumnPtr& doris_column, DataTypePtr& type, FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter) { @@ -455,7 +2323,7 @@ Status ScalarColumnReader::_read_nested_column( RETURN_IF_ERROR( _chunk_reader->decode_values(data_column, type, select_vector, is_dict_filter)); - if (ancestor_null_indices.size() != 0) { + if (!ancestor_null_indices.empty()) { RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_null_indices.size(), false)); } if (filter_map.has_filter()) { @@ -503,6 +2371,77 @@ Status ScalarColumnReader::_read_nested_column( return Status::OK(); } +template +Status ScalarColumnReader::_read_and_skip_nested_levels( + FilterMap& filter_map, size_t before_rep_level_sz, size_t filter_map_index, + std::vector& nested_filter_map_data) { + RETURN_IF_ERROR(_chunk_reader->fill_def(_def_levels)); + RETURN_IF_ERROR(_chunk_reader->skip_nested_values(_def_levels, before_rep_level_sz, + _def_levels.size())); + if (!filter_map.has_filter()) { + return Status::OK(); + } + + std::unique_ptr nested_filter_map = std::make_unique(); + RETURN_IF_ERROR(gen_filter_map(filter_map, filter_map_index, before_rep_level_sz, + _rep_levels.size(), nested_filter_map_data, &nested_filter_map)); + auto new_rep_sz = before_rep_level_sz; + for (size_t idx = before_rep_level_sz; idx < _rep_levels.size(); idx++) { + if (nested_filter_map_data[idx - before_rep_level_sz]) { + _rep_levels[new_rep_sz] = _rep_levels[idx]; + _def_levels[new_rep_sz] = _def_levels[idx]; + new_rep_sz++; + } + } + _rep_levels.resize(new_rep_sz); + _def_levels.resize(new_rep_sz); + return Status::OK(); +} + +template +Status ScalarColumnReader::read_nested_levels(FilterMap& filter_map, + size_t batch_size, + size_t* read_rows, + bool* eof) { + _rep_levels.clear(); + _def_levels.clear(); + *read_rows = 0; + + std::vector nested_filter_map_data; + + while (_current_range_idx < _row_ranges.range_size()) { + size_t left_row = + std::max(_current_row_index, _row_ranges.get_range_from(_current_range_idx)); + size_t right_row = std::min(left_row + batch_size - *read_rows, + (size_t)_row_ranges.get_range_to(_current_range_idx)); + _current_row_index = left_row; + RETURN_IF_ERROR(_chunk_reader->seek_to_nested_row(left_row)); + size_t load_rows = 0; + bool cross_page = false; + size_t before_rep_level_sz = _rep_levels.size(); + RETURN_IF_ERROR(_chunk_reader->load_page_nested_rows(_rep_levels, right_row - left_row, + &load_rows, &cross_page)); + RETURN_IF_ERROR(_read_and_skip_nested_levels(filter_map, before_rep_level_sz, + _filter_map_index, nested_filter_map_data)); + _filter_map_index += load_rows; + while (cross_page) { + before_rep_level_sz = _rep_levels.size(); + RETURN_IF_ERROR(_chunk_reader->load_cross_page_nested_row(_rep_levels, &cross_page)); + RETURN_IF_ERROR(_read_and_skip_nested_levels(filter_map, before_rep_level_sz, + _filter_map_index - 1, + nested_filter_map_data)); + } + *read_rows += load_rows; + _current_row_index += load_rows; + _current_range_idx += (_current_row_index == _row_ranges.get_range_to(_current_range_idx)); + if (*read_rows == batch_size) { + break; + } + } + *eof = _current_range_idx == _row_ranges.range_size(); + return Status::OK(); +} + template Status ScalarColumnReader::read_dict_values_to_column( MutableColumnPtr& doris_column, bool* has_dict) { @@ -530,6 +2469,8 @@ Status ScalarColumnReader::_try_load_dict_page(bool } template +// Existing scalar read path handles page iteration, filtering, and conversion in one dispatch loop. +// NOLINTNEXTLINE(readability-function-cognitive-complexity,readability-function-size) Status ScalarColumnReader::read_column_data( ColumnPtr& doris_column, const DataTypePtr& type, const std::shared_ptr& root_node, FilterMap& filter_map, @@ -645,9 +2586,10 @@ Status ScalarColumnReader::read_column_data( } Status ArrayColumnReader::init(std::unique_ptr element_reader, - FieldSchema* field) { + FieldSchema* field, bool offset_only) { _field_schema = field; _element_reader = std::move(element_reader); + _offset_only = offset_only; return Status::OK(); } @@ -678,10 +2620,15 @@ Status ArrayColumnReader::read_column_data( ColumnPtr& element_column = assert_cast(*data_column).get_data_ptr(); const DataTypePtr& element_type = (assert_cast(remove_nullable(type).get()))->get_nested_type(); - // read nested column - RETURN_IF_ERROR(_element_reader->read_column_data(element_column, element_type, - root_node->get_element_node(), filter_map, - batch_size, read_rows, eof, is_dict_filter)); + if (_offset_only) { + // Cardinality needs collection levels and offsets, but not element payloads. + RETURN_IF_ERROR( + _element_reader->read_nested_levels(filter_map, batch_size, read_rows, eof)); + } else { + RETURN_IF_ERROR(_element_reader->read_column_data( + element_column, element_type, root_node->get_element_node(), filter_map, batch_size, + read_rows, eof, is_dict_filter)); + } if (*read_rows == 0) { return Status::OK(); } @@ -690,6 +2637,11 @@ Status ArrayColumnReader::read_column_data( // fill offset and null map fill_array_offset(_field_schema, offsets_data, null_map_ptr, _element_reader->get_rep_level(), _element_reader->get_def_level()); + if (_offset_only && offsets_data.back() > element_column->size()) { + auto mutable_element_column = element_column->assume_mutable(); + mutable_element_column->insert_many_defaults(offsets_data.back() - element_column->size()); + element_column = std::move(mutable_element_column); + } DCHECK_EQ(element_column->size(), offsets_data.back()); #ifndef NDEBUG doris_column->sanity_check(); @@ -782,6 +2734,25 @@ Status StructColumnReader::init( _child_readers = std::move(child_readers); return Status::OK(); } + +Status StructColumnReader::read_nested_levels(FilterMap& filter_map, size_t batch_size, + size_t* read_rows, bool* eof) { + _read_column_names.clear(); + for (const auto& child : _field_schema->children) { + auto it = _child_readers.find(child.name); + if (it == _child_readers.end() || + dynamic_cast(it->second.get()) != nullptr) { + continue; + } + _read_column_names.emplace_back(child.name); + return it->second->read_nested_levels(filter_map, batch_size, read_rows, eof); + } + return Status::Corruption("Cannot read struct '{}' levels without a reference column", + _field_schema->name); +} + +// Existing struct reader coordinates child readers, missing columns, and selection state. +// NOLINTNEXTLINE(readability-function-cognitive-complexity,readability-function-size) Status StructColumnReader::read_column_data( ColumnPtr& doris_column, const DataTypePtr& type, const std::shared_ptr& root_node, FilterMap& filter_map, @@ -818,8 +2789,8 @@ Status StructColumnReader::read_column_data( for (size_t i = 0; i < doris_struct.tuple_size(); ++i) { ColumnPtr& doris_field = doris_struct.get_column_ptr(i); - auto& doris_type = doris_struct_type->get_element(i); - auto& doris_name = doris_struct_type->get_element_name(i); + const auto& doris_type = doris_struct_type->get_element(i); + const auto& doris_name = doris_struct_type->get_element_name(i); if (!root_node->children_column_exists(doris_name)) { missing_column_idxs.push_back(i); VLOG_DEBUG << "[ParquetReader] Missing column in schema: column_idx[" << i @@ -984,7 +2955,7 @@ Status StructColumnReader::read_column_data( // Fill truly missing columns (not in root_node) with null or default value for (auto idx : missing_column_idxs) { auto& doris_field = doris_struct.get_column_ptr(idx); - auto& doris_type = doris_struct_type->get_element(idx); + const auto& doris_type = doris_struct_type->get_element(idx); DCHECK(doris_type->is_nullable()); auto mutable_column = doris_field->assume_mutable(); auto* nullable_column = static_cast(mutable_column.get()); @@ -1001,6 +2972,69 @@ Status StructColumnReader::read_column_data( return Status::OK(); } +Status VariantColumnReader::init(io::FileReaderSPtr file, FieldSchema* field, + const tparquet::RowGroup& row_group, size_t max_buf_size, + std::unordered_map& col_offsets, + RuntimeState* state, bool in_collection, + const std::set& column_ids, + const std::set& filter_column_ids) { + _field_schema = field; + _column_ids = column_ids; + _variant_struct_field = std::make_unique(*field); + + DataTypePtr variant_struct_type = make_variant_struct_reader_type(*field); + _variant_struct_field->data_type = variant_struct_type; + + RETURN_IF_ERROR(ParquetColumnReader::create(file, _variant_struct_field.get(), row_group, + _row_ranges, _ctz, _io_ctx, _struct_reader, + max_buf_size, col_offsets, state, in_collection, + column_ids, filter_column_ids)); + _struct_reader->set_column_in_nested(); + return Status::OK(); +} + +Status VariantColumnReader::read_column_data( + ColumnPtr& doris_column, const DataTypePtr& type, + const std::shared_ptr& root_node, FilterMap& filter_map, + size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter, + int64_t real_column_size) { + (void)root_node; + if (remove_nullable(type)->get_primitive_type() != PrimitiveType::TYPE_VARIANT) { + return Status::Corruption( + "Wrong data type for column '{}', expected Variant type, actual type: {}.", + _field_schema->name, type->get_name()); + } + + const auto& variant_struct_type = _variant_struct_field->data_type; + ColumnPtr struct_column = make_variant_struct_read_column(*_field_schema, variant_struct_type); + const size_t old_struct_rows = struct_column->size(); + auto const_node = TableSchemaChangeHelper::ConstNode::get_instance(); + RETURN_IF_ERROR(_struct_reader->read_column_data(struct_column, variant_struct_type, const_node, + filter_map, batch_size, read_rows, eof, + is_dict_filter, real_column_size)); + + const size_t new_struct_rows = struct_column->size() - old_struct_rows; + if (new_struct_rows == 0) { + return Status::OK(); + } + + const IColumn* variant_struct_source = struct_column.get(); + const NullMap* struct_null_map = nullptr; + if (const auto* nullable_struct = check_and_get_column(variant_struct_source)) { + struct_null_map = &nullable_struct->get_null_map_data(); + variant_struct_source = &nullable_struct->get_nested_column(); + } + const auto& variant_struct_column = assert_cast(*variant_struct_source); + + RETURN_IF_ERROR(append_variant_struct_rows_to_column( + *_field_schema, variant_struct_column, struct_null_map, old_struct_rows, + new_struct_rows, _column_ids, doris_column, &_variant_statistics)); +#ifndef NDEBUG + doris_column->sanity_check(); +#endif + return Status::OK(); +} + template class ScalarColumnReader; template class ScalarColumnReader; template class ScalarColumnReader; diff --git a/be/src/format/parquet/vparquet_column_reader.h b/be/src/format/parquet/vparquet_column_reader.h index 9d9fd2280c88f8..f05276d4a574ba 100644 --- a/be/src/format/parquet/vparquet_column_reader.h +++ b/be/src/format/parquet/vparquet_column_reader.h @@ -18,13 +18,16 @@ #pragma once #include #include -#include -#include +#include +#include #include #include #include +#include +#include #include +#include #include #include "common/status.h" @@ -48,11 +51,35 @@ struct IOContext; } // namespace doris::io namespace doris { +class Field; struct FieldSchema; +class IColumn; +class ColumnVariant; template class ColumnStr; using ColumnString = ColumnStr; +#ifdef BE_TEST +namespace parquet_variant_reader_test { +bool can_direct_read_typed_value_for_test(const FieldSchema& typed_value_field); +bool can_use_direct_typed_only_value_for_test(const FieldSchema& variant_field, + const std::set& column_ids); +Status append_direct_typed_column_to_batch_for_test(const FieldSchema& typed_value_field, + const IColumn& typed_value_column, size_t start, + size_t rows, ColumnVariant* batch); +Status read_variant_row_for_test(const FieldSchema& variant_field, const Field& field, + bool output_nullable, Field* result, bool* sql_null); +Status read_variant_rows_for_test(const FieldSchema& variant_field, const IColumn& struct_column, + const std::set& column_ids, ColumnPtr& doris_column, + int64_t* direct_rows, int64_t* rowwise_rows); +Status variant_to_json_for_test(const FieldSchema& variant_field, const Field& field, + const std::string& inherited_metadata, std::string* json, + bool* present); +bool variant_struct_reader_type_is_nullable_for_test(const FieldSchema& variant_field); +bool variant_struct_reader_column_is_nullable_for_test(const FieldSchema& variant_field); +} // namespace parquet_variant_reader_test +#endif + class ParquetColumnReader { public: struct ColumnStatistics { @@ -76,7 +103,9 @@ class ParquetColumnReader { page_cache_hit_counter(0), page_cache_missing_counter(0), page_cache_compressed_hit_counter(0), - page_cache_decompressed_hit_counter(0) {} + page_cache_decompressed_hit_counter(0), + variant_direct_typed_value_read_rows(0), + variant_rowwise_read_rows(0) {} ColumnStatistics(ColumnChunkReaderStatistics& cs, int64_t null_map_time, int64_t convert_time_) @@ -99,7 +128,9 @@ class ParquetColumnReader { page_cache_hit_counter(cs.page_cache_hit_counter), page_cache_missing_counter(cs.page_cache_missing_counter), page_cache_compressed_hit_counter(cs.page_cache_compressed_hit_counter), - page_cache_decompressed_hit_counter(cs.page_cache_decompressed_hit_counter) {} + page_cache_decompressed_hit_counter(cs.page_cache_decompressed_hit_counter), + variant_direct_typed_value_read_rows(0), + variant_rowwise_read_rows(0) {} int64_t page_index_read_calls; int64_t decompress_time; @@ -121,6 +152,8 @@ class ParquetColumnReader { int64_t page_cache_missing_counter; int64_t page_cache_compressed_hit_counter; int64_t page_cache_decompressed_hit_counter; + int64_t variant_direct_typed_value_read_rows; + int64_t variant_rowwise_read_rows; void merge(ColumnStatistics& col_statistics) { page_index_read_calls += col_statistics.page_index_read_calls; @@ -146,6 +179,9 @@ class ParquetColumnReader { page_cache_compressed_hit_counter += col_statistics.page_cache_compressed_hit_counter; page_cache_decompressed_hit_counter += col_statistics.page_cache_decompressed_hit_counter; + variant_direct_typed_value_read_rows += + col_statistics.variant_direct_typed_value_read_rows; + variant_rowwise_read_rows += col_statistics.variant_rowwise_read_rows; } }; @@ -158,6 +194,10 @@ class ParquetColumnReader { FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter, int64_t real_column_size = -1) = 0; + virtual Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) { + return Status::NotSupported("read_nested_levels is not supported for parquet field"); + } virtual Status read_dict_values_to_column(MutableColumnPtr& doris_column, bool* has_dict) { return Status::NotSupported("read_dict_values_to_column is not supported"); @@ -211,11 +251,10 @@ class ScalarColumnReader : public ParquetColumnReader { ENABLE_FACTORY_CREATOR(ScalarColumnReader) public: ScalarColumnReader(const RowRanges& row_ranges, size_t total_rows, - const tparquet::ColumnChunk& chunk_meta, - const tparquet::OffsetIndex* offset_index, const cctz::time_zone* ctz, - io::IOContext* io_ctx) + tparquet::ColumnChunk chunk_meta, const tparquet::OffsetIndex* offset_index, + const cctz::time_zone* ctz, io::IOContext* io_ctx) : ParquetColumnReader(row_ranges, total_rows, ctz, io_ctx), - _chunk_meta(chunk_meta), + _chunk_meta(std::move(chunk_meta)), _offset_index(offset_index) {} ~ScalarColumnReader() override { close(); } Status init(io::FileReaderSPtr file, FieldSchema* field, size_t max_buf_size, @@ -325,6 +364,11 @@ class ScalarColumnReader : public ParquetColumnReader { Status _read_nested_column(ColumnPtr& doris_column, DataTypePtr& type, FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter); + Status _read_and_skip_nested_levels(FilterMap& filter_map, size_t before_rep_level_sz, + size_t filter_map_index, + std::vector& nested_filter_map_data); + Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) override; Status _try_load_dict_page(bool* loaded, bool* has_dict); }; @@ -335,11 +379,16 @@ class ArrayColumnReader : public ParquetColumnReader { io::IOContext* io_ctx) : ParquetColumnReader(row_ranges, total_rows, ctz, io_ctx) {} ~ArrayColumnReader() override { close(); } - Status init(std::unique_ptr element_reader, FieldSchema* field); + Status init(std::unique_ptr element_reader, FieldSchema* field, + bool offset_only); Status read_column_data(ColumnPtr& doris_column, const DataTypePtr& type, const std::shared_ptr& root_node, FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter, int64_t real_column_size = -1) override; + Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) override { + return _element_reader->read_nested_levels(filter_map, batch_size, read_rows, eof); + } const std::vector& get_rep_level() const override { return _element_reader->get_rep_level(); } @@ -353,6 +402,7 @@ class ArrayColumnReader : public ParquetColumnReader { private: std::unique_ptr _element_reader; + bool _offset_only = false; }; class MapColumnReader : public ParquetColumnReader { @@ -369,6 +419,10 @@ class MapColumnReader : public ParquetColumnReader { const std::shared_ptr& root_node, FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter, int64_t real_column_size = -1) override; + Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) override { + return _key_reader->read_nested_levels(filter_map, batch_size, read_rows, eof); + } const std::vector& get_rep_level() const override { return _key_reader->get_rep_level(); @@ -411,6 +465,8 @@ class StructColumnReader : public ParquetColumnReader { const std::shared_ptr& root_node, FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, bool is_dict_filter, int64_t real_column_size = -1) override; + Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) override; const std::vector& get_rep_level() const override { if (!_read_column_names.empty()) { @@ -460,6 +516,49 @@ class StructColumnReader : public ParquetColumnReader { //Need to use vector instead of set,see `get_rep_level()` for the reason. }; +class VariantColumnReader : public ParquetColumnReader { + ENABLE_FACTORY_CREATOR(VariantColumnReader) +public: + VariantColumnReader(const RowRanges& row_ranges, size_t total_rows, const cctz::time_zone* ctz, + io::IOContext* io_ctx) + : ParquetColumnReader(row_ranges, total_rows, ctz, io_ctx) {} + ~VariantColumnReader() override { close(); } + + Status init(io::FileReaderSPtr file, FieldSchema* field, const tparquet::RowGroup& row_group, + size_t max_buf_size, std::unordered_map& col_offsets, + RuntimeState* state, bool in_collection, const std::set& column_ids, + const std::set& filter_column_ids); + Status read_column_data(ColumnPtr& doris_column, const DataTypePtr& type, + const std::shared_ptr& root_node, + FilterMap& filter_map, size_t batch_size, size_t* read_rows, bool* eof, + bool is_dict_filter, int64_t real_column_size = -1) override; + Status read_nested_levels(FilterMap& filter_map, size_t batch_size, size_t* read_rows, + bool* eof) override { + return _struct_reader->read_nested_levels(filter_map, batch_size, read_rows, eof); + } + + const std::vector& get_rep_level() const override { + return _struct_reader->get_rep_level(); + } + const std::vector& get_def_level() const override { + return _struct_reader->get_def_level(); + } + ColumnStatistics column_statistics() override { + auto statistics = _struct_reader->column_statistics(); + statistics.merge(_variant_statistics); + return statistics; + } + void close() override {} + + void reset_filter_map_index() override { _struct_reader->reset_filter_map_index(); } + +private: + std::unique_ptr _variant_struct_field; + std::unique_ptr _struct_reader; + std::set _column_ids; + ColumnStatistics _variant_statistics; +}; + // A special reader that skips actual reading but provides empty data with correct structure // This is used when a column is not needed but its structure is required (e.g., for map keys) class SkipReadingReader : public ParquetColumnReader { @@ -532,9 +631,7 @@ class SkipReadingReader : public ParquetColumnReader { } // Implement required pure virtual methods from base class - ColumnStatistics column_statistics() override { - return ColumnStatistics(); // Return empty statistics - } + ColumnStatistics column_statistics() override { return {}; } void close() override { // Nothing to close for skip reading diff --git a/be/src/format/parquet/vparquet_reader.cpp b/be/src/format/parquet/vparquet_reader.cpp index a2f2356085b171..9e1b7e700fab01 100644 --- a/be/src/format/parquet/vparquet_reader.cpp +++ b/be/src/format/parquet/vparquet_reader.cpp @@ -24,6 +24,7 @@ #include #include +#include #include #include "common/config.h" @@ -46,12 +47,14 @@ #include "format/column_type_convert.h" #include "format/parquet/parquet_block_split_bloom_filter.h" #include "format/parquet/parquet_common.h" +#include "format/parquet/parquet_nested_column_utils.h" #include "format/parquet/parquet_predicate.h" #include "format/parquet/parquet_thrift_util.h" #include "format/parquet/schema_desc.h" #include "format/parquet/vparquet_file_metadata.h" #include "format/parquet/vparquet_group_reader.h" #include "format/parquet/vparquet_page_index.h" +#include "format/table/nested_column_access_helper.h" #include "information_schema/schema_scanner.h" #include "io/file_factory.h" #include "io/fs/buffered_reader.h" @@ -194,6 +197,7 @@ void ParquetReader::set_file_reader(io::FileReaderSPtr file_reader) { } #endif +// NOLINTNEXTLINE(readability-function-size): existing Parquet counter initialization stays grouped. void ParquetReader::_init_profile() { if (_profile != nullptr) { static const char* parquet_profile = "ParquetReader"; @@ -287,6 +291,10 @@ void ParquetReader::_init_profile() { ADD_CHILD_TIMER_WITH_LEVEL(_profile, "ConvertTime", parquet_profile, 1); _parquet_profile.bloom_filter_read_time = ADD_CHILD_TIMER_WITH_LEVEL(_profile, "BloomFilterReadTime", parquet_profile, 1); + _parquet_profile.variant_direct_typed_value_read_rows = ADD_CHILD_COUNTER_WITH_LEVEL( + _profile, "VariantDirectTypedValueReadRows", TUnit::UNIT, parquet_profile, 1); + _parquet_profile.variant_rowwise_read_rows = ADD_CHILD_COUNTER_WITH_LEVEL( + _profile, "VariantRowWiseReadRows", TUnit::UNIT, parquet_profile, 1); } } @@ -372,10 +380,21 @@ Status ParquetReader::_open_file() { Status ParquetReader::get_file_metadata_schema(const FieldDescriptor** ptr) { RETURN_IF_ERROR(_open_file()); DCHECK(_file_metadata != nullptr); - *ptr = &_file_metadata->schema(); + *ptr = &parquet_file_schema(); return Status::OK(); } +const FieldDescriptor& ParquetReader::parquet_file_schema() const { + if (_file_schema_with_ids.has_value()) { + return *_file_schema_with_ids; + } + return _file_metadata->schema(); +} + +void ParquetReader::prepare_parquet_file_schema_with_ids(const FieldDescriptor* field_desc) { + _file_schema_with_ids = field_desc->copy_with_assigned_ids(); +} + void ParquetReader::_init_system_properties() { if (_scan_range.__isset.file_type) { // for compatibility @@ -430,13 +449,115 @@ Status ParquetReader::on_before_init_reader(ReaderInitContext* ctx) { if (ctx->tuple_descriptor != nullptr) { const FieldDescriptor* field_desc = nullptr; RETURN_IF_ERROR(get_file_metadata_schema(&field_desc)); + prepare_parquet_file_schema_with_ids(field_desc); + field_desc = &parquet_file_schema(); RETURN_IF_ERROR(TableSchemaChangeHelper::BuildTableInfoUtil::by_parquet_name( ctx->tuple_descriptor, *field_desc, ctx->table_info_node)); + auto column_id_result = _create_column_ids_by_name(field_desc, ctx->tuple_descriptor); + ctx->column_ids = std::move(column_id_result.column_ids); + ctx->filter_column_ids = std::move(column_id_result.filter_column_ids); } return Status::OK(); } +ColumnIdResult ParquetReader::_create_column_ids_by_name(const FieldDescriptor* field_desc, + const TupleDescriptor* tuple_descriptor) { + FieldDescriptor field_desc_with_ids = field_desc->copy_with_assigned_ids(); + field_desc = &field_desc_with_ids; + + std::unordered_map table_col_name_to_field_schema_map; + for (int i = 0; i < field_desc->size(); ++i) { + const auto* field_schema = field_desc->get_column(i); + if (!field_schema) { + continue; + } + table_col_name_to_field_schema_map[field_schema->lower_case_name] = field_schema; + } + + std::set column_ids; + std::set filter_column_ids; + + auto process_access_paths = [](const FieldSchema* parquet_field, + const std::vector& access_paths, + std::set& out_ids) { + process_nested_access_paths( + parquet_field, access_paths, out_ids, + [](const FieldSchema* field) { return field->get_column_id(); }, + [](const FieldSchema* field) { return field->get_max_column_id(); }, + ParquetNestedColumnUtils::extract_nested_column_ids_by_name); + }; + + for (const auto* slot : tuple_descriptor->slots()) { + auto it = table_col_name_to_field_schema_map.find(slot->col_name_lower_case()); + if (it == table_col_name_to_field_schema_map.end()) { + continue; + } + const auto* field_schema = it->second; + + if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { + column_ids.insert(field_schema->column_id); + if (slot->is_predicate()) { + filter_column_ids.insert(field_schema->column_id); + } + continue; + } + + process_access_paths(field_schema, slot->all_access_paths(), column_ids); + if (!slot->predicate_access_paths().empty()) { + process_access_paths(field_schema, slot->predicate_access_paths(), filter_column_ids); + } + } + + return {std::move(column_ids), std::move(filter_column_ids)}; +} + +std::string ParquetReader::_selected_leaf_column_paths() const { + if (_file_metadata == nullptr) { + return ""; + } + + std::vector leaf_paths; + const auto& schema_desc = parquet_file_schema(); + std::function collect = + [&](const FieldSchema* field, const std::string& path) { + if (!_column_ids.empty() && !_column_ids.contains(field->get_column_id())) { + return; + } + + if (field->children.empty()) { + if (field->physical_column_index >= 0) { + leaf_paths.push_back(path); + } + return; + } + + for (const auto& child : field->children) { + collect(&child, path + "." + child.name); + } + }; + + for (const auto& read_col : _read_file_columns) { + const FieldSchema* field = schema_desc.get_column(read_col); + if (field != nullptr) { + collect(field, field->name); + } + } + + std::sort(leaf_paths.begin(), leaf_paths.end()); + leaf_paths.erase(std::unique(leaf_paths.begin(), leaf_paths.end()), leaf_paths.end()); + + std::stringstream result; + for (size_t i = 0; i < leaf_paths.size(); ++i) { + if (i != 0) { + result << ", "; + } + result << leaf_paths[i]; + } + return result.str(); +} + Status ParquetReader::_open_file_reader(ReaderInitContext* /*ctx*/) { return _open_file(); } @@ -490,6 +611,9 @@ Status ParquetReader::_do_init_reader(ReaderInitContext* base_ctx) { // _init_read_columns handles both normal path (missing cols populated above) // and standalone path (_fill_missing_cols empty, _table_info_node_ptr may be null). _init_read_columns(base_ctx->column_names); + if (_profile != nullptr) { + _profile->add_info_string("ParquetReadColumnPaths", _selected_leaf_column_paths()); + } // build column predicates for column lazy read if (ctx->conjuncts != nullptr) { @@ -534,7 +658,7 @@ void ParquetReader::_init_read_columns(const std::vector& column_na // Build file_col_name → table_col_name map, skipping missing columns. // Must iterate file schema in physical order so that _generate_random_access_ranges // sees monotonically increasing chunk offsets. - auto schema_desc = _file_metadata->schema(); + const auto& schema_desc = parquet_file_schema(); std::map required_file_columns; for (const auto& col_name : column_names) { if (_fill_missing_cols.contains(col_name)) { @@ -572,7 +696,7 @@ bool ParquetReader::_type_matches(const int cid) const { const auto& file_col_name = _table_info_node_ptr->children_file_column_name(slot->col_name()); const auto& file_col_type = - remove_nullable(_file_metadata->schema().get_column(file_col_name)->data_type); + remove_nullable(parquet_file_schema().get_column(file_col_name)->data_type); return (table_col_type->get_primitive_type() == file_col_type->get_primitive_type()) && !is_complex_type(table_col_type->get_primitive_type()); @@ -635,7 +759,7 @@ void ParquetReader::_classify_columns_for_lazy_read( const std::unordered_map>& partition_columns, const std::unordered_map& missing_columns) { - const FieldDescriptor& schema = _file_metadata->schema(); + const FieldDescriptor& schema = parquet_file_schema(); auto predicate_columns = predicate_conjuncts_columns; #ifndef BE_TEST for (const auto& [col_name, _] : _generated_col_handlers) { @@ -745,7 +869,7 @@ Status ParquetReader::init_schema_reader() { Status ParquetReader::get_parsed_schema(std::vector* col_names, std::vector* col_types) { _total_groups = _t_metadata->row_groups.size(); - auto schema_desc = _file_metadata->schema(); + const auto& schema_desc = parquet_file_schema(); for (int i = 0; i < schema_desc.size(); ++i) { // Get the Column Reader for the boolean column col_names->emplace_back(schema_desc.get_column(i)->name); @@ -756,7 +880,7 @@ Status ParquetReader::get_parsed_schema(std::vector* col_names, Status ParquetReader::_get_columns_impl( std::unordered_map* name_to_type) { - const auto& schema_desc = _file_metadata->schema(); + const auto& schema_desc = parquet_file_schema(); std::unordered_set column_names; schema_desc.get_column_names(&column_names); for (auto& name : column_names) { @@ -839,7 +963,7 @@ Status ParquetReader::_do_get_next_block(Block* block, size_t* read_rows, bool* RowGroupReader::PositionDeleteContext ParquetReader::_get_position_delete_ctx( const tparquet::RowGroup& row_group, const RowGroupReader::RowGroupIndex& row_group_index) { if (_delete_rows == nullptr) { - return RowGroupReader::PositionDeleteContext(row_group.num_rows, row_group_index.first_row); + return {row_group.num_rows, row_group_index.first_row}; } const int64_t* delete_rows = &(*_delete_rows)[0]; const int64_t* delete_rows_end = delete_rows + _delete_rows->size(); @@ -890,7 +1014,7 @@ Status ParquetReader::_next_row_group_reader() { }; int64_t group_size = 0; // only calculate the needed columns for (auto& read_col : _read_file_columns) { - const FieldSchema* field = _file_metadata->schema().get_column(read_col); + const FieldSchema* field = parquet_file_schema().get_column(read_col); group_size += column_compressed_size(field); } @@ -960,7 +1084,7 @@ Status ParquetReader::_next_row_group_reader() { _current_group_reader->set_table_format_reader(this); _current_group_reader->_table_info_node_ptr = _table_info_node_ptr; - return _current_group_reader->init(_file_metadata->schema(), candidate_row_ranges, _col_offsets, + return _current_group_reader->init(parquet_file_schema(), candidate_row_ranges, _col_offsets, _tuple_descriptor, _row_descriptor, _colname_to_slot_id, _not_single_slot_filter_conjuncts, _slot_id_to_filter_conjuncts); @@ -975,14 +1099,15 @@ std::vector ParquetReader::_generate_random_access_ranges( [&](const FieldSchema* field, const tparquet::RowGroup& row_group) { if (_column_ids.empty() || _column_ids.find(field->get_column_id()) != _column_ids.end()) { - if (field->data_type->get_primitive_type() == TYPE_ARRAY) { - scalar_range(&field->children[0], row_group); - } else if (field->data_type->get_primitive_type() == TYPE_MAP) { - scalar_range(&field->children[0], row_group); - scalar_range(&field->children[1], row_group); - } else if (field->data_type->get_primitive_type() == TYPE_STRUCT) { - for (int i = 0; i < field->children.size(); ++i) { - scalar_range(&field->children[i], row_group); + const auto field_type = remove_nullable(field->data_type)->get_primitive_type(); + if (field_type == TYPE_ARRAY) { + scalar_range(field->children.data(), row_group); + } else if (field_type == TYPE_MAP) { + scalar_range(field->children.data(), row_group); + scalar_range(field->children.data() + 1, row_group); + } else if (field_type == TYPE_STRUCT || field_type == TYPE_VARIANT) { + for (const auto& child : field->children) { + scalar_range(&child, row_group); } } else { const tparquet::ColumnChunk& chunk = @@ -1001,7 +1126,7 @@ std::vector ParquetReader::_generate_random_access_ranges( }; const tparquet::RowGroup& row_group = _t_metadata->row_groups[group.row_group_id]; for (const auto& read_col : _read_file_columns) { - const FieldSchema* field = _file_metadata->schema().get_column(read_col); + const FieldSchema* field = parquet_file_schema().get_column(read_col); scalar_range(field, row_group); } if (!result.empty()) { @@ -1025,8 +1150,12 @@ bool ParquetReader::_is_misaligned_range_group(const tparquet::RowGroup& row_gro } int64_t ParquetReader::get_total_rows() const { - if (!_t_metadata) return 0; - if (!_filter_groups) return _t_metadata->num_rows; + if (!_t_metadata) { + return 0; + } + if (!_filter_groups) { + return _t_metadata->num_rows; + } int64_t total = 0; for (const auto& rg : _t_metadata->row_groups) { if (!_is_misaligned_range_group(rg)) { @@ -1079,22 +1208,23 @@ Status ParquetReader::_process_page_index_filter( if (!_colname_to_slot_id->contains(read_table_col)) { continue; } - auto* field = _file_metadata->schema().get_column(read_file_col); + const auto* field = parquet_file_schema().get_column(read_file_col); - std::function f = [&](FieldSchema* field) { + std::function f = [&](const FieldSchema* field) { if (!_column_ids.empty() && _column_ids.find(field->get_column_id()) == _column_ids.end()) { return; } - if (field->data_type->get_primitive_type() == TYPE_ARRAY) { - f(&field->children[0]); - } else if (field->data_type->get_primitive_type() == TYPE_MAP) { - f(&field->children[0]); - f(&field->children[1]); - } else if (field->data_type->get_primitive_type() == TYPE_STRUCT) { - for (int i = 0; i < field->children.size(); ++i) { - f(&field->children[i]); + const auto field_type = remove_nullable(field->data_type)->get_primitive_type(); + if (field_type == TYPE_ARRAY) { + f(field->children.data()); + } else if (field_type == TYPE_MAP) { + f(field->children.data()); + f(field->children.data() + 1); + } else if (field_type == TYPE_STRUCT || field_type == TYPE_VARIANT) { + for (const auto& child : field->children) { + f(&child); } } else { int parquet_col_id = field->physical_column_index; @@ -1175,7 +1305,7 @@ Status ParquetReader::_process_page_index_filter( const auto& file_col_name = _table_info_node_ptr->children_file_column_name(slot->col_name()); - const FieldSchema* col_schema = _file_metadata->schema().get_column(file_col_name); + const FieldSchema* col_schema = parquet_file_schema().get_column(file_col_name); int parquet_col_id = col_schema->physical_column_index; if (parquet_col_id < 0) { @@ -1322,6 +1452,18 @@ Status ParquetReader::_process_column_stat_filter( // when there are multiple predicates on the same column std::unordered_map> bloom_filter_cache; + auto find_physical_column = [&](const SlotDescriptor* slot, const FieldSchema** col_schema, + int* parquet_col_id) -> bool { + if (!_table_info_node_ptr->children_column_exists(slot->col_name())) { + return false; + } + const auto& file_col_name = + _table_info_node_ptr->children_file_column_name(slot->col_name()); + *col_schema = parquet_file_schema().get_column(file_col_name); + *parquet_col_id = (*col_schema)->physical_column_index; + return *parquet_col_id >= 0; + }; + // Initialize output parameters *filtered_by_min_max = false; *filtered_by_bloom_filter = false; @@ -1333,15 +1475,12 @@ Status ParquetReader::_process_column_stat_filter( if (!_enable_filter_by_min_max) { return false; } + const FieldSchema* col_schema = nullptr; + int parquet_col_id = -1; auto* slot = _tuple_descriptor->slots()[cid]; - if (!_table_info_node_ptr->children_column_exists(slot->col_name())) { + if (!find_physical_column(slot, &col_schema, &parquet_col_id)) { return false; } - const auto& file_col_name = - _table_info_node_ptr->children_file_column_name(slot->col_name()); - const FieldSchema* col_schema = - _file_metadata->schema().get_column(file_col_name); - int parquet_col_id = col_schema->physical_column_index; auto meta_data = row_group.columns[parquet_col_id].meta_data; stat->col_schema = col_schema; return ParquetPredicate::read_column_stats(col_schema, meta_data, @@ -1351,15 +1490,12 @@ Status ParquetReader::_process_column_stat_filter( }; std::function get_bloom_filter_func = [&](ParquetPredicate::ColumnStat* stat, const int cid) { + const FieldSchema* col_schema = nullptr; + int parquet_col_id = -1; auto* slot = _tuple_descriptor->slots()[cid]; - if (!_table_info_node_ptr->children_column_exists(slot->col_name())) { + if (!find_physical_column(slot, &col_schema, &parquet_col_id)) { return false; } - const auto& file_col_name = - _table_info_node_ptr->children_file_column_name(slot->col_name()); - const FieldSchema* col_schema = - _file_metadata->schema().get_column(file_col_name); - int parquet_col_id = col_schema->physical_column_index; auto meta_data = row_group.columns[parquet_col_id].meta_data; if (!meta_data.__isset.bloom_filter_offset) { return false; @@ -1423,16 +1559,14 @@ Status ParquetReader::_process_column_stat_filter( if (stat.bloom_filter) { // Find the column id for caching for (auto* slot : _tuple_descriptor->slots()) { - if (_table_info_node_ptr->children_column_exists(slot->col_name())) { - const auto& file_col_name = - _table_info_node_ptr->children_file_column_name(slot->col_name()); - const FieldSchema* col_schema = - _file_metadata->schema().get_column(file_col_name); - int parquet_col_id = col_schema->physical_column_index; - if (stat.col_schema == col_schema) { - bloom_filter_cache[parquet_col_id] = std::move(stat.bloom_filter); - break; - } + const FieldSchema* col_schema = nullptr; + int parquet_col_id = -1; + if (!find_physical_column(slot, &col_schema, &parquet_col_id)) { + continue; + } + if (stat.col_schema == col_schema) { + bloom_filter_cache[parquet_col_id] = std::move(stat.bloom_filter); + break; } } } @@ -1522,6 +1656,10 @@ void ParquetReader::_collect_profile() { COUNTER_UPDATE(_parquet_profile.decode_dict_time, _column_statistics.decode_dict_time); COUNTER_UPDATE(_parquet_profile.decode_level_time, _column_statistics.decode_level_time); COUNTER_UPDATE(_parquet_profile.decode_null_map_time, _column_statistics.decode_null_map_time); + COUNTER_UPDATE(_parquet_profile.variant_direct_typed_value_read_rows, + _column_statistics.variant_direct_typed_value_read_rows); + COUNTER_UPDATE(_parquet_profile.variant_rowwise_read_rows, + _column_statistics.variant_rowwise_read_rows); } void ParquetReader::_collect_profile_before_close() { diff --git a/be/src/format/parquet/vparquet_reader.h b/be/src/format/parquet/vparquet_reader.h index 68979bf9e4f027..e40714ffe84c6d 100644 --- a/be/src/format/parquet/vparquet_reader.h +++ b/be/src/format/parquet/vparquet_reader.h @@ -18,11 +18,13 @@ #pragma once #include -#include -#include +#include +#include #include #include +#include +#include #include #include #include @@ -239,8 +241,14 @@ class ParquetReader : public TableFormatReader { const TupleDescriptor* get_tuple_descriptor() const { return _tuple_descriptor; } const RowDescriptor* get_row_descriptor() const { return _row_descriptor; } const FileMetaData* get_file_metadata() const { return _file_metadata; } + const FieldDescriptor& parquet_file_schema() const; + void prepare_parquet_file_schema_with_ids(const FieldDescriptor* field_desc); private: + static ColumnIdResult _create_column_ids_by_name(const FieldDescriptor* field_desc, + const TupleDescriptor* tuple_descriptor); + std::string _selected_leaf_column_paths() const; + struct ParquetProfile { RuntimeProfile::Counter* filtered_row_groups = nullptr; RuntimeProfile::Counter* filtered_row_groups_by_min_max = nullptr; @@ -286,6 +294,8 @@ class ParquetReader : public TableFormatReader { RuntimeProfile::Counter* dict_filter_rewrite_time = nullptr; RuntimeProfile::Counter* convert_time = nullptr; RuntimeProfile::Counter* bloom_filter_read_time = nullptr; + RuntimeProfile::Counter* variant_direct_typed_value_read_rows = nullptr; + RuntimeProfile::Counter* variant_rowwise_read_rows = nullptr; }; // ---- set_fill_columns sub-functions ---- @@ -361,6 +371,7 @@ class ParquetReader : public TableFormatReader { // after _file_reader. Otherwise, there may be heap-use-after-free bug. ObjLRUCache::CacheHandle _meta_cache_handle; std::unique_ptr _file_metadata_ptr; + std::optional _file_schema_with_ids; const tparquet::FileMetaData* _t_metadata = nullptr; // _tracing_file_reader wraps _file_reader. diff --git a/be/src/format/table/hive/hive_parquet_nested_column_utils.cpp b/be/src/format/table/hive/hive_parquet_nested_column_utils.cpp index d9d7642afeb888..b0d222f9d36797 100644 --- a/be/src/format/table/hive/hive_parquet_nested_column_utils.cpp +++ b/be/src/format/table/hive/hive_parquet_nested_column_utils.cpp @@ -17,154 +17,14 @@ #include "format/table/hive/hive_parquet_nested_column_utils.h" -#include -#include -#include -#include -#include -#include - -#include "format/parquet/schema_desc.h" -#include "format/table/table_schema_change_helper.h" +#include "format/parquet/parquet_nested_column_utils.h" namespace doris { void HiveParquetNestedColumnUtils::extract_nested_column_ids( const FieldSchema& field_schema, const std::vector>& paths, std::set& column_ids) { - // Group paths by first field_id - std::unordered_map>> - child_paths_by_table_col_name; - - for (const auto& path : paths) { - if (!path.empty()) { - std::string first_table_col_name = path[0]; - std::vector remaining; - if (path.size() > 1) { - remaining.assign(path.begin() + 1, path.end()); - } - child_paths_by_table_col_name[first_table_col_name].push_back(std::move(remaining)); - } - } - - // Track whether any child column was added to determine if parent should be included - bool has_child_columns = false; - - // For MAP type, normalize wildcard "*" to explicit KEYS/VALUES access - // Wildcard in MAP context means accessing both map keys and values - // Normalization logic: - // path: ["map_col", "*"] → ["map_col", "VALUES"] + ["map_col", "KEYS"] - // path: ["map_col", "*", "field"] → ["map_col", "VALUES", "field"] + ["map_col", "KEYS"] - if (field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_MAP) { - auto wildcard_it = child_paths_by_table_col_name.find("*"); - if (wildcard_it != child_paths_by_table_col_name.end()) { - auto& wildcard_paths = wildcard_it->second; - - // All wildcard paths go to VALUES - auto& values_paths = child_paths_by_table_col_name["VALUES"]; - values_paths.insert(values_paths.end(), wildcard_paths.begin(), wildcard_paths.end()); - - // Always add KEYS for wildcard access - auto& keys_paths = child_paths_by_table_col_name["KEYS"]; - // Add an empty path to request full KEYS - std::vector empty_path; - keys_paths.push_back(empty_path); - - // Remove wildcard entry as it's been expanded - child_paths_by_table_col_name.erase(wildcard_it); - } - } - - // Efficiently traverse children - for (uint64_t i = 0; i < field_schema.children.size(); ++i) { - const auto& child = field_schema.children[i]; - std::string child_field_name; - - bool is_list = field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_ARRAY; - bool is_map = field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_MAP; - - if (is_list) { - child_field_name = "*"; - } else if (is_map) { - // After wildcard normalization above, all MAP accesses are explicit KEYS/VALUES - // Simply assign the appropriate field name based on which child we're processing - if (i == 0) { - child_field_name = "KEYS"; - } else if (i == 1) { - child_field_name = "VALUES"; - } - - // Special handling for Parquet MAP structure: - // When accessing only VALUES, we still need KEY structure for levels - // Check if we're at key child (i==0) and only VALUES is requested (no KEYS) - if (i == 0) { - bool has_keys_access = child_paths_by_table_col_name.find("KEYS") != - child_paths_by_table_col_name.end(); - bool has_values_access = child_paths_by_table_col_name.find("VALUES") != - child_paths_by_table_col_name.end(); - - // If only VALUES is accessed (not KEYS), still include key structure for RL/DL - if (!has_keys_access && has_values_access) { - // For map_values() queries, we need key's structure for correct RL/DL parsing. - // If key is a nested type (e.g., STRUCT), RL/DL info is stored at leaf columns. - // Add all column IDs from key's start to max (all leaves + intermediate nodes). - uint64_t key_start_id = child.get_column_id(); - uint64_t key_max_id = child.get_max_column_id(); - for (uint64_t id = key_start_id; id <= key_max_id; ++id) { - column_ids.insert(id); - } - has_child_columns = true; - continue; // Skip further processing of key child - } - } - - } else { - child_field_name = child.lower_case_name; - } - - if (child_field_name.empty()) { - continue; - } - - auto child_paths_it = child_paths_by_table_col_name.find(child_field_name); - if (child_paths_it != child_paths_by_table_col_name.end()) { - const auto& child_paths = child_paths_it->second; - - // Check if any child path is empty (meaning full child needed) - bool needs_full_child = - std::any_of(child_paths.begin(), child_paths.end(), - [](const std::vector& path) { return path.empty(); }); - - if (needs_full_child) { - // Add all column IDs from current child node to max_column_id - // This efficiently handles all nested/complex cases in one loop - uint64_t start_id = child.get_column_id(); - uint64_t max_column_id = child.get_max_column_id(); - for (uint64_t id = start_id; id <= max_column_id; ++id) { - column_ids.insert(id); - } - has_child_columns = true; - } else { - // Store current size to check if recursive call added any columns - size_t before_size = column_ids.size(); - - // Recursively extract from child - extract_nested_column_ids(child, child_paths, column_ids); - - // Check if recursive call added any columns - if (column_ids.size() > before_size) { - has_child_columns = true; - } - } - } - } - - // If any child columns were added, also add the parent column ID - // This ensures parent struct/container nodes are included when their children are needed - if (has_child_columns) { - // Set automatically handles deduplication, so no need to check if it already exists - column_ids.insert(field_schema.get_column_id()); - } + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field_schema, paths, column_ids); } } // namespace doris diff --git a/be/src/format/table/hive/hive_parquet_nested_column_utils.h b/be/src/format/table/hive/hive_parquet_nested_column_utils.h index be960c9da8fcd1..ddd237877859af 100644 --- a/be/src/format/table/hive/hive_parquet_nested_column_utils.h +++ b/be/src/format/table/hive/hive_parquet_nested_column_utils.h @@ -17,14 +17,11 @@ #pragma once -#include +#include #include #include -#include #include -#include "format/table/table_schema_change_helper.h" - namespace doris { struct FieldSchema; diff --git a/be/src/format/table/hive_reader.cpp b/be/src/format/table/hive_reader.cpp index 1a8d8f79bd9774..4a917398932a8f 100644 --- a/be/src/format/table/hive_reader.cpp +++ b/be/src/format/table/hive_reader.cpp @@ -137,7 +137,7 @@ ColumnIdResult HiveOrcReader::_create_column_ids(const orc::Type* orc_type, // primitive (non-nested) types if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && - slot->col_type() != TYPE_MAP)) { + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { column_ids.insert(orc_field->getColumnId()); if (slot->is_predicate()) { filter_column_ids.insert(orc_field->getColumnId()); @@ -193,7 +193,7 @@ ColumnIdResult HiveOrcReader::_create_column_ids_by_top_level_col_index( // primitive (non-nested) types if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && - slot->col_type() != TYPE_MAP)) { + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { column_ids.insert(orc_field->getColumnId()); if (slot->is_predicate()) { filter_column_ids.insert(orc_field->getColumnId()); @@ -240,6 +240,8 @@ Status HiveParquetReader::on_before_init_reader(ReaderInitContext* ctx) { const FieldDescriptor* field_desc = nullptr; RETURN_IF_ERROR(get_file_metadata_schema(&field_desc)); DCHECK(field_desc != nullptr); + prepare_parquet_file_schema_with_ids(field_desc); + field_desc = &parquet_file_schema(); // Build table_info_node based on config if (get_state()->query_options().hive_parquet_use_column_names) { @@ -279,8 +281,9 @@ Status HiveParquetReader::on_before_init_reader(ReaderInitContext* ctx) { if (get_state()->query_options().hive_parquet_use_column_names) { column_id_result = _create_column_ids(field_desc, ctx->tuple_descriptor); } else { - column_id_result = - _create_column_ids_by_top_level_col_index(field_desc, ctx->tuple_descriptor); + column_id_result = _create_column_ids_by_top_level_col_index( + field_desc, ctx->tuple_descriptor, ctx->column_names, + get_scan_params().column_idxs); } ctx->column_ids = std::move(column_id_result.column_ids); ctx->filter_column_ids = std::move(column_id_result.filter_column_ids); @@ -291,9 +294,8 @@ Status HiveParquetReader::on_before_init_reader(ReaderInitContext* ctx) { ColumnIdResult HiveParquetReader::_create_column_ids(const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor) { - // First, assign column IDs to the field descriptor - auto* mutable_field_desc = const_cast(field_desc); - mutable_field_desc->assign_ids(); + FieldDescriptor field_desc_with_ids = field_desc->copy_with_assigned_ids(); + field_desc = &field_desc_with_ids; // map top-level table column name (lower-cased) -> FieldSchema* std::unordered_map table_col_name_to_field_schema_map; @@ -328,7 +330,7 @@ ColumnIdResult HiveParquetReader::_create_column_ids(const FieldDescriptor* fiel // primitive (non-nested) types if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && - slot->col_type() != TYPE_MAP)) { + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { column_ids.insert(field_schema->column_id); if (slot->is_predicate()) { @@ -351,18 +353,24 @@ ColumnIdResult HiveParquetReader::_create_column_ids(const FieldDescriptor* fiel } ColumnIdResult HiveParquetReader::_create_column_ids_by_top_level_col_index( - const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor) { - // First, assign column IDs to the field descriptor - auto* mutable_field_desc = const_cast(field_desc); - mutable_field_desc->assign_ids(); - - // map top-level table column position -> FieldSchema* - std::unordered_map table_col_pos_to_field_schema_map; - for (int i = 0; i < field_desc->size(); ++i) { - auto field_schema = field_desc->get_column(i); - if (!field_schema) continue; - - table_col_pos_to_field_schema_map[i] = field_schema; + const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor, + const std::vector& table_column_names, + const std::vector& file_column_idxs) { + FieldDescriptor field_desc_with_ids = field_desc->copy_with_assigned_ids(); + field_desc = &field_desc_with_ids; + + // map top-level table column name -> file FieldSchema* using the same by-position mapping + // that builds table_info_node. + DORIS_CHECK(table_column_names.size() == file_column_idxs.size()); + std::unordered_map table_col_name_to_field_schema_map; + const auto& parquet_fields_schema = field_desc->get_fields_schema(); + for (size_t idx = 0; idx < file_column_idxs.size(); ++idx) { + const int32_t file_index = file_column_idxs[idx]; + if (file_index >= parquet_fields_schema.size()) { + continue; + } + table_col_name_to_field_schema_map[to_lower(table_column_names[idx])] = + &parquet_fields_schema[file_index]; } std::set column_ids; @@ -380,8 +388,8 @@ ColumnIdResult HiveParquetReader::_create_column_ids_by_top_level_col_index( }; for (const auto* slot : tuple_descriptor->slots()) { - auto it = table_col_pos_to_field_schema_map.find(slot->col_pos()); - if (it == table_col_pos_to_field_schema_map.end()) { + auto it = table_col_name_to_field_schema_map.find(slot->col_name_lower_case()); + if (it == table_col_name_to_field_schema_map.end()) { // Column not found in file continue; } @@ -389,7 +397,7 @@ ColumnIdResult HiveParquetReader::_create_column_ids_by_top_level_col_index( // primitive (non-nested) types if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && - slot->col_type() != TYPE_MAP)) { + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { column_ids.insert(field_schema->column_id); if (slot->is_predicate()) { diff --git a/be/src/format/table/hive_reader.h b/be/src/format/table/hive_reader.h index 9bcaa0536e7374..57ea175264219d 100644 --- a/be/src/format/table/hive_reader.h +++ b/be/src/format/table/hive_reader.h @@ -72,7 +72,9 @@ class HiveParquetReader final : public ParquetReader, public TableSchemaChangeHe const TupleDescriptor* tuple_descriptor); static ColumnIdResult _create_column_ids_by_top_level_col_index( - const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor); + const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor, + const std::vector& table_column_names, + const std::vector& file_column_idxs); const std::set* _is_file_slot = nullptr; }; diff --git a/be/src/format/table/iceberg/arrow_schema_util.cpp b/be/src/format/table/iceberg/arrow_schema_util.cpp index e0bf830dfc8168..aa6e6a7ad60e5f 100644 --- a/be/src/format/table/iceberg/arrow_schema_util.cpp +++ b/be/src/format/table/iceberg/arrow_schema_util.cpp @@ -119,6 +119,9 @@ Status ArrowSchemaUtil::convert_to(const iceberg::NestedField& field, break; } + case iceberg::TypeID::VARIANT: + return Status::NotSupported("Iceberg VARIANT write is not supported"); + case iceberg::TypeID::TIME: default: return Status::InternalError("Unsupported field type:" + field.field_type()->to_string()); diff --git a/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp b/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp index 726a66b580f541..d8a51f2ec17e05 100644 --- a/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp +++ b/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp @@ -17,155 +17,15 @@ #include "format/table/iceberg/iceberg_parquet_nested_column_utils.h" -#include -#include -#include -#include -#include -#include -#include - -#include "format/parquet/schema_desc.h" -#include "format/table/table_schema_change_helper.h" +#include "format/parquet/parquet_nested_column_utils.h" namespace doris { void IcebergParquetNestedColumnUtils::extract_nested_column_ids( const FieldSchema& field_schema, const std::vector>& paths, std::set& column_ids) { - // Group paths by first field_id - std::unordered_map>> child_paths_by_field_id; - - for (const auto& path : paths) { - if (!path.empty()) { - std::string first_field_id = path[0]; - std::vector remaining; - if (path.size() > 1) { - remaining.assign(path.begin() + 1, path.end()); - } - child_paths_by_field_id[first_field_id].push_back(std::move(remaining)); - } - } - - // Track whether any child column was added to determine if parent should be included - bool has_child_columns = false; - - // For MAP type, normalize wildcard "*" to explicit KEYS/VALUES access - // Wildcard in MAP context means accessing both map keys and values - // Normalization logic: - // path: ["map_col", "*"] → ["map_col", "VALUES"] + ["map_col", "KEYS"] - // path: ["map_col", "*", "field"] → ["map_col", "VALUES", "field"] + ["map_col", "KEYS"] - if (field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_MAP) { - auto wildcard_it = child_paths_by_field_id.find("*"); - if (wildcard_it != child_paths_by_field_id.end()) { - auto& wildcard_paths = wildcard_it->second; - - // All wildcard paths go to VALUES - auto& values_paths = child_paths_by_field_id["VALUES"]; - values_paths.insert(values_paths.end(), wildcard_paths.begin(), wildcard_paths.end()); - - // Always add KEYS for wildcard access - auto& keys_paths = child_paths_by_field_id["KEYS"]; - // Add an empty path to request full KEYS - std::vector empty_path; - keys_paths.push_back(empty_path); - - // Remove wildcard entry as it's been expanded - child_paths_by_field_id.erase(wildcard_it); - } - } - - // Efficiently traverse children - for (uint64_t i = 0; i < field_schema.children.size(); ++i) { - const auto& child = field_schema.children[i]; - - std::string child_field_id; - - bool is_list = field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_ARRAY; - bool is_map = field_schema.data_type->get_primitive_type() == PrimitiveType::TYPE_MAP; - - if (is_list) { - child_field_id = "*"; - } else if (is_map) { - // After wildcard normalization above, all MAP accesses are explicit KEYS/VALUES - // Simply assign the appropriate field name based on which child we're processing - if (i == 0) { - child_field_id = "KEYS"; - } else if (i == 1) { - child_field_id = "VALUES"; - } - - // Special handling for Parquet MAP structure: - // When accessing only VALUES, we still need KEY structure for levels - // Check if we're at key child (i==0) and only VALUES is requested (no KEYS) - if (i == 0) { - bool has_keys_access = - child_paths_by_field_id.find("KEYS") != child_paths_by_field_id.end(); - bool has_values_access = - child_paths_by_field_id.find("VALUES") != child_paths_by_field_id.end(); - - // If only VALUES is accessed (not KEYS), still include key structure for RL/DL - if (!has_keys_access && has_values_access) { - // For map_values() queries, we need key's structure for correct RL/DL parsing. - // If key is a nested type (e.g., STRUCT), RL/DL info is stored at leaf columns. - // Add all column IDs from key's start to max (all leaves + intermediate nodes). - uint64_t key_start_id = child.get_column_id(); - uint64_t key_max_id = child.get_max_column_id(); - for (uint64_t id = key_start_id; id <= key_max_id; ++id) { - column_ids.insert(id); - } - has_child_columns = true; - continue; // Skip further processing of key child - } - } - - } else { - child_field_id = std::to_string(child.field_id); - } - - if (child_field_id.empty() || child_field_id == "-1") { - continue; - } - - auto child_paths_it = child_paths_by_field_id.find(child_field_id); - if (child_paths_it != child_paths_by_field_id.end()) { - const auto& child_paths = child_paths_it->second; - - // Check if any child path is empty (meaning full child needed) - bool needs_full_child = - std::any_of(child_paths.begin(), child_paths.end(), - [](const std::vector& path) { return path.empty(); }); - - if (needs_full_child) { - // Add all column IDs from current child node to max_column_id - // This efficiently handles all nested/complex cases in one loop - uint64_t start_id = child.get_column_id(); - uint64_t max_column_id = child.get_max_column_id(); - for (uint64_t id = start_id; id <= max_column_id; ++id) { - column_ids.insert(id); - } - has_child_columns = true; - } else { - // Store current size to check if recursive call added any columns - size_t before_size = column_ids.size(); - - // Recursively extract from child - extract_nested_column_ids(child, child_paths, column_ids); - - // Check if recursive call added any columns - if (column_ids.size() > before_size) { - has_child_columns = true; - } - } - } - } - - // If any child columns were added, also add the parent column ID - // This ensures parent struct/container nodes are included when their children are needed - if (has_child_columns) { - // Set automatically handles deduplication, so no need to check if it already exists - column_ids.insert(field_schema.get_column_id()); - } + ParquetNestedColumnUtils::extract_nested_column_ids_by_field_id(field_schema, paths, + column_ids); } } // namespace doris diff --git a/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h b/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h index 39c1b90fac0977..bf54823b7a32f8 100644 --- a/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h +++ b/be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h @@ -17,17 +17,13 @@ #pragma once -#include +#include #include #include -#include #include -#include "format/table/table_schema_change_helper.h" - namespace doris { -class FieldDescriptor; struct FieldSchema; class IcebergParquetNestedColumnUtils { diff --git a/be/src/format/table/iceberg/types.cpp b/be/src/format/table/iceberg/types.cpp index 252f9035518c0b..75a9f5c939d467 100644 --- a/be/src/format/table/iceberg/types.cpp +++ b/be/src/format/table/iceberg/types.cpp @@ -170,6 +170,8 @@ std::unique_ptr Types::from_primitive_string(const std::string& t return std::make_unique(); } else if (lower_type_string == "binary") { return std::make_unique(); + } else if (lower_type_string == "variant") { + return std::make_unique(); } else { std::regex fixed(R"(fixed\[\s*(\d+)\s*\])"); std::regex decimal(R"(decimal\(\s*(\d+)\s*,\s*(\d+)\s*\))"); diff --git a/be/src/format/table/iceberg/types.h b/be/src/format/table/iceberg/types.h index 53c54e238fa255..09b00defb9bcde 100644 --- a/be/src/format/table/iceberg/types.h +++ b/be/src/format/table/iceberg/types.h @@ -46,6 +46,7 @@ enum TypeID { FIXED, BINARY, DECIMAL, + VARIANT, STRUCT, LIST, MAP @@ -394,6 +395,15 @@ class BooleanType : public PrimitiveType { std::string to_string() const override { return "boolean"; } }; +class VariantType : public PrimitiveType { +public: + ~VariantType() override = default; + + TypeID type_id() const override { return TypeID::VARIANT; } + + std::string to_string() const override { return "variant"; } +}; + class Types { public: static std::unique_ptr from_primitive_string(const std::string& type_string); diff --git a/be/src/format/table/iceberg_reader.cpp b/be/src/format/table/iceberg_reader.cpp index 7a74431a05851b..2c1e40c236fbdc 100644 --- a/be/src/format/table/iceberg_reader.cpp +++ b/be/src/format/table/iceberg_reader.cpp @@ -136,6 +136,8 @@ Status IcebergParquetReader::on_before_init_reader(ReaderInitContext* ctx) { const FieldDescriptor* field_desc = nullptr; RETURN_IF_ERROR(this->get_file_metadata_schema(&field_desc)); DCHECK(field_desc != nullptr); + this->prepare_parquet_file_schema_with_ids(field_desc); + field_desc = &this->parquet_file_schema(); // Build table_info_node by field_id or name matching. // This must happen BEFORE column classification so we can use children_column_exists @@ -312,8 +314,8 @@ Status IcebergParquetReader::on_before_init_reader(ReaderInitContext* ctx) { // ============================================================================ ColumnIdResult IcebergParquetReader::_create_column_ids(const FieldDescriptor* field_desc, const TupleDescriptor* tuple_descriptor) { - auto* mutable_field_desc = const_cast(field_desc); - mutable_field_desc->assign_ids(); + FieldDescriptor field_desc_with_ids = field_desc->copy_with_assigned_ids(); + field_desc = &field_desc_with_ids; std::unordered_map iceberg_id_to_field_schema_map; for (int i = 0; i < field_desc->size(); ++i) { @@ -344,7 +346,7 @@ ColumnIdResult IcebergParquetReader::_create_column_ids(const FieldDescriptor* f auto field_schema = it->second; if ((slot->col_type() != TYPE_STRUCT && slot->col_type() != TYPE_ARRAY && - slot->col_type() != TYPE_MAP)) { + slot->col_type() != TYPE_MAP && slot->col_type() != TYPE_VARIANT)) { column_ids.insert(field_schema->column_id); if (slot->is_predicate()) { filter_column_ids.insert(field_schema->column_id); diff --git a/be/test/format/parquet/delta_byte_array_decoder_test.cpp b/be/test/format/parquet/delta_byte_array_decoder_test.cpp index 1b039da3d2344d..4ebab87320f3f1 100644 --- a/be/test/format/parquet/delta_byte_array_decoder_test.cpp +++ b/be/test/format/parquet/delta_byte_array_decoder_test.cpp @@ -20,9 +20,13 @@ #include #include "arrow/api.h" +#include "core/column/column_nullable.h" +#include "core/column/column_varbinary.h" #include "core/column/column_vector.h" +#include "core/data_type/data_type_nullable.h" #include "core/data_type/data_type_number.h" #include "core/data_type/data_type_string.h" +#include "core/data_type/data_type_varbinary.h" #include "format/parquet/delta_bit_pack_decoder.h" #include "parquet/encoding.h" #include "parquet/schema.h" @@ -38,6 +42,28 @@ class DeltaByteArrayDecoderTest : public ::testing::Test { std::unique_ptr _decoder; }; +static std::vector make_byte_array_values( + const std::vector& values) { + std::vector byte_array_values; + byte_array_values.reserve(values.size()); + for (const auto& value : values) { + byte_array_values.emplace_back(static_cast(value.size()), + reinterpret_cast(value.data())); + } + return byte_array_values; +} + +static Status init_all_selected_nullable_vector(size_t num_values, + std::vector* run_length_null_map, + std::vector* filter_data, + FilterMap* filter_map, NullMap* null_map, + ColumnSelectVector* select_vector) { + run_length_null_map->assign(num_values, 1); + filter_data->assign(num_values, 1); + RETURN_IF_ERROR(filter_map->init(filter_data->data(), filter_data->size(), false)); + return select_vector->init(*run_length_null_map, num_values, null_map, filter_map, 0); +} + // Test basic decoding byte array functionality TEST_F(DeltaByteArrayDecoderTest, test_basic_decode_byte_array) { // Create ColumnDescriptor @@ -47,12 +73,7 @@ TEST_F(DeltaByteArrayDecoderTest, test_basic_decode_byte_array) { // Prepare original data std::vector values = {"Hello", "World", "Foobar", "ABCDEF"}; - std::vector byte_array_values; - for (const auto& value : values) { - byte_array_values.emplace_back( - parquet::ByteArray {static_cast(value.size()), - reinterpret_cast(value.data())}); - } + auto byte_array_values = make_byte_array_values(values); // Create encoder auto encoder = MakeTypedEncoder(parquet::Encoding::DELTA_BYTE_ARRAY, @@ -100,12 +121,7 @@ TEST_F(DeltaByteArrayDecoderTest, test_decode_byte_array_with_filter) { // Prepare original data std::vector values = {"Hello", "World", "Foobar", "ABCDEF"}; - std::vector byte_array_values; - for (const auto& value : values) { - byte_array_values.emplace_back( - parquet::ByteArray {static_cast(value.size()), - reinterpret_cast(value.data())}); - } + auto byte_array_values = make_byte_array_values(values); // Create encoder auto encoder = MakeTypedEncoder(parquet::Encoding::DELTA_BYTE_ARRAY, @@ -152,12 +168,7 @@ TEST_F(DeltaByteArrayDecoderTest, test_decode_byte_array_with_filter_and_null) { // Prepare original data std::vector values = {"Hello", "World", "ABCDEF"}; - std::vector byte_array_values; - for (const auto& value : values) { - byte_array_values.emplace_back( - parquet::ByteArray {static_cast(value.size()), - reinterpret_cast(value.data())}); - } + auto byte_array_values = make_byte_array_values(values); // Create encoder auto encoder = MakeTypedEncoder(parquet::Encoding::DELTA_BYTE_ARRAY, @@ -209,6 +220,49 @@ TEST_F(DeltaByteArrayDecoderTest, test_decode_byte_array_with_filter_and_null) { } } +TEST_F(DeltaByteArrayDecoderTest, test_decode_nullable_varbinary) { + auto node = parquet::schema::PrimitiveNode::Make("test_column", parquet::Repetition::OPTIONAL, + parquet::Type::BYTE_ARRAY); + auto descr = std::make_shared(node, 0, 1); + + std::vector values = {"hello", std::string("\x01\xff", 2)}; + auto byte_array_values = make_byte_array_values(values); + + auto encoder = MakeTypedEncoder(parquet::Encoding::DELTA_BYTE_ARRAY, + /*use_dictionary=*/false, descr.get()); + ASSERT_NO_THROW( + encoder->Put(byte_array_values.data(), static_cast(byte_array_values.size()))); + + auto encoded_buffer = encoder->FlushValues(); + Slice data_slice(encoded_buffer->data(), encoded_buffer->size()); + ASSERT_TRUE(_decoder->set_data(&data_slice).ok()); + + DataTypePtr data_type = make_nullable(std::make_shared()); + MutableColumnPtr column = data_type->create_column(); + + constexpr size_t num_values = 3; + std::vector run_length_null_map; + std::vector filter_data; + FilterMap filter_map; + ColumnSelectVector select_vector; + NullMap null_map; + ASSERT_TRUE(init_all_selected_nullable_vector(num_values, &run_length_null_map, &filter_data, + &filter_map, &null_map, &select_vector) + .ok()); + + ASSERT_TRUE(_decoder->decode_values(column, data_type, select_vector, false).ok()); + + ASSERT_EQ(column->size(), num_values); + const auto* nullable_column = assert_cast(column.get()); + const auto& result_column = + assert_cast(nullable_column->get_nested_column()); + EXPECT_EQ(nullable_column->get_null_map_data()[0], 0); + EXPECT_EQ(nullable_column->get_null_map_data()[1], 1); + EXPECT_EQ(nullable_column->get_null_map_data()[2], 0); + EXPECT_EQ(result_column.get_data_at(0).to_string(), values[0]); + EXPECT_EQ(result_column.get_data_at(2).to_string(), values[1]); +} + // Test skipping values for byte array decoding TEST_F(DeltaByteArrayDecoderTest, test_skip_value_for_byte_array) { // Create ColumnDescriptor diff --git a/be/test/format/parquet/parquet_expr_test.cpp b/be/test/format/parquet/parquet_expr_test.cpp index 83a83e71d3098d..def801cf39cd48 100644 --- a/be/test/format/parquet/parquet_expr_test.cpp +++ b/be/test/format/parquet/parquet_expr_test.cpp @@ -83,6 +83,37 @@ class VExprContext; //using namespace iceberg; using namespace parquet; +namespace { + +std::vector make_variant_root_schema(const std::string& column_name) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name(column_name); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(2); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement value; + value.__set_name("value"); + value.__set_type(tparquet::Type::BYTE_ARRAY); + value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + return {root, variant, metadata, value}; +} + +} // namespace + class ParquetExprTest : public testing::Test { public: ParquetExprTest() {} @@ -1173,6 +1204,37 @@ TEST_F(ParquetExprTest, test_expr_push_down_and) { ASSERT_TRUE(filter_group); } +TEST_F(ParquetExprTest, test_row_group_stats_skip_top_level_variant_root) { + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift(make_variant_root_schema("int64_col")); + ASSERT_TRUE(st.ok()) << st.to_string(); + p_reader->prepare_parquet_file_schema_with_ids(&descriptor); + + std::unique_ptr pred = AndBlockColumnPredicate::create_unique(); + pred->add_column_predicate(SingleColumnBlockPredicate::create_unique( + ComparisonPredicateBase::create_shared( + 2, "", Field::create_field(10000000001)))); + + p_reader->_push_down_predicates.clear(); + p_reader->_push_down_predicates.push_back(std::move(pred)); + p_reader->_enable_filter_by_min_max = true; + p_reader->_enable_filter_by_bloom_filter = true; + + tparquet::RowGroup row_group; + row_group.__set_num_rows(3); + + bool filter_group = false; + bool filtered_by_min_max = false; + bool filtered_by_bloom_filter = false; + ASSERT_TRUE(p_reader->_process_column_stat_filter(row_group, p_reader->_push_down_predicates, + &filter_group, &filtered_by_min_max, + &filtered_by_bloom_filter) + .ok()); + EXPECT_FALSE(filter_group); + EXPECT_FALSE(filtered_by_min_max); + EXPECT_FALSE(filtered_by_bloom_filter); +} + TEST_F(ParquetExprTest, test_expr_push_down_or_string) { auto or_expr = std::make_shared(); or_expr->_op = TExprOpcode::COMPOUND_OR; diff --git a/be/test/format/parquet/parquet_variant_reader_test.cpp b/be/test/format/parquet/parquet_variant_reader_test.cpp new file mode 100644 index 00000000000000..74ad54453b7428 --- /dev/null +++ b/be/test/format/parquet/parquet_variant_reader_test.cpp @@ -0,0 +1,2994 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "format/parquet/parquet_variant_reader.h" + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "core/column/column_nullable.h" +#include "core/column/column_variant.h" +#include "core/column/column_vector.h" +#include "core/data_type/data_type_array.h" +#include "core/data_type/data_type_decimal.h" +#include "core/data_type/data_type_map.h" +#include "core/data_type/data_type_nullable.h" +#include "core/data_type/data_type_number.h" +#include "core/data_type/data_type_string.h" +#include "core/data_type/data_type_struct.h" +#include "core/data_type/data_type_time.h" +#include "core/data_type/data_type_varbinary.h" +#include "core/data_type/data_type_variant.h" +#include "core/data_type/primitive_type.h" +#include "core/data_type_serde/data_type_serde.h" +#include "core/field.h" +#include "format/parquet/parquet_column_convert.h" +#include "format/parquet/schema_desc.h" +#include "format/parquet/vparquet_column_reader.h" + +namespace doris::parquet { +namespace { + +StringRef bytes_ref(const std::vector& bytes) { + return {bytes.data(), bytes.size()}; +} + +void append_int64_le(std::vector* bytes, int64_t value) { + auto unsigned_value = static_cast(value); + for (int i = 0; i < 8; ++i) { + bytes->push_back(static_cast(unsigned_value >> (i * 8))); + } +} + +std::vector make_metadata(std::initializer_list keys, + bool sorted_strings = false) { + const uint8_t header = sorted_strings ? 0x11 : 0x01; + std::vector metadata {header, static_cast(keys.size())}; + uint8_t offset = 0; + metadata.push_back(offset); + for (std::string_view key : keys) { + offset += static_cast(key.size()); + metadata.push_back(offset); + } + for (std::string_view key : keys) { + metadata.insert(metadata.end(), key.begin(), key.end()); + } + return metadata; +} + +void expect_variant_json(const std::vector& metadata, std::initializer_list value, + std::string_view expected) { + std::vector value_bytes(value); + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value_bytes), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ(expected, json); +} + +void expect_variant_corruption(const std::vector& metadata, + std::initializer_list value) { + std::vector value_bytes(value); + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value_bytes), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +FieldSchema make_int32_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT32; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_int64_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT64; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_float_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::FLOAT; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_double_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::DOUBLE; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_varbinary_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::BYTE_ARRAY; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +Field make_varbinary_field(std::initializer_list bytes) { + const auto* data = reinterpret_cast(bytes.begin()); + return Field::create_field( + StringView(data, static_cast(bytes.size()))); +} + +Field make_varbinary_field(std::string_view bytes) { + return Field::create_field(StringView(bytes)); +} + +std::string varbinary_field_bytes(const Field& field) { + auto ref = field.get().to_string_ref(); + return {ref.data, ref.size}; +} + +std::string test_uuid_bytes() { + return {"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 16}; +} + +FieldSchema make_uuid_field_schema(std::string name) { + FieldSchema field = make_varbinary_field_schema(std::move(name)); + field.physical_type = tparquet::Type::FIXED_LEN_BYTE_ARRAY; + field.parquet_schema.__set_logicalType(tparquet::LogicalType()); + field.parquet_schema.logicalType.__set_UUID(tparquet::UUIDType()); + field.parquet_schema.__set_type_length(16); + return field; +} + +FieldSchema make_datev2_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT32; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_timev2_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT64; + field.data_type = make_nullable(std::make_shared(6)); + return field; +} + +FieldSchema make_datetimev2_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT64; + field.data_type = make_nullable(std::make_shared(6)); + return field; +} + +FieldSchema make_required_int64_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT64; + field.data_type = std::make_shared(); + return field; +} + +FieldSchema make_binary_field_schema(std::string name, bool nullable) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::BYTE_ARRAY; + field.data_type = std::make_shared(); + if (nullable) { + field.data_type = make_nullable(field.data_type); + } + return field; +} + +FieldSchema make_string_field_schema(std::string name, bool nullable) { + FieldSchema field = make_binary_field_schema(std::move(name), nullable); + tparquet::LogicalType logical_type; + logical_type.__set_STRING(tparquet::StringType()); + field.parquet_schema.__set_logicalType(logical_type); + return field; +} + +FieldSchema make_required_shredded_variant_schema() { + FieldSchema field; + field.name = "measurement"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), + make_int64_field_schema("typed_value")}; + return field; +} + +std::string serialize_variant_field(const Field& field) { + auto variant_column = ColumnVariant::create(0, false); + variant_column->insert(field); + std::string json; + DataTypeSerDe::FormatOptions options; + variant_column->serialize_one_row_to_string(0, &json, options); + return json; +} + +} // namespace + +TEST(ParquetVariantReaderTest, ParseTypedOnlyVariantSchemaWithoutTopLevelValue) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(2); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement typed_value; + typed_value.__set_name("typed_value"); + typed_value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + typed_value.__set_num_children(1); + + tparquet::SchemaElement metric; + metric.__set_name("metric"); + metric.__set_type(tparquet::Type::INT64); + metric.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, typed_value, metric}); + ASSERT_TRUE(st.ok()) << st.to_string(); + + const auto* variant_field = descriptor.get_column("v"); + ASSERT_NE(variant_field, nullptr); + EXPECT_EQ(variant_field->data_type->get_primitive_type(), TYPE_VARIANT); + ASSERT_EQ(variant_field->children.size(), 2); + EXPECT_EQ(variant_field->children[0].name, "metadata"); + EXPECT_EQ(variant_field->children[1].name, "typed_value"); +} + +TEST(ParquetVariantReaderTest, RejectVariantSchemaWithUnexpectedChild) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(3); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement value; + value.__set_name("value"); + value.__set_type(tparquet::Type::BYTE_ARRAY); + value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + tparquet::SchemaElement extra; + extra.__set_name("extra"); + extra.__set_type(tparquet::Type::INT32); + extra.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, value, extra}); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectVariantSchemaWithDuplicateStructuralChild) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(3); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement value; + value.__set_name("value"); + value.__set_type(tparquet::Type::BYTE_ARRAY); + value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + tparquet::SchemaElement duplicate_value = value; + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, value, duplicate_value}); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectVariantSchemaWithNonBinaryValueChild) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(3); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement value; + value.__set_name("value"); + value.__set_type(tparquet::Type::INT32); + value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + tparquet::SchemaElement typed_value; + typed_value.__set_name("typed_value"); + typed_value.__set_type(tparquet::Type::INT64); + typed_value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, value, typed_value}); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectVariantSchemaWithAnnotatedMetadataChild) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(2); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + tparquet::LogicalType metadata_type; + metadata_type.__set_STRING(tparquet::StringType()); + metadata.__set_logicalType(metadata_type); + + tparquet::SchemaElement typed_value; + typed_value.__set_name("typed_value"); + typed_value.__set_type(tparquet::Type::INT64); + typed_value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, typed_value}); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectVariantSchemaWithAnnotatedValueChild) { + tparquet::SchemaElement root; + root.__set_name("schema"); + root.__set_num_children(1); + + tparquet::LogicalType variant_type; + variant_type.__set_VARIANT(tparquet::VariantType()); + + tparquet::SchemaElement variant; + variant.__set_name("v"); + variant.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + variant.__set_num_children(3); + variant.__set_logicalType(variant_type); + + tparquet::SchemaElement metadata; + metadata.__set_name("metadata"); + metadata.__set_type(tparquet::Type::BYTE_ARRAY); + metadata.__set_repetition_type(tparquet::FieldRepetitionType::REQUIRED); + + tparquet::SchemaElement value; + value.__set_name("value"); + value.__set_type(tparquet::Type::BYTE_ARRAY); + value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + tparquet::LogicalType value_type; + value_type.__set_STRING(tparquet::StringType()); + value.__set_logicalType(value_type); + + tparquet::SchemaElement typed_value; + typed_value.__set_name("typed_value"); + typed_value.__set_type(tparquet::Type::INT64); + typed_value.__set_repetition_type(tparquet::FieldRepetitionType::OPTIONAL); + + FieldDescriptor descriptor; + Status st = descriptor.parse_from_thrift({root, variant, metadata, value, typed_value}); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, OptionalTopLevelVariantUsesNullableReadColumnOnly) { + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true)}; + + EXPECT_FALSE(parquet_variant_reader_test::variant_struct_reader_type_is_nullable_for_test( + variant_field)); + EXPECT_TRUE(parquet_variant_reader_test::variant_struct_reader_column_is_nullable_for_test( + variant_field)); + + variant_field.data_type = std::make_shared(0, false); + EXPECT_FALSE(parquet_variant_reader_test::variant_struct_reader_column_is_nullable_for_test( + variant_field)); +} + +TEST(ParquetVariantReaderTest, DecodeSimpleObject) { + auto metadata = make_metadata({"a"}); + std::vector value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0 + 0x00, 0x02, // field value offsets + 0x0c, 0x07 // int8(7) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("{\"a\":7}", json); +} + +TEST(ParquetVariantReaderTest, DecodeUnsortedMetadataMayContainDuplicateDictionaryStrings) { + auto metadata = make_metadata({"a", "a"}); + std::vector value { + 0x02, // object + 0x01, // one field + 0x01, // dictionary id 1 + 0x00, 0x02, // field value offsets + 0x0c, 0x07 // int8(7) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("{\"a\":7}", json); +} + +TEST(ParquetVariantReaderTest, RejectDuplicateObjectKeysFromDuplicateMetadataEntries) { + auto metadata = make_metadata({"a", "a"}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x01, // strictly increasing dictionary ids + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x01, 0x0c, 0x02 // int8(1), int8(2) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectInvalidSortedMetadataDictionaryStrings) { + expect_variant_corruption(make_metadata({"a", "a"}, true), {0x00}); + expect_variant_corruption(make_metadata({"b", "a"}, true), {0x00}); +} + +TEST(ParquetVariantReaderTest, RejectMetadataTrailingBytes) { + auto metadata = make_metadata({"a"}); + metadata.push_back(0xff); + expect_variant_corruption(metadata, {0x00}); +} + +TEST(ParquetVariantReaderTest, RejectMetadataFirstDictionaryOffsetNotZero) { + std::vector metadata { + 0x01, // version 1, one-byte offsets + 0x01, // one dictionary entry + 0x01, 0x02, // invalid dictionary offsets: first offset must be zero + 'x', 'a' // no trailing bytes; offset[1] consumes both bytes + }; + expect_variant_corruption(metadata, {0x00}); +} + +TEST(ParquetVariantReaderTest, RejectMetadataReservedHeaderBits) { + std::vector metadata { + 0x21, // reserved bit 5 is set + 0x00, // zero dictionary entries + 0x00 // offset[0] + }; + expect_variant_corruption(metadata, {0x00}); +} + +TEST(ParquetVariantReaderTest, RejectInvalidUtf8MetadataAndStrings) { + std::vector invalid_metadata { + 0x01, // version 1, one-byte offsets + 0x01, // one dictionary entry + 0x00, 0x01, // dictionary offsets + 0xff // invalid UTF-8 dictionary key + }; + expect_variant_corruption(invalid_metadata, {0x00}); + + auto metadata = make_metadata({}); + expect_variant_corruption(metadata, {0x05, 0xff}); + expect_variant_corruption(metadata, {0x40, 0x01, 0x00, 0x00, 0x00, 0xff}); +} + +TEST(ParquetVariantReaderTest, DecodeObjectUsesUnsignedByteFieldOrder) { + const std::string e_acute("\xc3\xa9", 2); + auto metadata = make_metadata({std::string_view("z"), std::string_view(e_acute)}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x01, // dictionary ids are sorted by unsigned UTF-8 bytes: z, e acute + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x01, 0x0c, 0x02 // int8(1), int8(2) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + std::string expected = R"({"z":1,")"; + expected.append(e_acute); + expected.append(R"(":2})"); + EXPECT_EQ(expected, json); +} + +TEST(ParquetVariantReaderTest, RejectObjectFieldOrderUsingUnsignedBytes) { + const std::string e_acute("\xc3\xa9", 2); + auto metadata = make_metadata({std::string_view(e_acute), std::string_view("z")}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x01, // dictionary ids are not sorted by unsigned UTF-8 bytes + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x02, 0x0c, 0x01 // int8(2), int8(1) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, DecodeDecimal128MinimumValue) { + auto metadata = make_metadata({}); + std::vector value { + 0x28, // decimal128 primitive + 0x00, // scale 0 + 0x00, 0x00, 0x00, 0x00, // + 0x00, 0x00, 0x00, 0x00, // + 0x00, 0x00, 0x00, 0x00, // + 0x00, 0x00, 0x00, 0x80 // -2^127 in little-endian two's complement + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("-170141183460469231731687303715884105728", json); +} + +TEST(ParquetVariantReaderTest, RejectInvalidDecimalScale) { + auto metadata = make_metadata({}); + + expect_variant_corruption(metadata, {0x20, 0xff, 0x00, 0x00, 0x00, 0x00}); + expect_variant_corruption(metadata, {0x20, 0x27, 0x00, 0x00, 0x00, 0x00}); +} + +TEST(ParquetVariantReaderTest, DecodePrimitiveCoverageExtras) { + auto metadata = make_metadata({}); + + expect_variant_json(metadata, {0x00}, "null"); + expect_variant_json(metadata, {0x10, 0xff, 0xff}, "-1"); + expect_variant_json(metadata, {0x20, 0x02, 0x85, 0xff, 0xff, 0xff}, "-1.23"); + expect_variant_json(metadata, {0x24, 0x00, 0xb0, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, + "1200"); + expect_variant_json(metadata, + {0x28, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, + "1"); + expect_variant_json(metadata, {0x1c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xf8, 0x3f}, "1.5"); + expect_variant_json(metadata, {0x38, 0x00, 0x00, 0xc0, 0x3f}, "1.5"); + expect_variant_json(metadata, {0x2c, 0x2a, 0x00, 0x00, 0x00}, "42"); + expect_variant_json(metadata, {0x30, 0x2a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, "42"); + expect_variant_json(metadata, {0x40, 0x04, 0x00, 0x00, 0x00, 't', 'e', 'x', 't'}, "\"text\""); + expect_variant_json(metadata, {0x3c, 0x03, 0x00, 0x00, 0x00, 0xff, 0x00, 'A'}, + R"("\u00ff\u0000A")"); + expect_variant_json(metadata, {0x21, '"', '\\', '\b', '\f', '\n', '\r', '\t', 0x01}, + R"("\"\\\b\f\n\r\t\u0001")"); + expect_variant_json(metadata, + {0x50, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, + 0x0b, 0x0c, 0x0d, 0x0e, 0x0f}, + "\"00010203-0405-0607-0809-0a0b0c0d0e0f\""); +} + +TEST(ParquetVariantReaderTest, DecodeResidualPrimitiveToVariantMapPreservesScalarTypes) { + auto metadata = make_metadata({}); + auto decode_root = [&](std::initializer_list value_bytes) { + std::vector value(value_bytes); + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), + PathInData(), &values, &string_values); + EXPECT_TRUE(st.ok()) << st.to_string(); + auto root = values.find(PathInData()); + EXPECT_NE(root, values.end()); + if (!st.ok() || root == values.end()) { + return FieldWithDataType {}; + } + return root->second; + }; + + auto int16_value = decode_root({0x10, 0xff, 0xff}); + EXPECT_EQ(int16_value.base_scalar_type_id, TYPE_SMALLINT); + EXPECT_EQ(int16_value.field.get(), -1); + + auto decimal_value = decode_root({0x20, 0x02, 0x85, 0xff, 0xff, 0xff}); + EXPECT_EQ(decimal_value.base_scalar_type_id, TYPE_DECIMAL32); + EXPECT_EQ(decimal_value.precision, BeConsts::MAX_DECIMAL32_PRECISION); + EXPECT_EQ(decimal_value.scale, 2); + EXPECT_EQ(decimal_value.field.to_debug_string(decimal_value.scale), "-1.23"); + + auto string_value = decode_root({0x40, 0x04, 0x00, 0x00, 0x00, 't', 'e', 'x', 't'}); + EXPECT_EQ(string_value.base_scalar_type_id, TYPE_STRING); + EXPECT_EQ(string_value.field.get(), "text"); +} + +TEST(ParquetVariantReaderTest, RejectInvalidVariantEncodingsCoverageExtras) { + expect_variant_corruption(std::vector {0x02}, {0x00}); + expect_variant_corruption(std::vector {0x01, 0x01, 0x01, 0x00}, {0x00}); + + auto metadata = make_metadata({}); + expect_variant_corruption(metadata, {0x0c, 0x01, 0x00}); + expect_variant_corruption(metadata, {0x03, 0x01, 0x01, 0x02, 0x0c, 0x07}); + expect_variant_corruption(metadata, {0x03, 0x02, 0x00, 0x02, 0x01, 0x0c, 0x01}); + expect_variant_corruption(metadata, {0x02, 0x00, 0x01, 0x00}); + expect_variant_corruption(metadata, {0x54}); + + auto object_metadata = make_metadata({"a"}); + expect_variant_corruption(object_metadata, {0x02, 0x01, 0x00, 0x01, 0x02, 0x0c, 0x07}); + expect_variant_corruption(metadata, {0x02, 0x01, 0x00, 0x00, 0x02, 0x0c, 0x07}); +} + +TEST(ParquetVariantReaderTest, DecodeResidualRootBinaryToVariantMap) { + auto metadata = make_metadata({}); + std::vector value {0x3c, // binary primitive, 3 bytes + 0x03, 0x00, 0x00, 0x00, 0xff, 0x00, 0x41}; + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto& binary = values.at(PathInData()); + EXPECT_EQ(binary.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(binary.field), std::string("\xff\x00\x41", 3)); +} + +TEST(ParquetVariantReaderTest, DecodeResidualBinaryToVariantMap) { + auto metadata = make_metadata({"b"}); + std::vector value {0x02, // object + 0x01, // one field + 0x00, // dictionary id 0: b + 0x00, 0x08, // field value offsets + 0x3c, 0x03, 0x00, 0x00, 0x00, // binary primitive, 3 bytes + 0xff, 0x00, 0x41}; + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto& binary = values.at(PathInData("b")); + EXPECT_EQ(binary.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(binary.field), std::string("\xff\x00\x41", 3)); +} + +TEST(ParquetVariantReaderTest, DecodeResidualBinaryArrayToVariantMap) { + auto metadata = make_metadata({}); + std::vector value {0x03, // array + 0x02, // two elements + 0x00, 0x07, 0x08, // element value offsets + 0x3c, 0x02, 0x00, 0x00, 0x00, // binary primitive, 2 bytes + 0xc3, 0x28, 0x00}; // variant null + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto& binary_array = values.at(PathInData()); + EXPECT_EQ(binary_array.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(binary_array.num_dimensions, 1); + const auto& array = binary_array.field.get(); + ASSERT_EQ(array.size(), 2); + EXPECT_EQ(varbinary_field_bytes(array[0]), std::string("\xc3\x28", 2)); + EXPECT_TRUE(array[1].is_null()); +} + +TEST(ParquetVariantReaderTest, DecodeResidualNonFiniteDoubleArrayToVariantMap) { + auto metadata = make_metadata({}); + std::vector value {0x03, // array + 0x02, // two elements + 0x00, 0x09, 0x12, // element value offsets + 0x1c}; // double primitive + append_int64_le(&value, static_cast(0x7ff8000000000000ULL)); + value.push_back(0x1c); // double primitive + append_int64_le(&value, static_cast(0x7ff0000000000000ULL)); + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto& double_array = values.at(PathInData()); + EXPECT_EQ(double_array.base_scalar_type_id, TYPE_DOUBLE); + EXPECT_EQ(double_array.num_dimensions, 1); + const auto& array = double_array.field.get(); + ASSERT_EQ(array.size(), 2); + EXPECT_TRUE(std::isnan(array[0].get())); + EXPECT_TRUE(std::isinf(array[1].get())); +} + +TEST(ParquetVariantReaderTest, DecodeResidualBinaryObjectArrayToVariantMap) { + auto metadata = make_metadata({"b"}); + std::vector value {0x03, // array + 0x01, // one element + 0x00, 0x0c, // element value offsets + 0x02, // object + 0x01, // one field + 0x00, // dictionary id 0: b + 0x00, 0x07, // field value offsets + 0x3c, 0x02, 0x00, 0x00, 0x00, // binary primitive, 2 bytes + 0xc3, 0x28}; + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto& object_array = values.at(PathInData()); + EXPECT_EQ(object_array.base_scalar_type_id, TYPE_VARIANT); + EXPECT_EQ(object_array.num_dimensions, 1); + const auto& array = object_array.field.get(); + ASSERT_EQ(array.size(), 1); + ASSERT_EQ(array[0].get_type(), TYPE_VARIANT); + const auto& object = array[0].get(); + const auto& binary = object.at(PathInData("b")); + EXPECT_EQ(binary.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(binary.field), std::string("\xc3\x28", 2)); +} + +TEST(ParquetVariantReaderTest, DecodeObjectOutOfOrderPhysicalValuesToVariantMap) { + auto metadata = make_metadata({"a", "b", "c"}); + std::vector value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x03, // three fields + 0x00, 0x01, 0x02, // dictionary ids: a, b, c + 0x04, 0x02, 0x00, 0x06, // field offsets in key order; values are c, b, a + 0x0c, 0x03, // c: int8(3) + 0x0c, 0x02, // b: int8(2) + 0x0c, 0x01 // a: int8(1) + }; + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + Field result = Field::create_field(std::move(values)); + EXPECT_EQ("{\"a\":1,\"b\":2,\"c\":3}", serialize_variant_field(result)); +} + +TEST(ParquetVariantReaderTest, DecodeResidualNullToVariantMap) { + auto metadata = make_metadata({}); + std::vector value {0x00}; // variant null + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + auto root = values.find(PathInData()); + ASSERT_NE(root, values.end()); + EXPECT_TRUE(root->second.field.is_null()); +} + +TEST(ParquetVariantReaderTest, DecodeResidualObjectNullChildToVariantMap) { + auto metadata = make_metadata({"a"}); + std::vector value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: a + 0x00, 0x01, // field value offsets + 0x00 // variant null + }; + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + auto child = values.find(PathInData("a")); + ASSERT_NE(child, values.end()); + EXPECT_TRUE(child->second.field.is_null()); +} + +TEST(ParquetVariantReaderTest, DecodeNonFiniteDoublePrimitive) { + auto metadata = make_metadata({}); + std::vector value {0x1c}; // primitive double + append_int64_le(&value, static_cast(0x7ff8000000000000ULL)); + + VariantMap values; + std::deque string_values; + Status st = decode_variant_to_variant_map(bytes_ref(metadata), bytes_ref(value), PathInData(), + &values, &string_values); + ASSERT_TRUE(st.ok()) << st.to_string(); + auto root = values.find(PathInData()); + ASSERT_NE(root, values.end()); + ASSERT_EQ(root->second.field.get_type(), TYPE_DOUBLE); + EXPECT_TRUE(std::isnan(root->second.field.get())); + + std::string json; + st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(json.empty()); +} + +TEST(ParquetVariantReaderTest, DecodeNanosecondTimestampAsMicros) { + auto metadata = make_metadata({}); + std::vector value {0x48}; // primitive timestamptz nanos + append_int64_le(&value, 1234567890); + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("1234567", json); +} + +TEST(ParquetVariantReaderTest, TimeV2ConverterRequiresVariantContext) { + FieldSchema time_field; + time_field.name = "timestamp"; + time_field.lower_case_name = time_field.name; + time_field.physical_type = tparquet::Type::INT64; + time_field.parquet_schema.__set_name(time_field.name); + time_field.parquet_schema.__set_type(tparquet::Type::INT64); + time_field.parquet_schema.__set_converted_type(tparquet::ConvertedType::TIME_MICROS); + time_field.data_type = make_nullable(std::make_shared(6)); + + auto converter = PhysicalToLogicalConverter::get_converter( + &time_field, time_field.data_type, time_field.data_type, nullptr, false); + EXPECT_FALSE(converter->support()); + + time_field.is_in_variant = true; + converter = PhysicalToLogicalConverter::get_converter(&time_field, time_field.data_type, + time_field.data_type, nullptr, false); + EXPECT_TRUE(converter->support()); + + auto physical_column = ColumnInt64::create(); + physical_column->insert_value(3723004005); + ColumnPtr physical = std::move(physical_column); + ColumnPtr logical = time_field.data_type->create_column(); + Status st = converter->convert(physical, time_field.data_type, time_field.data_type, logical, + false); + ASSERT_TRUE(st.ok()) << st.to_string(); + + const auto& nullable = assert_cast(*logical); + const auto& time_column = assert_cast(nullable.get_nested_column()); + ASSERT_EQ(1, time_column.size()); + EXPECT_DOUBLE_EQ(3723004005, time_column.get_data()[0]); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyKeepsStructuralNameUserKeys) { + auto int_type = make_nullable(std::make_shared()); + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = "typed_value"; + typed_value_field.children = {make_int32_field_schema("typed_value"), + make_int32_field_schema("value")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {int_type, int_type}, Strings {"typed_value", "value"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + Struct row; + row.push_back(Field::create_field(42)); + row.push_back(Field::create_field(7)); + typed_value_column->insert(Field::create_field(row)); + + auto batch = ColumnVariant::create(0, false, 2); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field result; + batch_variant->get(1, result); + const auto& values = result.get(); + EXPECT_EQ(values.find(PathInData()), values.end()); + EXPECT_EQ(values.at(PathInData("typed_value")).field.get(), 42); + EXPECT_EQ(values.at(PathInData("value")).field.get(), 7); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyKeepsNestedStructuralNameUserKeys) { + auto int_type = make_nullable(std::make_shared()); + FieldSchema nested_field; + nested_field.name = "nested"; + nested_field.lower_case_name = nested_field.name; + nested_field.children = {make_int32_field_schema("typed_value"), + make_int32_field_schema("value")}; + nested_field.data_type = make_nullable(std::make_shared( + DataTypes {int_type, int_type}, Strings {"typed_value", "value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {nested_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {nested_field.data_type}, Strings {"nested"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + Struct nested; + nested.push_back(Field::create_field(42)); + nested.push_back(Field::create_field(7)); + Struct row; + row.push_back(Field::create_field(nested)); + typed_value_column->insert(Field::create_field(row)); + + auto batch = ColumnVariant::create(0, false, 2); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field result; + batch_variant->get(1, result); + const auto& values = result.get(); + EXPECT_EQ(values.find(PathInData("nested")), values.end()); + EXPECT_EQ(values.at(PathInData("nested.typed_value")).field.get(), 42); + EXPECT_EQ(values.at(PathInData("nested.value")).field.get(), 7); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyConvertsTemporalLeavesToVariantMicros) { + FieldSchema date_field = make_datev2_field_schema("d"); + FieldSchema time_field = make_timev2_field_schema("t"); + FieldSchema timestamp_field = make_datetimev2_field_schema("ts"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {date_field, time_field, timestamp_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {date_field.data_type, time_field.data_type, timestamp_field.data_type}, + Strings {"d", "t", "ts"})); + + DateV2Value date; + std::string date_text = "1970-01-03"; + std::string date_format = "%Y-%m-%d"; + ASSERT_TRUE(date.from_date_format_str(date_format.data(), date_format.size(), date_text.data(), + date_text.size())); + + DateV2Value timestamp; + std::string timestamp_text = "1970-01-01 00:00:01.000002"; + std::string timestamp_format = "%Y-%m-%d %H:%i:%s.%f"; + ASSERT_TRUE(timestamp.from_date_format_str(timestamp_format.data(), timestamp_format.size(), + timestamp_text.data(), timestamp_text.size())); + int64_t timestamp_seconds = 0; + timestamp.unix_timestamp(×tamp_seconds, cctz::utc_time_zone()); + + Struct row; + row.push_back(Field::create_field(date)); + row.push_back(Field::create_field(3723004005.0)); + row.push_back(Field::create_field(timestamp)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field::create_field(row)); + + auto batch = ColumnVariant::create(0, false, 2); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field result; + batch_variant->get(1, result); + const auto& values = result.get(); + EXPECT_EQ(values.at(PathInData("d")).field.get(), 2); + EXPECT_EQ(values.at(PathInData("t")).field.get(), 3723004005); + EXPECT_EQ(values.at(PathInData("ts")).field.get(), + timestamp_seconds * 1000000 + timestamp.microsecond()); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesTemporalLeafNull) { + FieldSchema date_field = make_datev2_field_schema("d"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {date_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {date_field.data_type}, Strings {"d"})); + + Struct row; + row.push_back(Field()); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field::create_field(row)); + + auto batch = ColumnVariant::create(0, false, 2); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + const auto* date_subcolumn = batch_variant->get_subcolumn(PathInData("d")); + ASSERT_NE(date_subcolumn, nullptr); + EXPECT_TRUE(date_subcolumn->is_null_at(1)); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyConvertsTemporalArrayLeavesToVariantMicros) { + FieldSchema element = make_timev2_field_schema("element"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Array array; + array.push_back(Field::create_field(3723004005.0)); + array.push_back(Field()); + typed_value_column->insert(Field::create_field(array)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + const auto* root_subcolumn = batch_variant->get_subcolumn(PathInData()); + ASSERT_NE(root_subcolumn, nullptr); + EXPECT_TRUE(root_subcolumn->is_null_at(1)); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ(array_value->second.base_scalar_type_id, TYPE_BIGINT); + EXPECT_EQ(array_value->second.num_dimensions, 1); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 2); + EXPECT_EQ(result_array[0].get(), 3723004005); + EXPECT_TRUE(result_array[1].is_null()); +} + +TEST(ParquetVariantReaderTest, TypedOnlyKeepsUserMetadataAndValueFields) { + FieldSchema object_field; + object_field.name = "obj"; + object_field.lower_case_name = object_field.name; + object_field.children = {make_binary_field_schema("metadata", true), + make_binary_field_schema("value", true)}; + object_field.data_type = make_nullable(std::make_shared( + DataTypes {object_field.children[0].data_type, object_field.children[1].data_type}, + Strings {"metadata", "value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {object_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {object_field.data_type}, Strings {"obj"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct object; + object.push_back(Field::create_field(String("user-metadata"))); + object.push_back(Field::create_field(String("\0", 1))); + Struct typed_value; + typed_value.push_back(Field::create_field(object)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + EXPECT_EQ(values.at(PathInData("obj.metadata")).field.get(), "user-metadata"); + EXPECT_EQ(values.at(PathInData("obj.value")).field.get(), std::string("\0", 1)); +} + +TEST(ParquetVariantReaderTest, TypedOnlyKeepsUserValueOnlyField) { + FieldSchema object_field; + object_field.name = "obj"; + object_field.lower_case_name = object_field.name; + object_field.children = {make_string_field_schema("value", true)}; + object_field.data_type = make_nullable(std::make_shared( + DataTypes {object_field.children[0].data_type}, Strings {"value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {object_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {object_field.data_type}, Strings {"obj"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct object; + object.push_back(Field::create_field(String("\0", 1))); + Struct typed_value; + typed_value.push_back(Field::create_field(object)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + EXPECT_EQ(values.at(PathInData("obj.value")).field.get(), std::string("\0", 1)); +} + +TEST(ParquetVariantReaderTest, TypedOnlyKeepsAnnotatedValueAndTypedValueUserFields) { + auto int_type = make_nullable(std::make_shared()); + FieldSchema nested_typed_value; + nested_typed_value.name = "typed_value"; + nested_typed_value.lower_case_name = nested_typed_value.name; + nested_typed_value.children = {make_int32_field_schema("x")}; + nested_typed_value.data_type = + make_nullable(std::make_shared(DataTypes {int_type}, Strings {"x"})); + + FieldSchema object_field; + object_field.name = "obj"; + object_field.lower_case_name = object_field.name; + object_field.children = {make_string_field_schema("value", true), nested_typed_value}; + object_field.data_type = make_nullable(std::make_shared( + DataTypes {object_field.children[0].data_type, nested_typed_value.data_type}, + Strings {"value", "typed_value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {object_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {object_field.data_type}, Strings {"obj"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct nested; + nested.push_back(Field::create_field(42)); + Struct object; + object.push_back(Field::create_field(String("abc"))); + object.push_back(Field::create_field(nested)); + Struct typed_value; + typed_value.push_back(Field::create_field(object)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + EXPECT_EQ(values.at(PathInData("obj.value")).field.get(), "abc"); + EXPECT_EQ(values.at(PathInData("obj.typed_value.x")).field.get(), 42); +} + +TEST(ParquetVariantReaderTest, TypedOnlyKeepsUserMetadataAndTypedValueFields) { + auto int_type = make_nullable(std::make_shared()); + FieldSchema nested_typed_value; + nested_typed_value.name = "typed_value"; + nested_typed_value.lower_case_name = nested_typed_value.name; + nested_typed_value.children = {make_int32_field_schema("x")}; + nested_typed_value.data_type = + make_nullable(std::make_shared(DataTypes {int_type}, Strings {"x"})); + + FieldSchema object_field; + object_field.name = "obj"; + object_field.lower_case_name = object_field.name; + object_field.children = {make_string_field_schema("metadata", true), nested_typed_value}; + object_field.data_type = make_nullable(std::make_shared( + DataTypes {object_field.children[0].data_type, nested_typed_value.data_type}, + Strings {"metadata", "typed_value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {object_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {object_field.data_type}, Strings {"obj"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct nested; + nested.push_back(Field::create_field(42)); + Struct object; + object.push_back(Field::create_field(String("user-metadata"))); + object.push_back(Field::create_field(nested)); + Struct typed_value; + typed_value.push_back(Field::create_field(object)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + EXPECT_EQ(values.at(PathInData("obj.metadata")).field.get(), "user-metadata"); + EXPECT_EQ(values.at(PathInData("obj.typed_value.x")).field.get(), 42); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyRequiresSelectedTypedLeaf) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int64_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + uint64_t next_id = 1; + variant_field.assign_ids(next_id); + const auto& typed_value = variant_field.children[1]; + const auto& metric = typed_value.children[0]; + + std::set missing_path_ids {variant_field.get_column_id(), + variant_field.children[0].get_column_id()}; + EXPECT_FALSE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test( + variant_field, missing_path_ids)); + + std::set typed_root_only_ids {typed_value.get_column_id()}; + EXPECT_FALSE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test( + variant_field, typed_root_only_ids)); + + std::set metric_ids {metric.get_column_id()}; + EXPECT_TRUE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test(variant_field, + metric_ids)); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyAllowsUnselectedTopLevelResidualValue) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int64_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + uint64_t next_id = 1; + variant_field.assign_ids(next_id); + const auto& value = variant_field.children[1]; + const auto& metric = variant_field.children[2].children[0]; + + std::set metric_ids {metric.get_column_id()}; + EXPECT_TRUE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test(variant_field, + metric_ids)); + + std::set metric_with_residual_ids {value.get_column_id(), metric.get_column_id()}; + EXPECT_FALSE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test( + variant_field, metric_with_residual_ids)); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyReaderCountersUseNativePath) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int64_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = std::make_shared(0, false); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + uint64_t next_id = 1; + variant_field.assign_ids(next_id); + const auto& metric = variant_field.children[1].children[0]; + + auto variant_struct_type = std::make_shared( + DataTypes {variant_field.children[0].data_type, typed_value_field.data_type}, + Strings {"metadata", "typed_value"}); + MutableColumnPtr struct_column = variant_struct_type->create_column(); + for (int64_t metric_value : {7, 11}) { + Struct typed_value; + typed_value.push_back(Field::create_field(metric_value)); + Struct row; + row.push_back(Field::create_field(String(""))); + row.push_back(Field::create_field(typed_value)); + struct_column->insert(Field::create_field(row)); + } + + ColumnPtr output = ColumnVariant::create(0, false); + int64_t direct_rows = 0; + int64_t rowwise_rows = 0; + Status st = parquet_variant_reader_test::read_variant_rows_for_test( + variant_field, *struct_column, {metric.get_column_id()}, output, &direct_rows, + &rowwise_rows); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ(2, direct_rows); + EXPECT_EQ(0, rowwise_rows); + ASSERT_EQ(2, output->size()); + + Field first; + output->get(0, first); + const auto& first_values = first.get(); + EXPECT_EQ(first_values.at(PathInData("metric")).field.get(), 7); +} + +TEST(ParquetVariantReaderTest, VariantReaderCountersUseRowWiseWhenResidualValueSelected) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int64_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = std::make_shared(0, false); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + uint64_t next_id = 1; + variant_field.assign_ids(next_id); + const auto& value = variant_field.children[1]; + const auto& metric = variant_field.children[2].children[0]; + + auto variant_struct_type = std::make_shared( + DataTypes {variant_field.children[0].data_type, value.data_type, + typed_value_field.data_type}, + Strings {"metadata", "value", "typed_value"}); + MutableColumnPtr struct_column = variant_struct_type->create_column(); + for (int64_t metric_value : {7, 11}) { + Struct typed_value; + typed_value.push_back(Field::create_field(metric_value)); + Struct row; + row.push_back(Field::create_field(String(""))); + row.push_back(Field()); + row.push_back(Field::create_field(typed_value)); + struct_column->insert(Field::create_field(row)); + } + + ColumnPtr output = ColumnVariant::create(0, false); + int64_t direct_rows = 0; + int64_t rowwise_rows = 0; + Status st = parquet_variant_reader_test::read_variant_rows_for_test( + variant_field, *struct_column, {value.get_column_id(), metric.get_column_id()}, output, + &direct_rows, &rowwise_rows); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ(0, direct_rows); + EXPECT_EQ(2, rowwise_rows); + ASSERT_EQ(2, output->size()); + + Field second; + output->get(1, second); + const auto& second_values = second.get(); + EXPECT_EQ(second_values.at(PathInData("metric")).field.get(), 11); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesNullableTypedStructNull) { + FieldSchema metric_field = make_required_int64_field_schema("metric"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {metric_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.data_type}, Strings {"metric"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Struct typed_value; + typed_value.push_back(Field::create_field(7)); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + EXPECT_EQ(values.at(PathInData("metric")).field.get(), 7); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesEmptyTypedObject) { + FieldSchema metric_field = make_int64_field_schema("metric"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {metric_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.data_type}, Strings {"metric"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Struct empty_object; + empty_object.push_back(Field()); + typed_value_column->insert(Field::create_field(empty_object)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + + Field empty_result; + batch_variant->get(2, empty_result); + EXPECT_FALSE(empty_result.is_null()); + + std::string json; + DataTypeSerDe::FormatOptions options; + batch_variant->serialize_one_row_to_string(2, &json, options); + EXPECT_EQ(json, "{}"); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesNestedEmptyTypedObject) { + FieldSchema metric_field = make_int64_field_schema("metric"); + + FieldSchema nested_field; + nested_field.name = "nested"; + nested_field.lower_case_name = nested_field.name; + nested_field.children = {metric_field}; + nested_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.data_type}, Strings {"metric"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {nested_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {nested_field.data_type}, Strings {"nested"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Struct nested_object; + nested_object.push_back(Field()); + Struct typed_value; + typed_value.push_back(Field::create_field(nested_object)); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + std::string json; + DataTypeSerDe::FormatOptions options; + batch_variant->serialize_one_row_to_string(2, &json, options); + EXPECT_EQ(json, "{\"nested\":{}}"); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesVarbinaryLeaf) { + FieldSchema payload_field = make_varbinary_field_schema("payload"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {payload_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {payload_field.data_type}, Strings {"payload"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Struct typed_value; + typed_value.push_back(make_varbinary_field({0xff, 0x00, 0x41})); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + const auto& payload = values.at(PathInData("payload")); + EXPECT_EQ(payload.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(payload.field), std::string("\xff\x00\x41", 3)); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesFloatingPointLeaves) { + FieldSchema float_field = make_float_field_schema("f"); + FieldSchema double_field = make_double_field_schema("d"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {float_field, double_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {float_field.data_type, double_field.data_type}, Strings {"f", "d"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + uint64_t next_id = 1; + variant_field.assign_ids(next_id); + std::set typed_leaf_ids {variant_field.children[1].children[0].get_column_id(), + variant_field.children[1].children[1].get_column_id()}; + EXPECT_TRUE(parquet_variant_reader_test::can_use_direct_typed_only_value_for_test( + variant_field, typed_leaf_ids)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Struct typed_value; + typed_value.push_back(Field::create_field(1.25F)); + typed_value.push_back(Field::create_field(2.5)); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + const auto& float_value = values.at(PathInData("f")); + EXPECT_EQ(float_value.base_scalar_type_id, TYPE_FLOAT); + EXPECT_FLOAT_EQ(float_value.field.get(), 1.25F); + const auto& double_value = values.at(PathInData("d")); + EXPECT_EQ(double_value.base_scalar_type_id, TYPE_DOUBLE); + EXPECT_DOUBLE_EQ(double_value.field.get(), 2.5); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesNonFiniteFloatingPointLeaf) { + FieldSchema nan_field = make_double_field_schema("nan"); + FieldSchema inf_field = make_double_field_schema("inf"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {nan_field, inf_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {nan_field.data_type, inf_field.data_type}, Strings {"nan", "inf"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + Struct typed_value; + typed_value.push_back( + Field::create_field(std::numeric_limits::quiet_NaN())); + typed_value.push_back( + Field::create_field(std::numeric_limits::infinity())); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 2); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch.get()); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field present_result; + batch->get(1, present_result); + const auto& values = present_result.get(); + EXPECT_TRUE(std::isnan(values.at(PathInData("nan")).field.get())); + EXPECT_TRUE(std::isinf(values.at(PathInData("inf")).field.get())); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesFloatingPointArrayLeaf) { + FieldSchema element = make_double_field_schema("element"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + Array array; + array.push_back(Field::create_field(1.5)); + array.push_back(Field()); + array.push_back(Field::create_field(2.25)); + typed_value_column->insert(Field::create_field(array)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch.get()); + ASSERT_TRUE(st.ok()) << st.to_string(); + const auto* root_subcolumn = batch->get_subcolumn(PathInData()); + ASSERT_NE(root_subcolumn, nullptr); + EXPECT_TRUE(root_subcolumn->is_null_at(1)); + + Field present_result; + batch->get(2, present_result); + const auto& values = present_result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ(array_value->second.base_scalar_type_id, TYPE_DOUBLE); + EXPECT_EQ(array_value->second.num_dimensions, 1); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 3); + EXPECT_DOUBLE_EQ(result_array[0].get(), 1.5); + EXPECT_TRUE(result_array[1].is_null()); + EXPECT_DOUBLE_EQ(result_array[2].get(), 2.25); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesNonFiniteFloatingPointArrayLeaf) { + FieldSchema element = make_double_field_schema("element"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + Array array; + array.push_back(Field::create_field(std::numeric_limits::quiet_NaN())); + typed_value_column->insert(Field::create_field(array)); + + auto batch = ColumnVariant::create(0, false, 2); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 1, batch.get()); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field present_result; + batch->get(1, present_result); + const auto& values = present_result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 1); + EXPECT_TRUE(std::isnan(result_array[0].get())); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesUuidSemantics) { + FieldSchema uuid_field = make_uuid_field_schema("u"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {uuid_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {uuid_field.data_type}, Strings {"u"})); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + std::string uuid_bytes = test_uuid_bytes(); + Struct typed_value; + typed_value.push_back(make_varbinary_field(uuid_bytes)); + typed_value_column->insert(Field::create_field(typed_value)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + const auto& uuid = values.at(PathInData("u")); + EXPECT_EQ(uuid.base_scalar_type_id, TYPE_STRING); + EXPECT_EQ(uuid.field.get(), "00010203-0405-0607-0809-0a0b0c0d0e0f"); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesTypedUuidSemantics) { + FieldSchema uuid_field = make_uuid_field_schema("u"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {uuid_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(DataTypes {uuid_field.data_type}, Strings {"u"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + std::string uuid_bytes = test_uuid_bytes(); + Struct typed_value; + typed_value.push_back(make_varbinary_field(uuid_bytes)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& uuid = values.at(PathInData("u")); + EXPECT_EQ(uuid.base_scalar_type_id, TYPE_STRING); + EXPECT_EQ(uuid.field.get(), "00010203-0405-0607-0809-0a0b0c0d0e0f"); +} + +TEST(ParquetVariantReaderTest, DirectTypedOnlyPreservesTypedUuidArraySemantics) { + FieldSchema element = make_uuid_field_schema("element"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + MutableColumnPtr typed_value_column = typed_value_field.data_type->create_column(); + typed_value_column->insert(Field()); + + std::string uuid_bytes = test_uuid_bytes(); + Array array; + array.push_back(make_varbinary_field(uuid_bytes)); + array.push_back(Field()); + typed_value_column->insert(Field::create_field(array)); + + auto batch = ColumnVariant::create(0, false, 3); + ASSERT_TRUE( + parquet_variant_reader_test::can_direct_read_typed_value_for_test(typed_value_field)); + auto* batch_variant = batch.get(); + Status st = parquet_variant_reader_test::append_direct_typed_column_to_batch_for_test( + typed_value_field, *typed_value_column, 0, 2, batch_variant); + ASSERT_TRUE(st.ok()) << st.to_string(); + + Field null_result; + batch_variant->get(1, null_result); + EXPECT_TRUE(null_result.is_null()); + + Field present_result; + batch_variant->get(2, present_result); + const auto& values = present_result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ(array_value->second.base_scalar_type_id, TYPE_STRING); + EXPECT_EQ(array_value->second.num_dimensions, 1); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 2); + EXPECT_EQ(result_array[0].get(), "00010203-0405-0607-0809-0a0b0c0d0e0f"); + EXPECT_TRUE(result_array[1].is_null()); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesTypedUuidArraySemantics) { + FieldSchema element = make_uuid_field_schema("element"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + std::string uuid_bytes = test_uuid_bytes(); + Array array; + array.push_back(make_varbinary_field(uuid_bytes)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ(array_value->second.base_scalar_type_id, TYPE_STRING); + EXPECT_EQ(array_value->second.num_dimensions, 1); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 1); + EXPECT_EQ(result_array[0].get(), "00010203-0405-0607-0809-0a0b0c0d0e0f"); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesExplicitVariantNullShreddedArrayElement) { + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {make_binary_field_schema("value", true), + make_int64_field_schema("typed_value")}; + element.data_type = make_nullable(std::make_shared( + DataTypes {element.children[0].data_type, element.children[1].data_type}, + Strings {"value", "typed_value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({}); + std::vector variant_null {0x00}; + Struct element_row; + element_row.push_back(Field::create_field( + String(reinterpret_cast(variant_null.data()), variant_null.size()))); + element_row.push_back(Field()); + Array array; + array.push_back(Field::create_field(element_row)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field()); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ("[null]", serialize_variant_field(result)); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesNullComplexTypedArrayElement) { + FieldSchema payload_field = make_int64_field_schema("payload"); + + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {payload_field}; + element.data_type = make_nullable(std::make_shared( + DataTypes {payload_field.data_type}, Strings {"payload"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct element_value; + element_value.push_back(Field::create_field(7)); + Array array; + array.push_back(Field()); + array.push_back(Field::create_field(element_value)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& object_array = values.at(PathInData()); + EXPECT_EQ(object_array.base_scalar_type_id, TYPE_VARIANT); + EXPECT_EQ(object_array.num_dimensions, 1); + const auto& result_array = object_array.field.get(); + ASSERT_EQ(result_array.size(), 2); + EXPECT_TRUE(result_array[0].is_null()); + ASSERT_EQ(result_array[1].get_type(), TYPE_VARIANT); + const auto& object = result_array[1].get(); + const auto& payload = object.at(PathInData("payload")); + EXPECT_EQ(payload.base_scalar_type_id, TYPE_BIGINT); + EXPECT_EQ(payload.field.get(), 7); +} + +TEST(ParquetVariantReaderTest, RowWiseRejectsMissingShreddedArrayElement) { + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {make_binary_field_schema("value", true), + make_int64_field_schema("typed_value")}; + element.data_type = make_nullable(std::make_shared( + DataTypes {element.children[0].data_type, element.children[1].data_type}, + Strings {"value", "typed_value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({}); + Struct element_row; + element_row.push_back(Field()); + element_row.push_back(Field()); + Array array; + array.push_back(Field::create_field(element_row)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field()); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesTypedDecimalArrayMetadata) { + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.physical_type = tparquet::Type::INT64; + element.data_type = make_nullable(std::make_shared(18, 2)); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({}); + Array array; + array.push_back(Field::create_field(Decimal64(12345))); + array.push_back(Field::create_field(Decimal64(67890))); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field()); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + auto array_value = values.find(PathInData()); + ASSERT_NE(array_value, values.end()); + EXPECT_EQ(array_value->second.base_scalar_type_id, TYPE_DECIMAL64); + EXPECT_EQ(array_value->second.num_dimensions, 1); + EXPECT_EQ(array_value->second.precision, 18); + EXPECT_EQ(array_value->second.scale, 2); + const auto& result_array = array_value->second.field.get(); + ASSERT_EQ(result_array.size(), 2); + EXPECT_EQ(result_array[0].get(), Decimal64(12345)); + EXPECT_EQ(result_array[1].get(), Decimal64(67890)); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesTypedVarbinaryObjectField) { + FieldSchema payload_field = make_varbinary_field_schema("payload"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {payload_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {payload_field.data_type}, Strings {"payload"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct typed_value; + typed_value.push_back(make_varbinary_field({0xc3, 0x28})); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& payload = values.at(PathInData("payload")); + EXPECT_EQ(payload.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(payload.field), std::string("\xc3\x28", 2)); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesTypedVarbinaryObjectArrayField) { + FieldSchema payload_field = make_varbinary_field_schema("payload"); + + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {payload_field}; + element.data_type = make_nullable(std::make_shared( + DataTypes {payload_field.data_type}, Strings {"payload"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {element}; + typed_value_field.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct element_value; + element_value.push_back(make_varbinary_field({0xc3, 0x28})); + Array array; + array.push_back(Field::create_field(element_value)); + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(array)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& object_array = values.at(PathInData()); + EXPECT_EQ(object_array.base_scalar_type_id, TYPE_VARIANT); + EXPECT_EQ(object_array.num_dimensions, 1); + const auto& result_array = object_array.field.get(); + ASSERT_EQ(result_array.size(), 1); + ASSERT_EQ(result_array[0].get_type(), TYPE_VARIANT); + const auto& object = result_array[0].get(); + const auto& payload = object.at(PathInData("payload")); + EXPECT_EQ(payload.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(payload.field), std::string("\xc3\x28", 2)); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesResidualBinaryObjectField) { + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true)}; + + auto metadata = make_metadata({"b"}); + std::vector residual_value {0x02, // object + 0x01, // one field + 0x00, // dictionary id 0: b + 0x00, 0x07, // field value offsets + 0x3c, 0x02, 0x00, 0x00, 0x00, // binary primitive, 2 bytes + 0xc3, 0x28}; + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& payload = values.at(PathInData("b")); + EXPECT_EQ(payload.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(payload.field), std::string("\xc3\x28", 2)); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesResidualBinaryArray) { + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true)}; + + auto metadata = make_metadata({}); + std::vector residual_value {0x03, // array + 0x02, // two elements + 0x00, 0x07, 0x08, // element value offsets + 0x3c, 0x02, 0x00, 0x00, 0x00, // binary primitive, 2 bytes + 0xc3, 0x28, 0x00}; // variant null + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& binary_array = values.at(PathInData()); + EXPECT_EQ(binary_array.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(binary_array.num_dimensions, 1); + const auto& array = binary_array.field.get(); + ASSERT_EQ(array.size(), 2); + EXPECT_EQ(varbinary_field_bytes(array[0]), std::string("\xc3\x28", 2)); + EXPECT_TRUE(array[1].is_null()); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesResidualBinaryObjectArray) { + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true)}; + + auto metadata = make_metadata({"b"}); + std::vector residual_value {0x03, // array + 0x01, // one element + 0x00, 0x0c, // element value offsets + 0x02, // object + 0x01, // one field + 0x00, // dictionary id 0: b + 0x00, 0x07, // field value offsets + 0x3c, 0x02, 0x00, 0x00, 0x00, // binary primitive, 2 bytes + 0xc3, 0x28}; + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + const auto& object_array = values.at(PathInData()); + EXPECT_EQ(object_array.base_scalar_type_id, TYPE_VARIANT); + EXPECT_EQ(object_array.num_dimensions, 1); + const auto& array = object_array.field.get(); + ASSERT_EQ(array.size(), 1); + ASSERT_EQ(array[0].get_type(), TYPE_VARIANT); + const auto& object = array[0].get(); + const auto& binary = object.at(PathInData("b")); + EXPECT_EQ(binary.base_scalar_type_id, TYPE_VARBINARY); + EXPECT_EQ(varbinary_field_bytes(binary.field), std::string("\xc3\x28", 2)); +} + +TEST(ParquetVariantReaderTest, RequiredMissingPayloadIsVariantNull) { + FieldSchema variant_field = make_required_shredded_variant_schema(); + + Struct row; + row.push_back(Field::create_field(String(""))); + row.push_back(Field()); + row.push_back(Field()); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_TRUE(result.is_null()); +} + +TEST(ParquetVariantReaderTest, NullableTopLevelGroupIsSqlNull) { + FieldSchema variant_field = make_required_shredded_variant_schema(); + + Field result; + bool sql_null = false; + Status st = parquet_variant_reader_test::read_variant_row_for_test(variant_field, Field(), true, + &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_TRUE(sql_null); +} + +TEST(ParquetVariantReaderTest, NestedWrapperMergesResidualValueAndTypedValue) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {make_nullable(std::make_shared())}, Strings {"metric"})); + + FieldSchema wrapper_field; + wrapper_field.name = "element"; + wrapper_field.lower_case_name = wrapper_field.name; + wrapper_field.children = {make_binary_field_schema("value", true), typed_value_field}; + wrapper_field.data_type = make_nullable(std::make_shared( + DataTypes {wrapper_field.children[0].data_type, typed_value_field.data_type}, + Strings {"value", "typed_value"})); + + auto metadata = make_metadata({"extra"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: extra + 0x00, 0x02, // field value offsets + 0x0c, 0x07 // int8(7) + }; + + Struct typed_value; + typed_value.push_back(Field::create_field(1)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + std::string json; + bool present = false; + Status st = parquet_variant_reader_test::variant_to_json_for_test( + wrapper_field, Field::create_field(row), + std::string(reinterpret_cast(metadata.data()), metadata.size()), &json, + &present); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_TRUE(present); + EXPECT_NE(json.find("\"extra\":7"), std::string::npos); + EXPECT_NE(json.find("\"metric\":1"), std::string::npos); +} + +TEST(ParquetVariantReaderTest, NestedWrapperMergesEmptyResidualObjectAndTypedValue) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema wrapper_field; + wrapper_field.name = "element"; + wrapper_field.lower_case_name = wrapper_field.name; + wrapper_field.children = {make_binary_field_schema("value", true), typed_value_field}; + wrapper_field.data_type = make_nullable(std::make_shared( + DataTypes {wrapper_field.children[0].data_type, typed_value_field.data_type}, + Strings {"value", "typed_value"})); + + auto metadata = make_metadata({}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x00, // zero fields + 0x00 // total field value size + }; + + Struct typed_value; + typed_value.push_back(Field::create_field(1)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + std::string json; + bool present = false; + Status st = parquet_variant_reader_test::variant_to_json_for_test( + wrapper_field, Field::create_field(row), + std::string(reinterpret_cast(metadata.data()), metadata.size()), &json, + &present); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_TRUE(present); + EXPECT_EQ("{\"metric\":1}", json); +} + +TEST(ParquetVariantReaderTest, NestedWrapperRejectsResidualTypedKeyCollision) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {make_nullable(std::make_shared())}, Strings {"metric"})); + + FieldSchema wrapper_field; + wrapper_field.name = "element"; + wrapper_field.lower_case_name = wrapper_field.name; + wrapper_field.children = {make_binary_field_schema("value", true), typed_value_field}; + wrapper_field.data_type = make_nullable(std::make_shared( + DataTypes {wrapper_field.children[0].data_type, typed_value_field.data_type}, + Strings {"value", "typed_value"})); + + auto metadata = make_metadata({"metric"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: metric + 0x00, 0x02, // field value offsets + 0x0c, 0x02 // int8(2) + }; + + Struct typed_value; + typed_value.push_back(Field::create_field(1)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + std::string json; + bool present = false; + Status st = parquet_variant_reader_test::variant_to_json_for_test( + wrapper_field, Field::create_field(row), + std::string(reinterpret_cast(metadata.data()), metadata.size()), &json, + &present); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RowWiseRejectsResidualTypedKeyCollision) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {make_nullable(std::make_shared())}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({"metric"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: metric + 0x00, 0x02, // field value offsets + 0x0c, 0x02 // int8(2) + }; + + Struct typed_value; + typed_value.push_back(Field::create_field(1)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = false; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesEmptyTypedObject) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + auto metadata = make_metadata({}); + Struct typed_value; + typed_value.push_back(Field()); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{}", serialize_variant_field(result)); + + const auto& values = result.get(); + EXPECT_NE(values.find(PathInData()), values.end()); +} + +TEST(ParquetVariantReaderTest, RowWiseReadsRootTypedMapObject) { + FieldSchema key_field = make_binary_field_schema("key", false); + FieldSchema value_field = make_int32_field_schema("value"); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {key_field, value_field}; + typed_value_field.data_type = make_nullable( + std::make_shared(key_field.data_type, value_field.data_type)); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), typed_value_field}; + + Array keys; + keys.push_back(Field::create_field(String("a"))); + keys.push_back(Field::create_field(String("b"))); + Array values; + values.push_back(Field::create_field(7)); + values.push_back(Field::create_field(8)); + Map typed_map {Field::create_field(keys), Field::create_field(values)}; + + auto metadata = make_metadata({}); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field(typed_map)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{\"a\":7,\"b\":8}", serialize_variant_field(result)); + + const auto& variant_values = result.get(); + EXPECT_EQ(variant_values.at(PathInData("a")).field.get(), 7); + EXPECT_EQ(variant_values.at(PathInData("b")).field.get(), 8); +} + +TEST(ParquetVariantReaderTest, RowWisePreservesEmptyResidualObject) { + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true)}; + + auto metadata = make_metadata({}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x00, // zero fields + 0x00 // total field value size + }; + + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{}", serialize_variant_field(result)); + + const auto& values = result.get(); + EXPECT_NE(values.find(PathInData()), values.end()); +} + +TEST(ParquetVariantReaderTest, RowWiseMergesEmptyResidualObjectAndTypedValue) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x00, // zero fields + 0x00 // total field value size + }; + + Struct typed_value; + typed_value.push_back(Field::create_field(1)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{\"metric\":1}", serialize_variant_field(result)); + + const auto& values = result.get(); + EXPECT_EQ(values.find(PathInData()), values.end()); + EXPECT_NE(values.find(PathInData("metric")), values.end()); +} + +TEST(ParquetVariantReaderTest, RowWiseMergesResidualObjectAndEmptyTypedValue) { + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {make_int32_field_schema("metric")}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {typed_value_field.children[0].data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({"x"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: x + 0x00, 0x02, // field value offsets + 0x0c, 0x07 // int8(7) + }; + + Struct typed_value; + typed_value.push_back(Field()); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{\"x\":7}", serialize_variant_field(result)); + + const auto& values = result.get(); + EXPECT_EQ(values.find(PathInData()), values.end()); + EXPECT_NE(values.find(PathInData("x")), values.end()); +} + +TEST(ParquetVariantReaderTest, RowWiseMergesMatchingEmptyResidualAndTypedObjects) { + FieldSchema metric_field; + metric_field.name = "metric"; + metric_field.lower_case_name = metric_field.name; + metric_field.children = {make_int32_field_schema("x")}; + metric_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.children[0].data_type}, Strings {"x"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {metric_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({"metric"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: metric + 0x00, 0x03, // field value offsets + 0x02, 0x00, 0x00 // metric: empty object + }; + + Struct metric; + metric.push_back(Field()); + Struct typed_value; + typed_value.push_back(Field::create_field(metric)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + EXPECT_EQ("{\"metric\":{}}", serialize_variant_field(result)); + + const auto& values = result.get(); + EXPECT_NE(values.find(PathInData("metric")), values.end()); +} + +TEST(ParquetVariantReaderTest, RowWiseReadsValueOnlyNestedResidualField) { + FieldSchema metric_field; + metric_field.name = "metric"; + metric_field.lower_case_name = metric_field.name; + metric_field.children = {make_binary_field_schema("value", true)}; + metric_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.children[0].data_type}, Strings {"value"})); + + FieldSchema typed_value_field; + typed_value_field.name = "typed_value"; + typed_value_field.lower_case_name = typed_value_field.name; + typed_value_field.children = {metric_field}; + typed_value_field.data_type = make_nullable(std::make_shared( + DataTypes {metric_field.data_type}, Strings {"metric"})); + + FieldSchema variant_field; + variant_field.name = "v"; + variant_field.lower_case_name = variant_field.name; + variant_field.data_type = make_nullable(std::make_shared(0, false)); + variant_field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value_field}; + + auto metadata = make_metadata({"x"}); + std::vector residual_value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x01, // one field + 0x00, // dictionary id 0: x + 0x00, 0x02, // field value offsets + 0x0c, 0x07 // int8(7) + }; + + Struct metric; + metric.push_back(Field::create_field( + String(reinterpret_cast(residual_value.data()), residual_value.size()))); + Struct typed_value; + typed_value.push_back(Field::create_field(metric)); + Struct row; + row.push_back(Field::create_field( + String(reinterpret_cast(metadata.data()), metadata.size()))); + row.push_back(Field()); + row.push_back(Field::create_field(typed_value)); + + Field result; + bool sql_null = true; + Status st = parquet_variant_reader_test::read_variant_row_for_test( + variant_field, Field::create_field(row), true, &result, &sql_null); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_FALSE(sql_null); + + const auto& values = result.get(); + auto metric_x = values.find(PathInData("metric.x")); + ASSERT_NE(metric_x, values.end()); + EXPECT_EQ(metric_x->second.field.get(), 7); + EXPECT_EQ(values.find(PathInData("metric.value")), values.end()); +} + +TEST(ParquetVariantReaderTest, DecodeObjectWithOutOfOrderPhysicalValues) { + auto metadata = make_metadata({"a", "b", "c"}); + std::vector value { + 0x02, // object, 1-byte offsets, 1-byte field ids, 1-byte element count + 0x03, // three fields + 0x00, 0x01, 0x02, // dictionary ids: a, b, c + 0x04, 0x02, 0x00, 0x06, // field offsets in key order; values are c, b, a + 0x0c, 0x03, // c: int8(3) + 0x0c, 0x02, // b: int8(2) + 0x0c, 0x01 // a: int8(1) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("{\"a\":1,\"b\":2,\"c\":3}", json); +} + +TEST(ParquetVariantReaderTest, RejectObjectChildTrailingBytes) { + auto metadata = make_metadata({"a"}); + std::vector value { + 0x02, // object + 0x01, // one field + 0x00, // dictionary id 0 + 0x00, 0x03, // child is declared as 3 bytes + 0x0c, 0x07, 0x00 // int8(7) plus one trailing byte inside the child range + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectObjectDuplicatePhysicalOffsets) { + auto metadata = make_metadata({"a", "b"}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x01, // dictionary ids + 0x00, 0x00, 0x02, // both fields point at the same physical value + 0x0c, 0x07 // int8(7) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectObjectDuplicateFieldIds) { + auto metadata = make_metadata({"a"}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x00, // duplicate dictionary id 0 + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x01, 0x0c, 0x02 // int8(1), int8(2) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, DecodeObjectWithLexicographicFieldOrderAndNonMonotonicIds) { + auto metadata = make_metadata({"b", "a"}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x01, 0x00, // dictionary ids are sorted by field name: a, b + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x01, 0x0c, 0x02 // int8(1), int8(2) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + ASSERT_TRUE(st.ok()) << st.to_string(); + EXPECT_EQ("{\"a\":1,\"b\":2}", json); +} + +TEST(ParquetVariantReaderTest, RejectObjectOutOfOrderFieldNames) { + auto metadata = make_metadata({"b", "a"}); + std::vector value { + 0x02, // object + 0x02, // two fields + 0x00, 0x01, // dictionary ids are not sorted by field name + 0x00, 0x02, 0x04, // valid physical value offsets + 0x0c, 0x02, 0x0c, 0x01 // int8(2), int8(1) + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectArrayChildTrailingBytes) { + auto metadata = make_metadata({}); + std::vector value { + 0x03, // array, 1-byte offsets, 1-byte element count + 0x01, // one element + 0x00, 0x03, // element is declared as 3 bytes + 0x0c, 0x07, 0x00 // int8(7) plus one trailing byte inside the element range + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +TEST(ParquetVariantReaderTest, RejectOversizedPrimitiveLength) { + auto metadata = make_metadata({}); + std::vector value { + 0x40, // primitive string + 0xff, 0xff, 0xff, 0xff // length exceeds the remaining buffer + }; + + std::string json; + Status st = decode_variant_to_json(bytes_ref(metadata), bytes_ref(value), &json); + EXPECT_TRUE(st.is()) << st.to_string(); +} + +} // namespace doris::parquet diff --git a/be/test/format/table/hive/hive_reader_create_column_ids_test.cpp b/be/test/format/table/hive/hive_reader_create_column_ids_test.cpp index 7a884359027d73..27b16c9041b34f 100644 --- a/be/test/format/table/hive/hive_reader_create_column_ids_test.cpp +++ b/be/test/format/table/hive/hive_reader_create_column_ids_test.cpp @@ -722,7 +722,8 @@ class HiveReaderCreateColumnIdsTest : public ::testing::Test { const std::vector& access_configs, const std::set& expected_column_ids, const std::set& expected_filter_column_ids, - bool use_top_level_method = false, bool should_skip_assertion = false) { + bool use_top_level_method = false, bool should_skip_assertion = false, + const std::vector& top_level_file_column_idxs = {}) { std::string test_file = "./be/test/exec/test_data/nested_user_profiles_parquet/" "part-00000-64a7a390-1a03-4efc-ab51-557e9369a1f9-c000.snappy.parquet"; @@ -775,8 +776,13 @@ class HiveReaderCreateColumnIdsTest : public ::testing::Test { // Execute test based on method choice ColumnIdResult actual_result; if (use_top_level_method) { + std::vector file_column_idxs = top_level_file_column_idxs; + if (file_column_idxs.empty()) { + file_column_idxs.assign(table_column_positions.begin(), + table_column_positions.end()); + } actual_result = HiveParquetReader::_create_column_ids_by_top_level_col_index( - field_desc, tuple_descriptor); + field_desc, tuple_descriptor, table_column_names, file_column_idxs); } else { actual_result = HiveParquetReader::_create_column_ids(field_desc, tuple_descriptor); } @@ -931,6 +937,15 @@ TEST_F(HiveReaderCreateColumnIdsTest, test_create_column_ids_2) { expected_filter_column_ids, true); } +TEST_F(HiveReaderCreateColumnIdsTest, test_parquet_top_level_index_uses_scan_column_mapping) { + std::vector table_column_names = {"friends"}; + std::set expected_column_ids = {26, 27, 28, 29, 30, 31, 32}; + std::set expected_filter_column_ids = {}; + + run_parquet_test(table_column_names, {}, expected_column_ids, expected_filter_column_ids, true, + false, {5}); +} + TEST_F(HiveReaderCreateColumnIdsTest, test_create_column_ids_3) { // ORC column IDs are assigned in a tree-like incremental manner: the root node is 0, and child nodes increase sequentially. // Currently, Parquet uses a similar design. @@ -1171,4 +1186,4 @@ TEST_F(HiveReaderCreateColumnIdsTest, test_create_column_ids_6) { } } -} // namespace doris \ No newline at end of file +} // namespace doris diff --git a/be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp b/be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp index e32153d1ef7f74..cb0fb7264354e7 100644 --- a/be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp +++ b/be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp @@ -175,7 +175,8 @@ class IcebergReaderCreateColumnIdsTest : public ::testing::Test { {"id", 1}, {"name", 2}, {"profile", 3}, {"tags", 4}, {"friends", 5}, {"recent_activity", 6}, - {"attributes", 7}, {"complex_attributes", 8}}; + {"attributes", 7}, {"complex_attributes", 8}, + {"v", 100}}; auto it = column_to_field_id.find(column_name); if (it != column_to_field_id.end()) { @@ -185,6 +186,7 @@ class IcebergReaderCreateColumnIdsTest : public ::testing::Test { } // Helper function to create tuple descriptor + // NOLINTNEXTLINE(readability-function-size): test descriptor setup mirrors thrift fixtures. const TupleDescriptor* create_tuple_descriptor( DescriptorTbl** desc_tbl, ObjectPool& obj_pool, TDescriptorTable& t_desc_table, TTableDescriptor& t_table_desc, const std::vector& column_names, @@ -573,6 +575,16 @@ class IcebergReaderCreateColumnIdsTest : public ::testing::Test { hobby_level_node.__set_scalar_type(hobby_level_scalar); type.types.push_back(hobby_level_node); tslot_desc.__set_slotType(type); + } else if (types[i] == TPrimitiveType::VARIANT) { + TTypeNode node; + node.__set_type(TTypeNodeType::SCALAR); + TScalarType scalar_type; + scalar_type.__set_type(TPrimitiveType::VARIANT); + scalar_type.__set_variant_max_subcolumns_count(2048); + scalar_type.__set_variant_enable_doc_mode(false); + node.__set_scalar_type(scalar_type); + type.types.push_back(node); + tslot_desc.__set_slotType(type); } else { // 普通类型 TTypeNode node; @@ -621,6 +633,68 @@ class IcebergReaderCreateColumnIdsTest : public ::testing::Test { return (*desc_tbl)->get_tuple_descriptor(0); } + static tparquet::SchemaElement make_root_schema(int num_children) { + tparquet::SchemaElement schema; + schema.__set_name("schema"); + schema.__set_num_children(num_children); + return schema; + } + + static tparquet::SchemaElement make_group_schema( + std::string name, int num_children, tparquet::FieldRepetitionType::type repetition_type, + int field_id = -1, bool is_variant = false) { + tparquet::SchemaElement schema; + schema.__set_name(name); + schema.__set_num_children(num_children); + schema.__set_repetition_type(repetition_type); + if (field_id >= 0) { + schema.__set_field_id(field_id); + } + if (is_variant) { + tparquet::LogicalType logical_type; + logical_type.__set_VARIANT(tparquet::VariantType()); + schema.__set_logicalType(logical_type); + } + return schema; + } + + static tparquet::SchemaElement make_primitive_schema( + std::string name, tparquet::Type::type type, + tparquet::FieldRepetitionType::type repetition_type, int field_id = -1) { + tparquet::SchemaElement schema; + schema.__set_name(name); + schema.__set_type(type); + schema.__set_repetition_type(repetition_type); + if (field_id >= 0) { + schema.__set_field_id(field_id); + } + return schema; + } + + FieldDescriptor make_iceberg_variant_field_id_descriptor() { + std::vector schemas; + schemas.push_back(make_root_schema(1)); + schemas.push_back( + make_group_schema("v", 3, tparquet::FieldRepetitionType::OPTIONAL, 100, true)); + schemas.push_back(make_primitive_schema("metadata", tparquet::Type::BYTE_ARRAY, + tparquet::FieldRepetitionType::REQUIRED, 101)); + schemas.push_back(make_primitive_schema("value", tparquet::Type::BYTE_ARRAY, + tparquet::FieldRepetitionType::OPTIONAL, 102)); + schemas.push_back( + make_group_schema("typed_value", 1, tparquet::FieldRepetitionType::OPTIONAL, 103)); + schemas.push_back( + make_group_schema("metric", 1, tparquet::FieldRepetitionType::REQUIRED, 104)); + schemas.push_back( + make_group_schema("typed_value", 1, tparquet::FieldRepetitionType::OPTIONAL, 105)); + schemas.push_back(make_group_schema("x", 1, tparquet::FieldRepetitionType::REQUIRED, 106)); + schemas.push_back(make_primitive_schema("typed_value", tparquet::Type::INT64, + tparquet::FieldRepetitionType::OPTIONAL, 107)); + + FieldDescriptor field_desc; + EXPECT_TRUE(field_desc.parse_from_thrift(schemas).ok()); + return field_desc; + } + // Helper function: set column access paths on a slot descriptor void set_column_access_paths(TSlotDescriptor& tslot_desc, const ColumnAccessPathConfig& config) { @@ -1166,4 +1240,81 @@ TEST_F(IcebergReaderCreateColumnIdsTest, test_create_column_ids_6) { } } -} // namespace doris \ No newline at end of file +TEST_F(IcebergReaderCreateColumnIdsTest, test_variant_field_id_pruning_uses_typed_value_columns) { + auto field_desc = make_iceberg_variant_field_id_descriptor(); + + ColumnAccessPathConfig access_config; + access_config.column_name = "v"; + access_config.all_column_paths = {{"100", "metric", "x"}}; + access_config.predicate_paths = {{"100", "metric", "x"}}; + + DescriptorTbl* desc_tbl; + ObjectPool obj_pool; + TDescriptorTable t_desc_table; + TTableDescriptor t_table_desc; + const TupleDescriptor* tuple_descriptor = + create_tuple_descriptor(&desc_tbl, obj_pool, t_desc_table, t_table_desc, {"v"}, {0}, + {TPrimitiveType::VARIANT}, {access_config}); + + auto actual_result = IcebergParquetReader::_create_column_ids(&field_desc, tuple_descriptor); + + const std::set expected_typed_value_column_ids = {1, 4, 5, 6, 7, 8}; + EXPECT_EQ(actual_result.column_ids, expected_typed_value_column_ids); + EXPECT_EQ(actual_result.filter_column_ids, expected_typed_value_column_ids); + EXPECT_FALSE(actual_result.column_ids.contains(2)); // top-level metadata + EXPECT_FALSE(actual_result.column_ids.contains(3)); // top-level residual value +} + +TEST_F(IcebergReaderCreateColumnIdsTest, test_parquet_column_id_creation_does_not_mutate_schema) { + auto field_desc = make_iceberg_variant_field_id_descriptor(); + ASSERT_EQ(field_desc.get_column(0)->get_column_id(), UNASSIGNED_COLUMN_ID); + + ColumnAccessPathConfig access_config; + access_config.column_name = "v"; + access_config.all_column_paths = {{"100", "metric", "x"}}; + access_config.predicate_paths = {{"100", "metric", "x"}}; + + DescriptorTbl* desc_tbl; + ObjectPool obj_pool; + TDescriptorTable t_desc_table; + TTableDescriptor t_table_desc; + const TupleDescriptor* tuple_descriptor = + create_tuple_descriptor(&desc_tbl, obj_pool, t_desc_table, t_table_desc, {"v"}, {0}, + {TPrimitiveType::VARIANT}, {access_config}); + + auto actual_result = IcebergParquetReader::_create_column_ids(&field_desc, tuple_descriptor); + + EXPECT_FALSE(actual_result.column_ids.empty()); + EXPECT_EQ(field_desc.get_column(0)->get_column_id(), UNASSIGNED_COLUMN_ID); + EXPECT_EQ(field_desc.get_column(0)->get_max_column_id(), 0); +} + +TEST_F(IcebergReaderCreateColumnIdsTest, + test_variant_field_id_pruning_treats_numeric_keys_as_variant_names) { + auto field_desc = make_iceberg_variant_field_id_descriptor(); + + ColumnAccessPathConfig access_config; + access_config.column_name = "v"; + access_config.all_column_paths = {{"100", "104", "106"}}; + access_config.predicate_paths = {{"100", "104", "106"}}; + + DescriptorTbl* desc_tbl; + ObjectPool obj_pool; + TDescriptorTable t_desc_table; + TTableDescriptor t_table_desc; + const TupleDescriptor* tuple_descriptor = + create_tuple_descriptor(&desc_tbl, obj_pool, t_desc_table, t_table_desc, {"v"}, {0}, + {TPrimitiveType::VARIANT}, {access_config}); + + auto actual_result = IcebergParquetReader::_create_column_ids(&field_desc, tuple_descriptor); + + const std::set expected_residual_value_column_ids = {1, 2, 3}; + EXPECT_EQ(actual_result.column_ids, expected_residual_value_column_ids); + EXPECT_EQ(actual_result.filter_column_ids, expected_residual_value_column_ids); + EXPECT_FALSE(actual_result.column_ids.contains(4)); // typed_value + EXPECT_FALSE(actual_result.column_ids.contains(5)); // typed_value.metric + EXPECT_FALSE(actual_result.column_ids.contains(6)); // typed_value.metric.typed_value + EXPECT_FALSE(actual_result.column_ids.contains(7)); // typed_value.metric.typed_value.x +} + +} // namespace doris diff --git a/be/test/format/table/nested_column_access_helper_test.cpp b/be/test/format/table/nested_column_access_helper_test.cpp new file mode 100644 index 00000000000000..7967cd08f33c4d --- /dev/null +++ b/be/test/format/table/nested_column_access_helper_test.cpp @@ -0,0 +1,1113 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "format/table/nested_column_access_helper.h" + +#include + +#include +#include +#include +#include + +#include "common/exception.h" +#include "core/data_type/data_type_array.h" +#include "core/data_type/data_type_map.h" +#include "core/data_type/data_type_nullable.h" +#include "core/data_type/data_type_number.h" +#include "core/data_type/data_type_string.h" +#include "core/data_type/data_type_struct.h" +#include "core/data_type/data_type_variant.h" +#include "format/parquet/parquet_nested_column_utils.h" +#include "format/parquet/schema_desc.h" +#include "format/table/hive/hive_parquet_nested_column_utils.h" +#include "format/table/iceberg/iceberg_parquet_nested_column_utils.h" + +namespace doris { +namespace { + +FieldSchema make_variant_field_for_access_path_test() { + FieldSchema field; + field.name = "v"; + field.lower_case_name = "v"; + field.column_id = 10; + field.max_column_id = 16; + return field; +} + +FieldSchema make_binary_field_schema(std::string name, bool nullable) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::BYTE_ARRAY; + field.data_type = std::make_shared(); + if (nullable) { + field.data_type = make_nullable(field.data_type); + } + return field; +} + +FieldSchema make_string_field_schema(std::string name, bool nullable) { + FieldSchema field = make_binary_field_schema(std::move(name), nullable); + tparquet::LogicalType logical_type; + logical_type.__set_STRING(tparquet::StringType()); + field.parquet_schema.__set_logicalType(logical_type); + return field; +} + +FieldSchema make_int32_field_schema(std::string name) { + FieldSchema field; + field.name = std::move(name); + field.lower_case_name = field.name; + field.physical_type = tparquet::Type::INT32; + field.data_type = make_nullable(std::make_shared()); + return field; +} + +FieldSchema make_variant_field_with_nested_structural_name_keys() { + FieldSchema nested; + nested.name = "nested"; + nested.lower_case_name = nested.name; + nested.children = {make_int32_field_schema("typed_value"), make_int32_field_schema("value")}; + nested.data_type = make_nullable(std::make_shared( + DataTypes {nested.children[0].data_type, nested.children[1].data_type}, + Strings {"typed_value", "value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {nested}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {nested.data_type}, Strings {"nested"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_annotated_value_user_field() { + FieldSchema object; + object.name = "obj"; + object.lower_case_name = object.name; + object.children = {make_string_field_schema("value", true)}; + object.data_type = make_nullable(std::make_shared( + DataTypes {object.children[0].data_type}, Strings {"value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {object}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {object.data_type}, Strings {"obj"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_typed_only_nested_shredded_object() { + FieldSchema nested_x = make_int32_field_schema("x"); + + FieldSchema nested_typed_value; + nested_typed_value.name = "typed_value"; + nested_typed_value.lower_case_name = nested_typed_value.name; + nested_typed_value.children = {nested_x}; + nested_typed_value.data_type = make_nullable( + std::make_shared(DataTypes {nested_x.data_type}, Strings {"x"})); + + FieldSchema nested; + nested.name = "nested"; + nested.lower_case_name = nested.name; + nested.children = {nested_typed_value}; + nested.data_type = make_nullable(std::make_shared( + DataTypes {nested_typed_value.data_type}, Strings {"typed_value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {nested}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {nested.data_type}, Strings {"nested"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_typed_only_array_field() { + FieldSchema element_n = make_int32_field_schema("n"); + + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {element_n}; + element.data_type = make_nullable( + std::make_shared(DataTypes {element_n.data_type}, Strings {"n"})); + + FieldSchema items_typed_value; + items_typed_value.name = "typed_value"; + items_typed_value.lower_case_name = items_typed_value.name; + items_typed_value.children = {element}; + items_typed_value.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema items; + items.name = "items"; + items.lower_case_name = items.name; + items.children = {items_typed_value}; + items.data_type = make_nullable(std::make_shared( + DataTypes {items_typed_value.data_type}, Strings {"typed_value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {items}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {items.data_type}, Strings {"items"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_root_typed_only_array() { + FieldSchema element_n = make_int32_field_schema("n"); + + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.children = {element_n}; + element.data_type = make_nullable( + std::make_shared(DataTypes {element_n.data_type}, Strings {"n"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {element}; + typed_value.data_type = make_nullable(std::make_shared(element.data_type)); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_typed_only_map_field() { + FieldSchema key = make_binary_field_schema("key", false); + + FieldSchema value_n = make_int32_field_schema("n"); + FieldSchema value; + value.name = "value"; + value.lower_case_name = value.name; + value.children = {value_n}; + value.data_type = make_nullable( + std::make_shared(DataTypes {value_n.data_type}, Strings {"n"})); + + FieldSchema attrs; + attrs.name = "attrs"; + attrs.lower_case_name = attrs.name; + attrs.children = {key, value}; + attrs.data_type = make_nullable(std::make_shared(key.data_type, value.data_type)); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {attrs}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {attrs.data_type}, Strings {"attrs"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_value_only_residual_field() { + FieldSchema metric; + metric.name = "metric"; + metric.lower_case_name = metric.name; + metric.children = {make_binary_field_schema("value", true)}; + metric.data_type = make_nullable(std::make_shared( + DataTypes {metric.children[0].data_type}, Strings {"value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {metric}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {metric.data_type}, Strings {"metric"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_partially_shredded_metric() { + FieldSchema metric_x = make_int32_field_schema("x"); + + FieldSchema metric_typed_value; + metric_typed_value.name = "typed_value"; + metric_typed_value.lower_case_name = metric_typed_value.name; + metric_typed_value.children = {metric_x}; + metric_typed_value.data_type = make_nullable( + std::make_shared(DataTypes {metric_x.data_type}, Strings {"x"})); + + FieldSchema metric; + metric.name = "metric"; + metric.lower_case_name = metric.name; + metric.children = {make_binary_field_schema("value", true), metric_typed_value}; + metric.data_type = make_nullable(std::make_shared( + DataTypes {metric.children[0].data_type, metric_typed_value.data_type}, + Strings {"value", "typed_value"})); + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {metric}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {metric.data_type}, Strings {"metric"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_variant_field_with_numeric_key_field_id_collision() { + FieldSchema metric = make_int32_field_schema("metric"); + metric.field_id = 20; + + FieldSchema typed_value; + typed_value.name = "typed_value"; + typed_value.lower_case_name = typed_value.name; + typed_value.children = {metric}; + typed_value.data_type = make_nullable( + std::make_shared(DataTypes {metric.data_type}, Strings {"metric"})); + + FieldSchema field; + field.name = "v"; + field.lower_case_name = field.name; + field.data_type = std::make_shared(0, false); + field.children = {make_binary_field_schema("metadata", false), + make_binary_field_schema("value", true), typed_value}; + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_optional_array_field_for_pruning() { + FieldSchema element_n = make_int32_field_schema("n"); + element_n.field_id = 102; + + FieldSchema element; + element.name = "element"; + element.lower_case_name = element.name; + element.field_id = 101; + element.children = {element_n}; + element.data_type = make_nullable( + std::make_shared(DataTypes {element_n.data_type}, Strings {"n"})); + + FieldSchema field; + field.name = "items"; + field.lower_case_name = field.name; + field.field_id = 100; + field.children = {element}; + field.data_type = make_nullable(std::make_shared(element.data_type)); + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_optional_map_field_for_pruning() { + FieldSchema key = make_binary_field_schema("key", false); + key.field_id = 101; + + FieldSchema value_n = make_int32_field_schema("n"); + value_n.field_id = 103; + FieldSchema value; + value.name = "value"; + value.lower_case_name = value.name; + value.field_id = 102; + value.children = {value_n}; + value.data_type = make_nullable( + std::make_shared(DataTypes {value_n.data_type}, Strings {"n"})); + + FieldSchema field; + field.name = "attrs"; + field.lower_case_name = field.name; + field.field_id = 100; + field.children = {key, value}; + field.data_type = make_nullable(std::make_shared(key.data_type, value.data_type)); + uint64_t next_id = 10; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_struct_with_optional_map_field_for_pruning() { + FieldSchema attrs = make_optional_map_field_for_pruning(); + + FieldSchema field; + field.name = "s"; + field.lower_case_name = field.name; + field.field_id = 200; + field.children = {attrs}; + field.data_type = make_nullable( + std::make_shared(DataTypes {attrs.data_type}, Strings {"attrs"})); + uint64_t next_id = 20; + field.assign_ids(next_id); + return field; +} + +FieldSchema make_struct_with_optional_array_field_for_pruning() { + FieldSchema items = make_optional_array_field_for_pruning(); + + FieldSchema field; + field.name = "s"; + field.lower_case_name = field.name; + field.field_id = 200; + field.children = {items}; + field.data_type = make_nullable( + std::make_shared(DataTypes {items.data_type}, Strings {"items"})); + uint64_t next_id = 20; + field.assign_ids(next_id); + return field; +} + +const FieldSchema& child_by_name(const FieldSchema& field, const std::string& name) { + for (const auto& child : field.children) { + if (child.name == name) { + return child; + } + } + throw Exception(Status::InternalError("missing test child {}", name)); +} + +std::set collect_ids(const FieldSchema& field, + const std::vector& access_paths) { + std::set ids; + process_nested_access_paths( + &field, access_paths, ids, + [](const FieldSchema* field) { return field->get_column_id(); }, + [](const FieldSchema* field) { return field->get_max_column_id(); }, + [](const FieldSchema&, const std::vector>&, + std::set&) { FAIL() << "full projection should not call extractor"; }); + return ids; +} + +void expect_map_offset_only_ids(const FieldSchema& field, const std::set& ids) { + const auto& key = child_by_name(field, "key"); + const auto& value = child_by_name(field, "value"); + const auto& value_n = child_by_name(value, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(key.get_column_id())); + EXPECT_FALSE(ids.contains(value.get_column_id())); + EXPECT_FALSE(ids.contains(value_n.get_column_id())); +} + +void expect_array_offset_only_ids(const FieldSchema& field, const std::set& ids) { + const auto& element = child_by_name(field, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(element.get_column_id())); + EXPECT_FALSE(ids.contains(element_n.get_column_id())); +} + +void expect_nested_array_offset_only_ids(const FieldSchema& field, const std::set& ids) { + const auto& items = child_by_name(field, "items"); + const auto& element = child_by_name(items, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(items.get_column_id())); + EXPECT_FALSE(ids.contains(element.get_column_id())); + EXPECT_FALSE(ids.contains(element_n.get_column_id())); +} + +void expect_nested_map_offset_only_ids(const FieldSchema& field, const std::set& ids) { + const auto& attrs = child_by_name(field, "attrs"); + const auto& key = child_by_name(attrs, "key"); + const auto& value = child_by_name(attrs, "value"); + const auto& value_n = child_by_name(value, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(attrs.get_column_id())); + EXPECT_TRUE(ids.contains(key.get_column_id())); + EXPECT_FALSE(ids.contains(value.get_column_id())); + EXPECT_FALSE(ids.contains(value_n.get_column_id())); +} + +} // namespace + +TEST(NestedColumnAccessHelperTest, EmptyAccessPathsSelectFullFieldRange) { + const auto field = make_variant_field_for_access_path_test(); + const std::set expected {10, 11, 12, 13, 14, 15, 16}; + EXPECT_EQ(collect_ids(field, {}), expected); +} + +TEST(NestedColumnAccessHelperTest, NoRecognizedAccessPathsDoNotSelectFieldRange) { + const auto field = make_variant_field_for_access_path_test(); + TColumnAccessPath ignored_path; + ignored_path.__set_type(static_cast(0)); + + const std::set expected; + EXPECT_EQ(collect_ids(field, {ignored_path}), expected); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetPruningUnwrapsOptionalVariant) { + auto field = make_variant_field_with_typed_only_nested_shredded_object(); + field.data_type = make_nullable(field.data_type); + + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"nested", "x"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetPruningUnwrapsOptionalArray) { + const auto field = make_optional_array_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"*", "n"}}, ids); + + const auto& element = child_by_name(field, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_optional_array_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"OFFSET"}}, ids); + + expect_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetNestedArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_struct_with_optional_array_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"items", "OFFSET"}}, ids); + + expect_nested_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetPruningUnwrapsOptionalMap) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"*", "n"}}, ids); + + const auto& key = child_by_name(field, "key"); + const auto& value = child_by_name(field, "value"); + const auto& value_n = child_by_name(value, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(key.get_column_id())); + EXPECT_TRUE(ids.contains(value.get_column_id())); + EXPECT_TRUE(ids.contains(value_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetMapOffsetOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"OFFSET"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetMapNullOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"NULL"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetNestedMapOffsetOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"attrs", "OFFSET"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetNestedMapNullOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name(field, {{"attrs", "NULL"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_optional_array_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"OFFSET"}}, ids); + + expect_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveNestedArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_struct_with_optional_array_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"items", "OFFSET"}}, ids); + + expect_nested_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_optional_array_field_for_pruning(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"OFFSET"}}, ids); + + expect_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergNestedArrayOffsetOnlyKeepsArrayContainer) { + const auto field = make_struct_with_optional_array_field_for_pruning(); + const auto& items = child_by_name(field, "items"); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids( + field, {{std::to_string(items.field_id), "OFFSET"}}, ids); + + expect_nested_array_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveMapOffsetOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"OFFSET"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveMapNullOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"NULL"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveNestedMapOffsetOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"attrs", "OFFSET"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveNestedMapNullOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"attrs", "NULL"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergMapOffsetOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"OFFSET"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergMapNullOnlyKeepsKeyReference) { + const auto field = make_optional_map_field_for_pruning(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"NULL"}}, ids); + + expect_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergNestedMapOffsetOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + const auto& attrs = child_by_name(field, "attrs"); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids( + field, {{std::to_string(attrs.field_id), "OFFSET"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, IcebergNestedMapNullOnlyKeepsKeyReference) { + const auto field = make_struct_with_optional_map_field_for_pruning(); + const auto& attrs = child_by_name(field, "attrs"); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids( + field, {{std::to_string(attrs.field_id), "NULL"}}, ids); + + expect_nested_map_offset_only_ids(field, ids); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningKeepsNestedStructuralNameUserKey) { + const auto field = make_variant_field_with_nested_structural_name_keys(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"nested", "typed_value"}}, + ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_value = child_by_name(nested, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(nested_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningKeepsNestedStructuralNameUserKey) { + const auto field = make_variant_field_with_nested_structural_name_keys(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"nested", "typed_value"}}, + ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_value = child_by_name(nested, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(nested_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningKeepsAnnotatedValueUserField) { + const auto field = make_variant_field_with_annotated_value_user_field(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"obj", "value"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& object = child_by_name(top_typed_value, "obj"); + const auto& object_value = child_by_name(object, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(object.get_column_id())); + EXPECT_TRUE(ids.contains(object_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningKeepsAnnotatedValueUserField) { + const auto field = make_variant_field_with_annotated_value_user_field(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"obj", "value"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& object = child_by_name(top_typed_value, "obj"); + const auto& object_value = child_by_name(object, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(object.get_column_id())); + EXPECT_TRUE(ids.contains(object_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningSkipsTypedOnlyNestedMissingKey) { + const auto field = make_variant_field_with_typed_only_nested_shredded_object(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"nested", "missing"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(nested.get_column_id())); + EXPECT_FALSE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningSkipsTypedOnlyNestedMissingKey) { + const auto field = make_variant_field_with_typed_only_nested_shredded_object(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"nested", "missing"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(nested.get_column_id())); + EXPECT_FALSE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningStripsTerminalMetaSuffix) { + const auto field = make_variant_field_with_typed_only_nested_shredded_object(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids( + field, {{"nested", "x", "NULL"}, {"nested", "x", "OFFSET"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, GenericParquetVariantPruningStripsTerminalMetaSuffix) { + const auto field = make_variant_field_with_typed_only_nested_shredded_object(); + std::set ids; + ParquetNestedColumnUtils::extract_nested_column_ids_by_name( + field, {{"nested", "x", "NULL"}, {"nested", "x", "OFFSET"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningStripsTerminalMetaSuffix) { + const auto field = make_variant_field_with_typed_only_nested_shredded_object(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids( + field, {{"nested", "x", "NULL"}, {"nested", "x", "OFFSET"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& nested = child_by_name(top_typed_value, "nested"); + const auto& nested_typed_value = child_by_name(nested, "typed_value"); + const auto& nested_x = child_by_name(nested_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested.get_column_id())); + EXPECT_TRUE(ids.contains(nested_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(nested_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningSelectsValueOnlyResidualField) { + const auto field = make_variant_field_with_value_only_residual_field(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"metric", "x"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_value = child_by_name(field, "value"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& metric = child_by_name(top_typed_value, "metric"); + const auto& metric_value = child_by_name(metric, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_value.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric.get_column_id())); + EXPECT_TRUE(ids.contains(metric_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningKeepsResidualForTypedNestedField) { + const auto field = make_variant_field_with_partially_shredded_metric(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"metric", "x"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_value = child_by_name(field, "value"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& metric = child_by_name(top_typed_value, "metric"); + const auto& metric_value = child_by_name(metric, "value"); + const auto& metric_typed_value = child_by_name(metric, "typed_value"); + const auto& metric_x = child_by_name(metric_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_value.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric.get_column_id())); + EXPECT_TRUE(ids.contains(metric_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningMapsArraySubscriptToTypedElement) { + const auto field = make_variant_field_with_typed_only_array_field(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"items", "1", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& items = child_by_name(top_typed_value, "items"); + const auto& items_typed_value = child_by_name(items, "typed_value"); + const auto& element = child_by_name(items_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(items.get_column_id())); + EXPECT_TRUE(ids.contains(items_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningMapsArraySubscriptToTypedElement) { + const auto field = make_variant_field_with_typed_only_array_field(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"items", "1", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& items = child_by_name(top_typed_value, "items"); + const auto& items_typed_value = child_by_name(items, "typed_value"); + const auto& element = child_by_name(items_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(items.get_column_id())); + EXPECT_TRUE(ids.contains(items_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningMapsArrayElementPathToTypedElement) { + const auto field = make_variant_field_with_typed_only_array_field(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"items", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& items = child_by_name(top_typed_value, "items"); + const auto& items_typed_value = child_by_name(items, "typed_value"); + const auto& element = child_by_name(items_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(items.get_column_id())); + EXPECT_TRUE(ids.contains(items_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningMapsArrayElementPathToTypedElement) { + const auto field = make_variant_field_with_typed_only_array_field(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"items", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& items = child_by_name(top_typed_value, "items"); + const auto& items_typed_value = child_by_name(items, "typed_value"); + const auto& element = child_by_name(items_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(items.get_column_id())); + EXPECT_TRUE(ids.contains(items_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningMapsRootArraySubscriptToTypedElement) { + const auto field = make_variant_field_with_root_typed_only_array(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"1", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& element = child_by_name(top_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningMapsRootArraySubscriptToTypedElement) { + const auto field = make_variant_field_with_root_typed_only_array(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"1", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& element = child_by_name(top_typed_value, "element"); + const auto& element_n = child_by_name(element, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(element.get_column_id())); + EXPECT_TRUE(ids.contains(element_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, HiveVariantPruningMapsTypedMapKeyToValueSubtree) { + const auto field = make_variant_field_with_typed_only_map_field(); + std::set ids; + HiveParquetNestedColumnUtils::extract_nested_column_ids(field, {{"attrs", "k", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& attrs = child_by_name(top_typed_value, "attrs"); + const auto& key = child_by_name(attrs, "key"); + const auto& value = child_by_name(attrs, "value"); + const auto& value_n = child_by_name(value, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(attrs.get_column_id())); + EXPECT_TRUE(ids.contains(key.get_column_id())); + EXPECT_TRUE(ids.contains(value.get_column_id())); + EXPECT_TRUE(ids.contains(value_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningMapsTypedMapKeyToValueSubtree) { + const auto field = make_variant_field_with_typed_only_map_field(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"attrs", "k", "n"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& attrs = child_by_name(top_typed_value, "attrs"); + const auto& key = child_by_name(attrs, "key"); + const auto& value = child_by_name(attrs, "value"); + const auto& value_n = child_by_name(value, "n"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_FALSE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(attrs.get_column_id())); + EXPECT_TRUE(ids.contains(key.get_column_id())); + EXPECT_TRUE(ids.contains(value.get_column_id())); + EXPECT_TRUE(ids.contains(value_n.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningSelectsValueOnlyResidualField) { + const auto field = make_variant_field_with_value_only_residual_field(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"metric", "x"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_value = child_by_name(field, "value"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& metric = child_by_name(top_typed_value, "metric"); + const auto& metric_value = child_by_name(metric, "value"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_value.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric.get_column_id())); + EXPECT_TRUE(ids.contains(metric_value.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningKeepsResidualForTypedNestedField) { + const auto field = make_variant_field_with_partially_shredded_metric(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"metric", "x"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_value = child_by_name(field, "value"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& metric = child_by_name(top_typed_value, "metric"); + const auto& metric_value = child_by_name(metric, "value"); + const auto& metric_typed_value = child_by_name(metric, "typed_value"); + const auto& metric_x = child_by_name(metric_typed_value, "x"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_FALSE(ids.contains(top_value.get_column_id())); + EXPECT_TRUE(ids.contains(top_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric.get_column_id())); + EXPECT_TRUE(ids.contains(metric_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric_typed_value.get_column_id())); + EXPECT_TRUE(ids.contains(metric_x.get_column_id())); +} + +TEST(NestedColumnAccessHelperTest, IcebergVariantPruningTreatsNumericKeyAsNameNotFieldId) { + const auto field = make_variant_field_with_numeric_key_field_id_collision(); + std::set ids; + IcebergParquetNestedColumnUtils::extract_nested_column_ids(field, {{"20"}}, ids); + + const auto& metadata = child_by_name(field, "metadata"); + const auto& top_value = child_by_name(field, "value"); + const auto& top_typed_value = child_by_name(field, "typed_value"); + const auto& metric = child_by_name(top_typed_value, "metric"); + EXPECT_TRUE(ids.contains(field.get_column_id())); + EXPECT_TRUE(ids.contains(metadata.get_column_id())); + EXPECT_TRUE(ids.contains(top_value.get_column_id())); + EXPECT_FALSE(ids.contains(top_typed_value.get_column_id())); + EXPECT_FALSE(ids.contains(metric.get_column_id())); +} + +} // namespace doris diff --git a/fe/fe-connector/fe-connector-iceberg/src/main/java/org/apache/doris/connector/iceberg/IcebergTypeMapping.java b/fe/fe-connector/fe-connector-iceberg/src/main/java/org/apache/doris/connector/iceberg/IcebergTypeMapping.java index 9539e2547d4a01..e894d29f9070e7 100644 --- a/fe/fe-connector/fe-connector-iceberg/src/main/java/org/apache/doris/connector/iceberg/IcebergTypeMapping.java +++ b/fe/fe-connector/fe-connector-iceberg/src/main/java/org/apache/doris/connector/iceberg/IcebergTypeMapping.java @@ -52,6 +52,8 @@ public static ConnectorType fromIcebergType(Type icebergType, enableMappingVarbinary, enableMappingTimestampTz); } switch (icebergType.typeId()) { + case VARIANT: + return ConnectorType.of("VARIANT"); case LIST: Types.ListType list = (Types.ListType) icebergType; ConnectorType elemType = fromIcebergType( @@ -118,6 +120,8 @@ private static ConnectorType fromPrimitive(Type.PrimitiveType primitive, return ConnectorType.of("DATETIMEV2", ICEBERG_DATETIME_SCALE_MS, 0); case TIME: return ConnectorType.of("UNSUPPORTED"); + case VARIANT: + return ConnectorType.of("VARIANT"); default: return ConnectorType.of("UNSUPPORTED"); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergUtils.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergUtils.java index dd600b13725aeb..4b44bee8bcdec2 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergUtils.java @@ -606,6 +606,8 @@ private static Type icebergPrimitiveTypeToDorisType(org.apache.iceberg.types.Typ return ScalarType.createDatetimeV2Type(ICEBERG_DATETIME_SCALE_MS); case TIME: return Type.UNSUPPORTED; + case VARIANT: + return Type.VARIANT; default: throw new IllegalArgumentException("Cannot transform unknown type: " + primitive); } @@ -618,6 +620,8 @@ public static Type icebergTypeToDorisType(org.apache.iceberg.types.Type type, bo enableMappingVarbinary, enableMappingTimestampTz); } switch (type.typeId()) { + case VARIANT: + return Type.VARIANT; case LIST: Types.ListType list = (Types.ListType) type; return ArrayType.create( @@ -1680,6 +1684,52 @@ public static Schema appendRowLineageFieldsForV3(Schema schema) { MetadataColumns.ROW_ID, MetadataColumns.LAST_UPDATED_SEQUENCE_NUMBER)); } + public static void validateVariantWriteUnsupported(Schema schema) throws AnalysisException { + Optional variantPath = findVariantFieldPath(schema); + if (variantPath.isPresent()) { + throw new AnalysisException("Writing Iceberg VARIANT columns is not supported: " + + variantPath.get()); + } + } + + private static Optional findVariantFieldPath(Schema schema) { + for (NestedField field : schema.columns()) { + Optional variantPath = findVariantFieldPath(field.type(), field.name()); + if (variantPath.isPresent()) { + return variantPath; + } + } + return Optional.empty(); + } + + private static Optional findVariantFieldPath( + org.apache.iceberg.types.Type type, String path) { + switch (type.typeId()) { + case VARIANT: + return Optional.of(path); + case STRUCT: + for (NestedField field : type.asStructType().fields()) { + Optional variantPath = + findVariantFieldPath(field.type(), path + "." + field.name()); + if (variantPath.isPresent()) { + return variantPath; + } + } + return Optional.empty(); + case LIST: + return findVariantFieldPath(type.asListType().elementType(), path + "[]"); + case MAP: + Optional keyVariantPath = + findVariantFieldPath(type.asMapType().keyType(), path + ".key"); + if (keyVariantPath.isPresent()) { + return keyVariantPath; + } + return findVariantFieldPath(type.asMapType().valueType(), path + ".value"); + default: + return Optional.empty(); + } + } + public static int getFormatVersion(Table table) { int formatVersion = 2; // default format version : 2 if (table instanceof BaseTable) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java index adc2507e2490a3..5f380a7e4c0075 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java @@ -22,9 +22,14 @@ import org.apache.doris.analysis.TableScanParams; import org.apache.doris.analysis.TableSnapshot; import org.apache.doris.analysis.TupleDescriptor; +import org.apache.doris.catalog.ArrayType; import org.apache.doris.catalog.Column; import org.apache.doris.catalog.Env; +import org.apache.doris.catalog.MapType; +import org.apache.doris.catalog.StructField; +import org.apache.doris.catalog.StructType; import org.apache.doris.catalog.TableIf; +import org.apache.doris.catalog.Type; import org.apache.doris.common.DdlException; import org.apache.doris.common.UserException; import org.apache.doris.common.profile.SummaryProfile; @@ -817,6 +822,7 @@ private LocationPath createLocationPathWithCache(String path) { private Split createIcebergSplit(FileScanTask fileScanTask) { DataFile dataFile = fileScanTask.file(); String originalPath = dataFile.path().toString(); + validateVariantDataFileFormat(dataFile.format(), originalPath); LocationPath locationPath = createLocationPathWithCache(originalPath); IcebergSplit split = new IcebergSplit( locationPath, @@ -1058,6 +1064,7 @@ public TFileFormatType getFileFormatType() throws UserException { if (icebergFormat.equalsIgnoreCase("parquet")) { type = TFileFormatType.FORMAT_PARQUET; } else if (icebergFormat.equalsIgnoreCase("orc")) { + validateVariantReadSupported(icebergFormat); type = TFileFormatType.FORMAT_ORC; } else { throw new DdlException(String.format("Unsupported format name: %s for iceberg table.", icebergFormat)); @@ -1065,6 +1072,58 @@ public TFileFormatType getFileFormatType() throws UserException { return type; } + private void validateVariantReadSupported(String icebergFormat) throws DdlException { + String variantColumnName = findVariantReadColumnName(); + if (variantColumnName != null) { + throw new DdlException("Reading Iceberg VARIANT columns is only supported for Parquet files, " + + "but table file format is " + icebergFormat + ": " + variantColumnName); + } + } + + @VisibleForTesting + void validateVariantDataFileFormat(FileFormat dataFileFormat, String path) { + if (dataFileFormat == FileFormat.PARQUET) { + return; + } + String variantColumnName = findVariantReadColumnName(); + if (variantColumnName != null) { + throw new NotSupportedException("Reading Iceberg VARIANT columns is only supported for Parquet files, " + + "but data file format is " + dataFileFormat.name() + ": " + variantColumnName + + " (" + path + ")"); + } + } + + private String findVariantReadColumnName() { + for (SlotDescriptor slot : desc.getSlots()) { + Column column = slot.getColumn(); + if (containsVariantType(column.getType())) { + return column.getName(); + } + } + return null; + } + + private static boolean containsVariantType(Type type) { + if (type.isVariantType()) { + return true; + } + if (type.isArrayType()) { + return containsVariantType(((ArrayType) type).getItemType()); + } + if (type.isMapType()) { + MapType mapType = (MapType) type; + return containsVariantType(mapType.getKeyType()) || containsVariantType(mapType.getValueType()); + } + if (type.isStructType()) { + for (StructField field : ((StructType) type).getFields()) { + if (containsVariantType(field.getType())) { + return true; + } + } + } + return false; + } + @Override public List getPathPartitionKeys() throws UserException { // return icebergTable.spec().fields().stream().map(PartitionField::name).map(String::toLowerCase) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index b2ca6cfa2b622a..07d5859893c532 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -637,7 +637,8 @@ public PlanFragment visitPhysicalIcebergMergeSink(PhysicalIcebergMergeSink builderPath = context.accessPathBuilder.getPathList(); + if (dataType instanceof VariantType && builderPath.size() == 1 + && AccessPathInfo.ACCESS_NULL.equals(builderPath.get(0))) { + recordVariantRootAccessPath(slotReference, context); + return null; + } if (dataType instanceof VariantType && (slotReference.hasSubColPath() || !context.accessPathBuilder.isEmpty())) { List path = new ArrayList<>(); @@ -122,7 +153,6 @@ public Void visitSlotReference(SlotReference slotReference, CollectorContext con } // Strip NULL suffix for variant sub-column access — null-flag-only optimization // does not apply to variant sub-column data layout. - List builderPath = context.accessPathBuilder.getPathList(); if (builderPath.size() > 1 && AccessPathInfo.ACCESS_NULL.equals(builderPath.get(builderPath.size() - 1))) { builderPath = new ArrayList<>(builderPath.subList(0, builderPath.size() - 1)); @@ -133,6 +163,10 @@ public Void visitSlotReference(SlotReference slotReference, CollectorContext con path, context.bottomFilter, ColumnAccessPathType.DATA)); return null; } + if (dataType instanceof VariantType && context.collectVariantRoot) { + recordVariantRootAccessPath(slotReference, context); + return null; + } if (dataType instanceof NestedColumnPrunable) { context.accessPathBuilder.addPrefix(slotReference.getName().toLowerCase()); ImmutableList path = Utils.fastToImmutableList(context.accessPathBuilder.accessPath); @@ -265,39 +299,94 @@ public Void visitAlias(Alias alias, CollectorContext context) { @Override public Void visitCast(Cast cast, CollectorContext context) { + Expression child = cast.child(0); + if (child.getDataType() instanceof VariantType && context.accessPathBuilder.isEmpty()) { + if (isVariantLiteralPathAccess(child)) { + return continueCollectAccessPath(child, context); + } + CollectorContext variantRootContext = context.copy(); + variantRootContext.setCollectVariantRoot(true); + return continueCollectAccessPath(child, variantRootContext); + } + if (child.getDataType() instanceof VariantType && !context.accessPathBuilder.isEmpty() + && (cast.getDataType() instanceof VariantType + || cast.getDataType() instanceof NestedColumnPrunable)) { + return continueCollectAccessPath(child, context); + } if (!context.accessPathBuilder.isEmpty() && cast.getDataType() instanceof NestedColumnPrunable - && cast.child().getDataType() instanceof NestedColumnPrunable - && !mapTypeIsChanged(cast.child().getDataType(), cast.getDataType(), false)) { + && child.getDataType() instanceof NestedColumnPrunable + && !mapTypeIsChanged(child.getDataType(), cast.getDataType(), false)) { DataTypeAccessTree castTree = DataTypeAccessTree.of( cast.getDataType(), ColumnAccessPathType.DATA); DataTypeAccessTree originTree = DataTypeAccessTree.of( - cast.child().getDataType(), ColumnAccessPathType.DATA); + child.getDataType(), ColumnAccessPathType.DATA); List replacePath = new ArrayList<>(context.accessPathBuilder.getPathList()); if (originTree.replacePathByAnotherTree(castTree, replacePath, 0)) { CollectorContext castContext = new CollectorContext(context.statementContext, context.bottomFilter); castContext.accessPathBuilder.accessPath.addAll(replacePath); - return continueCollectAccessPath(cast.child(), castContext); + return continueCollectAccessPath(child, castContext); } } - return cast.child(0).accept(this, + return child.accept(this, new CollectorContext(context.statementContext, context.bottomFilter) ); } + @Override + public Void visitGetVariantType(GetVariantType getVariantType, CollectorContext context) { + Expression child = getVariantType.child(0); + if (child.getDataType() instanceof VariantType && context.accessPathBuilder.isEmpty()) { + CollectorContext variantRootContext = context.copy(); + variantRootContext.setCollectVariantRoot(true); + return continueCollectAccessPath(child, variantRootContext); + } + return visit(getVariantType, context); + } + + @Override + public Void visitComparisonPredicate(ComparisonPredicate comparisonPredicate, CollectorContext context) { + if (context.collectVariantRoot) { + return visit(comparisonPredicate, context); + } + for (Expression child : comparisonPredicate.children()) { + CollectorContext childContext = + new CollectorContext(context.statementContext, context.bottomFilter); + if (child.getDataType() instanceof VariantType && context.accessPathBuilder.isEmpty() + && !isVariantLiteralPathAccess(child)) { + childContext.setCollectVariantRoot(true); + } + child.accept(this, childContext); + } + return null; + } + + private boolean isVariantLiteralPathAccess(Expression expression) { + if (expression instanceof SlotReference) { + return ((SlotReference) expression).hasSubColPath(); + } + if (!(expression instanceof ElementAt)) { + return false; + } + ElementAt elementAt = (ElementAt) expression; + return elementAt.child(0).getDataType().isVariantType() && elementAt.child(1).isLiteral(); + } + // array element at @Override public Void visitElementAt(ElementAt elementAt, CollectorContext context) { List arguments = elementAt.getArguments(); Expression first = arguments.get(0); if (first.getDataType().isArrayType() || first.getDataType().isMapType()) { - context.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_ALL); - continueCollectAccessPath(first, context); + CollectorContext valueContext = context.copy(); + valueContext.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_ALL); + continueCollectAccessPath(first, valueContext); for (int i = 1; i < arguments.size(); i++) { - visit(arguments.get(i), context); + arguments.get(i).accept(this, + new CollectorContext(context.statementContext, context.bottomFilter)); } return null; } else if (first.getDataType().isVariantType() && arguments.size() >= 2 @@ -313,6 +402,18 @@ public Void visitElementAt(ElementAt elementAt, CollectorContext context) { return continueCollectAccessPath(first, context); } return visit(elementAt, context); + } else if (first.getDataType().isVariantType() && arguments.size() >= 2) { + // Dynamic keys can hit any VARIANT field. Drop any outer literal suffix, e.g. + // v[cast(id AS string)]['x'], and require the first argument's root instead. + CollectorContext variantRootContext = + new CollectorContext(context.statementContext, context.bottomFilter); + variantRootContext.setCollectVariantRoot(true); + continueCollectAccessPath(first, variantRootContext); + for (int i = 1; i < arguments.size(); i++) { + arguments.get(i).accept(this, + new CollectorContext(context.statementContext, context.bottomFilter)); + } + return null; } else { return visit(elementAt, context); } @@ -346,6 +447,55 @@ public Void visitStructElement(StructElement structElement, CollectorContext con return null; } + @Override + public Void visitCreateNamedStruct(CreateNamedStruct createNamedStruct, CollectorContext context) { + List path = context.accessPathBuilder.getPathList(); + if (!path.isEmpty()) { + String fieldName = path.get(0); + for (int i = 0; i + 1 < createNamedStruct.arity(); i += 2) { + Expression fieldNameExpr = createNamedStruct.child(i); + if (fieldNameExpr.isLiteral() && fieldNameExpr.getDataType().isStringLikeType() + && fieldName.equalsIgnoreCase(((Literal) fieldNameExpr).getStringValue())) { + return collectConstructedStructField(createNamedStruct.child(i + 1), context); + } + } + } + return context.accessPathBuilder.isEmpty() + ? visit(createNamedStruct, context) + : collectChildrenWithoutAccessPath(createNamedStruct, context); + } + + @Override + public Void visitCreateStruct(CreateStruct createStruct, CollectorContext context) { + List path = context.accessPathBuilder.getPathList(); + if (!path.isEmpty()) { + String fieldName = path.get(0); + for (int i = 0; i < createStruct.arity(); i++) { + if (fieldName.equalsIgnoreCase(StructLiteral.COL_PREFIX + (i + 1))) { + return collectConstructedStructField(createStruct.child(i), context); + } + } + } + return context.accessPathBuilder.isEmpty() + ? visit(createStruct, context) + : collectChildrenWithoutAccessPath(createStruct, context); + } + + private Void collectConstructedStructField(Expression fieldValue, CollectorContext context) { + List path = context.accessPathBuilder.getPathList(); + CollectorContext fieldContext = new CollectorContext(context.statementContext, context.bottomFilter); + fieldContext.setType(context.type); + fieldContext.getAccessPathBuilder().addSuffix(path.subList(1, path.size())); + return continueCollectAccessPath(fieldValue, fieldContext); + } + + private Void collectChildrenWithoutAccessPath(Expression expression, CollectorContext context) { + for (Expression child : expression.children()) { + child.accept(this, new CollectorContext(context.statementContext, context.bottomFilter)); + } + return null; + } + @Override public Void visitMapKeys(MapKeys mapKeys, CollectorContext context) { LinkedList suffixPath = context.accessPathBuilder.accessPath; @@ -388,22 +538,39 @@ private static boolean isFunctionNullCheckPath(List suffixPath) { return suffixPath.size() == 1 && AccessPathInfo.ACCESS_NULL.equals(suffixPath.get(0)); } + private void collectArgumentsAfterFirst( + List arguments, CollectorContext context) { + for (int i = 1; i < arguments.size(); i++) { + arguments.get(i).accept(this, + new CollectorContext(context.statementContext, context.bottomFilter)); + } + } + @Override public Void visitMapContainsKey(MapContainsKey mapContainsKey, CollectorContext context) { - context.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_MAP_KEYS); - return continueCollectAccessPath(mapContainsKey.getArgument(0), context); + CollectorContext keyContext = context.copy(); + keyContext.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_MAP_KEYS); + continueCollectAccessPath(mapContainsKey.getArgument(0), keyContext); + collectArgumentsAfterFirst(mapContainsKey.getArguments(), context); + return null; } @Override public Void visitMapContainsValue(MapContainsValue mapContainsValue, CollectorContext context) { - context.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_MAP_VALUES); - return continueCollectAccessPath(mapContainsValue.getArgument(0), context); + CollectorContext valueContext = context.copy(); + valueContext.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_MAP_VALUES); + continueCollectAccessPath(mapContainsValue.getArgument(0), valueContext); + collectArgumentsAfterFirst(mapContainsValue.getArguments(), context); + return null; } @Override public Void visitMapContainsEntry(MapContainsEntry mapContainsEntry, CollectorContext context) { - context.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_ALL); - return continueCollectAccessPath(mapContainsEntry.getArgument(0), context); + CollectorContext entryContext = context.copy(); + entryContext.accessPathBuilder.addPrefix(AccessPathInfo.ACCESS_ALL); + continueCollectAccessPath(mapContainsEntry.getArgument(0), entryContext); + collectArgumentsAfterFirst(mapContainsEntry.getArguments(), context); + return null; } @Override @@ -619,12 +786,14 @@ public static class CollectorContext { private AccessPathBuilder accessPathBuilder; private boolean bottomFilter; private ColumnAccessPathType type; + private boolean collectVariantRoot; public CollectorContext(StatementContext statementContext, boolean bottomFilter) { this.statementContext = statementContext; this.accessPathBuilder = new AccessPathBuilder(); this.bottomFilter = bottomFilter; this.type = ColumnAccessPathType.DATA; + this.collectVariantRoot = false; } public ColumnAccessPathType getType() { @@ -638,6 +807,18 @@ public void setType(ColumnAccessPathType type) { public AccessPathBuilder getAccessPathBuilder() { return accessPathBuilder; } + + public void setCollectVariantRoot(boolean collectVariantRoot) { + this.collectVariantRoot = collectVariantRoot; + } + + public CollectorContext copy() { + CollectorContext context = new CollectorContext(statementContext, bottomFilter); + context.accessPathBuilder.accessPath.addAll(accessPathBuilder.accessPath); + context.type = type; + context.collectVariantRoot = collectVariantRoot; + return context; + } } /** AccessPathBuilder */ diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathPlanCollector.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathPlanCollector.java index d7ff817f12c6d8..c7ff996f3ddc1f 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathPlanCollector.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathPlanCollector.java @@ -25,11 +25,13 @@ import org.apache.doris.nereids.trees.expressions.Expression; import org.apache.doris.nereids.trees.expressions.NamedExpression; import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.SlotReference; import org.apache.doris.nereids.trees.expressions.functions.Function; import org.apache.doris.nereids.trees.expressions.functions.generator.Explode; import org.apache.doris.nereids.trees.expressions.functions.generator.ExplodeMap; import org.apache.doris.nereids.trees.expressions.functions.generator.ExplodeMapOuter; import org.apache.doris.nereids.trees.expressions.functions.generator.ExplodeOuter; +import org.apache.doris.nereids.trees.expressions.functions.generator.ExplodeVariantArray; import org.apache.doris.nereids.trees.expressions.functions.generator.PosExplode; import org.apache.doris.nereids.trees.expressions.functions.generator.PosExplodeOuter; import org.apache.doris.nereids.trees.expressions.literal.StructLiteral; @@ -64,6 +66,7 @@ public class AccessPathPlanCollector extends DefaultPlanVisitor { private Multimap allSlotToAccessPaths = LinkedHashMultimap.create(); private Map> scanSlotToAccessPaths = new LinkedHashMap<>(); + private boolean collectWholeVariantOutputEnabled = true; public Map> collect(Plan root, StatementContext context) { root.accept(this, context); @@ -89,11 +92,12 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State Function function = generators.get(i); Collection accessPaths = allSlotToAccessPaths.get( generatorOutput.getExprId().asInt()); - if (function instanceof Explode || function instanceof ExplodeOuter) { + if (function instanceof Explode || function instanceof ExplodeOuter + || function instanceof ExplodeVariantArray) { if (accessPaths.isEmpty()) { // use the whole column for (Expression child : function.children()) { - exprCollector.collect(child); + exprCollector.collectWholeVariantExpression(child); } } else { for (CollectAccessPathResult accessPath : accessPaths) { @@ -105,6 +109,7 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State if (function.child(0).getDataType().isVariantType()) { argumentContext.getAccessPathBuilder() .addSuffix(path.subList(1, path.size())); + argumentContext.setCollectVariantRoot(path.size() == 1); } else { argumentContext.getAccessPathBuilder() .addSuffix(AccessPathInfo.ACCESS_ALL) @@ -122,6 +127,7 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State if (function.child(colIndex).getDataType().isVariantType()) { argumentContext.getAccessPathBuilder() .addSuffix(path.subList(2, path.size())); + argumentContext.setCollectVariantRoot(path.size() == 2); } else { argumentContext.getAccessPathBuilder() .addSuffix(AccessPathInfo.ACCESS_ALL) @@ -132,7 +138,7 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State } // use the whole column for (Expression child : function.children()) { - exprCollector.collect(child); + exprCollector.collectWholeVariantExpression(child); } } } @@ -140,7 +146,7 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State if (accessPaths.isEmpty()) { // use the whole column for (Expression child : function.children()) { - exprCollector.collect(child); + exprCollector.collectWholeVariantExpression(child); } } else { for (CollectAccessPathResult accessPath : accessPaths) { @@ -171,14 +177,14 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State } } // use the whole column - exprCollector.collect(function.child(0)); + exprCollector.collectWholeVariantExpression(function.child(0)); } } } else if (function instanceof PosExplode || function instanceof PosExplodeOuter) { if (accessPaths.isEmpty()) { // use the whole column for (Expression child : function.children()) { - exprCollector.collect(child); + exprCollector.collectWholeVariantExpression(child); } } else { boolean useWholeItem = false; @@ -209,7 +215,7 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State if (useWholeItem) { // use the whole column for (Expression child : function.children()) { - exprCollector.collect(child); + exprCollector.collectWholeVariantExpression(child); } } else { for (int j = 0; j < function.arity(); j++) { @@ -223,7 +229,13 @@ public Void visitLogicalGenerate(LogicalGenerate generate, State exprCollector.collect(function); } } - return generate.child().accept(this, context); + boolean previousCollectWholeVariantOutputEnabled = collectWholeVariantOutputEnabled; + collectWholeVariantOutputEnabled = false; + try { + return generate.child().accept(this, context); + } finally { + collectWholeVariantOutputEnabled = previousCollectWholeVariantOutputEnabled; + } } @Override @@ -231,34 +243,52 @@ public Void visitLogicalProject(LogicalProject project, Statemen AccessPathExpressionCollector exprCollector = new AccessPathExpressionCollector(context, allSlotToAccessPaths, false); for (NamedExpression output : project.getProjects()) { + Collection outputAccessPaths = + allSlotToAccessPaths.get(output.getExprId().asInt()); // e.g. select struct_element(s, 'city') from (select s from tbl)a; // we will not treat the inner `s` access all path - if (output instanceof Slot && allSlotToAccessPaths.containsKey(output.getExprId().asInt())) { + if (collectWholeVariantOutputEnabled && output instanceof Slot + && collectWholeVariantOutput((Slot) output)) { continue; - } else if (output instanceof Alias && output.child(0) instanceof Slot - && allSlotToAccessPaths.containsKey(output.getExprId().asInt())) { - Slot innerSlot = (Slot) output.child(0); - Collection outerSlotAccessPaths = allSlotToAccessPaths.get( - output.getExprId().asInt()); - for (CollectAccessPathResult outerSlotAccessPath : outerSlotAccessPaths) { - List outerPath = outerSlotAccessPath.getPath(); - List replaceSlotNamePath = new ArrayList<>(); - replaceSlotNamePath.add(innerSlot.getName()); - replaceSlotNamePath.addAll(outerPath.subList(1, outerPath.size())); - allSlotToAccessPaths.put( - innerSlot.getExprId().asInt(), - new CollectAccessPathResult( - replaceSlotNamePath, - outerSlotAccessPath.isPredicate(), - outerSlotAccessPath.getType() - ) - ); - } + } else if (output instanceof Slot && !outputAccessPaths.isEmpty()) { + continue; + } else if (output instanceof Alias && !outputAccessPaths.isEmpty()) { + collectAliasAccessPaths((Alias) output, outputAccessPaths, exprCollector, context); + } else if (collectWholeVariantOutputEnabled && output instanceof Alias && output.child(0) instanceof Slot + && collectWholeVariantOutput((Slot) output.child(0))) { + continue; + } else if (collectWholeVariantOutputEnabled && output.getDataType().isVariantType()) { + exprCollector.collectWholeVariantExpression(output); } else { exprCollector.collect(output); } } - return project.child().accept(this, context); + boolean previousCollectWholeVariantOutputEnabled = collectWholeVariantOutputEnabled; + collectWholeVariantOutputEnabled = false; + try { + return project.child().accept(this, context); + } finally { + collectWholeVariantOutputEnabled = previousCollectWholeVariantOutputEnabled; + } + } + + private void collectAliasAccessPaths(Alias alias, Collection aliasAccessPaths, + AccessPathExpressionCollector exprCollector, StatementContext context) { + Expression child = alias.child(0); + if (collectWholeVariantOutputEnabled && child.getDataType().isVariantType()) { + exprCollector.collectWholeVariantExpression(child); + } + for (CollectAccessPathResult aliasAccessPath : aliasAccessPaths) { + List aliasPath = aliasAccessPath.getPath(); + CollectorContext childContext = new CollectorContext(context, aliasAccessPath.isPredicate()); + childContext.setType(aliasAccessPath.getType()); + if (aliasPath.size() == 1 && child.getDataType().isVariantType()) { + childContext.setCollectVariantRoot(true); + } else { + childContext.getAccessPathBuilder().addSuffix(aliasPath.subList(1, aliasPath.size())); + } + child.accept(exprCollector, childContext); + } } @Override @@ -305,7 +335,13 @@ public Void visitLogicalCTEConsumer(LogicalCTEConsumer cteConsumer, StatementCon @Override public Void visitLogicalCTEProducer(LogicalCTEProducer cteProducer, StatementContext context) { - return cteProducer.child().accept(this, context); + boolean previousCollectWholeVariantOutputEnabled = collectWholeVariantOutputEnabled; + collectWholeVariantOutputEnabled = false; + try { + return cteProducer.child().accept(this, context); + } finally { + collectWholeVariantOutputEnabled = previousCollectWholeVariantOutputEnabled; + } } @Override @@ -391,6 +427,18 @@ private void collectByExpressions(Plan plan, StatementContext context, boolean b } } + private boolean collectWholeVariantOutput(Slot slot) { + if (!slot.getDataType().isVariantType() + || (slot instanceof SlotReference && ((SlotReference) slot).hasSubColPath())) { + return false; + } + List path = new ArrayList<>(); + path.add(slot.getName()); + allSlotToAccessPaths.put(slot.getExprId().asInt(), + new CollectAccessPathResult(path, false, ColumnAccessPathType.DATA)); + return true; + } + static List normalizeDataSkippingOnlyAccessPaths( Collection accessPaths) { List normalizedAccessPaths = new ArrayList<>(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java index 0da73d3e936cff..3621b3db9c45a0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java @@ -391,6 +391,7 @@ && containsDataSkippingOnlyAccessPath(collectAccessPathResults)) { for (Entry kv : variantSlots.entrySet()) { Slot slot = kv.getKey(); + stripVariantSubpathsCoveredByFullPath(slot, allAccessPaths); List allPaths = buildColumnAccessPaths(slot, allAccessPaths); result.put(slot.getExprId().asInt(), new AccessPathInfo(slot.getDataType(), allPaths, new ArrayList<>())); @@ -421,6 +422,34 @@ && containsDataSkippingOnlyAccessPath(collectAccessPathResults)) { return result; } + private static void stripVariantSubpathsCoveredByFullPath( + Slot slot, Multimap>> allAccessPaths) { + int slotId = slot.getExprId().asInt(); + Collection>> paths = allAccessPaths.get(slotId); + boolean hasFullPath = false; + for (Pair> path : paths) { + if (path.first == ColumnAccessPathType.DATA + && path.second.size() == 1 + && path.second.get(0).equalsIgnoreCase(slot.getName())) { + hasFullPath = true; + break; + } + } + if (!hasFullPath) { + return; + } + + List>> pathsToRemove = new ArrayList<>(); + for (Pair> path : paths) { + if (path.first == ColumnAccessPathType.DATA + && path.second.size() > 1 + && path.second.get(0).equalsIgnoreCase(slot.getName())) { + pathsToRemove.add(path); + } + } + paths.removeAll(pathsToRemove); + } + private static boolean containsDataSkippingOnlyAccessPath( List collectAccessPathResults) { for (CollectAccessPathResult collectAccessPathResult : collectAccessPathResults) { @@ -1007,6 +1036,9 @@ public void setAccessByPath(List path, int accessIndex, ColumnAccessPath // Any other sub-path on a string column means full data is needed. accessAll = true; return; + } else if (type.isVariantType()) { + accessAll = true; + return; } else if (isRoot) { children.get(path.get(accessIndex).toLowerCase()).setAccessByPath(path, accessIndex + 1, pathType); return; diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacer.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacer.java index 6a8fd28b902ba7..b734499c65e510 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacer.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacer.java @@ -64,6 +64,7 @@ import org.apache.doris.nereids.types.MapType; import org.apache.doris.nereids.types.NestedColumnPrunable; import org.apache.doris.nereids.types.StructType; +import org.apache.doris.nereids.types.VariantType; import org.apache.doris.nereids.util.MoreFieldsThread; import com.google.common.collect.ImmutableCollection; @@ -236,8 +237,8 @@ public Plan visitLogicalExcept(LogicalExcept except, Void context) { = replaceExpressions(except.getOutputs(), true, false); if (replacedRegularChildrenOutputs.first || replacedOutputs.first) { - return new LogicalExcept(except.getQualifier(), except.getOutputs(), - except.getRegularChildrenOutputs(), except.children()); + return new LogicalExcept(except.getQualifier(), replacedOutputs.second, + replacedRegularChildrenOutputs.second, except.children()); } return except; @@ -254,8 +255,8 @@ public Plan visitLogicalIntersect(LogicalIntersect intersect, Void context) { = replaceExpressions(intersect.getOutputs(), true, false); if (replacedRegularChildrenOutputs.first || replacedOutputs.first) { - return new LogicalIntersect(intersect.getQualifier(), intersect.getOutputs(), - intersect.getRegularChildrenOutputs(), intersect.children()); + return new LogicalIntersect(intersect.getQualifier(), replacedOutputs.second, + replacedRegularChildrenOutputs.second, intersect.children()); } return intersect; } @@ -654,16 +655,22 @@ private void replaceIcebergAccessPathToId(List originPath, int index, Da replaceIcebergAccessPathToId( originPath, index + 1, ((MapType) type).getValueType(), column.getChildren().get(1) ); + } else if (fieldName.equals(AccessPathInfo.ACCESS_MAP_KEYS)) { + replaceIcebergAccessPathToId( + originPath, index + 1, ((MapType) type).getKeyType(), column.getChildren().get(0) + ); } } else if (type instanceof StructType) { for (Column child : column.getChildren()) { - if (child.getName().equals(fieldName)) { + if (child.getName().equalsIgnoreCase(fieldName)) { originPath.set(index, String.valueOf(child.getUniqueId())); - DataType childType = ((StructType) type).getNameToFields().get(fieldName).getDataType(); + DataType childType = ((StructType) type).getField(fieldName).getDataType(); replaceIcebergAccessPathToId(originPath, index + 1, childType, child); break; } } + } else if (type instanceof VariantType) { + replaceIcebergAccessPathToId(originPath, index + 1, type, column); } else { originPath.set(index, String.valueOf(column.getUniqueId())); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommand.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommand.java index 7770d147812682..e66557108cc2c3 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommand.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommand.java @@ -237,12 +237,16 @@ private List buildDeleteProjection(Expression rowIdExpr, List nameParts = Lists.newArrayList(targetNameInPlan); + nameParts.add(column.getName()); + projection.add(new UnboundSlot(nameParts)); + continue; + } + if (!column.isVisible()) { continue; } - List nameParts = Lists.newArrayList(targetNameInPlan); - nameParts.add(column.getName()); - projection.add(new UnboundSlot(nameParts)); + projection.add(new NullLiteral(DataType.fromCatalogType(column.getType()))); } return projection; } @@ -462,11 +466,25 @@ private LogicalPlan buildMergePlan(ConnectContext ctx, IcebergExternalTable iceb icebergTable.getBaseSchema(true), outputExprs, deleteCtx, + writesDataFiles(matchedClauses, notMatchedClauses), Optional.empty(), Optional.empty(), projectPlan); } + static boolean writesDataFiles(List matchedClauses, + List notMatchedClauses) { + if (!notMatchedClauses.isEmpty()) { + return true; + } + for (MergeMatchedClause clause : matchedClauses) { + if (!clause.isDelete()) { + return true; + } + } + return false; + } + private boolean executeMergePlan(ConnectContext ctx, StmtExecutor executor, IcebergExternalTable icebergTable, LogicalPlan logicalPlan) throws Exception { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/IcebergInsertExecutor.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/IcebergInsertExecutor.java index 6f9b951a9a6e06..be068206a89ae6 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/IcebergInsertExecutor.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/IcebergInsertExecutor.java @@ -17,10 +17,12 @@ package org.apache.doris.nereids.trees.plans.commands.insert; +import org.apache.doris.common.AnalysisException; import org.apache.doris.common.UserException; import org.apache.doris.datasource.NameMapping; import org.apache.doris.datasource.iceberg.IcebergExternalTable; import org.apache.doris.datasource.iceberg.IcebergTransaction; +import org.apache.doris.datasource.iceberg.IcebergUtils; import org.apache.doris.nereids.NereidsPlanner; import org.apache.doris.qe.ConnectContext; import org.apache.doris.transaction.TransactionType; @@ -46,10 +48,20 @@ public IcebergInsertExecutor(ConnectContext ctx, IcebergExternalTable table, super(ctx, table, labelName, planner, insertCtx, emptyInsert, jobId); } + private static void rejectVariantWrites(IcebergExternalTable table) throws UserException { + try { + IcebergUtils.validateVariantWriteUnsupported(table.getIcebergTable().schema()); + } catch (AnalysisException e) { + throw new UserException(e.getMessage(), e); + } + } + @Override protected void beforeExec() throws UserException { + IcebergExternalTable icebergTable = (IcebergExternalTable) table; + rejectVariantWrites(icebergTable); IcebergTransaction transaction = (IcebergTransaction) transactionManager.getTransaction(txnId); - transaction.beginInsert((IcebergExternalTable) table, insertCtx); + transaction.beginInsert(icebergTable, insertCtx); } @Override diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalIcebergMergeSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalIcebergMergeSink.java index 7f528020890dde..0e1acae31877a9 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalIcebergMergeSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalIcebergMergeSink.java @@ -47,6 +47,7 @@ public class LogicalIcebergMergeSink extends LogicalTab private final IcebergExternalDatabase database; private final IcebergExternalTable targetTable; private final DeleteCommandContext deleteContext; + private final boolean writeDataFiles; /** * Constructor @@ -59,10 +60,24 @@ public LogicalIcebergMergeSink(IcebergExternalDatabase database, Optional groupExpression, Optional logicalProperties, CHILD_TYPE child) { + this(database, targetTable, cols, outputExprs, deleteContext, true, groupExpression, + logicalProperties, child); + } + + public LogicalIcebergMergeSink(IcebergExternalDatabase database, + IcebergExternalTable targetTable, + List cols, + List outputExprs, + DeleteCommandContext deleteContext, + boolean writeDataFiles, + Optional groupExpression, + Optional logicalProperties, + CHILD_TYPE child) { super(PlanType.LOGICAL_ICEBERG_MERGE_SINK, outputExprs, groupExpression, logicalProperties, cols, child); this.database = Objects.requireNonNull(database, "database != null in LogicalIcebergMergeSink"); this.targetTable = Objects.requireNonNull(targetTable, "targetTable != null in LogicalIcebergMergeSink"); this.deleteContext = Objects.requireNonNull(deleteContext, "deleteContext != null in LogicalIcebergMergeSink"); + this.writeDataFiles = writeDataFiles; } public Plan withChildAndUpdateOutput(Plan child) { @@ -70,19 +85,19 @@ public Plan withChildAndUpdateOutput(Plan child) { .map(NamedExpression.class::cast) .collect(ImmutableList.toImmutableList()); return new LogicalIcebergMergeSink<>(database, targetTable, cols, output, - deleteContext, Optional.empty(), Optional.empty(), child); + deleteContext, writeDataFiles, Optional.empty(), Optional.empty(), child); } @Override public Plan withChildren(List children) { Preconditions.checkArgument(children.size() == 1, "LogicalIcebergMergeSink only accepts one child"); return new LogicalIcebergMergeSink<>(database, targetTable, cols, outputExprs, - deleteContext, Optional.empty(), Optional.empty(), children.get(0)); + deleteContext, writeDataFiles, Optional.empty(), Optional.empty(), children.get(0)); } public LogicalIcebergMergeSink withOutputExprs(List outputExprs) { return new LogicalIcebergMergeSink<>(database, targetTable, cols, outputExprs, - deleteContext, Optional.empty(), Optional.empty(), child()); + deleteContext, writeDataFiles, Optional.empty(), Optional.empty(), child()); } public IcebergExternalDatabase getDatabase() { @@ -97,6 +112,10 @@ public DeleteCommandContext getDeleteContext() { return deleteContext; } + public boolean writeDataFiles() { + return writeDataFiles; + } + @Override public boolean equals(Object o) { if (this == o) { @@ -112,12 +131,13 @@ public boolean equals(Object o) { return Objects.equals(database, that.database) && Objects.equals(targetTable, that.targetTable) && Objects.equals(deleteContext, that.deleteContext) + && writeDataFiles == that.writeDataFiles && Objects.equals(cols, that.cols); } @Override public int hashCode() { - return Objects.hash(super.hashCode(), database, targetTable, cols, deleteContext); + return Objects.hash(super.hashCode(), database, targetTable, cols, deleteContext, writeDataFiles); } @Override @@ -127,7 +147,8 @@ public String toString() { "database", database.getFullName(), "targetTable", targetTable.getName(), "cols", cols, - "deleteFileType", deleteContext.getDeleteFileType()); + "deleteFileType", deleteContext.getDeleteFileType(), + "writeDataFiles", writeDataFiles); } @Override @@ -138,13 +159,13 @@ public R accept(PlanVisitor visitor, C context) { @Override public Plan withGroupExpression(Optional groupExpression) { return new LogicalIcebergMergeSink<>(database, targetTable, cols, outputExprs, - deleteContext, groupExpression, Optional.of(getLogicalProperties()), child()); + deleteContext, writeDataFiles, groupExpression, Optional.of(getLogicalProperties()), child()); } @Override public Plan withGroupExprLogicalPropChildren(Optional groupExpression, Optional logicalProperties, List children) { return new LogicalIcebergMergeSink<>(database, targetTable, cols, outputExprs, - deleteContext, groupExpression, logicalProperties, children.get(0)); + deleteContext, writeDataFiles, groupExpression, logicalProperties, children.get(0)); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalIcebergMergeSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalIcebergMergeSink.java index 0281ad23243496..d5fa3d1fabc4c0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalIcebergMergeSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalIcebergMergeSink.java @@ -56,6 +56,7 @@ */ public class PhysicalIcebergMergeSink extends PhysicalBaseExternalTableSink { private final DeleteCommandContext deleteContext; + private final boolean writeDataFiles; /** * Constructor @@ -68,10 +69,23 @@ public PhysicalIcebergMergeSink(IcebergExternalDatabase database, Optional groupExpression, LogicalProperties logicalProperties, CHILD_TYPE child) { - this(database, targetTable, cols, outputExprs, deleteContext, groupExpression, logicalProperties, + this(database, targetTable, cols, outputExprs, deleteContext, true, groupExpression, logicalProperties, PhysicalProperties.GATHER, null, child); } + public PhysicalIcebergMergeSink(IcebergExternalDatabase database, + IcebergExternalTable targetTable, + List cols, + List outputExprs, + DeleteCommandContext deleteContext, + boolean writeDataFiles, + Optional groupExpression, + LogicalProperties logicalProperties, + CHILD_TYPE child) { + this(database, targetTable, cols, outputExprs, deleteContext, writeDataFiles, groupExpression, + logicalProperties, PhysicalProperties.GATHER, null, child); + } + /** * Constructor */ @@ -80,6 +94,7 @@ public PhysicalIcebergMergeSink(IcebergExternalDatabase database, List cols, List outputExprs, DeleteCommandContext deleteContext, + boolean writeDataFiles, Optional groupExpression, LogicalProperties logicalProperties, PhysicalProperties physicalProperties, @@ -89,17 +104,22 @@ public PhysicalIcebergMergeSink(IcebergExternalDatabase database, logicalProperties, physicalProperties, statistics, child); this.deleteContext = Objects.requireNonNull( deleteContext, "deleteContext != null in PhysicalIcebergMergeSink"); + this.writeDataFiles = writeDataFiles; } public DeleteCommandContext getDeleteContext() { return deleteContext; } + public boolean writeDataFiles() { + return writeDataFiles; + } + @Override public Plan withChildren(List children) { return new PhysicalIcebergMergeSink<>( (IcebergExternalDatabase) database, (IcebergExternalTable) targetTable, - cols, outputExprs, deleteContext, groupExpression, + cols, outputExprs, deleteContext, writeDataFiles, groupExpression, getLogicalProperties(), physicalProperties, statistics, children.get(0)); } @@ -112,7 +132,7 @@ public R accept(PlanVisitor visitor, C context) { public Plan withGroupExpression(Optional groupExpression) { return new PhysicalIcebergMergeSink<>( (IcebergExternalDatabase) database, (IcebergExternalTable) targetTable, cols, outputExprs, - deleteContext, groupExpression, getLogicalProperties(), child()); + deleteContext, writeDataFiles, groupExpression, getLogicalProperties(), child()); } @Override @@ -120,14 +140,15 @@ public Plan withGroupExprLogicalPropChildren(Optional groupExpr Optional logicalProperties, List children) { return new PhysicalIcebergMergeSink<>( (IcebergExternalDatabase) database, (IcebergExternalTable) targetTable, cols, outputExprs, - deleteContext, groupExpression, logicalProperties.get(), children.get(0)); + deleteContext, writeDataFiles, groupExpression, logicalProperties.get(), children.get(0)); } @Override public PhysicalPlan withPhysicalPropertiesAndStats(PhysicalProperties physicalProperties, Statistics statistics) { return new PhysicalIcebergMergeSink<>( (IcebergExternalDatabase) database, (IcebergExternalTable) targetTable, cols, outputExprs, - deleteContext, groupExpression, getLogicalProperties(), physicalProperties, statistics, child()); + deleteContext, writeDataFiles, groupExpression, getLogicalProperties(), physicalProperties, + statistics, child()); } @Override @@ -142,12 +163,12 @@ public boolean equals(Object o) { return false; } PhysicalIcebergMergeSink that = (PhysicalIcebergMergeSink) o; - return Objects.equals(deleteContext, that.deleteContext); + return Objects.equals(deleteContext, that.deleteContext) && writeDataFiles == that.writeDataFiles; } @Override public int hashCode() { - return Objects.hash(super.hashCode(), deleteContext); + return Objects.hash(super.hashCode(), deleteContext, writeDataFiles); } /** diff --git a/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergMergeSink.java b/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergMergeSink.java index 4af4ba17e18578..d1b468a5f1f46a 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergMergeSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergMergeSink.java @@ -64,6 +64,7 @@ public class IcebergMergeSink extends BaseExternalTableDataSink { private final IcebergExternalTable targetTable; private final DeleteCommandContext deleteContext; + private final boolean writeDataFiles; private List rewritableDeleteFileSets = Collections.emptyList(); private static final HashSet supportedTypes = new HashSet() {{ @@ -75,12 +76,18 @@ public class IcebergMergeSink extends BaseExternalTableDataSink { private Map storagePropertiesMap; public IcebergMergeSink(IcebergExternalTable targetTable, DeleteCommandContext deleteContext) { + this(targetTable, deleteContext, true); + } + + public IcebergMergeSink(IcebergExternalTable targetTable, DeleteCommandContext deleteContext, + boolean writeDataFiles) { super(); if (targetTable.isView()) { throw new UnsupportedOperationException("UPDATE on iceberg view is not supported"); } this.targetTable = targetTable; this.deleteContext = deleteContext; + this.writeDataFiles = writeDataFiles; IcebergExternalCatalog catalog = (IcebergExternalCatalog) targetTable.getCatalog(); storagePropertiesMap = VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials( @@ -129,6 +136,9 @@ public void bindDataSink(Optional insertCtx) if (formatVersion >= 3) { schema = IcebergUtils.appendRowLineageFieldsForV3(schema); } + if (writeDataFiles) { + IcebergUtils.validateVariantWriteUnsupported(schema); + } tSink.setFormatVersion(formatVersion); tSink.setSchemaJson(SchemaParser.toJson(schema)); diff --git a/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergTableSink.java index 0f3b1bb24d26bc..307d1893003d50 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergTableSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/planner/IcebergTableSink.java @@ -134,6 +134,7 @@ public void bindDataSink(Optional insertCtx) // iceberg v3 format requires additional row lineage fields when rewrite data files. schema = IcebergUtils.appendRowLineageFieldsForV3(schema); } + IcebergUtils.validateVariantWriteUnsupported(schema); tSink.setSchemaJson(SchemaParser.toJson(schema)); // partition spec diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/IcebergUtilsTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/IcebergUtilsTest.java index 1fcde27aa95416..a8bf7bd3db1cd3 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/IcebergUtilsTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/IcebergUtilsTest.java @@ -159,6 +159,12 @@ public void testAppendRowLineageFieldsForV3AddsMetadataFields() { Assert.assertNotNull(schemaWithRowLineage.findField(MetadataColumns.LAST_UPDATED_SEQUENCE_NUMBER.fieldId())); } + @Test + public void testIcebergVariantTypeToDorisVariant() { + Assert.assertEquals(Type.VARIANT, + IcebergUtils.icebergTypeToDorisType(Types.VariantType.get(), false, false)); + } + @Test public void testGetPartitionInfoMapSkipBinaryIdentityPartition() { Schema schema = new Schema( diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/source/IcebergScanNodeTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/source/IcebergScanNodeTest.java index f55b4f6f8e8027..4f796b58bb5afb 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/source/IcebergScanNodeTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/iceberg/source/IcebergScanNodeTest.java @@ -17,13 +17,24 @@ package org.apache.doris.datasource.iceberg.source; +import org.apache.doris.analysis.SlotDescriptor; +import org.apache.doris.analysis.SlotId; import org.apache.doris.analysis.TupleDescriptor; import org.apache.doris.analysis.TupleId; +import org.apache.doris.catalog.ArrayType; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.MapType; +import org.apache.doris.catalog.StructField; +import org.apache.doris.catalog.StructType; +import org.apache.doris.catalog.Type; +import org.apache.doris.common.DdlException; import org.apache.doris.common.util.LocationPath; import org.apache.doris.datasource.TableFormatType; +import org.apache.doris.nereids.exceptions.NotSupportedException; import org.apache.doris.planner.PlanNodeId; import org.apache.doris.planner.ScanContext; import org.apache.doris.qe.SessionVariable; +import org.apache.doris.thrift.TFileFormatType; import org.apache.doris.thrift.TFileRangeDesc; import org.apache.doris.thrift.TIcebergDeleteFileDesc; @@ -51,6 +62,10 @@ private static class TestIcebergScanNode extends IcebergScanNode { public boolean isBatchMode() { return false; } + + TupleDescriptor tupleDescriptor() { + return desc; + } } @Test @@ -143,4 +158,73 @@ public void testSetIcebergParamsPropagatesPositionDeleteFileFormat() throws Exce .get(0); Assert.assertEquals(org.apache.doris.thrift.TFileFormatType.FORMAT_ORC, deleteFileDesc.getFileFormat()); } + + @Test + public void testGetFileFormatTypeRejectsVariantForOrc() throws Exception { + SessionVariable sv = new SessionVariable(); + TestIcebergScanNode node = new TestIcebergScanNode(sv); + addSlot(node.tupleDescriptor(), new Column("v", Type.VARIANT, true)); + IcebergSource source = Mockito.mock(IcebergSource.class); + Mockito.when(source.getFileFormat()).thenReturn("orc"); + setSource(node, source); + + DdlException exception = Assert.assertThrows(DdlException.class, () -> node.getFileFormatType()); + Assert.assertTrue(exception.getMessage().contains( + "Reading Iceberg VARIANT columns is only supported for Parquet files")); + Assert.assertTrue(exception.getMessage().contains("v")); + } + + @Test + public void testGetFileFormatTypeRejectsNestedVariantForOrc() throws Exception { + SessionVariable sv = new SessionVariable(); + TestIcebergScanNode node = new TestIcebergScanNode(sv); + Type nestedVariantType = new StructType(new StructField("events", + ArrayType.create(new MapType(Type.STRING, Type.VARIANT), true))); + addSlot(node.tupleDescriptor(), new Column("payload", nestedVariantType, true)); + IcebergSource source = Mockito.mock(IcebergSource.class); + Mockito.when(source.getFileFormat()).thenReturn("orc"); + setSource(node, source); + + DdlException exception = Assert.assertThrows(DdlException.class, () -> node.getFileFormatType()); + Assert.assertTrue(exception.getMessage().contains( + "Reading Iceberg VARIANT columns is only supported for Parquet files")); + Assert.assertTrue(exception.getMessage().contains("payload")); + } + + @Test + public void testGetFileFormatTypeAllowsVariantForParquet() throws Exception { + SessionVariable sv = new SessionVariable(); + TestIcebergScanNode node = new TestIcebergScanNode(sv); + addSlot(node.tupleDescriptor(), new Column("v", Type.VARIANT, true)); + IcebergSource source = Mockito.mock(IcebergSource.class); + Mockito.when(source.getFileFormat()).thenReturn("parquet"); + setSource(node, source); + + Assert.assertEquals(TFileFormatType.FORMAT_PARQUET, node.getFileFormatType()); + } + + @Test + public void testValidateVariantDataFileFormatRejectsOrcSplit() { + SessionVariable sv = new SessionVariable(); + TestIcebergScanNode node = new TestIcebergScanNode(sv); + addSlot(node.tupleDescriptor(), new Column("v", Type.VARIANT, true)); + + NotSupportedException exception = Assert.assertThrows(NotSupportedException.class, + () -> node.validateVariantDataFileFormat(org.apache.iceberg.FileFormat.ORC, "file:///tmp/v.orc")); + Assert.assertTrue(exception.getMessage().contains( + "Reading Iceberg VARIANT columns is only supported for Parquet files")); + Assert.assertTrue(exception.getMessage().contains("file:///tmp/v.orc")); + } + + private static void addSlot(TupleDescriptor desc, Column column) { + SlotDescriptor slot = new SlotDescriptor(new SlotId(desc.getSlots().size()), desc.getId()); + slot.setColumn(column); + desc.addSlot(slot); + } + + private static void setSource(IcebergScanNode node, IcebergSource source) throws Exception { + Field sourceField = IcebergScanNode.class.getDeclaredField("source"); + sourceField.setAccessible(true); + sourceField.set(node, source); + } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PruneNestedColumnTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PruneNestedColumnTest.java index 8d8d3441fa6008..97020fbe634b82 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PruneNestedColumnTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PruneNestedColumnTest.java @@ -29,13 +29,18 @@ import org.apache.doris.nereids.rules.rewrite.NestedColumnPruning.DataTypeAccessTree; import org.apache.doris.nereids.trees.expressions.Alias; import org.apache.doris.nereids.trees.expressions.ArrayItemReference; +import org.apache.doris.nereids.trees.expressions.EqualTo; import org.apache.doris.nereids.trees.expressions.Expression; import org.apache.doris.nereids.trees.expressions.NamedExpression; import org.apache.doris.nereids.trees.expressions.Slot; import org.apache.doris.nereids.trees.expressions.SlotReference; import org.apache.doris.nereids.trees.expressions.functions.scalar.Coalesce; +import org.apache.doris.nereids.trees.expressions.functions.scalar.CreateNamedStruct; +import org.apache.doris.nereids.trees.expressions.functions.scalar.CreateStruct; +import org.apache.doris.nereids.trees.expressions.functions.scalar.ElementAt; import org.apache.doris.nereids.trees.expressions.functions.scalar.StructElement; import org.apache.doris.nereids.trees.expressions.literal.NullLiteral; +import org.apache.doris.nereids.trees.expressions.literal.VarcharLiteral; import org.apache.doris.nereids.trees.plans.Plan; import org.apache.doris.nereids.trees.plans.logical.LogicalOlapScan; import org.apache.doris.nereids.trees.plans.physical.PhysicalCTEConsumer; @@ -45,6 +50,10 @@ import org.apache.doris.nereids.types.DataType; import org.apache.doris.nereids.types.NestedColumnPrunable; import org.apache.doris.nereids.types.NullType; +import org.apache.doris.nereids.types.StringType; +import org.apache.doris.nereids.types.StructField; +import org.apache.doris.nereids.types.StructType; +import org.apache.doris.nereids.types.VariantType; import org.apache.doris.nereids.util.MemoPatternMatchSupported; import org.apache.doris.nereids.util.PlanChecker; import org.apache.doris.planner.OlapScanNode; @@ -52,6 +61,7 @@ import org.apache.doris.utframe.TestWithFeService; import com.google.common.collect.ImmutableList; +import com.google.common.collect.LinkedHashMultimap; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; @@ -101,6 +111,20 @@ public void createTable() throws Exception { + " v variant\n" + ") properties ('replication_num'='1')"); + createTable("create table variant_container_tbl(\n" + + " id int,\n" + + " arr array,\n" + + " m map,\n" + + " v variant\n" + + ") properties ('replication_num'='1')"); + + createTable("create table variant_expr_tbl(\n" + + " id int,\n" + + " flag boolean,\n" + + " v1 variant,\n" + + " v2 variant\n" + + ") properties ('replication_num'='1')"); + // Table for string-length offset-only optimization tests createTable("create table str_tbl(\n" + " id int,\n" @@ -183,6 +207,48 @@ public void testVariantAccessPath() throws Exception { ); } + @Test + public void testNonVariantInsideNamedStructConstructorCollectsSubPath() throws Exception { + StructType structType = new StructType(ImmutableList.of( + new StructField("city", StringType.INSTANCE, true, ""))); + SlotReference slot = new SlotReference("s", structType, true); + Expression expression = new StructElement( + new StructElement( + new CreateNamedStruct(new VarcharLiteral("a"), slot), + new VarcharLiteral("a")), + new VarcharLiteral("city")); + + LinkedHashMultimap slotToAccessPaths = + LinkedHashMultimap.create(); + AccessPathExpressionCollector collector = + new AccessPathExpressionCollector(null, slotToAccessPaths, false); + collector.collect(expression); + + Assertions.assertEquals(ImmutableList.of(new CollectAccessPathResult( + ImmutableList.of("s", "city"), false, ColumnAccessPathType.DATA)), + ImmutableList.copyOf(slotToAccessPaths.get(slot.getExprId().asInt()))); + } + + @Test + public void testNonVariantInsideStructConstructorCollectsSubPath() throws Exception { + StructType structType = new StructType(ImmutableList.of( + new StructField("city", StringType.INSTANCE, true, ""))); + SlotReference slot = new SlotReference("s", structType, true); + Expression expression = new StructElement( + new StructElement(new CreateStruct(slot), new VarcharLiteral("col1")), + new VarcharLiteral("city")); + + LinkedHashMultimap slotToAccessPaths = + LinkedHashMultimap.create(); + AccessPathExpressionCollector collector = + new AccessPathExpressionCollector(null, slotToAccessPaths, false); + collector.collect(expression); + + Assertions.assertEquals(ImmutableList.of(new CollectAccessPathResult( + ImmutableList.of("s", "city"), false, ColumnAccessPathType.DATA)), + ImmutableList.copyOf(slotToAccessPaths.get(slot.getExprId().asInt()))); + } + @Test public void testVariantMultiProjectionAccessPaths() throws Exception { assertVariantSubColumnSlots("select v['a'], v['b']['c'] from variant_tbl", @@ -201,6 +267,177 @@ public void testVariantPredicateAccessPath() throws Exception { ); } + @Test + public void testVariantTopLevelNullPredicateUsesRootAccessPath() throws Exception { + assertColumn("select 1 from variant_tbl where v is null", + "variant", + ImmutableList.of(path("v")), + ImmutableList.of(path("v")) + ); + assertColumn("select 1 from variant_tbl where v is not null", + "variant", + ImmutableList.of(path("v")), + ImmutableList.of(path("v")) + ); + } + + @Test + public void testVariantWholeColumnWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select v from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantWholeColumnWithSiblingSubPathAccessPath() throws Exception { + assertAllAccessPathsContain( + "select v from (select v, v['k'] as k from variant_tbl) t where k is not null", + ImmutableList.of(path("v")), + ImmutableList.of()); + } + + @Test + public void testVariantAliasWholeOutputWithOrderSubPathAccessPath() throws Exception { + assertAllAccessPathsContain( + "select v as a from variant_tbl order by cast(a['k'] as string)", + ImmutableList.of(path("v"), path("v", "k")), + ImmutableList.of()); + } + + @Test + public void testVariantAliasWholeOutputWithPredicateSubPathAccessPath() throws Exception { + assertAllAccessPathsContain( + "select a from (select v as a from variant_tbl) t where a['k'] is not null", + ImmutableList.of(path("v"), path("v", "k")), + ImmutableList.of()); + } + + @Test + public void testVariantWholeExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select cast(v as string) from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantTypeWholeExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select variant_type(v) from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantComparisonPredicateCollectsWholeVariantOperand() { + SlotReference slot = new SlotReference("v", VariantType.INSTANCE, true); + Expression expression = new EqualTo(slot, new NullLiteral(VariantType.INSTANCE)); + + LinkedHashMultimap slotToAccessPaths = + LinkedHashMultimap.create(); + AccessPathExpressionCollector collector = + new AccessPathExpressionCollector(null, slotToAccessPaths, true); + collector.collect(expression); + + Assertions.assertEquals(ImmutableList.of(new CollectAccessPathResult( + ImmutableList.of("v"), true, ColumnAccessPathType.DATA)), + ImmutableList.copyOf(slotToAccessPaths.get(slot.getExprId().asInt()))); + } + + @Test + public void testVariantLiteralPathComparisonKeepsSubPathOperand() { + SlotReference slot = new SlotReference("v", VariantType.INSTANCE, true); + Expression expression = new EqualTo( + new ElementAt(slot, new VarcharLiteral("k")), + new NullLiteral(VariantType.INSTANCE)); + + LinkedHashMultimap slotToAccessPaths = + LinkedHashMultimap.create(); + AccessPathExpressionCollector collector = + new AccessPathExpressionCollector(null, slotToAccessPaths, true); + collector.collect(expression); + + Assertions.assertEquals(ImmutableList.of(new CollectAccessPathResult( + ImmutableList.of("v", "k"), true, ColumnAccessPathType.DATA)), + ImmutableList.copyOf(slotToAccessPaths.get(slot.getExprId().asInt()))); + } + + @Test + public void testVariantWholeOrderExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select id from variant_tbl where v['k'] is not null order by cast(v as string)"); + } + + @Test + public void testVariantWholeGroupExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select cast(v as string), count(*) from variant_tbl where v['k'] is not null " + + "group by cast(v as string)"); + } + + @Test + public void testVariantDynamicElementAtWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select v[cast(id as string)] from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantChainedDynamicElementAtWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select v[cast(id as string)]['x'] from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantWholeNonSlotExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select cast(if(id = 1, v, v) as string) from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantWholeExpressionOutputWithSiblingPredicateAccessPath() throws Exception { + assertAllAccessPathsContain( + "select if(flag, v1, v2) as a from variant_expr_tbl where v1['p'] is not null", + ImmutableList.of(path("v1"), path("v2"), path("v1", "p")), + ImmutableList.of()); + } + + @Test + public void testVariantDynamicElementAtNonSlotExpressionWithPredicateAccessPath() throws Exception { + assertVariantWholeColumnAndPredicateAccessPaths( + "select element_at(if(id = 1, v, v), cast(id as string)) " + + "from variant_tbl where v['k'] is not null"); + } + + @Test + public void testVariantLiteralElementAtNonSlotExpressionKeepsSubPath() throws Exception { + assertVariantSubColumnSlots( + "select element_at(if(id = 1, v, v), 'a') from variant_tbl where v['k'] is not null", + ImmutableList.of( + ImmutableList.of("a"), + ImmutableList.of("k") + )); + } + + private void assertVariantWholeColumnAndPredicateAccessPaths(String sql) throws Exception { + Pair> result = collectComplexSlots(sql); + List slotDescriptors = result.second; + Assertions.assertEquals(2, slotDescriptors.size()); + + boolean hasWholeColumnSlot = false; + boolean hasPredicateSlot = false; + for (SlotDescriptor slotDescriptor : slotDescriptors) { + TreeSet allAccessPaths = + new TreeSet<>(slotDescriptor.getAllAccessPaths()); + TreeSet predicateAccessPaths = + new TreeSet<>(slotDescriptor.getPredicateAccessPaths()); + if (allAccessPaths.equals(new TreeSet<>(ImmutableList.of(path("v")))) + && predicateAccessPaths.isEmpty()) { + hasWholeColumnSlot = true; + } + if (allAccessPaths.equals(new TreeSet<>(ImmutableList.of(path("v", "k")))) + && predicateAccessPaths.equals(new TreeSet<>(ImmutableList.of(path("v", "k"))))) { + hasPredicateSlot = true; + } + } + Assertions.assertTrue(hasWholeColumnSlot); + Assertions.assertTrue(hasPredicateSlot); + } + @Test public void testVariantProjectAndPredicateAccessPaths() throws Exception { assertVariantSubColumnSlots("select v['a'] from variant_tbl where v['b']['c'] = 1", @@ -210,6 +447,75 @@ public void testVariantProjectAndPredicateAccessPaths() throws Exception { )); } + @Test + public void testVariantCastProjectionKeepsSubPathWithSiblingPredicate() throws Exception { + assertAllAccessPathsContain( + "select cast(v as variant)['a'] from variant_tbl where v['k'] is not null", + ImmutableList.of( + path("v", "a"), + path("v", "k") + ), + ImmutableList.of(path("v"))); + assertAllAccessPathsContain( + "select struct_element(cast(v['obj'] as struct), 'a') " + + "from variant_tbl where v['k'] is not null", + ImmutableList.of( + path("v", "obj", "a"), + path("v", "k") + ), + ImmutableList.of(path("v"))); + } + + @Test + public void testVariantIndexExpressionDoesNotInheritContainerPath() throws Exception { + assertAllAccessPathsContain( + "select arr[cast(v['idx'] as int)] from variant_container_tbl where v['k'] is not null", + ImmutableList.of( + path("v", "idx"), + path("v", "k") + ), + ImmutableList.of(path("v", "idx", AccessPathInfo.ACCESS_ALL))); + assertAllAccessPathsContain( + "select m[cast(v['map_key'] as string)] from variant_container_tbl where v['k'] is not null", + ImmutableList.of( + path("v", "map_key"), + path("v", "k") + ), + ImmutableList.of(path("v", "map_key", AccessPathInfo.ACCESS_ALL))); + } + + @Test + public void testMapFunctionsCollectVariantArguments() throws Exception { + assertAllAccessPathsContain( + "select map_contains_key(m, cast(v['key'] as string)) " + + "from variant_container_tbl where v['p'] is not null", + ImmutableList.of( + path("m", "KEYS"), + path("v", "key"), + path("v", "p") + ), + ImmutableList.of()); + assertAllAccessPathsContain( + "select map_contains_value(m, cast(v['value'] as int)) " + + "from variant_container_tbl where v['p'] is not null", + ImmutableList.of( + path("m", "VALUES"), + path("v", "value"), + path("v", "p") + ), + ImmutableList.of()); + assertAllAccessPathsContain( + "select map_contains_entry(m, cast(v['key'] as string), cast(v['value'] as int)) " + + "from variant_container_tbl where v['p'] is not null", + ImmutableList.of( + path("m", AccessPathInfo.ACCESS_ALL), + path("v", "key"), + path("v", "value"), + path("v", "p") + ), + ImmutableList.of()); + } + @Test public void testVariantAliasAccessPathPropagation() throws Exception { assertColumn("select x['b'] from (select v['a'] as x from variant_tbl) t", @@ -228,6 +534,16 @@ public void testVariantCteAccessPathPropagation() throws Exception { ); } + @Test + public void testVariantUnusedCteDoesNotCollectWholeColumn() throws Exception { + assertColumn("with t as (select v from variant_tbl) select 1 from t", null, null, null); + } + + @Test + public void testVariantUnusedSubqueryDoesNotCollectWholeColumn() throws Exception { + assertColumn("select 1 from (select v from variant_tbl) t", null, null, null); + } + @Test public void testVariantJoinAccessPathPropagation() throws Exception { assertVariantSubColumnSlotCount( @@ -247,6 +563,14 @@ public void testExplodeVariantAccessPath() throws Exception { ); } + @Test + public void testExplodeVariantWholeOutputWithPredicateAccessPath() throws Exception { + assertAllAccessPathsContain( + "select x from variant_tbl lateral view explode(v) tmp as x where x['k'] is not null", + ImmutableList.of(path("v")), + ImmutableList.of()); + } + @Test public void testExplodeVariantProjectAndFilterAccessPath() throws Exception { assertColumn("select x['x'] from variant_tbl lateral view explode(v['arr']) tmp as x where x['y'] is not null", @@ -956,6 +1280,7 @@ public void testDataTypeAccessTree() { ), "STRUCT>>>" ); + } @Test diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacerTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacerTest.java new file mode 100644 index 00000000000000..c4f6b20cdc5081 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/SlotTypeReplacerTest.java @@ -0,0 +1,210 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.rules.rewrite; + +import org.apache.doris.analysis.ColumnAccessPath; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.MapType; +import org.apache.doris.catalog.StructField; +import org.apache.doris.catalog.StructType; +import org.apache.doris.catalog.Type; +import org.apache.doris.datasource.iceberg.IcebergExternalTable; +import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.SlotReference; +import org.apache.doris.nereids.trees.plans.Plan; +import org.apache.doris.nereids.trees.plans.RelationId; +import org.apache.doris.nereids.trees.plans.algebra.SetOperation; +import org.apache.doris.nereids.trees.plans.logical.LogicalExcept; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; +import org.apache.doris.nereids.trees.plans.logical.LogicalIntersect; +import org.apache.doris.nereids.types.DataType; + +import com.google.common.collect.ImmutableList; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +class SlotTypeReplacerTest { + + @Test + void testIcebergAccessPathReplacesMixedCaseStructChildByFieldId() { + Column column = mixedCaseStructColumn(); + LogicalFileScan scan = newIcebergScan(column); + SlotReference slot = (SlotReference) scan.getOutput().get(0); + List allPaths = ImmutableList.of(path("payload", "mixedfield")); + List predicatePaths = ImmutableList.of(path("payload", "mixedfield")); + + SlotReference replacedSlot = replaceSlot(scan, slot, column, allPaths, predicatePaths); + + List expectedPaths = ImmutableList.of(path("10", "11")); + Assertions.assertEquals(expectedPaths, replacedSlot.getAllAccessPaths().get()); + Assertions.assertEquals(expectedPaths, replacedSlot.getPredicateAccessPaths().get()); + Assertions.assertEquals(allPaths, replacedSlot.getDisplayAllAccessPaths().get()); + Assertions.assertEquals(predicatePaths, replacedSlot.getDisplayPredicateAccessPaths().get()); + } + + @Test + void testIcebergVariantFullProjectionKeepsSubpathNamesAfterRootFieldId() { + Column column = new Column("v", Type.VARIANT, true); + column.setUniqueId(100); + LogicalFileScan scan = newIcebergScan(column); + SlotReference slot = (SlotReference) scan.getOutput().get(0); + List allPaths = ImmutableList.of(path("v")); + List predicatePaths = ImmutableList.of(path("v", "Metric", "x")); + + SlotReference replacedSlot = replaceSlot(scan, slot, column, allPaths, predicatePaths); + + Assertions.assertEquals(ImmutableList.of(path("100")), replacedSlot.getAllAccessPaths().get()); + Assertions.assertEquals(ImmutableList.of(path("100", "Metric", "x")), + replacedSlot.getPredicateAccessPaths().get()); + Assertions.assertEquals(allPaths, replacedSlot.getDisplayAllAccessPaths().get()); + Assertions.assertEquals(predicatePaths, replacedSlot.getDisplayPredicateAccessPaths().get()); + } + + @Test + void testIcebergMapKeyAccessPathReplacesNestedKeyStructChildByFieldId() { + Column column = mapWithStructKeyColumn(); + LogicalFileScan scan = newIcebergScan(column); + SlotReference slot = (SlotReference) scan.getOutput().get(0); + List allPaths = ImmutableList.of(path("payload", AccessPathInfo.ACCESS_MAP_KEYS, + "keyfield")); + List predicatePaths = ImmutableList.of(path("payload", AccessPathInfo.ACCESS_MAP_KEYS, + "keyfield")); + + SlotReference replacedSlot = replaceSlot(scan, slot, column, allPaths, predicatePaths); + + List expectedPaths = + ImmutableList.of(path("20", AccessPathInfo.ACCESS_MAP_KEYS, "22")); + Assertions.assertEquals(expectedPaths, replacedSlot.getAllAccessPaths().get()); + Assertions.assertEquals(expectedPaths, replacedSlot.getPredicateAccessPaths().get()); + Assertions.assertEquals(allPaths, replacedSlot.getDisplayAllAccessPaths().get()); + Assertions.assertEquals(predicatePaths, replacedSlot.getDisplayPredicateAccessPaths().get()); + } + + @Test + void testExceptUsesReplacedOutputsAndChildrenOutputs() { + assertSetOperationUsesReplacedOutputs(true); + } + + @Test + void testIntersectUsesReplacedOutputsAndChildrenOutputs() { + assertSetOperationUsesReplacedOutputs(false); + } + + private void assertSetOperationUsesReplacedOutputs(boolean isExcept) { + Column column = twoFieldStructColumn(); + LogicalFileScan leftScan = newIcebergScan(column); + LogicalFileScan rightScan = newIcebergScan(column); + SlotReference leftSlot = (SlotReference) leftScan.getOutput().get(0); + SlotReference rightSlot = (SlotReference) rightScan.getOutput().get(0); + DataType prunedType = new org.apache.doris.nereids.types.StructType(ImmutableList.of( + new org.apache.doris.nereids.types.StructField( + "keep", org.apache.doris.nereids.types.IntegerType.INSTANCE, true, ""))); + List allPaths = ImmutableList.of(path("payload", "keep")); + Map accessPaths = new LinkedHashMap<>(); + accessPaths.put(leftSlot.getExprId().asInt(), new AccessPathInfo(prunedType, allPaths, allPaths)); + accessPaths.put(rightSlot.getExprId().asInt(), new AccessPathInfo(prunedType, allPaths, allPaths)); + + Plan setOperation = isExcept + ? new LogicalExcept(SetOperation.Qualifier.DISTINCT, ImmutableList.of(leftSlot), + ImmutableList.of(ImmutableList.of(leftSlot), ImmutableList.of(rightSlot)), + ImmutableList.of(leftScan, rightScan)) + : new LogicalIntersect(SetOperation.Qualifier.DISTINCT, ImmutableList.of(leftSlot), + ImmutableList.of(ImmutableList.of(leftSlot), ImmutableList.of(rightSlot)), + ImmutableList.of(leftScan, rightScan)); + + Plan replacedPlan = new SlotTypeReplacer(accessPaths, setOperation).replace(); + List firstChildOutputs = isExcept + ? ((LogicalExcept) replacedPlan).getRegularChildOutput(0) + : ((LogicalIntersect) replacedPlan).getRegularChildOutput(0); + List secondChildOutputs = isExcept + ? ((LogicalExcept) replacedPlan).getRegularChildOutput(1) + : ((LogicalIntersect) replacedPlan).getRegularChildOutput(1); + List outputs = replacedPlan.getOutput(); + + Assertions.assertEquals("STRUCT", outputs.get(0).getDataType().toSql()); + Assertions.assertEquals("STRUCT", firstChildOutputs.get(0).getDataType().toSql()); + Assertions.assertEquals("STRUCT", secondChildOutputs.get(0).getDataType().toSql()); + } + + private SlotReference replaceSlot(LogicalFileScan scan, SlotReference slot, Column column, + List allPaths, List predicatePaths) { + Map accessPaths = new LinkedHashMap<>(); + accessPaths.put(slot.getExprId().asInt(), + new AccessPathInfo(DataType.fromCatalogType(column.getType()), allPaths, predicatePaths)); + + Plan replacedPlan = new SlotTypeReplacer(accessPaths, scan).replace(); + Slot replacedSlot = ((LogicalFileScan) replacedPlan).getOutput().get(0); + return (SlotReference) replacedSlot; + } + + private LogicalFileScan newIcebergScan(Column column) { + IcebergExternalTable table = Mockito.mock(IcebergExternalTable.class); + Mockito.when(table.initSelectedPartitions(Mockito.any())).thenReturn(SelectedPartitions.NOT_PRUNED); + Mockito.when(table.getFullSchema()).thenReturn(Collections.singletonList(column)); + Mockito.when(table.getName()).thenReturn("iceberg_tbl"); + return new LogicalFileScan(new RelationId(1), table, + ImmutableList.of("iceberg_catalog", "iceberg_db"), Collections.emptyList(), + Optional.empty(), Optional.empty(), Optional.empty(), Optional.empty()); + } + + private Column mixedCaseStructColumn() { + StructType type = new StructType(new StructField("MixedField", Type.INT)); + Column column = new Column("payload", type, true); + column.setUniqueId(10); + column.getChildren().get(0).setName("MixedField"); + column.getChildren().get(0).setUniqueId(11); + return column; + } + + private Column twoFieldStructColumn() { + StructType type = new StructType( + new StructField("keep", Type.INT), + new StructField("drop", Type.INT)); + Column column = new Column("payload", type, true); + column.setUniqueId(10); + column.getChildren().get(0).setName("keep"); + column.getChildren().get(0).setUniqueId(11); + column.getChildren().get(1).setName("drop"); + column.getChildren().get(1).setUniqueId(12); + return column; + } + + private Column mapWithStructKeyColumn() { + StructType keyType = new StructType(new StructField("KeyField", Type.INT)); + Column column = new Column("payload", new MapType(keyType, Type.INT), true); + column.setUniqueId(20); + Column keyColumn = column.getChildren().get(0); + keyColumn.setUniqueId(21); + keyColumn.getChildren().get(0).setName("KeyField"); + keyColumn.getChildren().get(0).setUniqueId(22); + column.getChildren().get(1).setUniqueId(23); + return column; + } + + private ColumnAccessPath path(String... parts) { + return ColumnAccessPath.data(ImmutableList.copyOf(parts)); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/VariantPruningLogicTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/VariantPruningLogicTest.java index 7a7b4b9ae4aa4b..e2910bfad74a74 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/VariantPruningLogicTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/VariantPruningLogicTest.java @@ -120,6 +120,33 @@ public void testExplodeVariantArrayWithOuterFilterAccessPaths() throws Exception ); } + @Test + public void testExplodeVariantArrayFunctionAccessPaths() throws Exception { + assertAllAccessPathsContain( + "select x['x'] from variant_tbl lateral view explode_variant_array(v['arr']) tmp as x " + + "where v['filter']['k'] = 1 and x['y'] is not null", + ImmutableList.of( + path("v", "arr", "x"), + path("v", "arr", "y"), + path("v", "filter", "k") + ), + ImmutableList.of() + ); + } + + @Test + public void testExplodeVariantArrayFunctionFullOutputAccessPath() throws Exception { + assertAllAccessPathsContain( + "select x from variant_tbl lateral view explode_variant_array(v['arr']) tmp as x " + + "where x['k'] is not null", + ImmutableList.of( + path("v", "arr"), + path("v", "arr", "k") + ), + ImmutableList.of() + ); + } + @Test public void testExplodeVariantDeepNestedAccessPaths() throws Exception { assertAllAccessPathsContain( @@ -131,17 +158,25 @@ public void testExplodeVariantDeepNestedAccessPaths() throws Exception { @Test public void testExplodeSubqueryJoinAggAccessPaths() throws Exception { + String sql = "select cast(t2.v['k'] as string) as k, count(*) from (select id, v from variant_tbl) t1 " + + "lateral view explode(t1.v['arr']) tmp as x " + + "join variant_tbl t2 on t1.id=t2.id " + + "where x['a']['b'] = 1 and t2.v['k'] is not null " + + "group by cast(t2.v['k'] as string)"; + assertVariantSubColumnSlots( + sql, + ImmutableList.of( + ImmutableList.of("arr"), + ImmutableList.of("k") + ) + ); assertAllAccessPathsContain( - "select cast(t2.v['k'] as string) as k, count(*) from (select id, v from variant_tbl) t1 " - + "lateral view explode(t1.v['arr']) tmp as x " - + "join variant_tbl t2 on t1.id=t2.id " - + "where x['a']['b'] = 1 and t2.v['k'] is not null " - + "group by cast(t2.v['k'] as string)", + sql, ImmutableList.of( path("v", "arr", "a", "b"), path("v", "k") ), - ImmutableList.of() + ImmutableList.of(path("v")) ); } @@ -226,10 +261,12 @@ private void assertAllAccessPathsContain( allAccessPaths.addAll(slotDescriptor.getAllAccessPaths()); } for (ColumnAccessPath accessPath : expectedContain) { - Assertions.assertTrue(allAccessPaths.contains(accessPath)); + Assertions.assertTrue(allAccessPaths.contains(accessPath), + "expected access path " + accessPath + " in " + allAccessPaths); } for (ColumnAccessPath accessPath : expectedNotContain) { - Assertions.assertFalse(allAccessPaths.contains(accessPath)); + Assertions.assertFalse(allAccessPaths.contains(accessPath), + "unexpected access path " + accessPath + " in " + allAccessPaths); } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommandTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommandTest.java index 531e8e30855a2e..1659a540b1bd6a 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommandTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/IcebergMergeCommandTest.java @@ -17,11 +17,28 @@ package org.apache.doris.nereids.trees.plans.commands; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.Type; +import org.apache.doris.datasource.iceberg.IcebergUtils; +import org.apache.doris.nereids.analyzer.UnboundSlot; +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.StatementScopeIdGenerator; +import org.apache.doris.nereids.trees.expressions.literal.BooleanLiteral; +import org.apache.doris.nereids.trees.expressions.literal.NullLiteral; +import org.apache.doris.nereids.trees.plans.commands.merge.MergeMatchedClause; +import org.apache.doris.nereids.trees.plans.commands.merge.MergeNotMatchedClause; +import org.apache.doris.nereids.trees.plans.logical.LogicalEmptyRelation; +import org.apache.doris.nereids.types.DataType; import org.apache.doris.qe.ConnectContext; +import com.google.common.collect.ImmutableList; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.lang.reflect.Method; +import java.util.List; +import java.util.Optional; + public class IcebergMergeCommandTest { @Test @@ -52,4 +69,47 @@ public void testExecuteWithExternalTableBatchModeDisabledRestoresValueOnExceptio Assertions.assertEquals("expected", exception.getMessage()); Assertions.assertFalse(ctx.getSessionVariable().enableExternalTableBatchMode); } + + @Test + public void testWritesDataFilesForMergeClauses() { + Assertions.assertFalse(IcebergMergeCommand.writesDataFiles( + ImmutableList.of(new MergeMatchedClause(Optional.empty(), ImmutableList.of(), true)), + ImmutableList.of())); + + Assertions.assertTrue(IcebergMergeCommand.writesDataFiles( + ImmutableList.of(new MergeMatchedClause(Optional.empty(), ImmutableList.of(), false)), + ImmutableList.of())); + + Assertions.assertTrue(IcebergMergeCommand.writesDataFiles( + ImmutableList.of(new MergeMatchedClause(Optional.empty(), ImmutableList.of(), true)), + ImmutableList.of(new MergeNotMatchedClause(Optional.empty(), ImmutableList.of(), + ImmutableList.of())))); + } + + @Test + public void testDeleteProjectionDoesNotReadVisibleTargetColumns() throws Exception { + IcebergMergeCommand command = new IcebergMergeCommand( + ImmutableList.of("catalog", "db", "target"), + Optional.of("t"), + Optional.empty(), + new LogicalEmptyRelation(StatementScopeIdGenerator.newRelationId(), ImmutableList.of()), + BooleanLiteral.TRUE, + ImmutableList.of(new MergeMatchedClause(Optional.empty(), ImmutableList.of(), true)), + ImmutableList.of()); + Method method = IcebergMergeCommand.class.getDeclaredMethod( + "buildDeleteProjection", Expression.class, List.class); + method.setAccessible(true); + + Column id = new Column("id", Type.INT, true); + Column variant = new Column("v", Type.VARIANT, true); + Column rowId = new Column(IcebergUtils.ICEBERG_ROW_ID_COL, Type.BIGINT, true); + List projection = (List) method.invoke( + command, new NullLiteral(DataType.fromCatalogType(Type.BIGINT)), ImmutableList.of(id, variant, rowId)); + + Assertions.assertTrue(projection.get(2) instanceof NullLiteral); + Assertions.assertTrue(projection.get(3) instanceof NullLiteral); + Assertions.assertTrue(projection.get(4) instanceof UnboundSlot); + Assertions.assertEquals(ImmutableList.of("t", IcebergUtils.ICEBERG_ROW_ID_COL), + ((UnboundSlot) projection.get(4)).getNameParts()); + } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergMergeSinkTest.java b/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergMergeSinkTest.java index 23dcb4403ce731..71f96c3bd54596 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergMergeSinkTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergMergeSinkTest.java @@ -18,6 +18,7 @@ package org.apache.doris.planner; import org.apache.doris.catalog.DatabaseIf; +import org.apache.doris.common.AnalysisException; import org.apache.doris.datasource.CatalogProperty; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalTable; @@ -74,6 +75,34 @@ public void testBindDataSinkSkipsRewritableDeleteFileSetsAndRowLineageSchemaForV IcebergUtils.ICEBERG_LAST_UPDATED_SEQUENCE_NUMBER_COL)); } + @Test + public void testBindDataSinkRejectsVariantSchema() { + Schema schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get()), + Types.NestedField.optional(2, "payload", Types.VariantType.get())); + IcebergMergeSink sink = new IcebergMergeSink( + mockIcebergExternalTable(2, schema), new DeleteCommandContext()); + + AnalysisException exception = Assertions.assertThrows( + AnalysisException.class, () -> sink.bindDataSink(Optional.empty())); + Assertions.assertTrue(exception.getMessage().contains( + "Writing Iceberg VARIANT columns is not supported: payload")); + } + + @Test + public void testBindDataSinkAllowsVariantSchemaForDeleteOnlyMerge() throws Exception { + Schema schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get()), + Types.NestedField.optional(2, "payload", Types.VariantType.get())); + IcebergMergeSink sink = new IcebergMergeSink( + mockIcebergExternalTable(2, schema), new DeleteCommandContext(), false); + + sink.bindDataSink(Optional.empty()); + + TIcebergMergeSink thriftSink = sink.tDataSink.getIcebergMergeSink(); + Assertions.assertTrue(thriftSink.getSchemaJson().contains("\"payload\"")); + } + private static TIcebergRewritableDeleteFileSet buildDeleteFileSet() { TIcebergDeleteFileDesc deleteFileDesc = new TIcebergDeleteFileDesc(); deleteFileDesc.setPath("file:///tmp/delete.puffin"); @@ -85,6 +114,10 @@ private static TIcebergRewritableDeleteFileSet buildDeleteFileSet() { private static IcebergExternalTable mockIcebergExternalTable(int formatVersion) { Schema schema = new Schema(Types.NestedField.required(1, "id", Types.IntegerType.get())); + return mockIcebergExternalTable(formatVersion, schema); + } + + private static IcebergExternalTable mockIcebergExternalTable(int formatVersion, Schema schema) { PartitionSpec spec = PartitionSpec.unpartitioned(); Map properties = new HashMap<>(); properties.put(TableProperties.FORMAT_VERSION, String.valueOf(formatVersion)); diff --git a/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergTableSinkTest.java b/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergTableSinkTest.java new file mode 100644 index 00000000000000..7ac49bea63d6d2 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/planner/IcebergTableSinkTest.java @@ -0,0 +1,89 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.planner; + +import org.apache.doris.catalog.DatabaseIf; +import org.apache.doris.common.AnalysisException; +import org.apache.doris.datasource.CatalogProperty; +import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; +import org.apache.doris.datasource.iceberg.IcebergExternalTable; + +import org.apache.iceberg.PartitionSpec; +import org.apache.iceberg.Schema; +import org.apache.iceberg.SortOrder; +import org.apache.iceberg.Table; +import org.apache.iceberg.TableProperties; +import org.apache.iceberg.types.Types; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +public class IcebergTableSinkTest { + @Test + public void testBindDataSinkRejectsVariantSchema() { + Schema schema = new Schema( + Types.NestedField.required(1, "id", Types.IntegerType.get()), + Types.NestedField.optional(2, "payload", Types.VariantType.get())); + IcebergTableSink sink = new IcebergTableSink(mockIcebergExternalTable(schema)); + + AnalysisException exception = Assertions.assertThrows( + AnalysisException.class, () -> sink.bindDataSink(Optional.empty())); + Assertions.assertTrue(exception.getMessage().contains( + "Writing Iceberg VARIANT columns is not supported: payload")); + } + + private static IcebergExternalTable mockIcebergExternalTable(Schema schema) { + PartitionSpec spec = PartitionSpec.unpartitioned(); + Map properties = new HashMap<>(); + properties.put(TableProperties.FORMAT_VERSION, "2"); + properties.put(TableProperties.DEFAULT_FILE_FORMAT, "parquet"); + properties.put(TableProperties.PARQUET_COMPRESSION, "snappy"); + properties.put(TableProperties.WRITE_DATA_LOCATION, "file:///tmp/iceberg_tbl/data"); + + Table icebergTable = Mockito.mock(Table.class); + Mockito.when(icebergTable.properties()).thenReturn(properties); + Mockito.when(icebergTable.spec()).thenReturn(spec); + Mockito.when(icebergTable.specs()).thenReturn(Collections.singletonMap(spec.specId(), spec)); + Mockito.when(icebergTable.location()).thenReturn("file:///tmp/iceberg_tbl"); + Mockito.when(icebergTable.schema()).thenReturn(schema); + Mockito.when(icebergTable.sortOrder()).thenReturn(SortOrder.unsorted()); + Mockito.when(icebergTable.name()).thenReturn("db.tbl"); + + CatalogProperty catalogProperty = Mockito.mock(CatalogProperty.class); + Mockito.when(catalogProperty.getMetastoreProperties()).thenReturn(null); + Mockito.when(catalogProperty.getStoragePropertiesMap()).thenReturn(Collections.emptyMap()); + + IcebergExternalCatalog catalog = Mockito.mock(IcebergExternalCatalog.class); + Mockito.when(catalog.getCatalogProperty()).thenReturn(catalogProperty); + + DatabaseIf database = Mockito.mock(DatabaseIf.class); + IcebergExternalTable table = Mockito.mock(IcebergExternalTable.class); + Mockito.when(table.isView()).thenReturn(false); + Mockito.when(table.getCatalog()).thenReturn(catalog); + Mockito.when(table.getDatabase()).thenReturn(database); + Mockito.when(table.getDbName()).thenReturn("db"); + Mockito.when(table.getName()).thenReturn("tbl"); + Mockito.when(table.getIcebergTable()).thenReturn(icebergTable); + return table; + } +} diff --git a/regression-test/data/external_table_p0/tvf/iceberg_variant_binary_typed.parquet b/regression-test/data/external_table_p0/tvf/iceberg_variant_binary_typed.parquet new file mode 100644 index 0000000000000000000000000000000000000000..fae29b093cd58eefb1aa91a64b7c39a588c438ac GIT binary patch literal 743 zcmYjP&1%~~5T0ELSE$BGsCSWy4!YDRhZ-yXkqe@bODV;NP;xAomAxqfTTK;NkXy-9 z1)x^CD zPw1N2Vr2b=<1c}Gz+J~AA1B&@;I2f6!q%Pkj=8hUk?k0$QpFBmAlU@hla`pF0$Hab zCoSodbHhYdc{b%(z4X$BPIUfuUaCA*Sus!2Oy!sJ;_@7;F%k3x9G)&=%pFz=^G(}- zMEewNj9Y76sHG|ttv;+MsI%H=!3ltj1f=PDZrlS8(+S%pErCHfle3-`*VWdgO9-mk zYY#Vz&9qS|29Q4t(UqUM=mROWUD0s^R^I2Y6myyPPUEIujfKkhU2LLth7M)G-hljV zh{tlv#q%b|l$E<>v7}0r$*gz7;t2%QcLz^kSIYi>1|Y9Y;mMDN7=ir6rP;G^>H-9> z&Q<5O+%I*0k-p3PupdVKfgdEBbhY%))MBuRMw9XINQc8=qLXJwQx%;}RHVmI ZG7bmHD2h~g=*AvE{lYgqmlM3={{W?AeFp#l literal 0 HcmV?d00001 diff --git a/regression-test/data/external_table_p0/tvf/iceberg_variant_binary_unshredded.parquet b/regression-test/data/external_table_p0/tvf/iceberg_variant_binary_unshredded.parquet new file mode 100644 index 0000000000000000000000000000000000000000..3e33f256e13baf802f012f620344da6672eb2036 GIT binary patch literal 764 zcmY*Xzi-n(6n=NU1Y?Cr6?rGSiclDc+<_=g8XF`p-4M!9p$-UDiq7#BEVVAUNhG%Z z0Sp}(`4?bhECWBp1PgP-Kfwg=?4&e)w%@&X-}~O(celUyB)~eh$lfl$FY99z2=qDh zhb@3}COGFn51~S`ughP5YHL)}pb2GJ1_zG8nUQnTj2Hu!P}x(mpP%3V=Kss;Rhd(w z!Cu#Gmvx*0*s(>)zp!^BHr+q@-WK;y`(0+X(lo!0)^M8i;BF}Sxjm`mJF z2q3^~OB=^TXGRB@8)Dfq!6m~BR)z!9@$k{zY^oDoJfAF7k*Rz(NwQoOuP5{T>_wr| zRHrBic@JEkr7+?yTc9{>bZ(oaV}tVciJqxc&6EkbTU)5(xdth`wikDyOB!nqrf}zgN4sP ze&bSfZF1DrfBaGvujYDo|D@*hQmZB j4+jrabQG#ckD_E0^po8vQo)WFw*jgbzR*$Z)4l%($mDZA literal 0 HcmV?d00001 diff --git a/regression-test/data/external_table_p0/tvf/iceberg_variant_shredded.parquet b/regression-test/data/external_table_p0/tvf/iceberg_variant_shredded.parquet new file mode 100644 index 0000000000000000000000000000000000000000..d1a75e2526bf8df21b5b9196ba2ba76b7a7119af GIT binary patch literal 1865 zcmb7FL1+^}6#cu|Y`19(rr>TW-`qDnLqFU_vioF z(NpJ=1c;8UeDB5WaFj$)REPo~0Dy=+gk8cO#!e)10o%>>4|ipghy>SlTo#%@lv@uS zIogr63s#=xS<%U63T(E>?Agq;!zOdql;t=GeGuEnpWmJ<+>%6OZppQB3_`el0Nc)u zx9>xqBkneLAaq{vcRxQ_eX49R8O5@rOVD z(}qJFvT)~RxuxEQjCh=UPqI4etIqnXvw`Z&(Aq>zmAGjf-0+GTMS(BtjmGppREZBS zXewF+E=IZrA;L}WSvv#XBKo0JLWk38lsHZoDxub&Ps&~}G}W6JuJy{PC{@T9as@3) zw*_*N)27QIF0P8ibVYY_bC0wVB1?jQY)uO4ZAqlecQk5pVy}nQY=)(oGc{4NoUBzS zPRw3%m~Ar~la`2`po&==3{e$IM84P>JIt?vwBovg*=yNW(c+@0hD%m%j&aSF;aa%UpcGAW}g-2 z*FLOV$U06~sbwL)h0vh;uY>L1+^}6#ct9*)BEORx`t{EG3sLE!5DoP14FD2%-{NDS8nMlua|0z@~}Y?W*9# zixl)Ah&>hbD1siOpcD^16mNpyp|L6br-pv2I)90_| zD9~OUzqgjgO-j(xs0JVbfQ%1%k~m(y+k36pRHh_yOVCRD`0UB!Zjnc*D)K~*qJVb}wOF2^Yi)%cL9r?fiiL%wF_OtNYNISLOcu~{LLk$G zu_}fBP_oH(C0c4lt;w|9S}rTm5)oG}_t$25g?l$=0>`U4^Zty_>&=GexHJCUCa+>y zshj~_u2mr~>rz0~ax#5N+z(BnU#oN9sXD$ROj%C^j=RW3rZItv zC04R_B|Nd{DV>N06DQuc4iHR4-3IJI7iH`Rp=U#Uax}gl3N5p;wI zy6-)TZ$s~MOZvRHGg={aY1 zc(yoJ$d8Wj+~{b9S0+Zro#I@CP&k-fQ4Ce5rC`eRMg(qoT; zd(ew6d-UXg;7$AkdJqqL^RkGD=+(1t60})Z*i4d{_ws%3d-L98=gwo7(zmxyMA^ zM{i%fF0R=5lx-xW6@4qAQ$>ed>B^1bHo#r^;>TANwfW)a_NU*M^YG^pddv3rmxl5_ z8O!@oGY{q(H7hzn7fvd1Moo@}avQ6R&KMbpb&;xB$w1CeAqlt92GGH*@&fCizFO_F zxypn)%2*P1LK1U9RE~QOuJwB|kkQlbOhkP#io3BK9*iR~=uStIXORp;88RKQ*P%lD zA+%_P%$UAcwAR&jSrPq(p^QZ+Vxdf0$<4%IDuL@})%0zsq7E}m#b)T_9p14Eep;rq zW*nTPXsLyHV&@D-!3W?UbF9qYX{-h+Qqg23(;>6}ib=#dpWC~Bw$Rx%M$fb$`z2K> zDJ3iq@Gm)bm4DXQy)#389v+D3`BcU?hmkdt(WE~fS#HhsYIVyAru{)^?TcQ$=QZ1{#-?-|jX(ys iH*X1Vzb!o3@`9FI51O7Q+zr#O0;D%w!t=U=KjdFmBA*%n literal 0 HcmV?d00001 diff --git a/regression-test/data/external_table_p0/tvf/iceberg_variant_typed_only.parquet b/regression-test/data/external_table_p0/tvf/iceberg_variant_typed_only.parquet new file mode 100644 index 0000000000000000000000000000000000000000..103b2f4b9769fe453e883add06f7119e8bebc335 GIT binary patch literal 1724 zcmZuyPiP!f7=JVSW;#uyHMQ?CQ|7Rj3|X*CHp%X$FqjA;QhF$!JcK4Q*|+V&?q=O= z))abZp-8Zof>3&DPoAtIQcygI2f<7AV6S?qBIqd>C7__c@4eZ{H0iwEo%j8Dzwgid zzS+xfyi*fQyn=P`%ieM*I3!Paj9H8^8((Oqu|ECk@6SrHumv}2i)F`tc=*xXDLTi6 zOXuiJsl-Y!zl?S3+o#_-23DI(WOnFq73;%4{`%?EQR`_o1t+@Bnj3j9VBPuo`=>7z zypakkooCF!J?RdvC8S8zD9OtF>SW{YBuf=u$H#I-SqxE*|8{F%#wP!a%f z)mq=`cdvs*!k%Tm-D$D9?OS8vu1%L;rIw0m=)E4EVk;RXB-*|+PP#jaJWm-6ErV@> z!|K9$*x4kr+gIBo)e`nGU)vyAyR+<4NMWXM(rS~R(;2DW_KVI4_fibsI`?_Evppgg z=ATeq)l+>f9Z3j5Y=Z4G$%8Ps5wQ@hm_;|KQUR@uWgL02p^f2?WOJHCMnXqHK;SS$ zmbX1|R{qIFl@Zi*Q0JZ-%vs~SKN@o&2hlisGbG;87l_%81+VGJL}E0^0L7za^1dg| z$@`9oCa@mUu%^#>o{z1#0jlK*ft{McOn%{sP=4cxISs&rIsq(OlWWnw z6(4fl&j$d><`ZzJEkwOwCW6TyJt5`qj<{IN`rRWz4FQUISi3rf4Y?6LvEv51P2=R7 z;VQ(DXDXBFV)8dC@N-wJ>R3}REYND_wLL1KHZTPjU!Cj@R;ae)+Ct2{wwotW+rFFC z2F{?Ld(W39upY#-FVmIjxx{CQ12g&fg9W^@N8261s2D1$OMESslC(HKVC;kh zQ`u053NbJu2F8j7A@N6a0#OzgBqSJlXNS;A9$azf7vFt(`QG>L-Cfph-K|le73?ox ze|)aFw1`uK8UP9a5VQiCil(7amEObt^wsAVdO!(9yn?VY14K73UAZJJ{oA`C@{ z=Yf(1ut^kxM{H$G)eW3PPH-bjs&ox<2peZT%=zB!;wFyh_ZFI zUN(ZXE@_b%RFW#CGt`Gi217ysz@C}ZVtWRyf}}DXPgf16qKv0O6V4-r9Vz8wrF`Gb zU$y+6C@}Y4$j=JS65bUmCW_h0S0Jkc?9ilh?2SR+D}QqaJ;^y;*(71C-icN|Ai*kB z{(HY~{we$c<|PV9 literal 0 HcmV?d00001 diff --git a/regression-test/data/external_table_p0/tvf/test_local_tvf_iceberg_variant.out b/regression-test/data/external_table_p0/tvf/test_local_tvf_iceberg_variant.out new file mode 100644 index 00000000000000..bf0da1da64118f --- /dev/null +++ b/regression-test/data/external_table_p0/tvf/test_local_tvf_iceberg_variant.out @@ -0,0 +1,51 @@ +-- This file is automatically generated. You should know what you did if you want to edit this +-- !desc_unshredded -- +id int Yes false \N NONE +v variant Yes false \N NONE + +-- !desc_typed_only -- +id int Yes false \N NONE +v variant Yes false \N NONE + +-- !unshredded_complex -- +1 1 name-1 100 10 false 1 name-1 +2 2 name-2 200 20 true 2 name-2 +3 3 name-3 300 30 false 3 name-3 +4 4 name-4 400 40 true 4 name-4 +5 5 name-5 500 50 false 5 name-5 + +-- !shredded_fields -- +1 100 name-1 +2 200 name-2 +3 300 name-3 +4 400 name-4 +5 500 name-5 + +-- !shredded_full_variant_with_scalar -- +1 true true +2 true true +3 true true +4 true true +5 true true + +-- !typed_only_fields -- +1 10 alpha true [{"n":1},{"n":2}] +2 20 beta true [{"n":3}] + +-- !typed_only_missing_field -- +1 true +2 true + +-- !typed_only_nested_missing_field -- +1 true +2 true + +-- !temporal_parity -- +1 19724 19724 3723004005 3723004005 1704164645006007 1704164645006007 true true true +2 20214 20214 45296789012 45296789012 1746515289010011 1746515289010011 true true true + +-- !complex_join -- +2 name-2 200 +3 name-3 300 +4 name-4 400 +5 name-5 500 diff --git a/regression-test/suites/external_table_p0/iceberg/test_iceberg_variant_table_path.groovy b/regression-test/suites/external_table_p0/iceberg/test_iceberg_variant_table_path.groovy new file mode 100644 index 00000000000000..40ca6bb84bd27f --- /dev/null +++ b/regression-test/suites/external_table_p0/iceberg/test_iceberg_variant_table_path.groovy @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +import org.apache.doris.regression.action.ProfileAction + +suite("test_iceberg_variant_table_path", "p0,external,iceberg,external_docker,external_docker_iceberg") { + def enabled = context.config.otherConfigs.get("enableIcebergTest") + if (enabled == null || !enabled.equalsIgnoreCase("true")) { + logger.info("Iceberg test is disabled") + return + } + + def catalogName = "test_iceberg_variant_table_path" + def dbName = "test_iceberg_variant_table_path_db" + def restPort = context.config.otherConfigs.get("iceberg_rest_uri_port") + def minioPort = context.config.otherConfigs.get("iceberg_minio_port") + def externalEnvIp = context.config.otherConfigs.get("externalEnvIp") + + def profileAction = new ProfileAction(context) + def getProfileByToken = { token -> + for (int i = 0; i < 60; ++i) { + List profileData = profileAction.getProfileList() + for (final def profileItem in profileData) { + if (profileItem["Sql Statement"].toString().contains(token)) { + def profileText = profileAction.getProfile(profileItem["Profile ID"].toString()).toString() + if (profileText.contains("ParquetReadColumnPaths")) { + return profileText + } + } + } + Thread.sleep(1000) + } + assertTrue(false) + } + def getParquetReadColumnPathSet = { profileText -> + def parquetReadColumnPaths = profileText.readLines().find { it.contains("ParquetReadColumnPaths") } + assertTrue(parquetReadColumnPaths != null) + logger.info("Iceberg variant table path ${parquetReadColumnPaths}") + def separatorIndex = parquetReadColumnPaths.indexOf(":") + assertTrue(separatorIndex >= 0) + return parquetReadColumnPaths.substring(separatorIndex + 1) + .split(",") + .collect { it.trim() } + .findAll { !it.isEmpty() } as Set + } + + sql """drop catalog if exists ${catalogName}""" + spark_iceberg """CREATE DATABASE IF NOT EXISTS demo.${dbName}""" + spark_iceberg """DROP TABLE IF EXISTS demo.${dbName}.variant_table_path""" + + try { + spark_iceberg_multi """ + CREATE TABLE demo.${dbName}.variant_table_path ( + id INT, + v VARIANT + ) USING iceberg + TBLPROPERTIES ( + 'format-version' = '3', + 'write.format.default' = 'parquet' + ); + + INSERT INTO demo.${dbName}.variant_table_path + SELECT 1, parse_json('{"metric":10,"nested":{"x":"a"},"items":[1,2]}') + UNION ALL + SELECT 2, parse_json('{"metric":20,"nested":{"x":"b"},"items":[3,4]}') + UNION ALL + SELECT 3, parse_json('null'); + """, 300 + + sql """ + create catalog if not exists ${catalogName} properties ( + "type" = "iceberg", + "iceberg.catalog.type" = "rest", + "uri" = "http://${externalEnvIp}:${restPort}", + "s3.access_key" = "admin", + "s3.secret_key" = "password", + "s3.endpoint" = "http://${externalEnvIp}:${minioPort}", + "s3.region" = "us-east-1" + ) + """ + + sql """switch ${catalogName}""" + sql """use ${dbName}""" + + def rows = sql """ + select id, + cast(v['metric'] as bigint) as metric, + cast(v['nested']['x'] as string) as nested_x, + cast(v['missing'] as string) is null as missing_is_null + from variant_table_path + order by id + """ + assertEquals(3, rows.size()) + assertEquals("1", rows[0][0].toString()) + assertEquals("10", rows[0][1].toString()) + assertEquals("a", rows[0][2].toString()) + assertEquals("true", rows[0][3].toString()) + assertEquals("2", rows[1][0].toString()) + assertEquals("20", rows[1][1].toString()) + assertEquals("b", rows[1][2].toString()) + assertEquals("true", rows[1][3].toString()) + assertEquals("3", rows[2][0].toString()) + assertEquals(null, rows[2][1]) + assertEquals(null, rows[2][2]) + assertEquals("true", rows[2][3].toString()) + + sql """ set enable_profile = true """ + sql """ set profile_level = 2 """ + def profileToken = UUID.randomUUID().toString() + sql """ + select "${profileToken}", sum(cast(v['metric'] as bigint)) + from variant_table_path + """ + def profile = getProfileByToken(profileToken) + def columnPaths = getParquetReadColumnPathSet(profile) + assertTrue(columnPaths.contains("v.metadata")) + assertTrue(columnPaths.contains("v.value")) + } finally { + sql """ set enable_profile = false """ + sql """ set profile_level = 0 """ + sql """drop catalog if exists ${catalogName}""" + } +} diff --git a/regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy b/regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy new file mode 100644 index 00000000000000..52f5d551444e02 --- /dev/null +++ b/regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy @@ -0,0 +1,448 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +import java.net.InetAddress +import java.net.NetworkInterface +import java.nio.file.Files +import java.nio.file.StandardCopyOption +import org.apache.doris.regression.action.ProfileAction + +suite("test_local_tvf_iceberg_variant", "p0,external") { + List> backends = sql """ show backends """ + assertTrue(backends.size() > 0) + + def dataFilePath = context.config.dataPath + "/external_table_p0/tvf/" + def beId = backends[0][0] + def outFilePath = "/" + def unshreddedData = "${dataFilePath}/iceberg_variant_unshredded.parquet" + def shreddedData = "${dataFilePath}/iceberg_variant_shredded.parquet" + def typedOnlyData = "${dataFilePath}/iceberg_variant_typed_only.parquet" + def temporalUnshreddedData = "${dataFilePath}/iceberg_variant_temporal_unshredded.parquet" + def temporalTypedData = "${dataFilePath}/iceberg_variant_temporal_typed.parquet" + def binaryUnshreddedData = "${dataFilePath}/iceberg_variant_binary_unshredded.parquet" + def binaryTypedData = "${dataFilePath}/iceberg_variant_binary_typed.parquet" + + def localHosts = ["localhost", "127.0.0.1", InetAddress.localHost.hostAddress, InetAddress.localHost.hostName] as Set + NetworkInterface.networkInterfaces.each { networkInterface -> + networkInterface.inetAddresses.each { inetAddress -> + localHosts.add(inetAddress.hostAddress) + localHosts.add(inetAddress.hostName) + } + } + + def dorisHome = new File(context.config.dataPath).parentFile.parentFile + def localBeHome = new File(dorisHome, "output/be") + def localJdbc = context.config.jdbcUrl.contains("127.0.0.1") || context.config.jdbcUrl.contains("localhost") + if (localJdbc && backends.size() == 1 && localHosts.contains(backends[0][1]) && localBeHome.exists()) { + outFilePath = "" + Files.copy(new File(unshreddedData).toPath(), new File(localBeHome, "iceberg_variant_unshredded.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(shreddedData).toPath(), new File(localBeHome, "iceberg_variant_shredded.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(typedOnlyData).toPath(), new File(localBeHome, "iceberg_variant_typed_only.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(temporalUnshreddedData).toPath(), new File(localBeHome, "iceberg_variant_temporal_unshredded.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(temporalTypedData).toPath(), new File(localBeHome, "iceberg_variant_temporal_typed.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(binaryUnshreddedData).toPath(), new File(localBeHome, "iceberg_variant_binary_unshredded.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + Files.copy(new File(binaryTypedData).toPath(), new File(localBeHome, "iceberg_variant_binary_typed.parquet").toPath(), + StandardCopyOption.REPLACE_EXISTING) + } else { + for (List backend : backends) { + def beHost = backend[1] + scpFiles("root", beHost, unshreddedData, outFilePath, false) + scpFiles("root", beHost, shreddedData, outFilePath, false) + scpFiles("root", beHost, typedOnlyData, outFilePath, false) + scpFiles("root", beHost, temporalUnshreddedData, outFilePath, false) + scpFiles("root", beHost, temporalTypedData, outFilePath, false) + scpFiles("root", beHost, binaryUnshreddedData, outFilePath, false) + scpFiles("root", beHost, binaryTypedData, outFilePath, false) + } + } + + def unshredded = outFilePath + "iceberg_variant_unshredded.parquet" + def shredded = outFilePath + "iceberg_variant_shredded.parquet" + def typedOnly = outFilePath + "iceberg_variant_typed_only.parquet" + def temporalUnshredded = outFilePath + "iceberg_variant_temporal_unshredded.parquet" + def temporalTyped = outFilePath + "iceberg_variant_temporal_typed.parquet" + def binaryUnshredded = outFilePath + "iceberg_variant_binary_unshredded.parquet" + def binaryTyped = outFilePath + "iceberg_variant_binary_typed.parquet" + def profileAction = new ProfileAction(context) + def getProfileByToken = { token -> + for (int i = 0; i < 60; ++i) { + List profileData = profileAction.getProfileList() + for (final def profileItem in profileData) { + if (profileItem["Sql Statement"].toString().contains(token)) { + def profileText = profileAction.getProfile(profileItem["Profile ID"].toString()).toString() + if (profileText.contains("ParquetReadColumnPaths")) { + return profileText + } + } + } + Thread.sleep(1000) + } + assertTrue(false) + } + def getParquetReadColumnPathSet = { profileText -> + def parquetReadColumnPaths = profileText.readLines().find { it.contains("ParquetReadColumnPaths") } + assertTrue(parquetReadColumnPaths != null) + logger.info("Iceberg variant shredding ${parquetReadColumnPaths}") + def separatorIndex = parquetReadColumnPaths.indexOf(":") + assertTrue(separatorIndex >= 0) + return parquetReadColumnPaths.substring(separatorIndex + 1) + .split(",") + .collect { it.trim() } + .findAll { !it.isEmpty() } as Set + } + def getProfileCounter = { profileText, counterName -> + def counterLine = profileText.readLines().find { it.contains(counterName) } + assertTrue(counterLine != null) + def matcher = counterLine =~ /${counterName}:\s*([0-9,]+)/ + assertTrue(matcher.find()) + return matcher.group(1).replace(",", "").toLong() + } + + qt_desc_unshredded """ + desc function local( + "file_path" = "${unshredded}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + + qt_desc_typed_only """ + desc function local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + + order_qt_unshredded_complex """ + select id, + cast(v['id'] as int) as variant_id, + cast(v['name'] as string) as name, + cast(v['metric'] as bigint) as metric, + cast(v['nested']['score'] as int) as score, + cast(v['nested']['flag'] as boolean) as flag, + cast(v['arr'] as array)[1] as first_arr, + cast(v['arr'] as array)[2] as second_arr + from local( + "file_path" = "${unshredded}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_shredded_fields """ + select id, + cast(v['metric'] as bigint) as metric, + cast(v['name'] as string) as name + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_shredded_full_variant_with_scalar """ + select id, + cast(v as string) like '%"name":"name-%' as has_name, + cast(v as string) like '%"metric":%' as has_metric + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_typed_only_fields """ + select id, + cast(v['metric'] as bigint) as metric, + cast(v['nested']['x'] as string) as nested_x, + cast(v['f'] as string) is null as non_finite_float_is_null, + cast(v['items'] as string) as items + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_typed_only_missing_field """ + select id, + cast(v['missing'] as string) is null as missing_is_null + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_typed_only_nested_missing_field """ + select id, + cast(v['nested']['missing'] as string) is null as missing_is_null + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + order by id + """ + + order_qt_temporal_parity """ + select u.id, + cast(u.v['d'] as bigint) as unshredded_date, + cast(t.v['d'] as bigint) as typed_date, + cast(u.v['t'] as bigint) as unshredded_time, + cast(t.v['t'] as bigint) as typed_time, + cast(u.v['ts'] as bigint) as unshredded_ts, + cast(t.v['ts'] as bigint) as typed_ts, + cast(u.v['d'] as bigint) = cast(t.v['d'] as bigint) as same_date, + cast(u.v['t'] as bigint) = cast(t.v['t'] as bigint) as same_time, + cast(u.v['ts'] as bigint) = cast(t.v['ts'] as bigint) as same_ts + from local( + "file_path" = "${temporalUnshredded}", + "backend_id" = "${beId}", + "format" = "parquet") u + join local( + "file_path" = "${temporalTyped}", + "backend_id" = "${beId}", + "format" = "parquet") t + on u.id = t.id + order by u.id + """ + + def binaryUnshreddedRows = sql """ + select id, hex(cast(v['b'] as varbinary)) + from local( + "file_path" = "${binaryUnshredded}", + "backend_id" = "${beId}", + "format" = "parquet", + "enable_mapping_varbinary" = "true") + order by id + """ + assertEquals(2, binaryUnshreddedRows.size()) + assertEquals("1", binaryUnshreddedRows[0][0].toString()) + assertEquals("FF0041", binaryUnshreddedRows[0][1].toString()) + assertEquals("2", binaryUnshreddedRows[1][0].toString()) + assertEquals("C328", binaryUnshreddedRows[1][1].toString()) + + def binaryTypedRows = sql """ + select id, hex(cast(v['b'] as varbinary)) + from local( + "file_path" = "${binaryTyped}", + "backend_id" = "${beId}", + "format" = "parquet", + "enable_mapping_varbinary" = "true") + order by id + """ + assertEquals(2, binaryTypedRows.size()) + assertEquals("1", binaryTypedRows[0][0].toString()) + assertEquals("FF0041", binaryTypedRows[0][1].toString()) + assertEquals("2", binaryTypedRows[1][0].toString()) + assertEquals("C328", binaryTypedRows[1][1].toString()) + + try { + sql """ set enable_profile = true """ + sql """ set profile_level = 2 """ + def profileToken = UUID.randomUUID().toString() + sql """ + select "${profileToken}", sum(cast(v['metric'] as bigint)) + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def profile = getProfileByToken(profileToken) + def metricColumnPaths = getParquetReadColumnPathSet(profile) + assertTrue(metricColumnPaths.contains("v.metadata")) + // typed_value.metric.value is the field-level residual fallback for mixed-type metric rows. + // The top-level v.value stores non-shredded object fields and is not needed for this projection. + assertTrue(metricColumnPaths.contains("v.typed_value.metric.value")) + assertTrue(metricColumnPaths.contains("v.typed_value.metric.typed_value")) + assertFalse(metricColumnPaths.contains("v.value")) + assertFalse(metricColumnPaths.contains("v.typed_value.name")) + + def nestedProfileToken = UUID.randomUUID().toString() + sql """ + select "${nestedProfileToken}", count(cast(v['metric']['x'] as string)) + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def nestedProfile = getProfileByToken(nestedProfileToken) + def nestedColumnPaths = getParquetReadColumnPathSet(nestedProfile) + assertTrue(nestedColumnPaths.contains("v.typed_value.metric.typed_value.x")) + // metric.value is required for rows that store metric as field-level residual instead of metric.typed_value. + assertTrue(nestedColumnPaths.contains("v.metadata")) + assertTrue(nestedColumnPaths.contains("v.typed_value.metric.value")) + assertFalse(nestedColumnPaths.contains("v.value")) + assertFalse(nestedColumnPaths.contains("v.typed_value.name")) + assertEquals(0, getProfileCounter(nestedProfile, "VariantDirectTypedValueReadRows")) + assertTrue(getProfileCounter(nestedProfile, "VariantRowWiseReadRows") > 0) + + def typedOnlyProfileToken = UUID.randomUUID().toString() + sql """ + select "${typedOnlyProfileToken}", sum(cast(v['metric'] as bigint)) + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def typedOnlyProfile = getProfileByToken(typedOnlyProfileToken) + def typedOnlyColumnPaths = getParquetReadColumnPathSet(typedOnlyProfile) + assertTrue(typedOnlyColumnPaths.contains("v.typed_value.metric")) + assertFalse(typedOnlyColumnPaths.contains("v.metadata")) + assertFalse(typedOnlyColumnPaths.contains("v.value")) + assertTrue(getProfileCounter(typedOnlyProfile, "VariantDirectTypedValueReadRows") > 0) + assertEquals(0, getProfileCounter(typedOnlyProfile, "VariantRowWiseReadRows")) + + def typedOnlyNestedProfileToken = UUID.randomUUID().toString() + sql """ + select "${typedOnlyNestedProfileToken}", count(cast(v['nested']['x'] as string)) + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def typedOnlyNestedProfile = getProfileByToken(typedOnlyNestedProfileToken) + def typedOnlyNestedColumnPaths = getParquetReadColumnPathSet(typedOnlyNestedProfile) + assertTrue(typedOnlyNestedColumnPaths.contains("v.typed_value.nested.typed_value.x")) + assertFalse(typedOnlyNestedColumnPaths.contains("v.metadata")) + assertFalse(typedOnlyNestedColumnPaths.contains("v.typed_value.nested.value")) + assertFalse(typedOnlyNestedColumnPaths.contains("v.value")) + assertTrue(getProfileCounter(typedOnlyNestedProfile, "VariantDirectTypedValueReadRows") > 0) + assertEquals(0, getProfileCounter(typedOnlyNestedProfile, "VariantRowWiseReadRows")) + + def binaryTypedProfileToken = UUID.randomUUID().toString() + sql """ + select "${binaryTypedProfileToken}", max(hex(cast(v['b'] as varbinary))) + from local( + "file_path" = "${binaryTyped}", + "backend_id" = "${beId}", + "format" = "parquet", + "enable_mapping_varbinary" = "true") + """ + def binaryTypedProfile = getProfileByToken(binaryTypedProfileToken) + def binaryTypedColumnPaths = getParquetReadColumnPathSet(binaryTypedProfile) + assertTrue(binaryTypedColumnPaths.contains("v.typed_value.b")) + assertFalse(binaryTypedColumnPaths.contains("v.metadata")) + assertFalse(binaryTypedColumnPaths.contains("v.value")) + assertTrue(getProfileCounter(binaryTypedProfile, "VariantDirectTypedValueReadRows") > 0) + assertEquals(0, getProfileCounter(binaryTypedProfile, "VariantRowWiseReadRows")) + + def typedOnlyMissingProfileToken = UUID.randomUUID().toString() + sql """ + select "${typedOnlyMissingProfileToken}", count(cast(v['missing'] as string)) + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def typedOnlyMissingProfile = getProfileByToken(typedOnlyMissingProfileToken) + def typedOnlyMissingColumnPaths = getParquetReadColumnPathSet(typedOnlyMissingProfile) + assertTrue(typedOnlyMissingColumnPaths.contains("v.metadata")) + assertFalse(typedOnlyMissingColumnPaths.any { it.startsWith("v.typed_value") }) + assertFalse(typedOnlyMissingColumnPaths.contains("v.value")) + + def typedOnlyNestedMissingProfileToken = UUID.randomUUID().toString() + sql """ + select "${typedOnlyNestedMissingProfileToken}", count(cast(v['nested']['missing'] as string)) + from local( + "file_path" = "${typedOnly}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def typedOnlyNestedMissingProfile = getProfileByToken(typedOnlyNestedMissingProfileToken) + def typedOnlyNestedMissingColumnPaths = getParquetReadColumnPathSet(typedOnlyNestedMissingProfile) + assertTrue(typedOnlyNestedMissingColumnPaths.contains("v.metadata")) + assertFalse(typedOnlyNestedMissingColumnPaths.any { it.startsWith("v.typed_value.nested.typed_value") }) + assertFalse(typedOnlyNestedMissingColumnPaths.contains("v.typed_value.nested.value")) + assertFalse(typedOnlyNestedMissingColumnPaths.contains("v.value")) + + def temporalProfileToken = UUID.randomUUID().toString() + sql """ + select "${temporalProfileToken}", + sum(cast(v['d'] as bigint) + cast(v['t'] as bigint) + cast(v['ts'] as bigint)) + from local( + "file_path" = "${temporalTyped}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def temporalProfile = getProfileByToken(temporalProfileToken) + def temporalColumnPaths = getParquetReadColumnPathSet(temporalProfile) + assertTrue(temporalColumnPaths.contains("v.typed_value.d")) + assertTrue(temporalColumnPaths.contains("v.typed_value.t")) + assertTrue(temporalColumnPaths.contains("v.typed_value.ts")) + assertFalse(temporalColumnPaths.contains("v.metadata")) + assertFalse(temporalColumnPaths.contains("v.value")) + assertTrue(getProfileCounter(temporalProfile, "VariantDirectTypedValueReadRows") > 0) + assertEquals(0, getProfileCounter(temporalProfile, "VariantRowWiseReadRows")) + + def caseProfileToken = UUID.randomUUID().toString() + sql """ + select "${caseProfileToken}", count(cast(v['Name'] as string)) + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + """ + def caseProfile = getProfileByToken(caseProfileToken) + def caseColumnPaths = getParquetReadColumnPathSet(caseProfile) + assertTrue(caseColumnPaths.contains("v.metadata")) + assertTrue(caseColumnPaths.contains("v.value")) + assertFalse(caseColumnPaths.contains("v.typed_value.name")) + + def fullVariantWithPredicateProfileToken = UUID.randomUUID().toString() + sql """ + select "${fullVariantWithPredicateProfileToken}", cast(v as string) + from local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") + where cast(v['metric'] as bigint) >= 20 + """ + def fullVariantWithPredicateProfile = getProfileByToken(fullVariantWithPredicateProfileToken) + def fullVariantWithPredicateColumnPaths = getParquetReadColumnPathSet(fullVariantWithPredicateProfile) + assertTrue(fullVariantWithPredicateColumnPaths.contains("v.metadata")) + assertTrue(fullVariantWithPredicateColumnPaths.contains("v.value")) + assertTrue(fullVariantWithPredicateColumnPaths.contains("v.typed_value.metric.value")) + assertTrue(fullVariantWithPredicateColumnPaths.contains("v.typed_value.metric.typed_value")) + assertTrue(fullVariantWithPredicateColumnPaths.contains("v.typed_value.name")) + + order_qt_complex_join """ + select u.id, + cast(u.v['name'] as string) as name, + cast(s.v['metric'] as bigint) as metric + from local( + "file_path" = "${unshredded}", + "backend_id" = "${beId}", + "format" = "parquet") u + join local( + "file_path" = "${shredded}", + "backend_id" = "${beId}", + "format" = "parquet") s + on u.id = s.id + where cast(u.v['nested']['score'] as int) >= 20 + order by u.id + """ + } finally { + sql """ set enable_profile = false """ + sql """ set profile_level = 0 """ + } +}