Skip to content

[feature](iceberg) Support reading Iceberg variant from Parquet#63192

Draft
eldenmoon wants to merge 1 commit into
apache:masterfrom
eldenmoon:codex/iceberg-v3-variant
Draft

[feature](iceberg) Support reading Iceberg variant from Parquet#63192
eldenmoon wants to merge 1 commit into
apache:masterfrom
eldenmoon:codex/iceberg-v3-variant

Conversation

@eldenmoon
Copy link
Copy Markdown
Member

@eldenmoon eldenmoon commented May 12, 2026

What problem does this PR solve?

Issue Number: N/A

Related PR: #63192

Problem Summary: Doris could not read Iceberg v3 VARIANT columns from Parquet files. This change maps Iceberg VARIANT to Doris VARIANT, validates the Parquet VARIANT wrapper shape from the VariantShredding spec, decodes unshredded metadata/value encoding, reads shredded typed_value columns, and prunes shredded Parquet leaf columns for accessed variant paths with profile observability. It keeps selected typed-only shredded projections on native Parquet typed columns when residual value columns are not selected, including top-level, nested, root-array, nested-array, typed-only, value-only residual, typed-map key lookup, binary, temporal, UUID, optional top-level, nullable ARRAY/MAP layouts, and Iceberg field-id pruning. It falls back to row-wise reconstruction only for complex or selected-residual layouts, while preserving binary, binary-array, complex binary arrays, residual null, UUID, decimal-array, typed-array null element, empty-object, root-null, out-of-order residual object, and user fields named value semantics. It also keeps full VARIANT projection separate from predicate subpath pruning, supports element paths produced by variant array explode, forces a root read for dynamic VARIANT element access, and explicitly rejects Iceberg VARIANT writes because this PR only implements read support.

Release note

Support reading Iceberg v3 VARIANT Parquet columns, including shredded typed_value column pruning and binary/UUID VARIANT values. Non-finite floating-point VARIANT values remain unsupported and return explicit errors. Writing Iceberg VARIANT columns is rejected with an explicit unsupported error.

Check List (For Author)

  • Test: Regression test / Unit Test / Manual test / Static analysis

    • Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.:IcebergReaderCreateColumnIdsTest.:NestedColumnAccessHelperTest.*' (97 tests passed: 68 ParquetVariantReaderTest, 7 IcebergReaderCreateColumnIdsTest, 22 NestedColumnAccessHelperTest)

    • Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest,org.apache.doris.planner.IcebergTableSinkTest,org.apache.doris.planner.IcebergMergeSinkTest (55 tests passed: 51 PruneNestedColumnTest, 1 IcebergTableSinkTest, 3 IcebergMergeSinkTest; Maven reactor/checkstyle succeeded)

    • Regression test: performance regression coverage is included in regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy; not run locally in this worktree because no local Doris cluster/output BE+FE runtime is available.

    • Manual test: env PATH=/tmp/codex-clang-format:$PATH build-support/clang-format.sh

    • Manual test: env PATH=/tmp/codex-clang-format:$PATH build-support/check-format.sh

    • Manual test: git diff --cached --check

    • Static analysis: CLANG_TIDY_BINARY=/mnt/disk1/claude-max/ldb_toolchain16/bin/clang-tidy build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (attempted earlier in this PR loop; local clang-tidy could not analyze the changed files because existing be/src/util/jni-util.h static_assert(false) diagnostics are promoted to errors before producing actionable changed-line diagnostics.)

  • Behavior changed: Yes. Doris can read Iceberg v3 VARIANT Parquet columns, supports spec-compliant binary and UUID VARIANT values, validates wrapper fields more strictly, avoids row-wise reconstruction for supported typed-only shredded projections, preserves residual binary/null values, typed-array null elements, nullable complex pruning, user value-only fields, and complex array elements with binary leaves, preserves full VARIANT projection when predicates read typed subpaths, handles dynamic root element access, optional top-level VARIANT, root/nested variant-array element paths, typed-map key lookup paths, and rejects Iceberg VARIANT writes explicitly.

  • Does this need documentation: No

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@eldenmoon eldenmoon force-pushed the codex/iceberg-v3-variant branch from cab85b8 to e9e3bfd Compare May 12, 2026 19:31
@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

OpenCode automated review failed and did not complete.

Error: Review step was skipped (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/25757447037

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:

I found one blocking issue in the added regression test: the local single-BE copy path does not match the file_path used by the local TVF, so the test can fail in the exact environment that branch is intended to support.

Critical checkpoint conclusions:

  • Goal: add Iceberg/Parquet VARIANT read support, including shredded projection. The implementation and regression coverage mostly target that goal, but the new regression test has a path setup bug.
  • Scope: the production changes are focused on Parquet schema parsing, variant reconstruction, column pruning, and Iceberg type mapping.
  • Concurrency/lifecycle: no new shared mutable concurrent state or non-obvious lifecycle ownership issue found in the reviewed PR diff.
  • Configuration/compatibility: no new config items or persisted storage-format changes found; FE/BE type mapping paths for Iceberg VARIANT are updated.
  • Parallel paths: Hive, Iceberg, and local Parquet pruning paths were considered; the test issue is distinct from production pruning logic.
  • Tests: regression coverage was added, but the local-file staging logic can make the new test fail before validating the feature.
  • Observability/performance: added ParquetReadColumnPaths profile string is useful for validating pruning; no blocking observability or hot-path issue found beyond the test blocker.

User focus: no additional user-provided review focus was present.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a correctness blocker in the new shredded VARIANT pruning logic. The implementation prunes the unshredded value leaf whenever a matching typed_value path exists, but Iceberg shredded VARIANT can still carry residual/unrepresentable values for that field in value, so queries can silently return NULL or partial objects for those rows.

Critical checkpoint conclusions:

  • Goal/test: the PR adds Iceberg v3 VARIANT reading and pruning tests, but the tests only cover fully typed shredded fields and do not prove residual fallback correctness.
  • Scope/focus: the change is mostly focused on Parquet/Iceberg VARIANT support.
  • Concurrency/lifecycle/config: no new concurrency, non-trivial lifecycle, or config behavior found in the reviewed PR diff.
  • Compatibility: adds new type mapping; no storage-format persistence changes found.
  • Parallel paths: the same pruning issue exists in both standalone/Hive Parquet and Iceberg Parquet helper paths.
  • Tests: missing mixed shredded/residual cases where a selected typed path is absent or has an incompatible type in typed_value but exists in value.
  • Observability/performance: profile string helps inspect selected leaves; no additional blocking observability issue found.
  • Data correctness: blocking issue below can cause incorrect query results after column pruning.

No additional user-provided focus points were present.

Comment thread be/src/format/table/hive/hive_parquet_nested_column_utils.cpp Outdated
Comment thread be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp Outdated
@eldenmoon eldenmoon force-pushed the codex/iceberg-v3-variant branch from e9e3bfd to 5fe9ca5 Compare May 12, 2026 20:03
@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

1 similar comment
@eldenmoon
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the full PR with the Doris code-review checklist. I found a blocking correctness issue in nested VARIANT pruning for deeper accesses under shredded fields. Goal/test: the PR adds Iceberg/Parquet VARIANT read support and includes a regression for top-level shredded field reads, but it does not cover deeper access through a shredded field residual. Scope: the change is focused, but the duplicated Hive/Iceberg pruning helper needs the same fix. Concurrency/lifecycle/config/transaction/persistence: no new concurrency, lifecycle, config, transaction, storage-format write, or persistence concerns found. Parallel paths: the same issue exists in both Hive/local Parquet and Iceberg Parquet helpers. Tests: existing tests cover unshredded reads and top-level shredded field pruning, but should add a mixed residual case such as v['metric']['x'] where metric is shredded and some rows store an object in typed_value.metric.value. Observability: the new profile string is useful for validating selected leaves. User focus: no additional user-provided review focus was supplied.

} else {
std::vector<std::vector<std::string>> child_paths {
std::vector<std::string>(path.begin() + 1, path.end())};
found_typed_path = extract_nested_column_ids_by_name(*typed_child, child_paths,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles v['metric'], but deeper accesses under a shredded field still prune the wrong residual. For a schema like typed_value.metric { value, typed_value int64 }, the access path for v['metric']['x'] reaches this branch with typed_child == metric and path.size() > 1. metric is parsed as a regular struct, not TYPE_VARIANT, so extract_nested_column_ids_by_name(*typed_child, {['x']}, ...) finds no child named x and returns false. The code then falls back to top-level v.value, even though the Parquet VARIANT shredding spec stores non-int/object residuals for metric in v.typed_value.metric.value and forbids duplicating shredded field keys in top-level v.value. That makes deeper projections/filters silently return missing/null for rows where metric is an object stored in the field-level residual. Please treat these unannotated shredded field groups (value/typed_value) as variant-like during recursive pruning, and add a regression that projects a subpath from a shredded field whose residual is stored in typed_value.<field>.value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest push. VARIANT user keys now use exact matching while structural Parquet fields still use structural lookup. The pruning helper now treats unannotated value/typed_value shredded field groups as variant-like, so v['metric']['x'] selects v.typed_value.metric.value instead of top-level v.value. Added profile assertions for the deeper residual path.

} else {
std::vector<std::vector<std::string>> child_paths {
std::vector<std::string>(path.begin() + 1, path.end())};
found_typed_path = extract_nested_column_ids_by_name(*typed_child, child_paths,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same distinct deeper-path issue as the Hive/local helper, separate from the already-discussed top-level residual question. When Iceberg pruning sees an access like v['metric']['x'] and metric is a shredded field group with value plus typed_value, the recursive call treats metric as a plain struct and returns false because there is no child named x. The fallback then selects top-level v.value, but Iceberg/Parquet shredded field residuals for metric live in v.typed_value.metric.value; top-level v.value must not contain that shredded key. Iceberg scans can therefore lose residual object values for deeper projections or filters under a shredded field. Please make the recursive pruning recognize the unannotated value/typed_value shredded-field layout and cover this with an Iceberg/local regression case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest push. VARIANT user keys now use exact matching while structural Parquet fields still use structural lookup. The pruning helper now treats unannotated value/typed_value shredded field groups as variant-like, so v['metric']['x'] selects v.typed_value.metric.value instead of top-level v.value. Added profile assertions for the deeper residual path.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes.

Critical checkpoint conclusions:

  • Goal/test coverage: the PR adds Iceberg VARIANT Parquet reading, type mapping, pruning observability, and regression/unit coverage. The main scenario is covered, but a case-sensitive key path is not covered and currently regresses correctness.
  • Scope/focus: the change is focused on Iceberg/Parquet VARIANT support, though duplicated Hive/Iceberg pruning helpers carry the same issue.
  • Concurrency/lifecycle: no new shared mutable state, threads, locks, or static initialization hazards found in the reviewed paths.
  • Configuration/compatibility: no new configs or storage-format writes; this is a reader/type-mapping change. Mixed files with non-VARIANT types continue through existing paths.
  • Parallel paths: the Hive/local and Iceberg Parquet pruning paths were both reviewed; both have the same case-sensitivity bug and are commented separately.
  • Error handling/memory: Status returns in the new reader path are generally propagated; no ignored Status or untracked large persistent allocation issue found beyond the correctness issue raised.
  • Data correctness: blocking issue found: shredded VARIANT field lookup lowercases user path components, so distinct keys such as a and A can be pruned/read as the same field.
  • Observability/performance: the profile leaf-path observable is useful; no additional blocker found.

User focus: no additional user-provided review focus was specified.

Comment thread be/src/format/table/hive/hive_parquet_nested_column_utils.cpp Outdated
Comment thread be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp Outdated
@eldenmoon eldenmoon force-pushed the codex/iceberg-v3-variant branch from 5fe9ca5 to fa098c0 Compare May 12, 2026 20:43
@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29804 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e9e3bfd819ba9d8ccea3f9b57abb35147175a7e8, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17678	3922	3919	3919
q2	q3	10718	872	608	608
q4	4655	466	351	351
q5	7465	1357	1156	1156
q6	207	173	140	140
q7	940	940	750	750
q8	9516	1396	1309	1309
q9	6069	5362	5345	5345
q10	6307	2102	1891	1891
q11	486	266	259	259
q12	687	430	299	299
q13	18191	3334	2785	2785
q14	286	284	268	268
q15	q16	904	845	793	793
q17	958	995	742	742
q18	6501	5679	5525	5525
q19	1169	1192	1132	1132
q20	512	406	259	259
q21	4752	2425	1946	1946
q22	459	402	327	327
Total cold run time: 98460 ms
Total hot run time: 29804 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4662	4580	4561	4561
q2	q3	4703	4784	4197	4197
q4	2175	2189	1420	1420
q5	5212	4969	5227	4969
q6	202	176	143	143
q7	2063	1832	1618	1618
q8	3350	3117	3095	3095
q9	8461	8840	8379	8379
q10	4501	4555	4267	4267
q11	633	423	411	411
q12	738	748	524	524
q13	3188	3594	2917	2917
q14	302	300	290	290
q15	q16	755	778	723	723
q17	1379	1276	1282	1276
q18	8031	7128	7115	7115
q19	1158	1175	1154	1154
q20	2293	2287	1992	1992
q21	6276	5486	4800	4800
q22	531	473	400	400
Total cold run time: 60613 ms
Total hot run time: 54251 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29785 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5fe9ca52dd2edab6a76b7083da4f51d88076c8c5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17624	3891	3801	3801
q2	q3	10699	879	587	587
q4	4666	467	347	347
q5	7460	1325	1139	1139
q6	201	169	142	142
q7	918	948	769	769
q8	9690	1438	1262	1262
q9	6853	5515	5462	5462
q10	6327	2070	1808	1808
q11	481	273	258	258
q12	688	422	301	301
q13	18192	3299	2725	2725
q14	298	282	264	264
q15	q16	898	868	787	787
q17	1197	1046	735	735
q18	6458	5747	5601	5601
q19	1610	1245	1108	1108
q20	521	414	416	414
q21	4731	2362	1928	1928
q22	464	395	347	347
Total cold run time: 99976 ms
Total hot run time: 29785 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4759	4521	4635	4521
q2	q3	4621	4757	4188	4188
q4	2159	2234	1421	1421
q5	4977	4949	5181	4949
q6	207	174	135	135
q7	2087	1822	1621	1621
q8	3333	3077	3141	3077
q9	8595	8393	8429	8393
q10	4534	4533	4220	4220
q11	588	417	419	417
q12	718	809	511	511
q13	3281	3632	2864	2864
q14	310	312	281	281
q15	q16	773	798	713	713
q17	1309	1304	1293	1293
q18	7986	7053	7094	7053
q19	1161	1199	1145	1145
q20	2214	2201	1934	1934
q21	6042	5384	4784	4784
q22	530	475	405	405
Total cold run time: 60184 ms
Total hot run time: 53925 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 25.00% (1/4) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two additional correctness issues in the Parquet VARIANT reconstruction path.

Critical checkpoint conclusions:

  • Goal/test: The PR adds Iceberg v3 Parquet VARIANT reading and shredded column pruning, with regression and FE unit coverage for the common unshredded/shredded object paths. However, valid shredded array layouts and non-finite shredded floats are not covered and can decode incorrectly or fail.
  • Scope/focus: The implementation is mostly focused on the new reader/pruning behavior, though the duplicated Hive/Iceberg pruning helper logic remains a maintainability risk rather than a blocker.
  • Concurrency/lifecycle: The reviewed changes are per-reader/per-query state and do not introduce new shared mutable state, locks, background threads, or special static lifecycle dependencies.
  • Config/compatibility: No new config items or storage-format writes are introduced. The change reads a standard Parquet/Iceberg format; mixed-version compatibility concerns are limited to reader capability.
  • Parallel paths: Hive/local and Iceberg pruning paths are both updated. Standalone Parquet uses the Hive-style name pruning helper.
  • Conditional checks: The main conditional logic around shredded value/typed_value follows the Parquet Variant Shredding spec for object fields, but the array element check currently misses a spec-valid layout.
  • Tests/results: Existing tests cover top-level object shredding, deeper residual paths, case-sensitive keys, and profile observability. Missing coverage remains for arrays whose element group omits value, and for NaN/Inf in shredded float/double typed values.
  • Observability: The added ParquetReadColumnPaths profile string is useful for pruning verification and appears lightweight.
  • Transactions/persistence/data writes: Not applicable; this is read-path only.
  • FE/BE variable passing: Iceberg type mapping and access-path rewriting are updated for VARIANT; no additional thrift variable propagation issue found.
  • Performance: The JSON reconstruction path is inherently allocation-heavy but limited to VARIANT decoding. No additional performance blocker found beyond the correctness issues below.

No additional user-provided review focus was specified.

std::string element_json;
bool element_present = false;
if (element_schema != nullptr && find_child_idx(*element_schema, "value") >= 0) {
RETURN_IF_ERROR(_variant_to_json(*element_schema, values[i], &metadata, &element_json,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only recognizes shredded array elements when the element group contains a value child, but the Parquet Variant Shredding spec allows array element to omit value when elements are fully shredded as a specific typed_value type. For a valid schema like typed_value (LIST) -> element { optional binary typed_value (STRING) }, this branch falls through to _field_to_json(*element_schema, ...) and reconstructs each element as an object such as {"typed_value":"x"} instead of "x". Please treat an element group with either value or typed_value as a shredded variant element and add coverage for the value-omitted array-element layout.

Comment thread be/src/format/parquet/vparquet_column_reader.cpp Outdated
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170353 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e9e3bfd819ba9d8ccea3f9b57abb35147175a7e8, data reload: false

query5	4327	649	519	519
query6	333	223	206	206
query7	4254	565	306	306
query8	323	252	224	224
query9	8875	4094	4024	4024
query10	445	351	314	314
query11	5803	2358	2258	2258
query12	186	132	123	123
query13	1273	625	441	441
query14	6085	5342	5029	5029
query14_1	4355	4361	4357	4357
query15	216	201	181	181
query16	1000	450	394	394
query17	1112	764	607	607
query18	2510	480	345	345
query19	209	203	163	163
query20	138	132	131	131
query21	209	137	118	118
query22	13639	13556	13372	13372
query23	17199	16360	16093	16093
query23_1	16079	16237	16181	16181
query24	7431	1793	1365	1365
query24_1	1355	1390	1388	1388
query25	603	540	489	489
query26	1352	324	181	181
query27	2678	608	359	359
query28	4478	2023	2012	2012
query29	994	657	554	554
query30	313	246	202	202
query31	1118	1087	924	924
query32	87	75	79	75
query33	552	376	306	306
query34	1183	1132	666	666
query35	776	778	686	686
query36	1328	1361	1199	1199
query37	157	108	95	95
query38	3190	3171	3051	3051
query39	962	931	904	904
query39_1	894	867	877	867
query40	258	167	145	145
query41	71	67	68	67
query42	117	115	113	113
query43	326	331	302	302
query44	
query45	217	208	202	202
query46	1077	1200	725	725
query47	2281	2334	2185	2185
query48	406	416	292	292
query49	652	557	470	470
query50	722	300	223	223
query51	4324	4278	4321	4278
query52	109	105	98	98
query53	253	282	209	209
query54	327	293	268	268
query55	96	92	87	87
query56	311	334	329	329
query57	1413	1401	1298	1298
query58	319	281	269	269
query59	1526	1640	1377	1377
query60	351	352	340	340
query61	203	152	156	152
query62	665	607	563	563
query63	254	199	205	199
query64	2406	813	677	677
query65	
query66	1716	512	389	389
query67	30267	29983	29905	29905
query68	
query69	461	336	301	301
query70	994	930	966	930
query71	299	273	271	271
query72	2936	2803	2490	2490
query73	858	765	404	404
query74	5055	4967	4727	4727
query75	2765	2670	2323	2323
query76	2291	1120	735	735
query77	421	435	348	348
query78	12873	13050	12248	12248
query79	1506	942	760	760
query80	1364	579	501	501
query81	532	280	244	244
query82	1020	159	128	128
query83	318	284	247	247
query84	266	139	109	109
query85	894	550	432	432
query86	449	340	305	305
query87	3422	3352	3232	3232
query88	3518	2667	2652	2652
query89	442	388	338	338
query90	1906	186	184	184
query91	178	168	137	137
query92	80	80	73	73
query93	1102	940	560	560
query94	710	348	294	294
query95	655	466	350	350
query96	1061	731	346	346
query97	2717	2699	2591	2591
query98	240	238	228	228
query99	1131	1084	979	979
Total cold run time: 253877 ms
Total hot run time: 170353 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169901 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5fe9ca52dd2edab6a76b7083da4f51d88076c8c5, data reload: false

query5	4329	655	519	519
query6	344	229	198	198
query7	4251	540	302	302
query8	334	234	206	206
query9	8828	4061	4005	4005
query10	447	353	316	316
query11	5828	2369	2201	2201
query12	178	128	128	128
query13	1284	654	423	423
query14	6612	5345	5024	5024
query14_1	4337	4351	4298	4298
query15	210	202	185	185
query16	1019	447	427	427
query17	1118	760	632	632
query18	2495	500	359	359
query19	218	208	159	159
query20	137	131	129	129
query21	217	138	115	115
query22	13679	13558	13470	13470
query23	17261	16299	15983	15983
query23_1	16164	16128	16226	16128
query24	7449	1767	1367	1367
query24_1	1373	1361	1361	1361
query25	593	546	492	492
query26	1294	333	175	175
query27	2716	614	341	341
query28	4469	1985	1985	1985
query29	1058	666	555	555
query30	311	250	203	203
query31	1131	1068	947	947
query32	93	81	78	78
query33	555	356	312	312
query34	1173	1145	626	626
query35	767	791	675	675
query36	1312	1361	1221	1221
query37	157	110	99	99
query38	3181	3151	3040	3040
query39	920	916	898	898
query39_1	885	881	874	874
query40	254	167	144	144
query41	76	69	69	69
query42	114	116	112	112
query43	320	329	287	287
query44	
query45	216	205	199	199
query46	1082	1209	710	710
query47	2332	2244	2129	2129
query48	408	421	324	324
query49	648	543	455	455
query50	694	284	219	219
query51	4281	4278	4175	4175
query52	104	108	97	97
query53	257	279	218	218
query54	334	285	280	280
query55	94	94	86	86
query56	321	317	340	317
query57	1405	1389	1338	1338
query58	314	284	280	280
query59	1540	1586	1379	1379
query60	362	358	345	345
query61	209	163	156	156
query62	675	623	559	559
query63	245	197	204	197
query64	2406	831	680	680
query65	
query66	1728	542	401	401
query67	30018	29869	29896	29869
query68	
query69	446	342	311	311
query70	1014	1021	971	971
query71	310	280	271	271
query72	3032	2707	2389	2389
query73	870	732	443	443
query74	5029	4890	4753	4753
query75	2763	2674	2344	2344
query76	2304	1109	742	742
query77	415	441	353	353
query78	12866	12893	12172	12172
query79	1554	932	750	750
query80	1362	566	529	529
query81	519	279	239	239
query82	963	162	119	119
query83	372	286	250	250
query84	268	138	113	113
query85	939	512	440	440
query86	491	337	346	337
query87	3477	3365	3223	3223
query88	3527	2653	2608	2608
query89	446	382	338	338
query90	1917	176	186	176
query91	181	170	138	138
query92	80	77	72	72
query93	1202	971	548	548
query94	715	345	298	298
query95	685	460	354	354
query96	1034	732	357	357
query97	2707	2674	2534	2534
query98	236	229	226	226
query99	1113	1129	1008	1008
Total cold run time: 254910 ms
Total hot run time: 169901 ms

@eldenmoon eldenmoon force-pushed the codex/iceberg-v3-variant branch from fa098c0 to bf0c548 Compare May 12, 2026 21:06
@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found additional blocking compatibility issues for valid Parquet VARIANT shredding layouts that omit value when data is fully represented by typed_value. Existing review threads already cover non-finite typed floats, typed array elements, case-sensitive keys, and value/residual pruning, so I did not duplicate those.

Critical checkpoints:

  • Goal/test: the PR aims to read Iceberg/Parquet VARIANT, including shredded layouts, and adds local TVF regression coverage, but coverage does not include typed-value-only top-level or nested shredded field groups.
  • Scope/focus: the change is focused, but the schema/pruning logic is stricter than the Parquet shredding layout it is trying to support.
  • Concurrency/lifecycle/config/transactions/persistence: no new concurrency, lifecycle, config, transaction, or persistence concerns found in the reviewed paths.
  • Parallel paths: Hive/local and Iceberg pruning have duplicated logic; both need the typed-value-only fix.
  • Compatibility/data correctness: current code rejects or prunes away valid typed-value-only shredded data, causing scan failure or null/missing results.
  • Tests: existing tests cover unshredded/shredded happy paths and several pruning observables, but miss the typed-value-only layouts described in the inline comments.
  • Observability/performance: no additional observability or performance blocker found beyond the added profile string being used by tests.
  • User focus: no additional user-provided review focus was present.

Comment thread be/src/format/parquet/schema_desc.cpp Outdated
has_value = child.physical_type == tparquet::Type::BYTE_ARRAY;
}
}
if (!has_metadata || !has_value) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rejects a valid fully-shredded VARIANT schema where the top-level value field is omitted and all data is represented by typed_value. The Parquet shredding layout makes the fallback value optional for fully shredded values, and the reader below already has code paths that can decode wrappers with typed_value but no value. With this check, such files fail schema parsing before reading. Please accept metadata plus either value or typed_value (validating their physical/logical shape), and add coverage for a top-level typed-value-only shredded variant.

}

bool is_shredded_variant_field(const FieldSchema& field_schema) {
return find_child_by_structural_name(field_schema, "value") != nullptr &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only treats a nested shredded field as variant-like when both value and typed_value are present. A fully shredded field may omit the fallback value, for example typed_value.metric { typed_value { x ... } }. For an access like v['metric']['x'], extract_variant_nested_column_ids() recurses into metric, this predicate returns false, and the generic struct path looks for a child named x next to typed_value, so it returns false and falls back to top-level v.value instead of selecting v.typed_value.metric.typed_value.x. Please also recognize typed_value-only shredded field groups and add a pruning/read regression for that layout.

Comment thread be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp Outdated
@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29314 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit fa098c01643af773e82d99b9046d891338e1145f, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17784	4016	3937	3937
q2	q3	10704	888	609	609
q4	4677	454	346	346
q5	7441	1317	1142	1142
q6	204	174	138	138
q7	917	948	757	757
q8	9628	1397	1270	1270
q9	6188	5397	5328	5328
q10	6323	2084	1806	1806
q11	477	262	256	256
q12	687	405	294	294
q13	18197	3296	2757	2757
q14	292	282	263	263
q15	q16	911	870	787	787
q17	1011	1038	727	727
q18	6402	5668	5519	5519
q19	1470	1164	958	958
q20	505	385	258	258
q21	4852	2276	1856	1856
q22	420	353	306	306
Total cold run time: 99090 ms
Total hot run time: 29314 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4269	4176	4162	4162
q2	q3	4627	4770	4167	4167
q4	2093	2152	1380	1380
q5	4961	4992	5266	4992
q6	193	163	133	133
q7	2028	1762	2018	1762
q8	3411	3196	3191	3191
q9	8519	8460	8332	8332
q10	4522	4465	4240	4240
q11	625	431	395	395
q12	732	743	535	535
q13	3224	3547	2922	2922
q14	306	314	290	290
q15	q16	763	781	720	720
q17	1356	1296	1272	1272
q18	8062	7007	7128	7007
q19	1173	1167	1127	1127
q20	2315	2241	1938	1938
q21	6202	5384	4915	4915
q22	567	533	438	438
Total cold run time: 59948 ms
Total hot run time: 53918 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169501 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 01137d15607e99a312eb4d0648a28b499cd2033f, data reload: false

query5	4331	649	533	533
query6	348	214	194	194
query7	4228	579	308	308
query8	335	232	216	216
query9	8837	4044	4039	4039
query10	464	338	296	296
query11	5711	2386	2269	2269
query12	191	130	131	130
query13	1279	616	413	413
query14	5994	5336	5018	5018
query14_1	4373	4353	4310	4310
query15	206	201	183	183
query16	975	445	421	421
query17	990	706	589	589
query18	2447	484	354	354
query19	208	200	164	164
query20	136	132	124	124
query21	207	138	115	115
query22	13606	13448	13470	13448
query23	17123	16402	16031	16031
query23_1	16153	16184	16125	16125
query24	7493	1759	1285	1285
query24_1	1309	1306	1325	1306
query25	535	464	415	415
query26	1319	327	171	171
query27	2690	548	341	341
query28	4445	1981	1929	1929
query29	972	613	488	488
query30	306	239	198	198
query31	1108	1064	931	931
query32	91	83	75	75
query33	524	348	297	297
query34	1155	1135	621	621
query35	768	787	674	674
query36	1317	1300	1183	1183
query37	155	103	93	93
query38	3214	3149	3005	3005
query39	923	928	902	902
query39_1	863	891	878	878
query40	231	153	127	127
query41	64	65	63	63
query42	108	108	109	108
query43	328	333	304	304
query44	
query45	211	208	199	199
query46	1054	1165	739	739
query47	2324	2321	2150	2150
query48	401	424	309	309
query49	653	506	406	406
query50	980	350	268	268
query51	4337	4282	4322	4282
query52	107	107	95	95
query53	253	293	210	210
query54	330	295	278	278
query55	96	94	87	87
query56	316	320	330	320
query57	1437	1426	1316	1316
query58	315	286	271	271
query59	1552	1640	1396	1396
query60	329	339	327	327
query61	179	181	184	181
query62	663	642	556	556
query63	242	204	206	204
query64	2485	866	702	702
query65	
query66	1737	483	372	372
query67	30126	30073	29821	29821
query68	
query69	473	351	316	316
query70	1049	952	1010	952
query71	309	283	268	268
query72	3258	2677	2402	2402
query73	831	783	399	399
query74	5047	4894	4713	4713
query75	2812	2608	2251	2251
query76	2303	1155	824	824
query77	389	406	339	339
query78	12155	12189	11713	11713
query79	1507	1035	731	731
query80	991	585	459	459
query81	498	281	240	240
query82	1373	162	123	123
query83	357	277	247	247
query84	302	138	108	108
query85	914	541	452	452
query86	431	345	362	345
query87	3417	3366	3195	3195
query88	3521	2670	2671	2670
query89	450	382	337	337
query90	1799	183	181	181
query91	179	172	141	141
query92	81	78	74	74
query93	1470	1501	824	824
query94	614	337	317	317
query95	659	464	351	351
query96	1006	787	348	348
query97	2682	2675	2558	2558
query98	254	227	228	227
query99	1081	1117	959	959
Total cold run time: 253469 ms
Total hot run time: 169501 ms

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 168970 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 81fcea94f01f769389c5f80754954b0057ad116c, data reload: false

query5	4325	661	497	497
query6	334	218	204	204
query7	4284	585	303	303
query8	324	230	218	218
query9	8781	4042	4067	4042
query10	456	327	297	297
query11	5808	2341	2251	2251
query12	180	126	122	122
query13	1281	622	446	446
query14	6076	5390	4990	4990
query14_1	4370	4363	4332	4332
query15	210	199	177	177
query16	1003	448	461	448
query17	1098	701	575	575
query18	2449	505	345	345
query19	208	194	159	159
query20	141	130	128	128
query21	223	143	118	118
query22	13561	13542	13315	13315
query23	17076	16364	15921	15921
query23_1	16019	16059	16075	16059
query24	7443	1803	1285	1285
query24_1	1312	1264	1305	1264
query25	550	502	444	444
query26	1318	322	174	174
query27	2703	569	361	361
query28	4482	1992	1924	1924
query29	1009	600	482	482
query30	302	232	193	193
query31	1108	1055	936	936
query32	91	73	70	70
query33	544	364	297	297
query34	1203	1137	640	640
query35	759	793	697	697
query36	1362	1349	1118	1118
query37	156	105	92	92
query38	3208	3152	3043	3043
query39	921	926	900	900
query39_1	850	862	879	862
query40	225	145	126	126
query41	65	64	62	62
query42	109	112	112	112
query43	328	332	290	290
query44	
query45	217	200	194	194
query46	1052	1176	724	724
query47	2303	2326	2171	2171
query48	386	405	293	293
query49	641	495	386	386
query50	969	343	250	250
query51	4340	4232	4265	4232
query52	104	108	96	96
query53	259	288	202	202
query54	307	270	247	247
query55	90	88	83	83
query56	297	302	310	302
query57	1456	1366	1302	1302
query58	292	270	266	266
query59	1579	1605	1377	1377
query60	323	323	336	323
query61	153	152	153	152
query62	675	625	577	577
query63	245	204	201	201
query64	2399	778	631	631
query65	
query66	1778	473	350	350
query67	30044	29891	29744	29744
query68	
query69	468	344	296	296
query70	1040	1013	1005	1005
query71	319	277	257	257
query72	2981	2753	2598	2598
query73	856	774	408	408
query74	5062	4860	4776	4776
query75	2698	2617	2293	2293
query76	2313	1175	827	827
query77	419	426	333	333
query78	12124	12209	11640	11640
query79	1385	1058	754	754
query80	681	584	492	492
query81	462	283	246	246
query82	642	157	122	122
query83	368	287	260	260
query84	284	142	118	118
query85	960	608	542	542
query86	441	330	347	330
query87	3389	3407	3202	3202
query88	3572	2670	2676	2670
query89	434	382	336	336
query90	1899	176	182	176
query91	175	168	137	137
query92	75	79	72	72
query93	1445	1450	806	806
query94	561	354	320	320
query95	672	379	429	379
query96	1011	770	340	340
query97	2703	2668	2565	2565
query98	236	226	228	226
query99	1096	1087	966	966
Total cold run time: 251576 ms
Total hot run time: 168970 ms

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one distinct blocking correctness issue after reviewing the current PR and existing review threads.

Critical checkpoint conclusions:

  • Goal/test: The PR aims to add Iceberg/Parquet VARIANT reading, pruning, and related FE access-path propagation. There is broad unit/regression coverage, but the chained dynamic-subscript case below is not covered and can still produce wrong results.
  • Scope/clarity: The change is large and cross-cuts BE decoding, FE pruning, and external table plumbing; most additions are focused on VARIANT support, but this increases edge-case risk.
  • Concurrency/lifecycle: Existing cached Parquet metadata mutation and reader lifecycle risks are already covered by existing threads; I did not find a new distinct concurrency/lifecycle issue beyond those.
  • Config/compatibility: No new config items were introduced. Exposing VARIANT through Iceberg/Parquet has compatibility implications already discussed in existing threads; no additional distinct protocol/storage-format issue found.
  • Parallel paths: The issue below is in FE access-path collection and affects scan pruning paths that rely on collected VARIANT paths.
  • Conditional checks/error handling: No new distinct unchecked Status/exception issue found in the reviewed code.
  • Test coverage/results: Existing tests cover many VARIANT layouts, but the chained dynamic-subscript plus sibling predicate case is missing.
  • Observability: No additional observability requirement found for this bug; it is a deterministic pruning correctness issue.
  • Transactions/persistence/data writes: The reviewed changes are read/planning focused except Iceberg sink type plumbing; no new distinct transaction/persistence issue found.
  • Performance: Several performance/correctness pruning tradeoffs are already covered by existing threads; the issue below is correctness-first.
  • User focus: No additional user-provided review focus was specified.

Please fix the inline issue before merge.

return visit(elementAt, context);
} else if (first.getDataType().isVariantType() && arguments.size() >= 2) {
CollectorContext variantRootContext = context.copy();
variantRootContext.setCollectVariantRoot(true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still does not force a full VARIANT read when the dynamic subscript has an outer suffix. For v[cast(id AS string)]['x'], the outer literal access leaves accessPathBuilder = [x]; this branch sets collectVariantRoot, but visitSlotReference() checks the non-empty builder first and records [v, x] instead of a root [v] demand. Since the dynamic key can select any top-level field before applying ['x'], pruning to only x can drop required data and return missing/null values when combined with a sibling predicate such as v['k'] IS NOT NULL. Please clear/override the suffix when a dynamic VARIANT subscript requires a root read, or make collectVariantRoot take precedence over a non-empty builder, and add coverage for a chained dynamic subscript plus a sibling nested predicate.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30790 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 013742cd4f2c58ebb182cb3db9472867f4e5e860, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17858	3852	3824	3824
q2	q3	10758	1350	816	816
q4	4687	469	343	343
q5	7550	2230	2096	2096
q6	265	176	141	141
q7	938	787	624	624
q8	9390	1754	1558	1558
q9	6517	4914	4857	4857
q10	6431	2114	1785	1785
q11	461	277	257	257
q12	694	428	298	298
q13	18201	3383	2743	2743
q14	276	259	237	237
q15	q16	824	772	704	704
q17	992	932	1009	932
q18	6857	5692	5496	5496
q19	1305	1207	1045	1045
q20	525	409	269	269
q21	5889	2603	2449	2449
q22	438	357	316	316
Total cold run time: 100856 ms
Total hot run time: 30790 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4205	4266	4071	4071
q2	q3	4463	4881	4320	4320
q4	2120	2201	1380	1380
q5	4425	4227	4284	4227
q6	233	175	131	131
q7	2191	1887	1642	1642
q8	2499	2140	2055	2055
q9	7795	7777	7628	7628
q10	4534	4478	4054	4054
q11	575	410	462	410
q12	764	735	519	519
q13	3291	3674	3007	3007
q14	291	296	290	290
q15	q16	745	751	667	667
q17	1327	1322	1376	1322
q18	7925	7331	7058	7058
q19	1153	1117	1094	1094
q20	2200	2217	1934	1934
q21	5326	4637	4478	4478
q22	534	463	409	409
Total cold run time: 56596 ms
Total hot run time: 50696 ms

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional correctness issue that is distinct from the existing review threads.

Critical checkpoint conclusions:

  • Goal/test coverage: the PR adds Iceberg/Parquet VARIANT read support, nested access-path pruning, and blocks unsupported Iceberg VARIANT writes. Many positive tests were added, but nullable top-level VARIANT null-check pruning for OLAP is not covered.
  • Scope/focus: the change is broad but mostly localized to VARIANT access-path collection and Parquet readers; the flagged issue is in the new VARIANT access propagation behavior.
  • Concurrency/lifecycle: existing cached Parquet metadata mutation concurrency is already covered by an existing thread; I found no additional distinct concurrency/lifecycle issue.
  • Config/compatibility: no new config item. Iceberg write rejection is explicit; mixed-version/storage-format concerns are otherwise around read support.
  • Parallel paths: file/TVF scans normalize NULL/OFFSET pseudo paths, but OLAP scans do not, which creates the distinct issue below.
  • Transaction/persistence/data writes: Iceberg write paths reject VARIANT; no FE EditLog or Doris transaction persistence change found.
  • FE-BE variable passing: access paths are passed through the existing slot access-path mechanism; the issue is an invalid pseudo VARIANT subpath being passed to OLAP BE.
  • Performance/observability: no additional distinct performance or observability blocker beyond already-known review threads.
  • User focus: no additional user-provided review focus was supplied.

public Void visit(Expression expr, CollectorContext context) {
for (Expression child : expr.children()) {
child.accept(this, new CollectorContext(context.statementContext, context.bottomFilter));
if (child.getDataType().isVariantType()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This propagation also turns top-level nullable VARIANT null checks into a fake nested subpath on OLAP scans. For SELECT 1 FROM olap_variant_tbl WHERE v IS NULL, visitIsNull() seeds the builder with NULL; this branch preserves that non-empty builder into the VARIANT child, and visitSlotReference() records [v, NULL] as a VARIANT DATA path. NestedColumnPruning handles VARIANT slots outside the DataTypeAccessTree null-only logic, and visitLogicalOlapScan() does not call normalizeDataSkippingOnlyAccessPaths(), so BE receives NULL as if it were a VARIANT field name rather than a top-level null-map/root requirement. File/TVF scans normalize this pseudo suffix away, which is why this is a distinct OLAP path issue. Please special-case VARIANT IS NULL/IS NOT NULL to record a valid root/null access for OLAP (or normalize these pseudo suffixes before building VARIANT AccessPathInfo) and add coverage for nullable top-level VARIANT null predicates.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional data-correctness issue in the current head. Review checkpoint conclusions: Goal: the PR adds Iceberg/Parquet VARIANT read support and access-path pruning, but the current wrapper detection still misclassifies a valid typed-only user-object shape. Scope: the change is mostly focused, but this ambiguity needs tightening. Concurrency/lifecycle: no new concurrency issue beyond the already-known cached-footer mutation thread; this comment is local row-wise decode correctness. Config/compatibility: no new config; write-side VARIANT is intentionally rejected, but read compatibility depends on preserving valid typed object field names. Parallel paths: the standalone VariantMap decoder was tightened for unannotated residual values, but the row-wise reader still has a distinct false-positive wrapper path. Tests: current tests cover value-only and typed_value-only cases discussed in prior threads, but not the combined annotated value plus typed_value user-object case. Observability: existing corruption messages would not make this ambiguity obvious. User focus: no additional user-provided focus was supplied.

}
if (child.lower_case_name == "value") {
if (child.physical_type != tparquet::Type::BYTE_ARRAY) {
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wrapper check still accepts an ordinary annotated user field named value when the same object also has a typed_value child. A valid typed-only VARIANT object such as {"obj": {"value": "abc", "typed_value": {"x": 1}}} is represented under v.typed_value.obj with a UTF8/STRING value leaf plus a struct typed_value leaf. Row-wise reconstruction calls is_variant_wrapper_field(obj, ...), this branch only checks that value is BYTE_ARRAY, and line 215 returns true because typed_value != nullptr; variant_to_variant_map() then tries to decode the UTF8 bytes from obj.value as Parquet VARIANT residual bytes with inherited metadata, causing corruption or wrong output instead of producing user fields obj.value and obj.typed_value.x. This is distinct from the existing value-only and typed_value-only false-positive threads because the misclassification requires both user children together. Please require value to satisfy is_unannotated_variant_value_field() for wrapper classification (as the pruning helper already does) and add coverage for a typed-only object containing both annotated value and typed_value user fields.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31051 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5c8d26e78b19d9103df693de698dbb57f3a65ab1, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17645	3817	3804	3804
q2	q3	10802	1353	781	781
q4	4685	474	341	341
q5	7559	2233	2098	2098
q6	234	177	140	140
q7	923	758	642	642
q8	9443	1814	1577	1577
q9	5137	4829	4841	4829
q10	6381	2088	1811	1811
q11	444	269	251	251
q12	659	423	296	296
q13	18176	3433	2743	2743
q14	261	258	236	236
q15	q16	826	777	710	710
q17	899	954	1015	954
q18	6963	5923	5478	5478
q19	1351	1424	1040	1040
q20	543	396	302	302
q21	6326	2821	2712	2712
q22	469	365	306	306
Total cold run time: 99726 ms
Total hot run time: 31051 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4845	4613	4491	4491
q2	q3	4845	5343	4716	4716
q4	2106	2179	1396	1396
q5	4499	4225	4201	4201
q6	220	166	119	119
q7	1717	1550	1379	1379
q8	2174	1879	1863	1863
q9	7185	7148	7105	7105
q10	4505	4397	3973	3973
q11	523	385	383	383
q12	712	707	500	500
q13	3028	3419	2775	2775
q14	269	280	245	245
q15	q16	678	699	604	604
q17	1245	1215	1218	1215
q18	7113	6685	6805	6685
q19	1117	1057	1095	1057
q20	2230	2204	1908	1908
q21	5291	4593	4424	4424
q22	510	457	400	400
Total cold run time: 54812 ms
Total hot run time: 49439 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 64.91% (74/114) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 168589 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 013742cd4f2c58ebb182cb3db9472867f4e5e860, data reload: false

query5	4321	640	503	503
query6	330	225	196	196
query7	4237	564	295	295
query8	334	232	215	215
query9	8820	3981	3959	3959
query10	446	357	298	298
query11	5760	2319	2158	2158
query12	199	131	123	123
query13	1273	625	433	433
query14	5970	5387	5060	5060
query14_1	4368	4364	4369	4364
query15	216	207	188	188
query16	998	395	450	395
query17	986	763	610	610
query18	2462	506	361	361
query19	220	208	175	175
query20	136	136	130	130
query21	217	142	118	118
query22	13629	13592	13337	13337
query23	17256	16377	15890	15890
query23_1	16148	16154	16161	16154
query24	7476	1774	1318	1318
query24_1	1367	1278	1314	1278
query25	573	535	418	418
query26	1324	320	165	165
query27	2721	532	336	336
query28	4459	1955	1942	1942
query29	957	619	487	487
query30	306	239	199	199
query31	1107	1068	928	928
query32	83	79	73	73
query33	557	342	291	291
query34	1179	1098	645	645
query35	775	778	672	672
query36	1370	1340	1175	1175
query37	152	102	94	94
query38	3255	3125	3066	3066
query39	926	922	908	908
query39_1	888	882	872	872
query40	224	147	125	125
query41	67	64	63	63
query42	112	111	110	110
query43	328	331	290	290
query44	
query45	211	201	197	197
query46	1061	1170	728	728
query47	2339	2362	2253	2253
query48	403	407	290	290
query49	662	531	386	386
query50	1047	352	250	250
query51	4401	4314	4353	4314
query52	105	107	98	98
query53	249	283	209	209
query54	318	265	248	248
query55	93	89	88	88
query56	294	308	301	301
query57	1417	1421	1347	1347
query58	300	277	272	272
query59	1564	1613	1385	1385
query60	317	317	311	311
query61	160	165	158	158
query62	665	628	563	563
query63	239	204	206	204
query64	2412	831	660	660
query65	
query66	1741	487	352	352
query67	30043	30029	29788	29788
query68	
query69	463	348	305	305
query70	999	1009	992	992
query71	306	278	265	265
query72	3013	2717	2378	2378
query73	842	733	448	448
query74	5096	4945	4728	4728
query75	2699	2618	2280	2280
query76	2283	1141	755	755
query77	401	419	329	329
query78	12182	12220	11499	11499
query79	1486	1034	721	721
query80	838	559	460	460
query81	477	278	241	241
query82	1391	159	121	121
query83	369	279	245	245
query84	311	142	110	110
query85	969	540	457	457
query86	458	335	331	331
query87	3431	3363	3221	3221
query88	3545	2649	2649	2649
query89	457	384	336	336
query90	1803	179	170	170
query91	201	164	140	140
query92	77	80	69	69
query93	1575	1374	911	911
query94	645	347	271	271
query95	653	477	352	352
query96	1082	745	329	329
query97	2697	2751	2552	2552
query98	235	226	229	226
query99	1155	1089	975	975
Total cold run time: 253354 ms
Total hot run time: 168589 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30861 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3853dff1ae1dd95a975ee64df294feab76fab9db, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17613	3895	3898	3895
q2	q3	10853	1441	806	806
q4	4686	462	344	344
q5	7555	2255	2091	2091
q6	250	182	139	139
q7	936	792	621	621
q8	9335	1717	1599	1599
q9	6665	4832	4903	4832
q10	6447	2096	1762	1762
q11	434	268	248	248
q12	697	423	305	305
q13	18180	3370	2771	2771
q14	264	254	231	231
q15	q16	827	770	705	705
q17	1032	915	1039	915
q18	6741	5794	5465	5465
q19	1274	1278	1159	1159
q20	518	407	268	268
q21	5703	2546	2396	2396
q22	419	354	309	309
Total cold run time: 100429 ms
Total hot run time: 30861 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4246	4092	4104	4092
q2	q3	4484	4921	4280	4280
q4	2124	2215	1385	1385
q5	4378	4332	4315	4315
q6	230	180	134	134
q7	2184	1886	1707	1707
q8	2501	2171	2049	2049
q9	7812	7758	7716	7716
q10	4468	4421	4049	4049
q11	599	418	555	418
q12	741	722	513	513
q13	3313	3586	2973	2973
q14	319	304	303	303
q15	q16	734	738	665	665
q17	1396	1348	1316	1316
q18	7781	7406	6688	6688
q19	1123	1050	1088	1050
q20	2183	2206	1905	1905
q21	5365	4715	4479	4479
q22	520	468	413	413
Total cold run time: 56501 ms
Total hot run time: 50450 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 168942 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5c8d26e78b19d9103df693de698dbb57f3a65ab1, data reload: false

query5	4302	646	501	501
query6	336	223	199	199
query7	4258	590	289	289
query8	320	222	216	216
query9	8856	4076	4058	4058
query10	456	337	293	293
query11	5768	2353	2273	2273
query12	184	126	123	123
query13	1253	603	432	432
query14	5995	5322	5019	5019
query14_1	4367	4352	4371	4352
query15	221	203	180	180
query16	1031	456	451	451
query17	1156	748	602	602
query18	2631	494	371	371
query19	219	209	178	178
query20	149	131	129	129
query21	217	155	117	117
query22	13593	13500	13424	13424
query23	17228	16470	16035	16035
query23_1	16140	16180	16179	16179
query24	7451	1762	1294	1294
query24_1	1284	1306	1283	1283
query25	545	465	416	416
query26	1303	303	174	174
query27	2696	547	339	339
query28	4429	1936	1934	1934
query29	1027	616	488	488
query30	297	236	199	199
query31	1143	1063	933	933
query32	89	77	73	73
query33	532	349	285	285
query34	1169	1108	636	636
query35	774	785	658	658
query36	1354	1368	1202	1202
query37	158	100	87	87
query38	3242	3120	3065	3065
query39	935	922	903	903
query39_1	886	865	861	861
query40	228	151	128	128
query41	68	63	63	63
query42	112	110	108	108
query43	321	321	288	288
query44	
query45	215	196	191	191
query46	1025	1191	722	722
query47	2357	2330	2227	2227
query48	395	391	294	294
query49	621	515	400	400
query50	967	360	244	244
query51	4309	4270	4208	4208
query52	104	104	91	91
query53	259	276	200	200
query54	309	261	246	246
query55	91	92	89	89
query56	301	297	289	289
query57	1422	1401	1325	1325
query58	331	272	271	271
query59	1565	1625	1392	1392
query60	323	314	305	305
query61	153	155	157	155
query62	663	616	579	579
query63	244	204	205	204
query64	2409	813	631	631
query65	
query66	1675	466	341	341
query67	29325	29893	29705	29705
query68	
query69	456	383	297	297
query70	1050	966	992	966
query71	308	272	263	263
query72	3062	2636	2368	2368
query73	806	735	411	411
query74	5060	4870	4702	4702
query75	2685	2582	2253	2253
query76	2290	1146	746	746
query77	402	400	324	324
query78	12023	12171	11622	11622
query79	1276	1027	735	735
query80	584	549	479	479
query81	475	278	245	245
query82	235	153	116	116
query83	271	282	249	249
query84	262	140	109	109
query85	926	579	503	503
query86	382	333	325	325
query87	3366	3344	3252	3252
query88	3480	2646	2617	2617
query89	428	388	336	336
query90	2179	179	178	178
query91	175	167	134	134
query92	78	75	72	72
query93	1497	1398	843	843
query94	551	363	299	299
query95	725	372	348	348
query96	1060	779	342	342
query97	2703	2681	2546	2546
query98	233	233	242	233
query99	1130	1109	957	957
Total cold run time: 250984 ms
Total hot run time: 168942 ms

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169711 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3853dff1ae1dd95a975ee64df294feab76fab9db, data reload: false

query5	4311	639	522	522
query6	353	220	200	200
query7	4245	545	308	308
query8	333	241	228	228
query9	8880	4008	3990	3990
query10	466	353	294	294
query11	5792	2394	2183	2183
query12	195	127	135	127
query13	1291	615	432	432
query14	5967	5331	5040	5040
query14_1	4391	4332	4367	4332
query15	206	205	183	183
query16	1011	461	434	434
query17	1159	738	618	618
query18	2727	485	370	370
query19	224	212	175	175
query20	147	135	128	128
query21	218	137	120	120
query22	13525	13536	13286	13286
query23	17260	16313	16054	16054
query23_1	16132	16137	16175	16137
query24	7435	1781	1298	1298
query24_1	1266	1308	1305	1305
query25	590	504	453	453
query26	1326	316	182	182
query27	2667	540	324	324
query28	4457	1961	1942	1942
query29	1030	641	520	520
query30	308	243	203	203
query31	1109	1066	948	948
query32	89	85	75	75
query33	563	360	300	300
query34	1184	1106	657	657
query35	765	777	738	738
query36	1359	1330	1191	1191
query37	149	103	92	92
query38	3186	3114	3048	3048
query39	930	918	913	913
query39_1	899	876	871	871
query40	227	144	132	132
query41	65	62	62	62
query42	120	109	109	109
query43	343	332	290	290
query44	
query45	206	199	192	192
query46	1077	1216	727	727
query47	2284	2351	2123	2123
query48	387	426	298	298
query49	626	485	382	382
query50	1005	350	250	250
query51	4341	4283	4233	4233
query52	106	104	94	94
query53	251	285	199	199
query54	313	267	256	256
query55	93	86	84	84
query56	295	299	285	285
query57	1436	1417	1353	1353
query58	303	277	269	269
query59	1556	1679	1435	1435
query60	326	326	308	308
query61	161	157	154	154
query62	689	649	568	568
query63	244	203	210	203
query64	2261	799	621	621
query65	
query66	1658	481	365	365
query67	30083	29981	29736	29736
query68	
query69	469	339	303	303
query70	1066	985	988	985
query71	306	278	265	265
query72	2967	2716	2386	2386
query73	851	756	420	420
query74	5034	4942	4739	4739
query75	2642	2565	2240	2240
query76	2298	1141	771	771
query77	391	409	329	329
query78	12169	12062	11633	11633
query79	1601	1028	751	751
query80	1305	537	446	446
query81	537	278	233	233
query82	945	151	123	123
query83	335	270	257	257
query84	262	138	114	114
query85	904	532	439	439
query86	454	316	327	316
query87	3417	3363	3188	3188
query88	3554	2685	2653	2653
query89	449	384	338	338
query90	1878	174	184	174
query91	179	171	153	153
query92	78	79	73	73
query93	1658	1493	915	915
query94	707	373	323	323
query95	688	377	349	349
query96	1036	818	344	344
query97	2762	2681	2560	2560
query98	246	228	226	226
query99	1141	1103	991	991
Total cold run time: 253938 ms
Total hot run time: 169711 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31234 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 35f02662f9c320859ede78361f2d1e6474a1261a, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17681	3898	3937	3898
q2	q3	10832	1384	791	791
q4	4676	472	336	336
q5	7716	2252	2115	2115
q6	239	178	150	150
q7	951	744	640	640
q8	9343	1729	1682	1682
q9	6151	4936	4948	4936
q10	6365	2078	1780	1780
q11	439	275	249	249
q12	691	442	298	298
q13	18189	3389	2747	2747
q14	264	257	241	241
q15	q16	815	767	704	704
q17	979	913	979	913
q18	6902	5691	5551	5551
q19	1220	1181	1096	1096
q20	655	432	284	284
q21	6142	2869	2498	2498
q22	431	385	325	325
Total cold run time: 100681 ms
Total hot run time: 31234 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4741	4651	4777	4651
q2	q3	4833	5272	4690	4690
q4	2133	2223	1416	1416
q5	4860	4811	4694	4694
q6	237	182	130	130
q7	1899	1776	1518	1518
q8	2439	2118	1894	1894
q9	7288	7298	7334	7298
q10	4478	4393	4007	4007
q11	552	403	357	357
q12	711	723	515	515
q13	3018	3433	2744	2744
q14	283	278	246	246
q15	q16	685	702	629	629
q17	1296	1274	1275	1274
q18	7520	6757	6964	6757
q19	1117	1092	1113	1092
q20	2224	2221	1954	1954
q21	5334	4690	4559	4559
q22	536	490	414	414
Total cold run time: 56184 ms
Total hot run time: 50839 ms

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

@eldenmoon
Copy link
Copy Markdown
Member Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional blocking issue beyond the already-open inline threads. The PR goal appears to be Iceberg/Parquet VARIANT read support with nested access-path pruning while keeping VARIANT writes unsupported. The read-side implementation is broad and has tests, but this write-side guard now rejects a delete-only MERGE that should not write VARIANT values.

Critical checkpoint conclusions:

  • Goal/test coverage: read support is covered by new regression and unit tests, but there is no coverage for delete-only MERGE on Iceberg tables containing VARIANT columns.
  • Scope/focus: most changes are focused on VARIANT read/pruning; the unconditional MERGE sink write guard is broader than necessary.
  • Concurrency/lifecycle/config/compatibility: no new concern found in this additional issue.
  • Parallel paths: Iceberg DELETE uses a separate delete sink and remains allowed, while delete-only MERGE is newly rejected; these paths should be consistent.
  • Data correctness/transactions: the issue is a behavioral regression at planning/binding time rather than persisted-data corruption.
  • Performance/observability: no additional concern found.
  • User focus: no additional user-provided review focus was specified.

if (formatVersion >= 3) {
schema = IcebergUtils.appendRowLineageFieldsForV3(schema);
}
IcebergUtils.validateVariantWriteUnsupported(schema);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rejects delete-only MERGE statements on Iceberg tables that contain a VARIANT column, even though that path only needs to write delete files and does not write VARIANT values. For example, MERGE INTO iceberg_variant_tbl t USING src s ON ... WHEN MATCHED THEN DELETE now fails here because the table schema contains VARIANT, while the semantically similar Iceberg DELETE FROM path still uses IcebergDeleteSink and is not rejected. Please scope the VARIANT write validation to MERGE plans that actually produce data-file rows (UPDATE/INSERT branches), or route delete-only MERGE through the delete-only behavior, and add coverage for a delete-only MERGE against a table with a VARIANT column.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the PR against the Doris code-review checklist, existing inline threads, and the stated focus file. I found one additional blocking correctness issue not covered by the existing review threads.

Critical checkpoint conclusions:

  • Goal/test coverage: the PR adds Iceberg/Parquet VARIANT reading and nested access pruning support with extensive tests, but the new map OFFSET-only pruning path is not covered and can crash.
  • Scope/clarity: the implementation is broad but generally focused on Parquet VARIANT and nested access paths.
  • Concurrency/lifecycle: no new issue beyond the already-known cached Parquet schema mutation thread.
  • Configuration/compatibility: no new configuration or storage-format compatibility issue found.
  • Parallel paths: the map OFFSET-only issue affects standalone Parquet plus Hive/Iceberg because they delegate to the shared Parquet nested-column helper.
  • Special checks/error handling: the failure path reaches SkipReadingReader where get_rep_level()/get_def_level() is fatal instead of returning a query error.
  • Tests/results: missing coverage for map_size/cardinality on Parquet MAP columns with access-path pruning enabled.
  • Observability/performance: no additional observability concern found; the issue is correctness/crash, not only performance.
  • Transactions/persistence/data writes/FE-BE variables: not applicable to the reviewed read-path changes.

User focus: no additional user-provided review focus was specified.

continue;
}

if (field_type == PrimitiveType::TYPE_MAP && i == 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path does not handle the OFFSET-only access emitted for map_size(m) / cardinality(m). For a Parquet MAP slot the FE sends [m, OFFSET]; after the root is stripped, child_paths_by_key contains only OFFSET, so neither KEYS nor VALUES is selected and the resulting column_ids contains only the map root. ParquetColumnReader::create() then builds a MapColumnReader with SkipReadingReader for both key and value children. During MapColumnReader::read_column_data(), the key skip reader is used as the reference reader and get_rep_level() / get_def_level() hits the fatal path, so a simple SELECT map_size(m) FROM parquet_map_tbl can crash instead of reading just the map levels. Please make OFFSET-only MAP access select a real key child reader (similar to the values-only case needing keys) or add a dedicated map-offset reader, and cover root and nested MAP cardinality/map_size for standalone/Hive/Iceberg Parquet.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30862 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1fe5df123ff286dc34a2f9ca8fef57c187bb95c4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17622	3820	3825	3820
q2	q3	10775	1352	814	814
q4	4701	474	350	350
q5	7541	2263	2094	2094
q6	227	174	135	135
q7	944	770	642	642
q8	9343	1669	1560	1560
q9	5189	4873	4864	4864
q10	6383	2065	1796	1796
q11	452	273	248	248
q12	632	436	292	292
q13	18181	3439	2764	2764
q14	260	258	243	243
q15	q16	837	784	708	708
q17	1012	993	933	933
q18	6777	5669	5481	5481
q19	1465	1262	1080	1080
q20	522	391	275	275
q21	5875	2653	2455	2455
q22	443	358	308	308
Total cold run time: 99181 ms
Total hot run time: 30862 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4159	4059	4058	4058
q2	q3	4441	4921	4277	4277
q4	2138	2230	1370	1370
q5	4385	4227	4234	4227
q6	224	172	129	129
q7	1752	1939	1823	1823
q8	2845	2147	2342	2147
q9	7915	8103	7553	7553
q10	4599	4441	4013	4013
q11	558	409	412	409
q12	735	746	510	510
q13	3386	3645	2940	2940
q14	295	309	274	274
q15	q16	735	761	649	649
q17	1307	1312	1264	1264
q18	7512	6735	6707	6707
q19	1074	1098	1092	1092
q20	2221	2215	1923	1923
q21	5233	4631	4396	4396
q22	523	462	409	409
Total cold run time: 56037 ms
Total hot run time: 50170 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169684 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 35f02662f9c320859ede78361f2d1e6474a1261a, data reload: false

query5	4331	673	525	525
query6	331	215	209	209
query7	4268	545	286	286
query8	331	239	222	222
query9	8832	4082	4056	4056
query10	472	345	305	305
query11	5786	2348	2165	2165
query12	184	129	125	125
query13	1300	564	445	445
query14	5938	5369	5096	5096
query14_1	4435	4372	4389	4372
query15	216	203	186	186
query16	1022	435	443	435
query17	1017	734	584	584
query18	2437	479	339	339
query19	216	195	160	160
query20	133	132	128	128
query21	213	138	116	116
query22	13580	13526	13357	13357
query23	17101	16390	16011	16011
query23_1	16130	16210	16157	16157
query24	7515	1768	1294	1294
query24_1	1284	1251	1324	1251
query25	550	465	407	407
query26	1318	319	167	167
query27	2704	562	342	342
query28	4432	1924	1957	1924
query29	991	621	489	489
query30	305	233	198	198
query31	1126	1077	947	947
query32	91	80	71	71
query33	537	356	287	287
query34	1159	1130	649	649
query35	766	797	675	675
query36	1316	1348	1128	1128
query37	152	106	95	95
query38	3211	3146	3059	3059
query39	933	949	900	900
query39_1	903	878	875	875
query40	234	154	147	147
query41	89	71	67	67
query42	114	121	114	114
query43	329	332	309	309
query44	
query45	216	206	207	206
query46	1086	1185	753	753
query47	2285	2329	2232	2232
query48	407	395	305	305
query49	657	520	419	419
query50	939	357	266	266
query51	4374	4303	4310	4303
query52	109	109	99	99
query53	259	283	205	205
query54	328	310	271	271
query55	95	97	87	87
query56	313	327	311	311
query57	1417	1410	1325	1325
query58	313	290	287	287
query59	1573	1628	1485	1485
query60	332	331	305	305
query61	181	178	181	178
query62	673	640	579	579
query63	251	210	213	210
query64	2450	866	716	716
query65	
query66	1726	480	362	362
query67	29844	29972	29804	29804
query68	
query69	468	346	304	304
query70	1026	963	1028	963
query71	298	275	272	272
query72	2998	2721	2598	2598
query73	823	766	410	410
query74	5020	4894	4764	4764
query75	2662	2578	2245	2245
query76	2307	1138	761	761
query77	407	404	336	336
query78	12169	12224	11554	11554
query79	1455	1048	771	771
query80	918	567	464	464
query81	506	282	237	237
query82	1330	161	123	123
query83	353	281	242	242
query84	295	140	116	116
query85	941	547	440	440
query86	427	319	312	312
query87	3442	3450	3205	3205
query88	3489	2661	2633	2633
query89	455	390	334	334
query90	1776	171	180	171
query91	181	172	141	141
query92	81	77	70	70
query93	1489	1466	806	806
query94	620	371	313	313
query95	682	461	342	342
query96	1092	787	353	353
query97	2702	2737	2557	2557
query98	243	224	228	224
query99	1111	1122	988	988
Total cold run time: 252972 ms
Total hot run time: 169684 ms

### What problem does this PR solve?

Issue Number: N/A

Related PR: apache#63192

Problem Summary: Doris could not read Iceberg v3 VARIANT columns from Parquet files. This change maps Iceberg VARIANT to Doris VARIANT, validates the Parquet VARIANT wrapper shape from the VariantShredding spec, decodes unshredded metadata/value encoding, reads shredded typed_value columns, and prunes shredded Parquet leaf columns for accessed variant paths with profile observability. It keeps selected typed-only shredded projections on native Parquet typed columns when residual value columns are not selected, including Iceberg field-id pruning below VARIANT typed_value trees, and falls back to row-wise reconstruction only for complex or selected-residual layouts. It also keeps full VARIANT projection separate from predicate subpath pruning, keeps field-level residual values for mixed typed/residual shredded subpaths, supports element paths produced by variant array explode, forces root reads for dynamic and non-slot VARIANT element access, preserves VARIANT generator root reads, explicitly rejects Iceberg VARIANT writes because this PR only implements read support, preserves residual primitive VARIANT scalar types without a JSON round trip, and avoids mutating cached Parquet footer schemas while assigning reader-local column ids. For Iceberg MERGE, the VARIANT write rejection is limited to plans that write data files, so delete-only MERGE can still write position deletes.

### Release note

Support reading Iceberg v3 VARIANT Parquet columns, including shredded typed_value column pruning and binary/UUID/primitive residual VARIANT values. Non-finite floating-point VARIANT values remain unsupported and return explicit errors. Writing Iceberg VARIANT columns is rejected with an explicit unsupported error, while delete-only Iceberg MERGE on tables that contain VARIANT columns remains allowed because it only writes delete files.

### Check List (For Author)

- Test: Regression test / Unit Test / Manual test

    - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.DecodeResidualPrimitiveToVariantMapPreservesScalarTypes:ParquetVariantReaderTest.DecodePrimitiveCoverageExtras:ParquetVariantReaderTest.DecodeResidualBinaryToVariantMap:ParquetVariantReaderTest.RejectNonFiniteDoublePrimitive' (4 tests passed)

    - Unit Test: ./run-be-ut.sh --run --filter='NestedColumnAccessHelperTest.*' (26 tests passed)

    - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.*:NestedColumnAccessHelperTest.*:IcebergReaderCreateColumnIdsTest.*' (104 tests passed)

    - Unit Test: ./run-be-ut.sh --run --filter=ParquetVariantReaderTest.* (74 tests passed)

    - Unit Test: ./run-be-ut.sh --run --filter='ParquetVariantReaderTest.RejectVariantSchemaWithAnnotatedMetadataChild:ParquetVariantReaderTest.RejectVariantSchemaWithAnnotatedValueChild:ParquetVariantReaderTest.RejectVariantSchemaWithNonBinaryValueChild:ParquetVariantReaderTest.ParseTypedOnlyVariantSchemaWithoutTopLevelValue' (4 tests passed)

    - Unit Test: ./run-be-ut.sh --run --filter='HiveReaderCreateColumnIdsTest.*' (7 tests passed)

    - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testNestedVariantInsideStructCollectsSubPath (1 test passed)

    - Unit Test: ./run-fe-ut.sh --run 'org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantWholeColumnWithSiblingSubPathAccessPath' (1 test passed)

    - Unit Test: ./run-fe-ut.sh --run 'org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantWholeNonSlotExpressionWithPredicateAccessPath,org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantDynamicElementAtNonSlotExpressionWithPredicateAccessPath,org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testVariantLiteralElementAtNonSlotExpressionKeepsSubPath,org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest#testExplodeVariantWholeOutputWithPredicateAccessPath,org.apache.doris.nereids.rules.rewrite.SlotTypeReplacerTest' (8 tests passed; Maven reactor succeeded)

    - Unit Test: ./run-fe-ut.sh --run PruneNestedColumnTest focused top-level-null and dynamic element access tests (3 tests passed; Maven reactor succeeded)

    - Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest (61 tests passed; Maven reactor succeeded)

    - Unit Test: ./run-fe-ut.sh --run 'org.apache.doris.planner.IcebergMergeSinkTest,org.apache.doris.nereids.trees.plans.commands.IcebergMergeCommandTest' (7 tests passed; Maven reactor succeeded)

    - Regression test: performance regression coverage is included in regression-test/suites/external_table_p0/tvf/test_local_tvf_iceberg_variant.groovy, including profile assertions that typed-only projections increment VariantDirectTypedValueReadRows and keep VariantRowWiseReadRows at 0. Not run locally in this worktree because no local Doris cluster/output BE+FE runtime is available.

    - Manual test: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/clang-format.sh

    - Manual test: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh

    - Manual test: git diff --check

    - Static analysis: PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/run-clang-tidy.sh was attempted after the focused fixes. The newly added residual primitive helper no longer reports cognitive-complexity/function-size warnings, but the script still fails in this local worktree on pre-existing warnings in touched large reader functions and analyzer errors from included be/src/util/jni-util.h static_assert(false) branches.

- Behavior changed: Yes. Doris can read Iceberg v3 VARIANT Parquet columns, supports spec-compliant binary, UUID, and residual primitive VARIANT values, validates wrapper fields more strictly, avoids row-wise reconstruction for supported typed-only shredded projections, preserves residual binary/null/primitive values, typed-array null elements, nullable complex pruning, user value-only fields, field-level residual values for mixed typed/residual shredded subpaths, and complex array elements with binary leaves, preserves full VARIANT projection when predicates read typed subpaths, handles dynamic and non-slot root element access, optional top-level VARIANT, root/nested variant-array element paths, typed-map key lookup paths, Iceberg field-id paths below VARIANT typed_value trees, rejects Iceberg VARIANT data-file writes explicitly, and keeps delete-only Iceberg MERGE on VARIANT tables on the delete-file path.

- Does this need documentation: No
@eldenmoon
Copy link
Copy Markdown
Member Author

/review

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants