[phase-31 3/4] Writer + pipeline wiring#6244
Open
g-talbot wants to merge 1 commit intogtt/phase-31-compaction-metadatafrom
Open
[phase-31 3/4] Writer + pipeline wiring#6244g-talbot wants to merge 1 commit intogtt/phase-31-compaction-metadatafrom
g-talbot wants to merge 1 commit intogtt/phase-31-compaction-metadatafrom
Conversation
3 tasks
c012908 to
780c585
Compare
3bbfb71 to
95c3596
Compare
08577b5 to
955f230
Compare
179ccd2 to
ed6d687
Compare
955f230 to
3e73d80
Compare
ed6d687 to
a4d0d36
Compare
3e73d80 to
2f78fe8
Compare
a4d0d36 to
f05d4e7
Compare
2f78fe8 to
2703ca5
Compare
Base automatically changed from
gtt/phase-31-compaction-metadata
to
gtt/phase-31-sort-schema
March 31, 2026 21:31
f05d4e7 to
4a0507e
Compare
2703ca5 to
8fce718
Compare
4a0507e to
bc9458d
Compare
8fce718 to
018a265
Compare
598de1a to
c95095b
Compare
2599f67 to
46903b3
Compare
c95095b to
5b1c080
Compare
46903b3 to
74bfd04
Compare
5b1c080 to
6ebf40b
Compare
74bfd04 to
de0f8c6
Compare
Wire TableConfig-driven sort order into ParquetWriter and add self-describing Parquet file metadata for compaction: - ParquetWriter::new() takes &TableConfig, resolves sort fields at construction via parse_sort_fields() + ParquetField::from_name() - sort_batch() uses resolved fields with per-column direction (ASC/DESC) - SS-1 debug_assert verification: re-sort and check identity permutation - build_compaction_key_value_metadata(): embeds sort_fields, window_start, window_duration, num_merge_ops, row_keys (base64) in Parquet kv_metadata - SS-5 verify_ss5_kv_consistency(): kv_metadata matches source struct - write_to_file_with_metadata() replaces write_to_file() - prepare_write() shared method for bytes and file paths - ParquetWriterConfig gains to_writer_properties_with_metadata() - ParquetSplitWriter passes TableConfig through - All callers in quickwit-indexing updated with TableConfig::default() - 23 storage tests pass including META-07 self-describing roundtrip Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
de0f8c6 to
2d9e6eb
Compare
6ebf40b to
295f59c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wire TableConfig into ParquetWriter sort path and add self-describing Parquet file metadata for compaction (Phase 31 Metadata Foundation, PR 3 of 4).
Stacks on
gtt/phase-31-compaction-metadata(PR #6243).What's included
storage/writer.rs (rewritten):
ParquetWriter::new()takes&TableConfig, resolves sort field names to physical columnssort_batch()uses resolved fields with per-column ASC/DESC directiondebug_assertverification: re-sort output and check identity permutationbuild_compaction_key_value_metadata(): embeds sort_fields, window_start, window_duration, num_merge_ops, row_keys (base64+JSON) in Parquet kv_metadataverify_ss5_kv_consistency(): kv entries must match source structwrite_to_file_with_metadata()replaceswrite_to_file()prepare_write()shared prep for both bytes and file write pathsresolve_sort_fields(): parse sort schema, map to ParquetField, skip missing columnsstorage/config.rs:
to_writer_properties_with_metadata(sorting_cols, kv_metadata)accepts dynamic sort columns and optional KV metadatato_writer_properties()delegates with empty defaultssorting_columns()method (now in writer)storage/split_writer.rs:
ParquetSplitWriter::new()takes&TableConfigparameterquickwit-indexing (5 files):
ParquetSplitWriter::new()callers updated with&TableConfig::default()Verification
cargo build -p quickwit-parquet-engine -p quickwit-indexing✅cargo test -p quickwit-parquet-engine -- storage::✅ (23 tests)cargo clippy -p quickwit-parquet-engine --all-features --tests✅Test plan
🤖 Generated with Claude Code