Skip to content

Commit 00a4023

Browse files
committed
feat(write): add write pipeline with DataFusion INSERT INTO/OVERWRITE support
Add TableWrite for writing Arrow RecordBatches to Paimon append-only tables. Each (partition, bucket) pair gets its own DataFileWriter with direct writes (matching delta-rs DeltaWriter pattern). File rolling uses tokio::spawn for background close, and prepare_commit uses try_join_all for parallel finalization across partition writers. Key components: - TableWrite: routes batches by partition/bucket, holds DataFileWriters - DataFileWriter: manages parquet file lifecycle with rolling support - WriteBuilder: creates TableWrite and TableCommit instances - PaimonDataSink: DataFusion DataSink integration for INSERT/OVERWRITE - FormatFileWriter: extended with flush() and in_progress_size() Configurable options via CoreOptions: - file.compression (default: zstd) - target-file-size (default: 256MB) - write.parquet-buffer-size (default: 256MB) Includes E2E integration tests for unpartitioned, partitioned, fixed-bucket, multi-commit, column projection, and bucket filtering.
1 parent db71611 commit 00a4023

File tree

20 files changed

+2357
-70
lines changed

20 files changed

+2357
-70
lines changed

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,9 @@ arrow-buffer = "57.0"
3434
arrow-schema = "57.0"
3535
arrow-cast = "57.0"
3636
arrow-ord = "57.0"
37+
arrow-select = "57.0"
3738
datafusion = "52.3.0"
3839
datafusion-ffi = "52.3.0"
3940
parquet = "57.0"
4041
tokio = "1.39.2"
42+
tokio-util = "0.7"

crates/integration_tests/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ homepage.workspace = true
2626
[dependencies]
2727
paimon = { path = "../paimon" }
2828
arrow-array = { workspace = true }
29+
arrow-schema = { workspace = true }
2930
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
3031
futures = "0.3"
3132

0 commit comments

Comments
 (0)