Skip to content

feat(datafusion): upgrade to DataFusion 53 and use VERSION AS OF#236

Open
JingsongLi wants to merge 1 commit intoapache:mainfrom
JingsongLi:databricks
Open

feat(datafusion): upgrade to DataFusion 53 and use VERSION AS OF#236
JingsongLi wants to merge 1 commit intoapache:mainfrom
JingsongLi:databricks

Conversation

@JingsongLi
Copy link
Copy Markdown
Contributor

Purpose

Upgrade DataFusion from 52.3 to 53.0 (arrow/parquet 57→58, sqlparser 0.59→0.61, orc-rust 0.7→0.8, pyo3 0.26→0.28) and replace the old FOR SYSTEM_TIME AS OF time travel syntax with the new VERSION AS OF and TIMESTAMP AS OF syntax supported by sqlparser 0.61.

Introduce scan.version option to unify snapshot id and tag name resolution: at scan time, the version value is resolved by first checking if a tag with that name exists, then trying to parse it as a snapshot id, otherwise returning an error. Remove the now-redundant scan.snapshot-id and scan.tag-name options.

Brief change log

Tests

API and Format

Documentation

…E AS OF with VERSION AS OF

Upgrade DataFusion from 52.3 to 53.0 (arrow/parquet 57→58, sqlparser 0.59→0.61,
orc-rust 0.7→0.8, pyo3 0.26→0.28) and replace the old `FOR SYSTEM_TIME AS OF`
time travel syntax with the new `VERSION AS OF` and `TIMESTAMP AS OF` syntax
supported by sqlparser 0.61.

Introduce `scan.version` option to unify snapshot id and tag name resolution:
at scan time, the version value is resolved by first checking if a tag with
that name exists, then trying to parse it as a snapshot id, otherwise returning
an error. Remove the now-redundant `scan.snapshot-id` and `scan.tag-name` options.
@littlecoder04
Copy link
Copy Markdown
Contributor

+1

Copy link
Copy Markdown

@leaves12138 leaves12138 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. The DataFusion 53 upgrade is well-contained and the unified scan.version design is cleaner than the previous dual-option approach.

Minor notes (non-blocking):

  1. scan.version tag-first resolution: A tag named 1 would shadow snapshot ID 1. Worth a short doc note as a known edge case, though practically unlikely since most tag names are descriptive.

  2. Error message in resolve_to_snapshot: The Tag {v} doesn't exist. branch inside the tag_exists true path is technically unreachable (since we already confirmed the tag exists). Could be tightened up but not a blocker.

  3. Python test skip: Correctly skipped since the Python datafusion package has not shipped 53 yet.

+1, good to merge.

@JingsongLi
Copy link
Copy Markdown
Contributor Author

https://lists.apache.org/thread/nps36jh2frp8wfh71564vo2vfozyo7vh Wait datafusion-python to release 53.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants