Skip to content

Support for 'time' type in Iceberg#1761

Open
ianton-ru wants to merge 8 commits intoantalya-26.3from
bugfix/antalya-26.3/1535_time_type_write_support
Open

Support for 'time' type in Iceberg#1761
ianton-ru wants to merge 8 commits intoantalya-26.3from
bugfix/antalya-26.3/1535_time_type_write_support

Conversation

@ianton-ru
Copy link
Copy Markdown

@ianton-ru ianton-ru commented May 8, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Support for 'time' type in Iceberg, read and write.

Documentation entry for user-facing changes

Solved #1535

This changes time format,.
Was - seconds from midnight:

SELECT * FROM datalake.`namespace.table`

43200

Now - time with microseconds

SELECT * FROM datalake.`namespace.table`

12:00:00.000000

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Workflow [PR], commit [810c6b3]

@ianton-ru ianton-ru added antalya port-antalya PRs to be ported to all new Antalya releases antalya-26.3 labels May 8, 2026
@ianton-ru ianton-ru changed the title Bugfix/antalya 26.3/1535 time type write support Support for 'time' type in Iceberg May 8, 2026
@ianton-ru
Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ianton-ru
Copy link
Copy Markdown
Author

Audit: PR #1761 — Iceberg time type (read and write)

AI audit note: This review comment was generated by AI (audit-review skill).

Audit update for PR #1761 (Support for time type in Iceberg, read and write)

Diff scope: altinity/antalya-26.3 (69f51ce20c5e060e2dbad3ff67c318469d4ec010) → altinity/bugfix/antalya-26.3/1535_time_type_write_support (810c6b37a9591e1b23a73491d9a189765cbd2b8c) — IcebergWrites.cpp, Utils.cpp, SchemaProcessor.cpp, Constant.h, AvroRowInputFormat.cpp, Parquet/PrepareForWrite.cpp, Parquet/Write.cpp, tests/integration/test_database_iceberg/test.py, tests/integration/test_storage_iceberg_no_spark/test_write_time.py.

Confirmed defects

No confirmed defects in reviewed scope.

Coverage summary

  • Scope reviewed: Manifest partition Avro typing with logicalType time-micros, partition datum values and statistics as microsecond long; Iceberg schema mapping timeDataTypeTime64(6); Parquet logical TIME (micros/nanos) with datetime_multiplier; Avro logical TIME_MILLIS / TIME_MICROS mapped to Time64; Iceberg export rejection of Time64 scale above 6 in metadata path.

  • Categories failed: (none).

  • Categories passed: multiplier/scale alignment (CH Time seconds → Iceberg/Parquet µs); Time physical Int32 vs logical TypeIndex::Time dispatch in Parquet writer; nullable partition/stats reasoning; overflow sanity for time-of-day µs range; static-only review of format paths (no new shared-state concurrency in diff).

  • Assumptions/limits: Static audit of the fetched branch diff only; no local build or CI run recorded here. Display change (seconds integer → HH:MM:SS.ffffff) is a documented user-visible behavior change, not logged as a defect.

Expanded review notes (methodology snapshot)

Call graph (in scope)

  1. Read: Iceberg type timeSchemaProcessor::getSimpleTypeDataTypeTime64(6); Avro schema reader maps time-micros / time-millis to DataTypeTime64.
  2. Write Parquet: preparePrimitiveColumn emits INT64 + TIME logical type; Write.cpp applies ConverterTime / ConverterTime64WithMultiplier with multipliers from column scale.
  3. Write Iceberg manifests: extendSchemaForPartitions embeds Avro field type object with logicalType when applicable; generateManifestFile writes partition time values via getTimeValueInMicroseconds before the generic Field switch; dumpFieldToBytes aligns stats with microseconds.

Invariant

Iceberg time is microseconds since midnight end-to-end for partition metadata, bounds, and parquet logical time for the tested paths; CH Time internal seconds are converted at boundaries.

Fault categories exercised (logical)

Malformed non-spec Avro pairs, scale > 6 for Iceberg, nanos Parquet branch vs Iceberg limits, null partition values — no code defect confirmed for spec-conformant inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant