Skip to content

DuckLake: partitioned_by on INCREMENTAL_BY_UNIQUE_KEY results in unpartitioned files on initial build #5742

@nathantapsas

Description

@nathantapsas

When using SQLMesh with DuckDB + DuckLake, models configured with partitioned_by (including INCREMENTAL_BY_UNIQUE_KEY) are not physically partitioned on initial table creation.

SQLMesh logs show:

CREATE OR REPLACE TABLE ... AS SELECT ...
ALTER TABLE ... SET PARTITIONED BY (...)
After this sequence, the initial data file remains unpartitioned (partition_id = NULL in DuckLake metadata). Only subsequent writes (INSERT / MERGE) produce partitioned files and partition paths.

SQLMesh / environment
SQLMesh: 0.231.1
Engine: DuckDB 1.4.4 with DuckLake extension
Connection: DuckDB catalog attached as DuckLake (TYPE ducklake, DATA_PATH set)
Model config example
MODEL (
name silver.accounts_snapshot,
kind INCREMENTAL_BY_UNIQUE_KEY (
unique_key (account_number, __data_snapshot_date),
),
partitioned_by __data_snapshot_date,
);
Expected behavior
If partitioned_by is configured, initial table build should physically write partitioned files (or docs should clearly state initial CTAS files will be unpartitioned and only future writes will partition).

Actual behavior
Initial build creates unpartitioned file(s).

Why this matters
On full rebuilds / first deployments, large snapshot tables are not physically partitioned as configured, which can impact pruning, storage layout, and performance until later incremental writes occur.

Possible fix direction
For DuckLake targets with partitioned_by, consider creating table schema + partition spec first, then inserting data (instead of CTAS then ALTER TABLE ... SET PARTITIONED BY), or perform a post-create rewrite when needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions