Pushdown string and date formatting functions#174
Draft
iskakaushik wants to merge 1 commit intomainfrom
Draft
Conversation
serprex
reviewed
Apr 2, 2026
| Remote SQL: SELECT COALESCE(CAST(a AS Nullable(String)), CAST(b AS Nullable(String)), CAST(c AS String)), a, b, c FROM functions_test.t1 GROUP BY a, b, c | ||
| (4 rows) | ||
|
|
||
| SELECT coalesce(a::text, b::text, c::text) FROM t1 GROUP BY a, b, c; |
serprex
reviewed
Apr 2, 2026
Member
|
|
Collaborator
Author
|
hence in draft. |
2f3af6c to
756985b
Compare
Collaborator
Author
|
Both review comments from the first push are now resolved:
|
Add query pushdown support for five PostgreSQL functions that previously required local evaluation: split_part(str, delim, n) → splitByString(delim, str)[n] regexp_replace(str, pat, rep [, flags]) → replaceRegexpOne/All(...) array_to_string(arr, sep) → arrayStringConcat(arr, sep) concat_ws(sep, a, b, ...) → arrayStringConcat(arrayFilter(...), sep) to_char(ts, fmt) → formatDateTime(ts, translated_fmt) The regexp_replace translation inspects the flags argument: without 'g' it maps to replaceRegexpOne, with 'g' to replaceRegexpAll. The concat_ws translation wraps each argument in ifNull and filters empty strings, matching PostgreSQL's NULL-skipping semantics. The to_char translation converts PG format tokens (YYYY, MM, DD, HH24, HH12, MI, SS) to ClickHouse strftime equivalents, with care to check MI before MM since both start with 'M', and to use %i (not %M) for minutes since ClickHouse's %M means month name. These functions appear in roughly 25% of surveyed dbt models that currently fall back to local evaluation. Closes PG-132, PG-133, PG-134, PG-135.
756985b to
6dfca48
Compare
Member
|
order by should still be there, any query with multiple results which aren't all the same needs |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add query pushdown for five PostgreSQL functions, eliminating local
evaluation for common string manipulation and date formatting patterns:
split_part()→splitByString()with array indexing (arg reorder)regexp_replace()→replaceRegexpOne()/replaceRegexpAll()(flag-dependent)array_to_string()→arrayStringConcat()concat_ws()→arrayStringConcat(arrayFilter(...))with NULL filteringto_char()→formatDateTime()with PG→CH format string translationThese functions appear in roughly 25% of surveyed dbt models that
currently fall back to local evaluation.
Closes PG-132, PG-133, PG-134, PG-135.
Test plan
make tempcheckpasses on local PG 18 + ClickHouse latestEXPLAIN (VERBOSE, COSTS OFF)for each functionconcat_ws,MIvsMMordering into_char, flag variants inregexp_replace