build: Upgrade ClickHouse version bounds and fix migrations for CH 25.8 compatibility#7800
build: Upgrade ClickHouse version bounds and fix migrations for CH 25.8 compatibility#7800
Conversation
Upgrade the default Altinity ClickHouse stable build from 25.3.6.10034 to 25.3.8.10041 across all services and CI. Add 25.8.16.10001 to the clickhouse-versions CI test matrix. Update supported version bounds accordingly (min: 25.3.8.10041, max: 25.8.16.10001). Co-Authored-By: Claude <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/R_jLZnLeL52xO3nc1UR7h2HK1pty7WvSy3GzXZNuVv0
Lower minimum ClickHouse version to 25.3.6.10034 to match sentry devservices. Add explicit `AS item_type` alias to bare `'span'` string literal in migrations 0023 and 0029 for ClickHouse 25.8 compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/TiTgZLRUxaV6TuqJ9_IDWBVD0AOdsJunbq0WIAIPHmQ
| # Note: SaaS, self-hosted, and sentry dev | ||
| # environements should all be on 23.8.11.29 | ||
| CLICKHOUSE_SERVER_MAX_VERSION = "25.3.6.10034" | ||
| CLICKHOUSE_SERVER_MIN_VERSION = "25.3.6.10034" |
There was a problem hiding this comment.
Min version set to old value instead of new default
Medium Severity
CLICKHOUSE_SERVER_MIN_VERSION is set to "25.3.6.10034" (the old default image version) instead of "25.3.8.10041" (the new default). The PR description explicitly states the min bound is 25.3.8.10041, and the supported versions doc lists 25.3.8.10041 as the lowest supported version—yet the version check in check_clickhouse will accept 25.3.6.10034, which is no longer listed as a supported version.
|
This PR has a migration; here is the generated SQL for -- start migrations
-- forward migration events : 0012_errors_make_level_nullable
Local op: ALTER TABLE errors_local ON CLUSTER 'cluster_one_sh' MODIFY COLUMN level LowCardinality(Nullable(String));
Distributed op: ALTER TABLE errors_dist ON CLUSTER 'cluster_one_sh' MODIFY COLUMN level LowCardinality(Nullable(String));
-- end forward migration events : 0012_errors_make_level_nullable
-- backward migration events : 0012_errors_make_level_nullable
Distributed op: ALTER TABLE errors_dist ON CLUSTER 'cluster_one_sh' MODIFY COLUMN level LowCardinality(String) DEFAULT '';
Local op: ALTER TABLE errors_local ON CLUSTER 'cluster_one_sh' MODIFY COLUMN level LowCardinality(String) DEFAULT '';
-- end backward migration events : 0012_errors_make_level_nullable
-- forward migration events_analytics_platform : 0023_smart_autocomplete_mv
Local op: CREATE TABLE IF NOT EXISTS eap_trace_item_attrs_local ON CLUSTER 'cluster_one_sh' (project_id UInt64, item_type String, date Date CODEC (DoubleDelta, ZSTD(1)), retention_days UInt16, attrs_string Map(String, String), attrs_bool Array(String), attrs_int64 Array(String), attrs_float64 Array(String), key_val_hash UInt64) ENGINE ReplicatedReplacingMergeTree('/clickhouse/tables/events_analytics_platform/{shard}/default/eap_trace_item_attrs_local', '{replica}') PRIMARY KEY (project_id, date, key_val_hash) ORDER BY (project_id, date, key_val_hash) PARTITION BY (retention_days, toMonday(date)) TTL date + toIntervalDay(retention_days);
Distributed op: CREATE TABLE IF NOT EXISTS eap_trace_item_attrs_dist ON CLUSTER 'cluster_one_sh' (project_id UInt64, item_type String, date Date CODEC (DoubleDelta, ZSTD(1)), retention_days UInt16, attrs_string Map(String, String), attrs_bool Array(String), attrs_int64 Array(String), attrs_float64 Array(String), key_val_hash UInt64) ENGINE Distributed(`cluster_one_sh`, default, eap_trace_item_attrs_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS eap_trace_item_attrs_mv ON CLUSTER 'cluster_one_sh' TO eap_trace_item_attrs_local (project_id UInt64, item_type String, date Date CODEC (DoubleDelta, ZSTD(1)), retention_days UInt16, attrs_string Map(String, String), attrs_bool Array(String), attrs_int64 Array(String), attrs_float64 Array(String), key_val_hash UInt64) AS
SELECT
project_id,
'span' AS item_type,
toDate(_sort_timestamp) AS date,
retention_days as retention_days,
mapConcat(attr_str_0, attr_str_1, attr_str_2, attr_str_3, attr_str_4, attr_str_5, attr_str_6, attr_str_7, attr_str_8, attr_str_9, attr_str_10, attr_str_11, attr_str_12, attr_str_13, attr_str_14, attr_str_15, attr_str_16, attr_str_17, attr_str_18, attr_str_19) AS attrs_string, -- `attrs_string` Map(String, String),
array() AS attrs_bool, -- bool
array() AS attrs_int64, -- int64
arrayConcat(mapKeys(attr_num_0), mapKeys(attr_num_1), mapKeys(attr_num_2), mapKeys(attr_num_3), mapKeys(attr_num_4), mapKeys(attr_num_5), mapKeys(attr_num_6), mapKeys(attr_num_7), mapKeys(attr_num_8), mapKeys(attr_num_9), mapKeys(attr_num_10), mapKeys(attr_num_11), mapKeys(attr_num_12), mapKeys(attr_num_13), mapKeys(attr_num_14), mapKeys(attr_num_15), mapKeys(attr_num_16), mapKeys(attr_num_17), mapKeys(attr_num_18), mapKeys(attr_num_19)) AS attrs_float64, -- float
-- a hash of all the attribute key,val pairs of the item in sorted order
-- this lets us deduplicate rows with merges
cityHash64(mapSort(
mapConcat(
mapApply((k, v) -> (k, ''), attr_num_0),
mapApply((k, v) -> (k, ''), attr_num_1),
mapApply((k, v) -> (k, ''), attr_num_2),
mapApply((k, v) -> (k, ''), attr_num_3),
mapApply((k, v) -> (k, ''), attr_num_4),
mapApply((k, v) -> (k, ''), attr_num_5),
mapApply((k, v) -> (k, ''), attr_num_6),
mapApply((k, v) -> (k, ''), attr_num_7),
mapApply((k, v) -> (k, ''), attr_num_8),
mapApply((k, v) -> (k, ''), attr_num_9),
mapApply((k, v) -> (k, ''), attr_num_10),
mapApply((k, v) -> (k, ''), attr_num_11),
mapApply((k, v) -> (k, ''), attr_num_12),
mapApply((k, v) -> (k, ''), attr_num_13),
mapApply((k, v) -> (k, ''), attr_num_14),
mapApply((k, v) -> (k, ''), attr_num_15),
mapApply((k, v) -> (k, ''), attr_num_16),
mapApply((k, v) -> (k, ''), attr_num_17),
mapApply((k, v) -> (k, ''), attr_num_18),
mapApply((k, v) -> (k, ''), attr_num_19),
attr_str_0,
attr_str_1,
attr_str_2,
attr_str_3,
attr_str_4,
attr_str_5,
attr_str_6,
attr_str_7,
attr_str_8,
attr_str_9,
attr_str_10,
attr_str_11,
attr_str_12,
attr_str_13,
attr_str_14,
attr_str_15,
attr_str_16,
attr_str_17,
attr_str_18,
attr_str_19
)
)) AS key_val_hash
FROM eap_spans_2_local
;
-- end forward migration events_analytics_platform : 0023_smart_autocomplete_mv
-- backward migration events_analytics_platform : 0023_smart_autocomplete_mv
Local op: DROP TABLE IF EXISTS eap_trace_item_attrs_mv ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS eap_trace_item_attrs_local ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS eap_trace_item_attrs_dist ON CLUSTER 'cluster_one_sh' SYNC;
-- end backward migration events_analytics_platform : 0023_smart_autocomplete_mv
-- forward migration events_analytics_platform : 0029_remove_smart_autocomplete_experimental
Local op: DROP TABLE IF EXISTS eap_trace_item_attrs_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS eap_trace_item_attrs_dist ON CLUSTER 'cluster_one_sh' SYNC;
-- end forward migration events_analytics_platform : 0029_remove_smart_autocomplete_experimental
-- backward migration events_analytics_platform : 0029_remove_smart_autocomplete_experimental
Distributed op: CREATE TABLE IF NOT EXISTS eap_trace_item_attrs_dist ON CLUSTER 'cluster_one_sh' (project_id UInt64, item_type String, date Date CODEC (DoubleDelta, ZSTD(1)), retention_days UInt16, attrs_string Map(String, String), attrs_bool Array(String), attrs_int64 Array(String), attrs_float64 Array(String), key_val_hash UInt64) ENGINE Distributed(`cluster_one_sh`, default, eap_trace_item_attrs_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS eap_trace_item_attrs_mv ON CLUSTER 'cluster_one_sh' TO eap_trace_item_attrs_local (project_id UInt64, item_type String, date Date CODEC (DoubleDelta, ZSTD(1)), retention_days UInt16, attrs_string Map(String, String), attrs_bool Array(String), attrs_int64 Array(String), attrs_float64 Array(String), key_val_hash UInt64) AS
SELECT
project_id,
'span' AS item_type,
toDate(_sort_timestamp) AS date,
retention_days as retention_days,
mapConcat(attr_str_0, attr_str_1, attr_str_2, attr_str_3, attr_str_4, attr_str_5, attr_str_6, attr_str_7, attr_str_8, attr_str_9, attr_str_10, attr_str_11, attr_str_12, attr_str_13, attr_str_14, attr_str_15, attr_str_16, attr_str_17, attr_str_18, attr_str_19) AS attrs_string, -- `attrs_string` Map(String, String),
array() AS attrs_bool, -- bool
array() AS attrs_int64, -- int64
arrayConcat(mapKeys(attr_num_0), mapKeys(attr_num_1), mapKeys(attr_num_2), mapKeys(attr_num_3), mapKeys(attr_num_4), mapKeys(attr_num_5), mapKeys(attr_num_6), mapKeys(attr_num_7), mapKeys(attr_num_8), mapKeys(attr_num_9), mapKeys(attr_num_10), mapKeys(attr_num_11), mapKeys(attr_num_12), mapKeys(attr_num_13), mapKeys(attr_num_14), mapKeys(attr_num_15), mapKeys(attr_num_16), mapKeys(attr_num_17), mapKeys(attr_num_18), mapKeys(attr_num_19)) AS attrs_float64, -- float
-- a hash of all the attribute key,val pairs of the item in sorted order
-- this lets us deduplicate rows with merges
cityHash64(mapSort(
mapConcat(
mapApply((k, v) -> (k, ''), attr_num_0),
mapApply((k, v) -> (k, ''), attr_num_1),
mapApply((k, v) -> (k, ''), attr_num_2),
mapApply((k, v) -> (k, ''), attr_num_3),
mapApply((k, v) -> (k, ''), attr_num_4),
mapApply((k, v) -> (k, ''), attr_num_5),
mapApply((k, v) -> (k, ''), attr_num_6),
mapApply((k, v) -> (k, ''), attr_num_7),
mapApply((k, v) -> (k, ''), attr_num_8),
mapApply((k, v) -> (k, ''), attr_num_9),
mapApply((k, v) -> (k, ''), attr_num_10),
mapApply((k, v) -> (k, ''), attr_num_11),
mapApply((k, v) -> (k, ''), attr_num_12),
mapApply((k, v) -> (k, ''), attr_num_13),
mapApply((k, v) -> (k, ''), attr_num_14),
mapApply((k, v) -> (k, ''), attr_num_15),
mapApply((k, v) -> (k, ''), attr_num_16),
mapApply((k, v) -> (k, ''), attr_num_17),
mapApply((k, v) -> (k, ''), attr_num_18),
mapApply((k, v) -> (k, ''), attr_num_19),
attr_str_0,
attr_str_1,
attr_str_2,
attr_str_3,
attr_str_4,
attr_str_5,
attr_str_6,
attr_str_7,
attr_str_8,
attr_str_9,
attr_str_10,
attr_str_11,
attr_str_12,
attr_str_13,
attr_str_14,
attr_str_15,
attr_str_16,
attr_str_17,
attr_str_18,
attr_str_19
)
)) AS key_val_hash
FROM eap_spans_2_local
;
-- end backward migration events_analytics_platform : 0029_remove_smart_autocomplete_experimental
-- forward migration events_analytics_platform : 0053_alter_deletes_workload_max_threads
Local op: SELECT 1
-- end forward migration events_analytics_platform : 0053_alter_deletes_workload_max_threads
-- backward migration events_analytics_platform : 0053_alter_deletes_workload_max_threads
Local op:
CREATE OR REPLACE WORKLOAD low_priority_deletes
IN all
SETTINGS
priority = 100,
max_requests = 2;
-- end backward migration events_analytics_platform : 0053_alter_deletes_workload_max_threads
-- forward migration generic_metrics : 0024_gauges_mv
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_gauges_aggregation_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_gauges_aggregated_local (org_id UInt64, project_id UInt64, metric_id UInt64, granularity UInt8, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tags Nested(key UInt64, indexed_value UInt64, raw_value String), use_case_id LowCardinality(String)) AS
SELECT
use_case_id,
org_id,
project_id,
metric_id,
arrayJoin(granularities) as granularity,
tags.key,
tags.indexed_value,
tags.raw_value,
maxState(timestamp) as last_timestamp,
toDateTime(multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1) *
intDiv(toUnixTimestamp(timestamp),
multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as rounded_timestamp,
least(retention_days,
multiIf(granularity=0,decasecond_retention_days,
granularity=1,min_retention_days,
granularity=2,hr_retention_days,
granularity=3,day_retention_days,
0)) as retention_days,
minState(arrayJoin(gauges_values.min)) as min,
maxState(arrayJoin(gauges_values.max)) as max,
sumState(arrayJoin(gauges_values.sum)) as sum,
sumState(arrayJoin(gauges_values.count)) as count,
argMaxState(arrayJoin(gauges_values.last), timestamp) as last
FROM generic_metric_gauges_raw_local
WHERE materialization_version = 2
AND metric_type = 'gauge'
GROUP BY
use_case_id,
org_id,
project_id,
metric_id,
tags.key,
tags.indexed_value,
tags.raw_value,
timestamp,
granularity,
retention_days
;
-- end forward migration generic_metrics : 0024_gauges_mv
-- backward migration generic_metrics : 0024_gauges_mv
Local op: DROP TABLE IF EXISTS generic_metric_gauges_aggregation_mv ON CLUSTER 'cluster_one_sh' SYNC;
-- end backward migration generic_metrics : 0024_gauges_mv
-- forward migration generic_metrics : 0032_counters_meta_table_mv
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_aggregation_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_aggregated_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tag_values AggregateFunction(groupUniqArray, String), count AggregateFunction(sum, Float64)) AS
SELECT
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
toStartOfWeek(timestamp) as timestamp,
retention_days,
groupUniqArrayState(tag_value) as `tag_values`,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
timestamp,
retention_days
;
-- end forward migration generic_metrics : 0032_counters_meta_table_mv
-- backward migration generic_metrics : 0032_counters_meta_table_mv
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_aggregation_mv ON CLUSTER 'cluster_one_sh' SYNC;
-- end backward migration generic_metrics : 0032_counters_meta_table_mv
-- forward migration generic_metrics : 0040_remove_counters_meta_tables
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_value_aggregation_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_value_aggregated_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_value_aggregated_local ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_aggregation_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_aggregated_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_aggregated_local ON CLUSTER 'cluster_one_sh' SYNC;
-- end forward migration generic_metrics : 0040_remove_counters_meta_tables
-- backward migration generic_metrics : 0040_remove_counters_meta_tables
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_value_aggregated_local ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_tag_value_aggregated_local', '{replica}') PRIMARY KEY (project_id, metric_id, tag_key, tag_value, timestamp) ORDER BY (project_id, metric_id, tag_key, tag_value, timestamp) PARTITION BY (retention_days, toMonday(timestamp)) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=2048;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_value_aggregated_dist ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_tag_value_aggregated_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_tag_value_aggregation_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_tag_value_aggregated_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
project_id,
metric_id,
tag_key,
tag_value,
toStartOfWeek(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
project_id,
metric_id,
tag_key,
tag_value,
timestamp,
retention_days
;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_aggregated_local ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tag_values AggregateFunction(groupUniqArray, String), count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_aggregated_local', '{replica}') PRIMARY KEY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) ORDER BY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) PARTITION BY (retention_days, toMonday(timestamp)) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=2048;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_aggregated_dist ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tag_values AggregateFunction(groupUniqArray, String), count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_aggregated_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_aggregation_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_aggregated_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tag_values AggregateFunction(groupUniqArray, String), count AggregateFunction(sum, Float64)) AS
SELECT
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
toStartOfWeek(timestamp) as timestamp,
retention_days,
groupUniqArrayState(tag_value) as `tag_values`,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
timestamp,
retention_days
;
-- end backward migration generic_metrics : 0040_remove_counters_meta_tables
-- forward migration generic_metrics : 0041_adjust_partitioning_meta_tables
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_local ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_local ON CLUSTER 'cluster_one_sh' SYNC;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_v2_local ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_v2_local', '{replica}') PRIMARY KEY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) ORDER BY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_v2_dist ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_v2_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_v2_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_v2_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
toMonday(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN tags.key AS tag_key
WHERE record_meta = 1
GROUP BY
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
timestamp,
retention_days
;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_values_v2_local ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_tag_values_v2_local', '{replica}') PRIMARY KEY (project_id, metric_id, tag_key, tag_value, timestamp) ORDER BY (project_id, metric_id, tag_key, tag_value, timestamp) PARTITION BY toMonday(timestamp) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192, ttl_only_drop_parts=0;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_values_v2_dist ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_tag_values_v2_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_tag_values_v2_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_tag_values_v2_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
project_id,
metric_id,
tag_key,
tag_value,
toMonday(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
project_id,
metric_id,
tag_key,
tag_value,
timestamp,
retention_days
;
-- end forward migration generic_metrics : 0041_adjust_partitioning_meta_tables
-- backward migration generic_metrics : 0041_adjust_partitioning_meta_tables
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_v2_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_v2_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_tag_values_v2_local ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_v2_mv ON CLUSTER 'cluster_one_sh' SYNC;
Distributed op: DROP TABLE IF EXISTS generic_metric_counters_meta_v2_dist ON CLUSTER 'cluster_one_sh' SYNC;
Local op: DROP TABLE IF EXISTS generic_metric_counters_meta_v2_local ON CLUSTER 'cluster_one_sh' SYNC;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_local ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_local', '{replica}') PRIMARY KEY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) ORDER BY (org_id, project_id, use_case_id, metric_id, tag_key, timestamp) PARTITION BY (retention_days, toMonday(timestamp)) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_dist ON CLUSTER 'cluster_one_sh' (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_local (org_id UInt64, project_id UInt64, use_case_id LowCardinality(String), metric_id UInt64, tag_key UInt64, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
toStartOfWeek(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN tags.key AS tag_key
WHERE record_meta = 1
GROUP BY
org_id,
project_id,
use_case_id,
metric_id,
tag_key,
timestamp,
retention_days
;
Local op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_values_local ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE ReplicatedAggregatingMergeTree('/clickhouse/tables/generic_metrics_counters/{shard}/default/generic_metric_counters_meta_tag_values_local', '{replica}') PRIMARY KEY (project_id, metric_id, tag_key, tag_value, timestamp) ORDER BY (project_id, metric_id, tag_key, tag_value, timestamp) PARTITION BY (retention_days, toMonday(timestamp)) TTL timestamp + toIntervalDay(retention_days) SETTINGS index_granularity=8192;
Distributed op: CREATE TABLE IF NOT EXISTS generic_metric_counters_meta_tag_values_dist ON CLUSTER 'cluster_one_sh' (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) ENGINE Distributed(`cluster_one_sh`, default, generic_metric_counters_meta_tag_values_local);
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_counters_meta_tag_values_mv ON CLUSTER 'cluster_one_sh' TO generic_metric_counters_meta_tag_values_local (project_id UInt64, metric_id UInt64, tag_key UInt64, tag_value String, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, count AggregateFunction(sum, Float64)) AS
SELECT
project_id,
metric_id,
tag_key,
tag_value,
toStartOfWeek(timestamp) as timestamp,
retention_days,
sumState(count_value) as count
FROM generic_metric_counters_raw_local
ARRAY JOIN
tags.key AS tag_key, tags.raw_value AS tag_value
WHERE record_meta = 1
GROUP BY
project_id,
metric_id,
tag_key,
tag_value,
timestamp,
retention_days
;
-- end backward migration generic_metrics : 0041_adjust_partitioning_meta_tables
-- forward migration generic_metrics : 0057_gauges_mv3
Local op: CREATE MATERIALIZED VIEW IF NOT EXISTS generic_metric_gauges_aggregation_mv3 ON CLUSTER 'cluster_one_sh' TO generic_metric_gauges_aggregated_local (org_id UInt64, project_id UInt64, metric_id UInt64, granularity UInt8, timestamp DateTime CODEC (DoubleDelta), retention_days UInt16, tags Nested(key UInt64, indexed_value UInt64, raw_value String), use_case_id LowCardinality(String)) AS
SELECT
use_case_id,
org_id,
project_id,
metric_id,
arrayJoin(granularities) as granularity,
tags.key,
tags.indexed_value,
tags.raw_value,
maxState(timestamp) as last_timestamp,
toDateTime(multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1) *
intDiv(toUnixTimestamp(timestamp),
multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as rounded_timestamp,
least(retention_days,
multiIf(granularity=0,decasecond_retention_days,
granularity=1,min_retention_days,
granularity=2,hr_retention_days,
granularity=3,day_retention_days,
0)) as retention_days,
minState(arrayJoin(gauges_values.min)) as min,
maxState(arrayJoin(gauges_values.max)) as max,
sumState(arrayJoin(gauges_values.sum)) as sum,
sumState(sampling_weight * arrayJoin(gauges_values.sum)) AS sum_weighted,
sumState(arrayJoin(gauges_values.count)) as count,
sumState(sampling_weight * arrayJoin(gauges_values.count)) AS count_weighted,
argMaxState(arrayJoin(gauges_values.last), timestamp) as last
FROM generic_metric_gauges_raw_local
WHERE materialization_version = 3
AND metric_type = 'gauge'
GROUP BY
use_case_id,
org_id,
project_id,
metric_id,
tags.key,
tags.indexed_value,
tags.raw_value,
timestamp,
granularity,
retention_days
;
-- end forward migration generic_metrics : 0057_gauges_mv3
-- backward migration generic_metrics : 0057_gauges_mv3
Local op: DROP TABLE IF EXISTS generic_metric_gauges_aggregation_mv3 ON CLUSTER 'cluster_one_sh' SYNC;
-- end backward migration generic_metrics : 0057_gauges_mv3
-- forward migration transactions : 0010_transactions_nullable_trace_id
Local op: ALTER TABLE transactions_local ON CLUSTER 'cluster_one_sh' MODIFY COLUMN trace_id Nullable(UUID);
Distributed op: ALTER TABLE transactions_dist ON CLUSTER 'cluster_one_sh' MODIFY COLUMN trace_id Nullable(UUID);
-- end forward migration transactions : 0010_transactions_nullable_trace_id
-- backward migration transactions : 0010_transactions_nullable_trace_id
Distributed op: ALTER TABLE transactions_dist ON CLUSTER 'cluster_one_sh' MODIFY COLUMN trace_id UUID DEFAULT toUUID('00000000-0000-0000-0000-000000000000');
Local op: ALTER TABLE transactions_local ON CLUSTER 'cluster_one_sh' MODIFY COLUMN trace_id UUID DEFAULT toUUID('00000000-0000-0000-0000-000000000000');
-- end backward migration transactions : 0010_transactions_nullable_trace_id |
The max_threads workload setting was renamed to max_concurrent_threads in ClickHouse 25.8. Detect the version and use the appropriate setting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/BRUWltJoPKJSLEGANsQodgovxovBnspJrsY37ttR4xs
snuba/snuba_migrations/events_analytics_platform/0053_alter_deletes_workload_max_threads.py
Outdated
Show resolved
Hide resolved
Use a custom SqlOperation subclass that detects the ClickHouse version at execution time (not during forwards_ops) to pick the correct thread setting name. This avoids connecting to ClickHouse during validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/WYWZ-FbLeQudZ8bq4os7KOj1pABHKKn0u5E7Ty6K-vQ
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| storage_set=StorageSetKey.EVENTS_ANALYTICS_PLATFORM, | ||
| statement="SELECT 1", | ||
| target=OperationTarget.LOCAL, | ||
| ) |
There was a problem hiding this comment.
Dry-run outputs placeholder "SELECT 1" instead of actual SQL
Medium Severity
_AlterWorkloadOp passes statement="SELECT 1" to the RunSql parent __init__. Because RunSql stores the statement via name-mangled self.__statement (becoming _RunSql__statement), the inherited format_sql() method always returns "SELECT 1". The migration framework's dry-run mode calls op.format_sql() to display what SQL will be executed, so operators previewing this migration see "SELECT 1" instead of the actual CREATE OR REPLACE WORKLOAD statement. The SqlOperation.__eq__ method also relies on format_sql(), making equality checks unreliable.
The backwards_ops recreated old tables but referenced v2 table names (which were just dropped) for distributed table local references and materialized view destinations. Use the correct old_* table names. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/buphKHEhV6JuKr-X86kHUKK2tNnQWAIR2lkORRVSjOM
CH 25.8 enforces that MV SELECT column names must exactly match destination table column names. Fix mismatched aliases: - 0032/0040: `as count` → `as value` (dest column is `value`) - 0024/0057: `as rounded_timestamp` → `as timestamp` (dest column is `timestamp`) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/NlntCeHhhZx1EWO3_EmM2bf82PKOZOKefEGyq318Yxg
CH 25.8 no longer allows referencing inline aliases defined inside aggregate functions from other SELECT expressions. Replace the `timestamp AS raw_timestamp` pattern inside maxState() with direct `timestamp` references throughout the query. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/DZpD0DJ5YzHLKJQ4rpwYNejio0bZSbJUbD6GA2kR-3c
The gauges aggregated table (created in migration 0022) uses `rounded_timestamp` as the column name, not `timestamp`. The MV SELECT alias must match the destination table column name for CH 25.8. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/0p8LnISRmrGJ5Orc9ru-PfPrrqWzRbepbtEesWJVDGE
…h 0031 The destination table (created in 0031) uses `count` as the column name, not `value`. Migration 0040's backwards_ops meta_table_columns incorrectly defined it as `value`, causing a mismatch when the table is recreated during reverse. Fix both the column definition and keep the MV SELECT alias as `as count`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/GuXZo5ObeYaVJBCuFuDJ7bTcgLLs1XquvclHsWifcj0
CH 25.8 requires an explicit DEFAULT expression when converting a column from Nullable to non-Nullable via ALTER MODIFY COLUMN (to handle existing NULL values). Add defaults to backwards_ops in: - transactions/0010: trace_id UUID DEFAULT '00000000-...' - events/0012: level LowCardinality(String) DEFAULT '' Also fix 0040 meta_table_columns to use `count` (matching the actual table created in 0031) and revert the incorrect count→value alias change in 0032/0040 MV queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/meTvnRxFd6ykY44mFxncT96l7TDjl1b0ZXXVZSnNtm4
| tags.key, | ||
| tags.indexed_value, | ||
| tags.raw_value, | ||
| maxState(timestamp as raw_timestamp) as last_timestamp, | ||
| maxState(timestamp) as last_timestamp, | ||
| toDateTime(multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1) * | ||
| intDiv(toUnixTimestamp(raw_timestamp), | ||
| intDiv(toUnixTimestamp(timestamp), | ||
| multiIf(granularity=0,10,granularity=1,60,granularity=2,3600,granularity=3,86400,-1))) as rounded_timestamp, | ||
| least(retention_days, | ||
| multiIf(granularity=0,decasecond_retention_days, |
There was a problem hiding this comment.
Bug: In migration 0057_gauges_mv3.py, the dest_table_columns defines a timestamp column, but the view's query aliases it as rounded_timestamp, causing a mismatch that will fail the migration.
Severity: HIGH
Suggested Fix
Update the dest_table_columns definition in snuba/snuba_migrations/generic_metrics/0057_gauges_mv3.py. Change the Column("timestamp", ...) to Column("rounded_timestamp", ...) to match the alias in the SELECT query and the schema of the destination table.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: snuba/snuba_migrations/generic_metrics/0057_gauges_mv3.py#L52-L60
Potential issue: In the Snuba migration `0057_gauges_mv3.py`, the materialized view is
configured with a `dest_table_columns` list that specifies a column named `timestamp`.
However, the `SELECT` query for this view aliases the corresponding computed time column
as `rounded_timestamp`. The actual destination table,
`generic_metric_gauges_aggregated_local` (created in migration 0022), expects a column
named `rounded_timestamp`. This mismatch between the view's column definition
(`timestamp`) and the alias used in the query (`rounded_timestamp`) will cause the
migration to fail on ClickHouse 25.8, which enforces strict column name matching between
a materialized view's `SELECT` statement and its destination table.
CH can't find a supertype between UUID and String types, so the DEFAULT must be a UUID-typed expression, not a string literal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/qvrPSI8VElO4sM5JtOJaN8CITVeUqFRjne67jy2s6E4


Summary
Upgrade ClickHouse version bounds and add CH 25.8 to the CI test matrix, along with all necessary migration fixes for CH 25.8 compatibility.
Version changes:
25.3.6.10034(matches sentry devservices)25.8.16.10001clickhouse-versionsmatrix: added25.8.16.10001.altinitystableCH 25.8 migration compatibility fixes
CH 25.8 introduces several stricter validation rules that required fixes across multiple migrations:
1. Strict MV column name matching
CH 25.8 enforces that
SELECTcolumn names in materialized views must exactly match destination table column names. Previously, ClickHouse would match columns by position; now it matches by name.Fixed migrations:
generic_metrics/0040backwards_ops:meta_table_columnsdefined column asvalueinstead ofcount, not matching the original table created in migration 0031generic_metrics/0041backwards_ops: referenced just-dropped v2 table names instead of old table namesevents_analytics_platform/0023,0029: addedAS item_typealias for bare'span'literal2. Inline alias scoping in aggregate functions
CH 25.8 no longer allows referencing inline aliases defined inside aggregate functions from other
SELECTexpressions. For example,maxState(timestamp AS raw_timestamp)would defineraw_timestamp, but other expressions liketoUnixTimestamp(raw_timestamp)can no longer see it.Fixed migrations:
generic_metrics/0024,0057: replaced thetimestamp AS raw_timestamppattern insidemaxState()with directtimestampreferences throughout gauges MV queries3. Nullable→non-nullable
ALTER MODIFY COLUMNrequiresDEFAULTCH 25.8 requires an explicit
DEFAULTexpression when converting a column fromNullable(T)toTviaALTER MODIFY COLUMN. In older versions,NULLvalues were silently replaced with the type's zero-value (empty string, zero UUID, etc.); CH 25.8 now requires the user to be explicit about whatNULLvalues should become. This prevents silent data loss.Fixed migrations:
transactions/0010backwards_ops:trace_idUUID → addedDEFAULT toUUID('00000000-...')events/0012backwards_ops:levelString → addedDEFAULT ''4. Workload setting rename
CH 25.8 renamed
max_threadstomax_concurrent_threadsin workload definitions.Fixed migrations:
events_analytics_platform/0053: uses a deferred version check pattern (_AlterWorkloadOpsubclass) to select the correct setting name at migration execution time, avoiding CH connection during validation testsTest plan
test_run_and_reverse_allpassestest_run_and_reverse_allpasses