[feat][pip] PIP-475: Regular-to-Scalable Topic Migration#25721
[feat][pip] PIP-475: Regular-to-Scalable Topic Migration#25721merlimat wants to merge 2 commits intoapache:masterfrom
Conversation
Sub-PIP of PIP-460 defining the migration path from regular (partitioned or non-partitioned) topics to scalable topics, including the V5 SDK resolution rule, lookup-session-based discovery with synthetic layouts and special segments, the metadata-flip migration command, and the broker-side guard that prevents v4 clients from writing to a topic that has already been migrated.
The number 474 was already taken; this PIP is now PIP-475. Also adds a paragraph in Motivation noting that scalable topics are intended to fully replace partitioned/non-partitioned topics over time, and that this migration tooling is what enables incremental adoption.
lhotari
left a comment
There was a problem hiding this comment.
LGTM, really nice migration path from v4 topics -> v5 scalable topics!
| A Pulsar topic name encodes its domain in a URI scheme: | ||
|
|
||
| - `persistent://t/n/x` — durable topic backed by a managed ledger. | ||
| - `non-persistent://t/n/x` — in-memory topic, no durability. | ||
| - `topic://t/n/x` — scalable topic introduced by PIP-460. Backed by a DAG of segments; each segment is itself a `segment://...` topic with its own managed ledger. | ||
|
|
There was a problem hiding this comment.
One detail that we missed in Pulsar 4.2.0 is the migration from v1 topics to v2 topics. Since users might be upgrading directly from 4.0.x to 5.0.x, I'd assume that v1 topics would need to be handled in some way.
znodes are different for v2 and v1 topics:
Managed ledger
• v2: /managed-ledgers/tenant/ns/persistent/topic
• v1: /managed-ledgers/tenant/cluster/ns/persistent/topic
Partitioned topic metadata
• v2: /admin/partitioned-topics/tenant/ns/persistent/topic
• v1: /admin/partitioned-topics/tenant/cluster/ns/persistent/topic
Namespace policies
• v2: /admin/policies/tenant/ns
• v1: /admin/policies/tenant/cluster/ns
A common reason why v1 topics exist in 4.0.x production deployments is that adding a slash to a topic name makes it silently a v1 topic.
In 4.1.0, a configuration setting allowAutoTopicCreationWithLegacyNamingScheme was added to prevent creating v1 topics accidentially:
#23620
How are we going to address the possible existence of v1 topics in 4.0.x -> 5.0.x migration?
Summary
Adds PIP-475, a sub-PIP of PIP-460: Scalable Topics, describing the migration path from regular (partitioned or non-partitioned) topics to scalable topics. The Motivation also notes the longer-term direction: scalable topics are intended to fully replace partitioned/non-partitioned topics over time, and this migration tooling is what makes that transition incremental.
The document covers:
topic://,persistent://, or short name); the broker responds with either a real DAG layout or a synthetic layout that wraps the existing partitions as special segments. No probe call, no client-side TTL, no separate v4-wrapper code path.pulsar-admin scalable-topics migrate-to-scalable) that flips a topic from regular to scalable in a single metadata-store CAS, with no data copy and no cursor migration. Each old partition becomes a sealed parent segment in the new DAG; new active children with range-based routing are created alongside, and the existing subscription-controller drain-before-assign protocol preserves per-key ordering across the migration boundary.TopicMigratedto v4 lookups for the equivalentpersistent://name, making "once scalable, always scalable" robust against stale v4 clients and tooling.ScalableTopics.migrateToScalable(...)admin client method,pulsar-admin scalable-topics migrate-to-scalableCLI, broker config kill switch, and themigratedFrominformational field onScalableTopicMetadata.This PIP only changes documentation under
pip/. Implementation will land in subsequent PRs against the parent PIP-460 tracking issue.Test plan