Skip to content

[feat][pip] PIP-475: Regular-to-Scalable Topic Migration#25721

Open
merlimat wants to merge 2 commits intoapache:masterfrom
merlimat:pip-474
Open

[feat][pip] PIP-475: Regular-to-Scalable Topic Migration#25721
merlimat wants to merge 2 commits intoapache:masterfrom
merlimat:pip-474

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

@merlimat merlimat commented May 8, 2026

Summary

Adds PIP-475, a sub-PIP of PIP-460: Scalable Topics, describing the migration path from regular (partitioned or non-partitioned) topics to scalable topics. The Motivation also notes the longer-term direction: scalable topics are intended to fully replace partitioned/non-partitioned topics over time, and this migration tooling is what makes that transition incremental.

The document covers:

  • V5 SDK resolution rule — the V5 SDK opens a single scalable-topic lookup session for any input form (topic://, persistent://, or short name); the broker responds with either a real DAG layout or a synthetic layout that wraps the existing partitions as special segments. No probe call, no client-side TTL, no separate v4-wrapper code path.
  • Migration protocol — an admin command (pulsar-admin scalable-topics migrate-to-scalable) that flips a topic from regular to scalable in a single metadata-store CAS, with no data copy and no cursor migration. Each old partition becomes a sealed parent segment in the new DAG; new active children with range-based routing are created alongside, and the existing subscription-controller drain-before-assign protocol preserves per-key ordering across the migration boundary.
  • Migration safety — the migration command is API-enforced: it rejects with HTTP 409 if any v4 producer/consumer connections are still attached, ensuring all clients have been upgraded before the metadata flip.
  • Broker-side v4 guard — once a topic is scalable, the broker returns TopicMigrated to v4 lookups for the equivalent persistent:// name, making "once scalable, always scalable" robust against stale v4 clients and tooling.
  • Public-facing changes — new admin REST endpoint, ScalableTopics.migrateToScalable(...) admin client method, pulsar-admin scalable-topics migrate-to-scalable CLI, broker config kill switch, and the migratedFrom informational field on ScalableTopicMetadata.

This PIP only changes documentation under pip/. Implementation will land in subsequent PRs against the parent PIP-460 tracking issue.

Test plan

  • Review the document on the PIP mailing list thread (TBD link to be added once thread is open).

Sub-PIP of PIP-460 defining the migration path from regular
(partitioned or non-partitioned) topics to scalable topics, including
the V5 SDK resolution rule, lookup-session-based discovery with
synthetic layouts and special segments, the metadata-flip migration
command, and the broker-side guard that prevents v4 clients from
writing to a topic that has already been migrated.
@github-actions github-actions Bot added the PIP label May 8, 2026
The number 474 was already taken; this PIP is now PIP-475. Also
adds a paragraph in Motivation noting that scalable topics are
intended to fully replace partitioned/non-partitioned topics over
time, and that this migration tooling is what enables incremental
adoption.
@merlimat merlimat changed the title [feat][pip] PIP-474: Regular-to-Scalable Topic Migration [feat][pip] PIP-475: Regular-to-Scalable Topic Migration May 8, 2026
@lhotari lhotari self-requested a review May 8, 2026 17:23
Copy link
Copy Markdown
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, really nice migration path from v4 topics -> v5 scalable topics!

Comment thread pip/pip-475.md
Comment on lines +26 to +31
A Pulsar topic name encodes its domain in a URI scheme:

- `persistent://t/n/x` — durable topic backed by a managed ledger.
- `non-persistent://t/n/x` — in-memory topic, no durability.
- `topic://t/n/x` — scalable topic introduced by PIP-460. Backed by a DAG of segments; each segment is itself a `segment://...` topic with its own managed ledger.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One detail that we missed in Pulsar 4.2.0 is the migration from v1 topics to v2 topics. Since users might be upgrading directly from 4.0.x to 5.0.x, I'd assume that v1 topics would need to be handled in some way.

znodes are different for v2 and v1 topics:

Managed ledger
• v2: /managed-ledgers/tenant/ns/persistent/topic
• v1: /managed-ledgers/tenant/cluster/ns/persistent/topic

Partitioned topic metadata
• v2: /admin/partitioned-topics/tenant/ns/persistent/topic
• v1: /admin/partitioned-topics/tenant/cluster/ns/persistent/topic

Namespace policies
• v2: /admin/policies/tenant/ns
• v1: /admin/policies/tenant/cluster/ns

A common reason why v1 topics exist in 4.0.x production deployments is that adding a slash to a topic name makes it silently a v1 topic.

In 4.1.0, a configuration setting allowAutoTopicCreationWithLegacyNamingScheme was added to prevent creating v1 topics accidentially:
#23620

How are we going to address the possible existence of v1 topics in 4.0.x -> 5.0.x migration?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants