feat(replication-restore): operator-side orchestration for cross-region replication restore (PR 1/2) by RidRisR · Pull Request #6861 · pingcap/tidb-operator

RidRisR · 2026-04-27T08:26:03Z

What problem does this PR solve?

Adds operator-side orchestration for replication restore — a cross-region disaster-recovery flow that defers log compaction from upstream to the downstream PITR phase, avoiding double cross-region transfer of log data. The CompactBackup mode: sharded feature shipped earlier provides parallel sharded compaction; this PR is the Restore-side companion that gates BR phase-2 (log restore) on CompactBackup terminal state.

This is PR 1 of 2. PR 2 (backup-manager CLI + BR --replication-storage-phase wiring) will follow once internal BR is ready.

Spec: docs/superpowers/specs/2026-04-23-replication-restore-design.md
Plan: docs/superpowers/plans/2026-04-23-replication-restore-pr1.md

What is changed and how does it work?

User triggers replication restore by setting + "spec.mode: pitr" + and a new + "spec.replicationConfig" + field on a Restore CR (referencing a CompactBackup CR by name). The controller dispatches a dedicated handler ( + "pkg/backup/restore/replication_handler.go" + ) that drives a two-phase state machine keyed on a new + "status.replicationStep" + string field:

"" / Scheduled → cross-CR consistency check → create phase-1 BR Job (snapshot restore) → + "status.phase = SnapshotRestore" + , + "replicationStep = snapshot-restore" +
snapshot-restore → observe phase-1 Job + CompactBackup → write + "SnapshotRestored" + and + "CompactSettled" + markers (without touching Phase) → when both markers True, transition to + "LogRestore" + and create phase-2 Job
log-restore → backup-manager drives + "Running/Complete/Failed" + via the standard updater (PR 2 wires the actual + "--replicationPhase=2" + arg)

Status write ownership uses Option B: the controller is the sole writer of + "status.phase" + during phase-1; backup-manager status writes are no-op'd via a wrapper ( + "NewReplicationRestoreStatusUpdater" + , activated by PR 2). This eliminates a v2-design hole where backup-manager's + "Running" + write would clobber controller-set Phase. Pattern mirrors + "ShardedCompactStatusUpdater" + from the sharded CompactBackup PR.

Markers ( + "SnapshotRestored" + / + "CompactSettled" + ) bypass + "UpdateRestoreCondition" + (which unconditionally writes + "status.Phase = condition.Type" + ) via a private + "appendRestoreMarker" + helper. + "UpdateRestoreCondition" + itself is untouched.

A CompactBackup informer is wired into the Restore controller so that CompactBackup state changes wake referencing Restores (reverse lookup by + "spec.replicationConfig.compactBackupName" + ).

Backward compatibility

Existing PiTR / snapshot / volume-snapshot users are completely unaffected. The interception in + "syncRestoreJob" + triggers only when + "spec.mode == pitr && spec.replicationConfig != nil" + ; nil falls through to the original code path. New CRD fields are all optional with zero-value-equivalent-old-behavior semantics.

Known follow-ups (out of scope for this PR)

+ "buildPiTRBaseArgs" + is a minimal stub — the full PiTR arg set (env vars, TLS volumes, secrets, resources) lives in + "restoreManager.makeRestoreJobWithMode" + and is not replicated yet. PR 2 (or a follow-up) must integrate, either by calling + "rm.makeRestoreJobWithMode" + and post-processing or by extracting a shared helper.
backup-manager CLI flag + "--replicationPhase" + , BR command + "--replication-storage-phase=N" + , + "--replication-status-sub-prefix=ccr" + , and + "--pitr-concurrency=1024" + — PR 2.
End-to-end integration tests requiring real internal BR — PR 2.

Code changes

Has Go code change
Has CI related scripts change

Tests

Unit test (handler state machine, gate, consistency check, no-op wrapper, label-based Job lookup, idempotence, controller informer reverse lookup, restore manager interception)
E2E test (deferred to PR 2 — gated on internal BR readiness)
Manual test
No code

Side effects

Breaking backward compatibility
Other side effects

Continuation of the CompactBackup sharded PR. Defines operator-side orchestration for replication restore: two-phase BR (snapshot + log) with CompactBackup terminal state as the gate. Differs from 2026-04-15 v2 design in five localized decisions: - Phase/Condition names are intent-based: SnapshotRestore / LogRestore (status.phase), SnapshotRestored / CompactSettled (markers) - ReplicationStep is string ("snapshot-restore" / "log-restore") not int - No intermediate ReadyForLogRestore phase; two markers True is the gate - statusSubPrefix removed from CRD, hardcoded to "ccr" constant - Status write ownership: Option B (controller owns all Phase writes; backup-manager in phase-1 wrapped by ReplicationRestoreStatusUpdater), mirroring the ShardedCompactStatusUpdater pattern from the CompactBackup sharded PR. Eliminates the v2 transformation-layer hole where backup- manager's Running write during phase-1 would overwrite controller's Phase value. Includes PR breakdown (types -> operator logic -> backup-manager CLI) aligned with the "single spec / multi PR" strategy.

The original 3-PR breakdown split operator-side work into a types-only PR and a logic PR. Types without consumers make for awkward review ("where does this field get used?"), and the logic PR is tightly coupled to the types. Merge them: PR 1 is now the full operator-side implementation (types + label + wrapper + handler + informer wire + unit tests); PR 2 stays as backup-manager CLI + BR flags, which must remain independent because it depends on kernel BR readiness and integration testing.

…e path Two post-implementation-exploration corrections to the spec: 1. The Restore status updater interface has a single Update(restore, condition, newStatus) method, not the 6-method shape of the Compact status updater I originally mirrored. Rewrote the wrapper code block as a simple no-op updater and spelled out the tradeoffs of dropping TimeStarted/Progress/CommitTs writes during phase-1. 2. UpdateRestoreCondition at pkg/apis/pingcap/v1alpha1/restore.go:73 unconditionally assigns status.Phase = condition.Type. So the controller cannot use it to write SnapshotRestored / CompactSettled markers without overwriting Phase. Documented that markers go through a handler-internal appendRestoreMarker helper which touches only status.Conditions[], preserving the "UpdateRestoreCondition is untouched" guarantee.

10 bite-sized tasks covering the operator-side work: CRD types and constants, label additions, no-op status wrapper, handler skeleton with ReplicationStep-keyed dispatch, helpers (appendRestoreMarker that preserves Phase, CompactBackup lookup with WaitTimeout, cross-CR consistency check), the three state-machine cases (initial / snapshot-restore / log-restore), restoreManager interception, and Restore controller CompactBackup informer wire-up. Each task follows TDD: failing test -> implementation -> passing test -> commit. PR 2 (backup-manager CLI + BR --replication-storage-phase) is covered by a separate plan because it gates on internal BR readiness and runs against a different test harness.

Adds ReplicationConfig struct (CompactBackupName, WaitTimeout) under RestoreSpec.ReplicationConfig, RestoreStatus.ReplicationStep string, and four new RestoreConditionType constants: SnapshotRestore and LogRestore (phase values) plus SnapshotRestored and CompactSettled (markers). No behavior change in this commit; consumed by subsequent handler and wrapper commits.

Labels are used to identify phase-1 vs phase-2 BR Jobs by selector rather than name pattern, so renaming the Job doesn't break the handler's idempotence checks.

In replication restore phase-1, backup-manager must not write status.phase or conditions — the controller is the sole writer. This wrapper drops all Update calls. Activated from the backup-manager CLI in PR 2 via --replicationPhase=1; phase-2 uses the real updater, matching the wrap-pattern shipped for CompactBackup sharded mode.

Skeleton dispatches by Status.ReplicationStep (empty / snapshot-restore / log-restore). Terminal restores are no-ops. Per-step logic arrives in subsequent commits.

appendRestoreMarker writes markers to Conditions without touching Phase (preserves SnapshotRestore / LogRestore phase values during marker observation). lookupCompactBackup encapsulates late-binding with WaitTimeout. checkCrossCRConsistency validates br.cluster/namespace and storage location, ignoring secretName differences.

On empty ReplicationStep, the handler validates cross-CR consistency, writes CompactSettled if CompactBackup is already terminal, advances Phase to SnapshotRestore, sets ReplicationStep=snapshot-restore, and creates the phase-1 BR Job with replication-step label. Late-binding and waitTimeout are handled via lookupCompactBackup.

Handler observes phase-1 Job (writes SnapshotRestored marker or fails the Restore on JobFailed) and CompactBackup (writes CompactSettled on terminal state). When both markers are True, transitions Phase to LogRestore, sets ReplicationStep=log-restore, and creates the phase-2 BR Job.

In log-restore, backup-manager drives status.phase via the real updater (no wrap in phase-2). Handler only guards against Job-level failures that terminate before backup-manager writes a terminal condition. makeReplicationBRJob builds phase-specific BR Jobs with the correct labels and passes --replicationPhase={1,2} to backup-manager (consumed in PR 2). buildPiTRBaseArgs is intentionally a minimal stub: full PiTR arg parity (env vars, TLS volumes, secrets) lives in restoreManager.makeRestoreJobWithMode and integration with that helper is deferred to a follow-up after Task 9 wires the handler to the restoreManager.

When Spec.Mode==pitr and Spec.ReplicationConfig!=nil, syncRestoreJob delegates to replicationHandler.Sync. All legacy single-Job machinery below the interception point is skipped; standard PiTR, snapshot, and volume-snapshot paths are unchanged.

Registers a CompactBackup EventHandler on the Restore controller that enqueues Restores whose Spec.ReplicationConfig.CompactBackupName matches the changed CompactBackup. Closes the observation loop so that a late-binding Restore unblocks as soon as its referenced CompactBackup is created, and so that a Restore in SnapshotRestore gates promptly when CompactBackup reaches terminal state.

…nsistency check - makeReplicationBRJob now uses label.NewRestore().Instance().RestoreJob().Restore() + merges restore.Labels, matching the standard restore Job builder pattern. External tooling that filters Jobs by app.kubernetes.io/managed-by=restore-operator or instance label will now find replication Jobs. - Added a doc comment in syncSnapshotRestore explaining that cross-CR consistency is validated only once (at syncInitial), per spec section 5.

ti-chi-bot · 2026-04-27T08:26:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jlerche for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov-commenter · 2026-04-27T08:36:54Z

Codecov Report

❌ Patch coverage is 75.59682% with 92 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.77%. Comparing base (a16b2e8) to head (b6531ab).
⚠️ Report is 72 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6861      +/-   ##
==========================================
+ Coverage   57.19%   57.77%   +0.57%     
==========================================
  Files         259      263       +4     
  Lines       33233    38602    +5369     
==========================================
+ Hits        19008    22302    +3294     
- Misses      12291    14199    +1908     
- Partials     1934     2101     +167

Flag	Coverage Δ
unittest	`57.77% <75.59%> (+0.57%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…erface

…ally Phase and ReplicationStep were written in two separate reconciles, leaving a transient (Phase=LogRestore, Step=snapshot-restore) window observable to external watchers and requiring two catch-up branches plus a bespoke setReplicationStep with its own lister-staleness workaround. Extend RestoreUpdateStatus with ReplicationStep so enterSnapshotRestore and enterLogRestore can write Phase + condition + Step in a single Update. Delete setReplicationStep and the two protocol-only catch-up branches; the two step handlers now dispatch on Phase with no transient skew to absorb. Align FakeRestoreConditionUpdater with the production path so future tests that rely on RestoreUpdateStatus patches see the same semantics.

Job lookup in the replication handler used a label selector keyed on restore.Name, which could mistake a leftover Job from a deleted same-name predecessor (still GC'ing) for the current Restore's Job, and the lister- List + Create pattern surfaced spurious AlreadyExists errors during fast back-to-back reconciles. makeReplicationBRJob now stamps the owning Restore's UID on each Job; both ensureJobForStep and findJobForStep switch to Get-by-name and verify the UID label. UID mismatch is treated as not-found by findJobForStep and as an explicit "awaiting GC" error by ensureJobForStep, so Phase is never advanced based on a stale predecessor's Job conditions.

The cross-CR consistency check between Restore and CompactBackup compared only bucket/prefix/region for S3 (and bucket/prefix for GCS, container/ prefix for Azblob). Two CRs could share those fields while pointing at different physical backends — e.g. same bucket name on AWS vs an internal Ceph/MinIO cluster, or the same path under different GCS projects — which would silently let phase-2 BR consume compacted metadata produced against a different storage system. Add the missing location-defining fields (S3 Provider/Endpoint/Path, GCS ProjectId/Location/Path, Azblob StorageAccount/Path) to the compare, keep credentials and upload-only attributes (SecretName, SasToken, StorageClass, Acl, SSE, AccessTier) ignored.

- compactIsTerminal: warn on unrecognized CompactBackup state and explicitly enumerate known non-terminal states, so a future kernel state surfaces in logs instead of silently stalling the gate. - lookupCompactBackup: document that WaitTimeout is measured from Restore.CreationTimestamp, not from the first NotFound observation. - failRestore: document that ReplicationStep is intentionally preserved on terminal failure so triage can tell phase-1 vs phase-2 failures apart at a glance. - Drop the B<n>_ cascade-branch prefix from test names and the matching section headers; cross-references to spec branches inside test bodies remain as documentation anchors.

Sentinel 0 means "not a replication restore" (CLI flag unset); non-zero values 1/2 are the only other legal values. Range validation lives in the cmd package (Task 2). Also adds the PR 2 implementation plan.

…dation Sentinel 0 = "not a replication restore" (cobra default when flag is absent). Reject {-N, >=3} early in runRestore so the operator gets a clean error instead of silently mishandling the value downstream.

Symmetric to compact.go's ShardedCompactStatusUpdater wrap, but driven by the CLI flag (not CR.Spec) because the same Restore CR runs through backup-manager twice with different phases. selectRestoreStatusUpdater returns the no-op wrapper for phase 1; phase 0 / 2 pass the real updater through unchanged.

Three BR flags are appended when --replicationPhase > 0: --replication-storage-phase, --replication-status-sub-prefix=ccr, --pitr-concurrency=1024. Constants live in restore.go (BR call details; not exposed via API per spec §6). restoreData itself takes only a one-line edit to call the new replicationBRFlags helper; phase 0 returns nil so append is a no-op for standard PiTR. The 200-line restoreData body is otherwise unchanged.

Reorder so pm.Enable runs before the replication interception. Both PiTR variants (standard PiTR and replication restore) now wait for TiKV's gc.ratio-threshold = -1 override to propagate via ConfigMap before any BR Job is launched. The override itself is written by tikvMemberManager.applyPiTRConfigOverride in the TiKV reconcile loop, which already covers replication restores because they satisfy Spec.Mode == PiTR. No TiKV-side change is needed. The replication routing test is updated to seed a -1 ConfigMap up front; a new test exercises the gating behavior end-to-end (closed gate => handler not called; open gate => handler called once).

Replace the stub buildPiTRBaseArgs with an injected job builder (production = rm.makeRestoreJobWithMode) plus a small applyReplicationPhase post-processor. Phase-1 and phase-2 Jobs now inherit the full PiTR Job setup: TiDBBackupManagerImage main container, BR-binary init container with br-bin shared volume, TLS volume mounts, password / storage env, owner reference. Replication adds three mutations on top: Job name override, replication-step + restore-UID labels, --replicationPhase arg. NewRestoreManager is two-stepped so it can pass a function reference to rm.makeRestoreJobWithMode into newReplicationHandler. The handler takes a jobBuilderFunc as a constructor param; tests inject a small fake builder via newHandlerForTest.

RidRisR added 15 commits April 23, 2026 14:03

feat(replication-restore): add ReplicationStepLabelKey constants

5f8f8db

Labels are used to identify phase-1 vs phase-2 BR Jobs by selector rather than name pattern, so renaming the Job doesn't break the handler's idempotence checks.

feat(replication-restore): add handler skeleton with dispatch

be458f3

Skeleton dispatches by Status.ReplicationStep (empty / snapshot-restore / log-restore). Terminal restores are no-ops. Per-step logic arrives in subsequent commits.

ti-chi-bot Bot requested a review from shonge April 27, 2026 08:26

ti-chi-bot Bot added the size/XXL label Apr 27, 2026

RidRisR added 3 commits April 28, 2026 10:20

chore(replication-restore): assert *replicationHandler implements int…

53eaf64

…erface

feat(replication-restore): decouple compact backup settling

850a5be

refactor(replication-restore): split sync phase dispatch

e00a8ce

RidRisR force-pushed the docs/replication-restore-design branch from f6be1ed to e00a8ce Compare April 29, 2026 05:23

RidRisR added 7 commits April 29, 2026 07:53

feat(replication-restore): add Options.ReplicationPhase field

71292a9

Sentinel 0 means "not a replication restore" (CLI flag unset); non-zero values 1/2 are the only other legal values. Range validation lives in the cmd package (Task 2). Also adds the PR 2 implementation plan.

feat(replication-restore): register --replicationPhase flag with vali…

335ff21

…dation Sentinel 0 = "not a replication restore" (cobra default when flag is absent). Reject {-N, >=3} early in runRestore so the operator gets a clean error instead of silently mishandling the value downstream.

RidRisR added 3 commits April 30, 2026 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(replication-restore): operator-side orchestration for cross-region replication restore (PR 1/2)#6861

feat(replication-restore): operator-side orchestration for cross-region replication restore (PR 1/2)#6861
RidRisR wants to merge 28 commits into
pingcap:masterfrom
RidRisR:docs/replication-restore-design

RidRisR commented Apr 27, 2026

Uh oh!

ti-chi-bot Bot commented Apr 27, 2026

Uh oh!

codecov-commenter commented Apr 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RidRisR commented Apr 27, 2026

What problem does this PR solve?

What is changed and how does it work?

Backward compatibility

Known follow-ups (out of scope for this PR)

Code changes

Tests

Side effects

Uh oh!

ti-chi-bot Bot commented Apr 27, 2026

Uh oh!

codecov-commenter commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Apr 27, 2026 •

edited

Loading