Search before asking
Version
Source Doris:2.1.11-x64
Target Doris:2.1.11-arm64
CCR:ccr-syncer-3.0.6-rc05-arm64
What's Wrong?
After creating a database-level CCR replication task, CCR starts full synchronization normally and then enters the incremental synchronization phase, with everything working properly.
However, when the upstream Doris FE master node switches to another node, CCR triggers a fullsync and pulls data again from scratch.
Due to the large volume of data, the synchronization takes a long time and has a significant impact on the production environment.
With the ccr log:
[2026-05-14 09:33:51.786] WARN call [:0] error: GetBinlog error: remote or network error: get connection error: dial tcp :0: connection has been closed by peer, req: TGetBinlogRequest({Cluster: User:0x40001a8378 Passwd:0x40001a8388 Db:0x40001a83a8 Table: TableId: UserIp: Token: PrevCommitSeq:0x400082e928 NumAcquired:0x400082e930}): [rpc] remote or network error: get connection error: dial tcp :0: connection has been closed by peer, try next addr job=CCR_PROD_ZHBB line=rpc/fe.go:259
...
[2026-05-14 09:33:52.149] WARN job sync failed, job: CCR_PROD_DW, err: [meta] index ids is empty
...
[2026-05-14 09:33:53.597] INFO fullsync status: create snapshot with prefix ccrs_CCR_PROD_DW_1778668141 job=CCR_PROD_DW line=ccr/job.go:973
[2026-05-14 09:33:53.694] INFO fullsync status: create snapshot ccrs_CCR_PROD_DW_1778668141_1778722433 job=CCR_PROD_DW line=ccr/job.go:1019
[2026-05-14 09:33:53.694] INFO create snapshot PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433, backup snapshot sql: BACKUP SNAPSHOT PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433 TO keep_on_local PROPERTIES ("type" = "full") job=CCR_PROD_DW line=base/spec.go:771
What You Expected?
CCR runs nomally after the Doris fe master node fails over to another node.
How to Reproduce?
When database-level CCR synchronization is running on the upstream cluster with continuous writes to a large number of tables, if the FE Master node goes down and a switchover occurs, CCR will trigger a fullsync again.
Anything Else?
No response
Are you willing to submit PR?
Code of Conduct
Search before asking
Version
Source Doris:2.1.11-x64
Target Doris:2.1.11-arm64
CCR:ccr-syncer-3.0.6-rc05-arm64
What's Wrong?
After creating a database-level CCR replication task, CCR starts full synchronization normally and then enters the incremental synchronization phase, with everything working properly.
However, when the upstream Doris FE master node switches to another node, CCR triggers a fullsync and pulls data again from scratch.
Due to the large volume of data, the synchronization takes a long time and has a significant impact on the production environment.
With the ccr log:
[2026-05-14 09:33:51.786] WARN call [:0] error: GetBinlog error: remote or network error: get connection error: dial tcp :0: connection has been closed by peer, req: TGetBinlogRequest({Cluster: User:0x40001a8378 Passwd:0x40001a8388 Db:0x40001a83a8 Table: TableId: UserIp: Token: PrevCommitSeq:0x400082e928 NumAcquired:0x400082e930}): [rpc] remote or network error: get connection error: dial tcp :0: connection has been closed by peer, try next addr job=CCR_PROD_ZHBB line=rpc/fe.go:259
...
[2026-05-14 09:33:52.149] WARN job sync failed, job: CCR_PROD_DW, err: [meta] index ids is empty
...
[2026-05-14 09:33:53.597] INFO fullsync status: create snapshot with prefix ccrs_CCR_PROD_DW_1778668141 job=CCR_PROD_DW line=ccr/job.go:973
[2026-05-14 09:33:53.694] INFO fullsync status: create snapshot ccrs_CCR_PROD_DW_1778668141_1778722433 job=CCR_PROD_DW line=ccr/job.go:1019
[2026-05-14 09:33:53.694] INFO create snapshot PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433, backup snapshot sql: BACKUP SNAPSHOT PROD_DW.ccrs_CCR_PROD_DW_1778668141_1778722433 TO keep_on_local PROPERTIES ("type" = "full") job=CCR_PROD_DW line=base/spec.go:771
What You Expected?
CCR runs nomally after the Doris fe master node fails over to another node.
How to Reproduce?
When database-level CCR synchronization is running on the upstream cluster with continuous writes to a large number of tables, if the FE Master node goes down and a switchover occurs, CCR will trigger a fullsync again.
Anything Else?
No response
Are you willing to submit PR?
Code of Conduct