Skip to content

[test] fix flaky testRebalanceWithRemoteLog#3409

Open
wattt3 wants to merge 1 commit into
apache:mainfrom
wattt3:3385-flaky-testRebalanceWithRemoteLog
Open

[test] fix flaky testRebalanceWithRemoteLog#3409
wattt3 wants to merge 1 commit into
apache:mainfrom
wattt3:3385-flaky-testRebalanceWithRemoteLog

Conversation

@wattt3
Copy link
Copy Markdown

@wattt3 wattt3 commented May 31, 2026

Purpose

Linked issue: close #3385

The test captures the remote log segment count, then asserts it's unchanged after rebalance.
But, produceRecordsAndWaitRemoteLogCopy() does not ensure that all log segments have been copied to remote, so the remoteLogSize(captured count) is non-deterministic.

Brief change log

  • Add FlussClusterExtension#waitUntilAllLogSegmentsCopyToRemote(), which waits until the remote log end offset reaches the active segment's base offset (i.e. all non-active segments are copied to remote).
  • Uses waitUntilAllLogSegmentsCopyToRemote() instead of waitUntilSomeLogSegmentsCopyToRemote(), so the captured segment count is stable.

Tests

Ran RebalanceManagerITCase.testRebalanceWithRemoteLog 100 times

  • Before fix: 9 / 100 failed
  • After fix: 0 / 100 failed

@wattt3
Copy link
Copy Markdown
Author

wattt3 commented May 31, 2026

Hi @swuferhong / @wuchong
I saw you've worked on this area before.
So I'd appreciate it if you could take a look.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[test] Unstable test RebalanceManagerITCase.testRebalanceWithRemoteLog

2 participants