Summary
4 distinct test failures were observed against committed code (Timer/Post Merge Action builds on main) in the 24 hours ending 2026-05-17. None of the failures reproduced locally with the original seed, indicating they are all timing/environment-sensitive flakes rather than deterministic bugs.
Failing Tests
1. MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}
| Field |
Value |
| Build |
77225 |
| Trigger |
Timer (main) |
| Seed |
D13505BF7CE3BF5B:59613A65D21FD2A3 |
| Reproduced locally |
No |
| First failure |
2024-03-25 |
| Total unique builds affected |
212 |
| Module |
qa/mixed-cluster |
Error: cluster.health API returned 408 Request Timeout; cluster status was red with 51 unassigned shards during BWC rolling upgrade.
Pattern: Chronic flake since March 2024. Peaked at 58 builds in Sep 2024, then stabilized at 1-14 builds/month. Recent uptick: 10 builds in Apr 2026, 12 in May 2026 (partial month). Likely worsened by the mid-April 2026 runner migration to m7a.8xlarge (faster CPUs may cause the rolling upgrade to proceed before shards finish allocating). Stable/slightly worsening.
2. SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase
| Field |
Value |
| Build |
77182 |
| Trigger |
Post Merge Action (main) |
| Seed |
220320CBD3478DB4 |
| Reproduced locally |
No |
| First failure |
2024-03-26 |
| Total unique builds affected |
166 |
| Module |
qa/smoke-test-http |
Error: AssertionError in ensureSearchTaskIsCancelled — assertBusy timed out waiting for the search task to be marked cancelled.
Pattern: Chronic flake since March 2024. Had a major spike in Nov 2025 (42 builds). After calming down in early 2026, it's rising again: 14 builds in Apr 2026, 22 in May 2026 (partial month). Worsening.
3. WarmIndexSegmentReplicationIT.testIndexReopenClose
| Field |
Value |
| Build |
77216 |
| Trigger |
Post Merge Action (main) |
| Seed |
27483C47C34980DA:C0927DDC1B8C8A14 |
| Reproduced locally |
No |
| First failure |
2025-03-11 |
| Total unique builds affected |
15 |
| Module |
server (internalClusterTest) |
Error: Expected: a value equal to or greater than <4L> but: <0L> was less than <4L> — waitForDocs timed out; 0 docs indexed when 4 were expected after index reopen.
Pattern: Low-frequency flake since March 2025. Sporadic: 4 builds in Mar 2025, 4 in Aug 2025, 3 in Feb 2026, 2 in May 2026. Never more than 4 builds in a single month. Stable (low-rate).
4. FlightClientChannelTests.testErrorInInterimBatchFromServer
| Field |
Value |
| Build |
77196 |
| Trigger |
Timer (main) |
| Seed |
7612269473A008A2:D4B6774589E81D0D |
| Reproduced locally |
No |
| First failure |
2025-07-03 |
| Total unique builds affected |
13 |
| Module |
plugins/arrow-flight-rpc |
Error: BindTransportException: Failed to bind to port 31001 (Address already in use) — hardcoded port conflict in test setup.
Pattern: Low-frequency flake since July 2025. Peaked at 6 builds in its first month, then 0-2 builds/month. Purely environmental (port conflict on CI runner). Stable (low-rate).
Summary Table
| Test |
Builds Affected |
First Seen |
Trend |
Reproduced |
| MixedClusterClientYamlTestSuiteIT (cluster.health/closed index) |
212 |
2024-03-25 |
Stable/slightly worsening |
No |
| SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase |
166 |
2024-03-26 |
Worsening |
No |
| WarmIndexSegmentReplicationIT.testIndexReopenClose |
15 |
2025-03-11 |
Stable (low-rate) |
No |
| FlightClientChannelTests.testErrorInInterimBatchFromServer |
13 |
2025-07-03 |
Stable (low-rate) |
No |
Reproduction Details
All tests were run locally with their CI seeds on the current main branch. None failed, confirming these are non-deterministic (timing/environment-dependent) failures. The seeds control randomized parameters but not thread scheduling, network timing, port availability, or shard allocation timing — all of which are the actual failure triggers here.
Notes
- The April 2026 CI runner migration from m5.8xlarge to m7a.8xlarge may be amplifying timing-sensitive failures (particularly the MixedClusterClientYamlTestSuiteIT and SearchRestCancellationIT tests).
- The FlightClientChannelTests failure is purely a port-binding conflict (hardcoded port 31001) and is unrelated to test logic.
Summary
4 distinct test failures were observed against committed code (Timer/Post Merge Action builds on
main) in the 24 hours ending 2026-05-17. None of the failures reproduced locally with the original seed, indicating they are all timing/environment-sensitive flakes rather than deterministic bugs.Failing Tests
1. MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}
D13505BF7CE3BF5B:59613A65D21FD2A3qa/mixed-clusterError:
cluster.healthAPI returned 408 Request Timeout; cluster status was red with 51 unassigned shards during BWC rolling upgrade.Pattern: Chronic flake since March 2024. Peaked at 58 builds in Sep 2024, then stabilized at 1-14 builds/month. Recent uptick: 10 builds in Apr 2026, 12 in May 2026 (partial month). Likely worsened by the mid-April 2026 runner migration to m7a.8xlarge (faster CPUs may cause the rolling upgrade to proceed before shards finish allocating). Stable/slightly worsening.
2. SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase
220320CBD3478DB4qa/smoke-test-httpError:
AssertionErrorinensureSearchTaskIsCancelled—assertBusytimed out waiting for the search task to be marked cancelled.Pattern: Chronic flake since March 2024. Had a major spike in Nov 2025 (42 builds). After calming down in early 2026, it's rising again: 14 builds in Apr 2026, 22 in May 2026 (partial month). Worsening.
3. WarmIndexSegmentReplicationIT.testIndexReopenClose
27483C47C34980DA:C0927DDC1B8C8A14server(internalClusterTest)Error:
Expected: a value equal to or greater than <4L> but: <0L> was less than <4L>—waitForDocstimed out; 0 docs indexed when 4 were expected after index reopen.Pattern: Low-frequency flake since March 2025. Sporadic: 4 builds in Mar 2025, 4 in Aug 2025, 3 in Feb 2026, 2 in May 2026. Never more than 4 builds in a single month. Stable (low-rate).
4. FlightClientChannelTests.testErrorInInterimBatchFromServer
7612269473A008A2:D4B6774589E81D0Dplugins/arrow-flight-rpcError:
BindTransportException: Failed to bind to port 31001 (Address already in use)— hardcoded port conflict in test setup.Pattern: Low-frequency flake since July 2025. Peaked at 6 builds in its first month, then 0-2 builds/month. Purely environmental (port conflict on CI runner). Stable (low-rate).
Summary Table
Reproduction Details
All tests were run locally with their CI seeds on the current
mainbranch. None failed, confirming these are non-deterministic (timing/environment-dependent) failures. The seeds control randomized parameters but not thread scheduling, network timing, port availability, or shard allocation timing — all of which are the actual failure triggers here.Notes