test: mark fee_settings.test.ts teardown segfault as flake#23378
Draft
AztecBot wants to merge 1 commit into
Draft
test: mark fee_settings.test.ts teardown segfault as flake#23378AztecBot wants to merge 1 commit into
AztecBot wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
PR #23344 (merge-train/spartan) was dequeued from the merge queue at 2026-05-18 17:08:03Z after CI run 26047392849 failed on
ci/x8-full(1 of 10merge-queue-heavygrinds). The PR branch CI on the same head (8caa1d336a) passed cleanly.The only failure was
src/e2e_fees/fee_settings.test.tsexiting with code 139 (SIGSEGV) at 355s. Log: http://ci.aztec-labs.com/14142e6c59162a95What's happening
The segfault occurs in
afterAllteardown, not in the test body. From the stack:Sequencer.stopawaits the in-flight checkpoint L1 submission, which is interrupted (Transaction sending is interrupted— a clean abort signal from fix: interrupt prover jobs in stop #23358).EpochProvingJobcalls into the native world-state DB toCreate fork at 2/Insert 0 L1 to L2 messages in fork.GET_TREE_INFO failed: Fork not foundand segfaults the Jest process.fix: interrupt prover jobs in stop (#23358)already on the train interrupts prover jobs at stop, but doesn't fully serialise prover-node shutdown against in-flight native fork operations — the segfault wins the race intermittently. Same class of teardown-time native crash that the existinge2e_fees/gas_estimation.test.tsflake entry covers (different surface:timeout: sending signal TERM to command 'bash').Fix
Mark
fee_settings.test.tsas a flake only when the error matches the segfault signature (Segmentation fault.*core dumped|code: 139). Real test-body assertion failures still fail CI. Assigned to*alex(PR author, owns the related gas_estimation flake entry).A proper fix is to either (a) cancel and await in-flight epoch-proving jobs before world-state synchronizer stops, or (b) make the native world-state DB return JS errors for
Fork not foundon a stopped store rather than segfaulting. Out of scope here — left as a follow-up for the prover-node / world-state team.Full analysis: https://gist.github.com/AztecBot/704d54fc69850b1b9ceb1aeaeae64667
Note on local CI
./bootstrap.sh cinot run locally — the change is metadata only (.test_patterns.ymlis consumed byci3/filter_test_cmdsandci3/get_test_entry, no compiled artifact depends on it). YAML validated withyaml.safe_load; bothregexanderror_regexmatched against the actual failure string.ClaudeBox log: https://claudebox.work/s/16f3aaf1a7b118c7?run=1