test(e2e): batch 3 of pipelining e2e test migration#23328
Open
spalladino wants to merge 1 commit into
Open
Conversation
Enables proposer pipelining on additional e2e tests where source-side B-bug fixes (B1, B5, B6) are now landed on merge-train/spartan, and sharpens TODOs for tests still blocked on the remaining source bugs (B2, B7, publisher-funder).
Collaborator
Flakey Tests🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Continues the pipelining e2e migration started in #23275. Several tests previously skipped or held back on suspected pipelining-code bugs are now ready to re-enable after the relevant fixes landed on
merge-train/spartan. This batch also sharpens the remainingTODO(kill-non-pipelined)markers against the source bugs that aren't yet fixed on this branch.Approach
For each skipped/held-back test, work out whether it can pass purely with test-side adjustments under the current source. Test motivation was preserved throughout — assertions over-pinned to specific slot numbers or committee identity were rewritten to assert the actual invariant. Where a test still fails because of a source-level bug that has not landed on this branch, the opt-in was held back and the TODO sharpened with the concrete symptom and code pointer.
Changes
Now passing under pipelining
e2e_block_building"clears up all nullifiers if tx processing fails": un-skipped. SwitchedPromise.race→Promise.any(now that the world-state fork-close fix has landed, the failed tx is correctly dropped from the pool —racewas propagating the rejection of the dropped tx before the surviving tx mined). ReplacedgetBlockData('latest')(a pipelining anti-pattern — empty pipelined checkpoints can interleave) with the receipt'sblockNumber.e2e_block_building > reorgs > detects an upcoming reorg: un-skipped. Added explicitcheatCodes.rollup.markAsProven()to drive the proven tip forward (theAnvilTestWatcherauto-prove loop is dormant under interval mining — unrelated to pipelining, just a footgun this test exposes), bounded the open-endedwhile+sleepinto aretryUntil, and setminTxsPerBlock: 1to keep the sequential-tx block-number assertions tight under pipelining's empty-checkpoint cadence.e2e_multi_validator/e2e_multi_validator_nodetest 2 "should attest ONLY with the correct validator keys": un-skipped. Rewrote the over-pinnedexpect.arrayContaining(validators[0..2])assertion. The original assumption was deterministic committee identity, but the committee is RNG-sampled over the active validator set, andlagInEpochsForValidatorSet=2means initiated-withdraw validators (3, 4) often still appear in the committee at attestation time. New assertion preserves the real motivation — "validators who initiated withdraw don't attest" — by checking that no signer is in the withdrawn set. Also bumpedjest.setTimeout(15 * 60 * 1000)sowaitForProvenhas wall-clock budget under pipelining's 12s slot cadence.Assertion rewrite only, pipelining opt-in held back on a separate source bug
composed/ha/e2e_ha_full"should coordinate governance voting across HA nodes": replaced the strictl1VoteCount === uniqueSlots.sizeinvariant (broken by design — HA signing intentionally suppresses duplicate duty signatures across nodes, and under pipelining a vote signed in build slot N mines in target slot N+1, so the equality never holds) with an outcome assertion: poll untilsignalCount >= VALIDATOR_COUNTfor our payload, then assertpayloadWithMostSignalsmatches, plus unconditional duty checks (no(slot, validator)double-signs, every duty SIGNED). Pipelining opt-in held back behind a sharpened TODO because the pipelined HA + governance path hits acanProposeAtTime / InvalidProposercascade that exhausts the publisher — the fix (which threadslastArchiveRootinto thecanProposeAtsimulation plan and overrides the pending-tip slot number socanPruneAtTimecan't bypass the pending override) is on a separate sequencer-side branch and has not yet been forward-ported tomerge-train/spartan.Still blocked on source bugs — TODOs sharpened, no opt-in
e2e_blacklist_token_contract/*(7 suites:burn,minting,shielding,transfer_private,transfer_public,unshielding,access_control): the huge-warp problem itself is now solvable under pipelining (working recipe: callcheatCodes.rollup.markAsProven()before the warp so L1'scanPruneAtTimedoesn't wipe to checkpoint 0; use the L1-onlycheatCodes.eth.warp({ resetBlockInterval: true })rather thanwarpL2TimeAtLeastTo; retryaztecNode.mineBlockup to 3 times to absorb the one pre-warp in-flight publish failure). However, every suite hits a separate blocker: under pipelining withinboxLag=2, the firstsimulate()after the warp queriesgetL1ToL2Messages(proposedCheckpoint+1)and throwsL1ToL2MessagesNotReadyError: inbox tree in progress is N, messages not yet sealed. This is the same simulator +inboxLagmismatch inAztecNodeService.simulatePublicCalls(seeaztec-node/src/server.ts+archiver/.../message_store.ts) that's blocking the simulator-heavy tests being handled separately. Sharpened TODO on all 7 files pointing to the recipe (preserved in working notes for when the simulator bug is fixed).e2e_publisher_funding_multi: tried opting into pipelining. First funding round fires correctly (Funded 2 publisherslogged at ~T+120s after both publishers' balances are forced below threshold). Second round (waiting for organic depletion to fall below threshold) never triggers —publisher:manageris silent from T+120s through to teardown at T+360s, even though balances objectively dropped below the threshold. Two independent agents reproduced. Reverted to non-pipelined with a sharpened TODO. Source-level investigation needed inPublisherManager'sRunningPromisecycle or in the L1 balance read path (publisher_manager.ts/l1_tx_utils.ts) — out of scope for a tests-only PR.Out-of-scope source bugs surfaced
canProposeAtsimulation under pipelined parent invalidation — blockse2e_contract_updatesprivate-ctor,composed/web3signer/e2e_multi_validator_node_key_store, and thee2e_ha_fullpipelining opt-in above. Fix exists on a sibling branch but is not onmerge-train/spartan.AztecNodeService.simulatePublicCallsqueries L1→L2 messages from a checkpoint that hasn't been sealed yet — blocks the 7 blacklist suites in this PR plus the simulator-heavy tests being handled separately.RunningPromisecycle inPublisherManager.