feat: merge-train/spartan#23253
Open
AztecBot wants to merge 33 commits into
Open
Conversation
…al pipeline (#23245) ## Summary - Removes `FastTxCollection` as a separate class and absorbs all its logic directly into `TxCollection` - Replaces the old parallel file-store delay with a single sequential pipeline: node RPC → reqresp → file store, where each phase blocks on the previous (cancellation-aware) - File store collection is now driven by `IRequestTracker` — the same synchronization primitive used by node and reqresp paths. The tracker is the single source of truth for "is this tx still missing?" and "is this request still alive?" - `FileStoreTxCollection` simplified: dropped `start()`/`stop()`/persistent worker pool/`wakeSignal`. `startCollecting(requestTracker, context)` returns `Promise<void>`, spins up its own per-call worker pool, and workers self-terminate when the tracker is cancelled (all-fetched / deadline / external) ## Collection flow inside `collectFast` 1. Start node RPC collection in the background 2. Wait `txCollectionFastNodesTimeoutBeforeReqRespMs` — interruptible by cancellation **or by node exhaustion** (so when no nodes are configured, reqresp starts immediately) 3. Start reqresp in the background (parallel with nodes) 4. Wait `txCollectionFileStoreFastDelayMs` — interruptible by cancellation or reqresp completion 5. Start file store collection in the background (its workers self-terminate) 6. `Promise.allSettled` on node + reqresp + file store `txCollectionFileStoreFastDelayMs` description updated to reflect it is now anchored to reqresp start, not collection start. ## File store / tracker integration - `FileStoreTxCollection.startCollecting` no longer takes `(txHashes, context, deadline)`; it takes `(requestTracker, context)` and reads the missing txs + deadline from the tracker - Workers check `requestTracker.isMissing(hash)` each scan — if the tx was found via another path (node/reqresp/gossipsub), the entry is dropped without an extra fetch - Workers race their backoff sleeps against `requestTracker.cancellationToken` — cancelling a request (deadline, `stopCollectingForBlocksUpTo/After`, or `stop()`) propagates to file store workers immediately - Removed `foundTxs`/`clearPending` plumbing on `FileStoreTxCollection` — the tracker handles both implicitly - `startCollecting` yields once after building its entry set, so a synchronous follow-up call (e.g. `markFetched` in tests, or the gossipsub-found path in production) lands before workers begin scanning ## Tests - `tx_collection.test.ts`: collapsed the `TestFastTxCollection` subclass; all accesses go directly through `TxCollection`. Added "starts reqresp immediately when no nodes are configured" covering the node-exhaustion shortcut - `file_store_tx_collection.test.ts`: rewritten for the new shape — no `start()`/`stop()`, lifecycle driven by the tracker (cancel to terminate workers). New "workers exit when tracker is cancelled" covers the per-call worker-pool teardown Closes https://linear.app/aztec-labs/issue/A-933/tx-collection-dont-retrieve-transactions-that-have-already-been via new synchronization with the request tracker.
…ims (#23165) ## Context `SequencerPublisher` simulates each enqueued L1 action individually at enqueue time, then sends them bundled through Multicall3. The `propose` checkpoint action is validated at enqueue and send time (the latter via a `preCheck` mechanism), but in isolation and relying on overrides. There is no simulation of the multicall payload before sending it, so a reverting tx is most likely not caught. This refactor: - Replaces the per-request `preCheck` mechanism with a **single bundle-level `eth_simulateV1`** of the assembled `aggregate3` payload, run right before send. If any entry reverts in sim it is dropped from the bundle, the reduced bundle is re-simulated to get an honest `gasUsed`, and the survivors are sent. Extracted to a `SequencerBundleSimulator`. - Drops the entire propose simulate at enqueue (`simulateProposeTx`, `validateCheckpointForSubmission`). The bundle simulate covers it. - Adds a new pre-broadcast `validateBlockHeader` call (calling `validateHeaderWithAttestations` with empty attestations + `ignoreSignatures: true`) that catches header-level bugs before we gossip the proposal to peers. Emits a new `header-validation-failed` event on failure. - Drops every per-action simulate at enqueue (governance signal **and** slashing votes/executes). Bundle simulate at send time is the single decision point for every per-action revert. `simulateAndEnqueueRequest` is deleted. We were enqueuing votes even if the simulation failed, after all. - Rewrites `sendRequestsAt` so it takes an L2 `SlotNumber`, derives the timestamp for the start of that slot, and sleeps until one L1 slot before that boundary, so we can land on the first L1 slot of the target L2 slot. - Centralises `SimulationOverridesPlan` construction into a single `buildCheckpointSimulationOverridesPlan` helper. The plan **always** pins both `pending` and `proven` chain tips (to the pipelined parent / invalidation target, or to the current snapshot when neither applies), so `STFLib.canPruneAtTime` cannot reintroduce a phantom prune during simulation. - Makes `SimulationOverridesBuilder.merge` undefined-safe: explicit `undefined` fields in an incoming plan no longer erase previously-set values. `withPendingTempCheckpointLogFields` now accepts a partial subset of fields. - Moves the payload-empty cache onto `GovernanceProposerContract` next to its concern. Only `isPayloadEmpty=false` is cached (a CREATE2 redeploy could go empty → populated). - Drops the old Multicall3 revert-recovery and per-request-resim machinery, since with `allowFailure: true` the top-level multicall is expected to land successfully. `Multicall3.forward` now throws `MulticallForwarderRevertedError` if the receipt reports a reverted status; the publisher does **not** rotate to a new publisher on that error (on-chain failure, not a send failure). Adds `Multicall3.hasCode` helper and a `simulateAggregate3` entrypoint used by the bundle simulator. - `L1TxUtils.sendTransaction` fails fast if `txTimeoutAt` has already elapsed when called. `SequencerPublisher.forwardWithPublisherRotation` re-checks the deadline at the head of each rotation iteration so it doesn't keep cycling through publishers after the L2 slot's submission window has closed. - Sequencer escape-hatch (`voteInSlotWithoutSyncing`) and full-escape-hatch (`voteOnSlotWithEscapeHatch`) vote-only paths now submit via `sendRequestsAt(slot)` rather than `sendRequests()`, so the bundle-simulate `block.timestamp` override matches the slot the EIP-712 vote signatures were generated for. The intended outcome is a publisher with one explicit re-validation point (the bundle simulate), measurable bundle gas (from the bundle simulate's `gasUsed`), and dead/duplicated state-override plumbing removed. ## Resulting simulations after this refactor The full list of simulation / gas-estimation steps that remain in a pipelined proposer slot, in execution order. ### Pre-build, in `Sequencer.doWork` 1. **`publisher.canProposeAt`** — rollup view call simulated with the centralised override plan. Cheap pre-check gate before any block-build work. 2. **`publisher.simulateInvalidateCheckpoint`** (conditional) — runs **only** if `syncedTo.pendingChainValidationStatus.valid === false` AND `!syncedTo.hasProposedCheckpoint`. Simulates the invalidate call against the rollup. Result becomes the `invalidateCheckpoint` package passed into `CheckpointProposalJob`. The previous code called this even when there's a proposed parent and discarded the result; this refactor adds the `!hasProposedCheckpoint` gate so we skip the wasted RPC. ### Per-slot, in `CheckpointProposalJob.proposeCheckpoint` 3. **CheckpointVoter votes** — `CheckpointVoter.enqueueVotes()` runs at the top of `execute()`, returning two promises that are awaited in parallel with block-build. It enqueues two kinds of votes via the publisher, **neither of which simulates at enqueue time** after this refactor: - **`enqueueGovernanceCastSignal`** — does an `isPayloadEmpty` pre-flight check (now on `GovernanceProposerContract`), then enqueues. No `eth_simulateV1`. - **`enqueueSlashingActions`** (one call per slashing action, type `vote-offenses` or `execute-slash`) — builds the request and enqueues. No `eth_simulateV1`. Real reverts on any of these are caught by the bundle simulate at send time, which drops the failing entry and proceeds with the survivors. 4. **`publisher.validateBlockHeader` (NEW: pre-broadcast)** — replaces the old `simulateProposeTx`-at-enqueue. Calls `validateHeaderWithAttestations` with empty attestations and `ignoreSignatures: true` so the rollup runs the header checks (archive match, slot match, timestamp, mana-min-fee, …) without needing real attestations. Runs **before** we gossip the proposal to peers. If it fails, abort the slot — log an error, emit `header-validation-failed`, don't broadcast, don't enqueue. 5. **`prepareProposeTx → validateBlobs estimateGas`** — kept as the blob-commitment **consistency check** (detects locally-built commitments not matching the blob sidecars). Returns `blobEvaluationGas`, which we stash on the propose `RequestWithExpiry` for use by the bundle gasLimit later. The simulate-step that previously paired with this (`simulateProposeTx`) is removed. ### Background pipeline, in `waitForAttestationsAndEnqueueSubmissionAsync` 6. **`publisher.simulateInvalidateCheckpoint` (conditional)** — runs **only** in the fallback path where attestation collection failed AND the pending chain turned out to be invalid. Triggered from `CheckpointProposalJob.enqueueInvalidation`. This is the second, late trigger for invalidation simulation — distinct from step 2's pre-build trigger. ### Send time, in `sendRequestsAt(targetSlot)` 7. **Bundle simulate (NEW)** — single `eth_simulateV1` of the assembled `aggregate3` payload, with `block.timestamp` overridden to the start of `targetSlot`, and state overrides = `[disableBlobCheck]` iff `propose` is in the bundle and `[]` otherwise. Per-entry result decoded from the returned `Result[]`. This is the **only** post-pipeline-sleep re-validation; it replaces the per-request `preCheck` mechanism entirely. 8. **Bundle re-simulate (NEW, conditional)** — runs **only** when step 7 dropped at least one entry. Re-runs the bundle simulate on the reduced payload to get an honest `gasUsed`, and applies the same per-entry decode so additional drops are caught. If the re-simulate falls back (node doesn't support `eth_simulateV1`), the publisher sends the **first-pass survivors only** with `MAX_L1_TX_LIMIT`; the entries that the first pass already proved would revert stay dropped and are reported as failed actions. ### Post-send No diagnostic-only simulate paths remain. `Multicall3.forward` throws `MulticallForwarderRevertedError` on a reverted receipt and re-throws on a send error; per-request revert resimulation has been removed. ## Known caveats - **`sendRequestsAt` early lead**: sleeps until `startOfTargetSlot - ethereumSlotDuration` to maximise inclusion in the first L1 block of the L2 slot. There is a known correctness risk: a tx mined in the L1 block immediately preceding the L2-slot boundary would revert via `ProposeLib.validateHeader`'s `slot == block.timestamp.slotFromTimestamp()` check. In practice the prior L1 block is usually already committed before this send wakes; if observed to be unreliable in production, tune the lead down, especially on tests. - **`validateBlockHeader` pre-broadcast coverage**: covers the `validateHeader` checks (archive, slot/timestamp, mana-min-fee, …) and the empty-attestation path of `validateHeaderWithAttestations`, but does NOT cover proposer-signature verification, inbox consumption (`Rollup__InvalidInHash`), or `header.inHash` match. Those still execute inside the full `propose` and are caught by the bundle simulate at send time. The cost of a rare miss is one wasted broadcast. - **Top-level `aggregate3` revert diagnostics removed**: the previous `Multicall3.forward` code decoded receipt-reverted reasons via `tryGetErrorFromRevertedTx` and did a per-request resim on send-throw. Both paths are gone. With `allowFailure: true` and `Multicall3.hasCode` covering the no-bytecode case, a reverted forwarder receipt is genuinely unexpected (OOG, forwarder bug). The throw of `MulticallForwarderRevertedError` is the only diagnostic surface — operators will need the transaction hash from the log to investigate.
…ses (#23249) ## Motivation The `RevertCode` and `TxExecutionResult` types each carried three deprecated aliases (`APP_LOGIC_REVERTED`, `TEARDOWN_REVERTED`, `BOTH_REVERTED`) that all collapse to the same `REVERTED` value. Keeping them around adds noise, requires `no-duplicate-enum-values` eslint suppressions, and lets new code keep reaching for the old names. ## Approach Removed the deprecated members from both enums and rewrote every call site to use `REVERTED` directly. Tests, fixtures, and a stale doc reference were updated to match. ## Changes - **stdlib**: Drop deprecated `APP_LOGIC_REVERTED`/`TEARDOWN_REVERTED`/`BOTH_REVERTED` from `RevertCode` and `TxExecutionResult`. - **simulator, pxe, aztec.js, end-to-end (tests)**: Replace remaining references with `REVERTED`. - **simulator/docs**: Update a stale `APP_LOGIC_REVERTED` reference in the public-tx-simulation doc.
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
…23259) Addresses a config-timing race in `epochs_invalidate_block.parallel.test.ts > "proposer invalidates multiple checkpoints"` that caused intermittent CI failures with `expect(validCount).toBeLessThan(quorum)` (e.g. 5/6 attestations when quorum=5). ## The race The test reads `currentSlot` via `monitor.run()` right after waiting for the first checkpoint to land — that read can land anywhere within the current L2 slot, including near its end. It then computes `badSlot1 = currentSlot + 2` and races to push malicious config (`skipCollectingAttestations: true`, …) to that slot's proposer via `await node.setConfig({...})`. `CheckpointProposalJob` is constructed with `this.config` passed by reference (`sequencer-client/src/sequencer/sequencer.ts:559`), and `Sequencer.updateConfig` reassigns `this.config = merge(...)` rather than mutating, so a job built before `setConfig` lands keeps the old config object. Under proposer pipelining (`PROPOSER_PIPELINING_SLOT_OFFSET = 1`, `epoch-cache/src/epoch_cache.ts:26`), the job for `badSlot1` is built during the last L1 slot of L2 slot `badSlot1 - 1`. With 32s L2 slots and 8s L1 slots, that's ~24s into the previous L2 slot — so if `currentSlot` was read late, badSlot1's proposer can snapshot the old config before our `setConfig` round-trip completes. ## Fix - Wait for an L2 slot boundary (`monitor.waitUntilNextL2Slot()`) before reading `currentSlot`, so we start from the beginning of a slot rather than wherever we happened to land. - Bump the gap from `+2/+3` to `+3/+4` for a second slot of margin. Cost is up to one additional L2 slot of test runtime in the worst case; the existing 8-slot wait window for both checkpoints still fits.
`sendBatchRequest` became unused after removing the slow tx flow and the old tx reqresp method. This PR removes sendBatchRequest and cleans up code that becomes unused. It does NOT remove subprotocol validator registration/etc from reqresp. This might be done in a follow-up depending on how https://linear.app/aztec-labs/issue/A-1014/block-txs-reqresp-validator-validaterequestedblocktxs-is-never-invoked becomes solved.
ProposalTxCollector doesn't exist anymore. Clean up unused files. Rename bench that is now only testing BatchTxRequester.
Remove the single-checkpoint-proposal map in favour of the "by hash" variant.
Checks that inHash, archive, and sig ctx match. Should catch errors during construction.
…23257) Addresses [Phil's review comment](#23165 (comment)) on #23165: uses the injected `DateProvider` instead of `new Date()` for the pre-gas-estimation timeout check in `L1TxUtils.sendTransaction`, so tests can drive the clock.
## Motivation The local network sandbox (`aztec start --local-network`) historically ran without proposer pipelining, so the compose-routed e2e suite (`src/composed/*`, `src/guides/*`, cli-wallet flows, docs examples, playground) never exercised the pipelined sequencer path. Turning pipelining on revealed that each L2 slot took a full real-time slot (~72 s) before the L1 multicall fired, blowing up sandbox boot from ~30 s to ~5 min, because the existing `AnvilTestWatcher` triggers don't fire in the pipelined-publish window. ## Approach First commit flips `SEQ_ENABLE_PROPOSER_PIPELINING=true` on the three sandbox-test compose envs so every compose-routed test runs through the pipelined path. Second commit teaches `AnvilTestWatcher` about the proposer's target slot by hooking the sequencer's `block-proposed` event in `createLocalNetwork`; when the proposer has built a block destined for a slot beyond L1, the watcher warps L1 (and, via `cheatcodes.warp`, the injected date provider) forward, waking the pipelined publisher's `sendRequestsAt` sleep and the upstream `waitForValidParentCheckpointOnL1` wait. `block-proposed` is used rather than the cleaner `state-changed → PUBLISHING_CHECKPOINT` because the latter only fires *after* `waitForValidParentCheckpointOnL1` unblocks — which is what we are trying to break — so it would be circular. ## Changes - **yarn-project/end-to-end, docs/examples/ts, playground (compose)**: add `SEQ_ENABLE_PROPOSER_PIPELINING=true` to the `local-network` / `aztec` service env so every compose-routed sandbox test runs pipelined. - **yarn-project/aztec (`AnvilTestWatcher`)**: new `setProposedTargetSlot` setter and a `warpTimeIfNeeded` branch (gated on `isLocalNetwork`) that warps L1 to the target slot's timestamp when it's ahead of L1. - **yarn-project/aztec (`createLocalNetwork`)**: subscribe to the sequencer's `block-proposed` event and forward `slot` to the watcher. Verified locally: sandbox boot drops from ~5 min back to ~27 s under pipelining, and `e2e_local_network_example.test.ts` (both tests) passes in ~33 s.
… pipelining (#23302) ## Summary Fixes the `e2e_p2p_broadcasted_invalid_block_proposal_slash` failure that has been blocking the `merge-train/spartan` train (run https://github.com/AztecProtocol/aztec-packages/actions/runs/25896899879, test log http://ci.aztec-labs.com/2bf4e2cd2d9e7944). The test creates the malicious proposer first (auto-starting its sequencer) and only later creates the honest nodes and waits for P2P mesh. Under `enableProposerPipelining: true` (turned on for this test by #23070), the malicious proposer is selected for the very next slot, builds + broadcasts the invalid proposal one slot ahead, and lands the broadcast before the honest validators have joined the mesh. They then reject it at the gossipsub `checkpoint_proposal_validator` with `Penalizing peer for invalid slot number` (since their target slot has already moved past), so the `state_mismatch` slashing path never runs. The malicious sequencer then gets stuck on the failed publish (`Awaiting pending L1 payload submission`) and never proposes again before the test times out on `awaitOffenseDetected`. This is the same race that #23070 fixed in `duplicate_proposal_slash.test.ts`; the same pattern is applied here: - Create both the invalid proposer and the honest nodes with `dontStartSequencer: true`. - After P2P mesh connectivity + committee formation, use `advanceToEpochBeforeProposer` to land one epoch before an epoch where the invalid proposer is scheduled. - Start all sequencers, then `advanceToEpoch(targetEpoch, { offset: -AZTEC_SLOT_DURATION })` so the malicious slot fires while every node is online and at the same wall-clock slot. - After `awaitOffenseDetected` on one node, poll `getSlashOffenses` across **all** nodes for `BROADCASTED_INVALID_BLOCK_PROPOSAL` — under pipelining a given receiver may have already advanced past the build slot when the proposal arrives, so we need to catch whichever node was still in the build slot. The on-chain slash assertion (`rollup.listenToSlash`) is preserved unchanged. Full failure analysis: https://gist.github.com/AztecBot/39b69c1117f419145938ccd2c198f8e9 ## Test plan - CI: `e2e_p2p_broadcasted_invalid_block_proposal_slash` passes on `merge-train/spartan`. - Local `./bootstrap.sh ci` / `fast` / `build` are not runnable in this container (no Docker socket and `$HOME` not writable for the container UID — `yarn install` fails on `corepack` mkdir, parallel-bootstrap can't create `~/.parallel`). Fix is a direct port of a pattern already shipping green on `next` via the sibling `duplicate_proposal_slash.test.ts`. ClaudeBox log: https://claudebox.work/s/06a4929a1971beaf?run=1
Prevents the archiver from reporting invalid L2 tips by querying all chain tips within a db transaction. Moves the responsibility of assembling the tip data to the block store itself to minimize the number of queries to the db. Clamps proven and finalized tips such that an incorrect L1 sync still results in finalized <= proven <= checkpoints. And adds explicit assertions that tips are ordered. Also adds a guard in the tips store that prevents from deleting block hashes that are still alive by a given chain tip, instead of assuming that the finalized chain tip is always the oldest one. This should catch errors where the block stream breaks due to a finalized chain tip running ahead of a proven chain tip. Note that this PR does NOT enforce ordering at the L2Tips struct itself, since consumers (ie the ones that report the "local" chain tips) may break this contract (see A-1061). This PR is a simpler alternative to #22964. Fixes A-1018.
…pelining (#23296) ## Problem `Sequencer.tryVoteWhenEscapeHatchOpen` constructed `CheckpointVoter` with the wall-clock `slot` and called `publisher.sendRequestsAt(slot)`. Under proposer pipelining we are the elected proposer for `slot + 1` (`targetSlot`), and the multicall is expected to mine in `targetSlot`. `EmpireBase.sol::_internalSignal`: - Verifies the EIP-712 digest against the **mining-slot** signature - Checks `msg.sender == getCurrentProposer()` for the **mining slot** Both fail under pipelining because we're the proposer for `targetSlot`, not `slot`. The multicall reverts silently inside Multicall3 and every governance/slashing entry is dropped. ## Fix Thread `targetSlot` through `tryVoteWhenEscapeHatchOpen` and use it for both: - `CheckpointVoter` (binds the EIP-712 signature to `targetSlot`) - `publisher.sendRequestsAt(targetSlot)` (delays submission so the tx mines in `targetSlot`) This mirrors `tryVoteWhenSyncFails` and `CheckpointProposalJob.execute`, which already use `targetSlot` correctly. When pipelining is disabled `targetSlot == slot` (from `epochCache.getTargetEpochAndSlotInNextL1Slot()`), so `sendRequestsAt` resolves with no extra sleep and the legacy behaviour is preserved. ## Showcase Re-enables `e2e_sequencer/escape_hatch_vote_only.test.ts` with `enableProposerPipelining: true` and `inboxLag: 2`. The test asserts `finalStats.votes >= slotsPassed` over the escape-hatch window — this assertion fails without the fix because no votes ever land. Test-side adjustments for the pipelined timing model: - Move event listener attachment to **after** the warp into the escape-hatch epoch. Checkpoint proposals in flight at warp time fail their L1 propose tx and are setup-warp artifacts, not vote-only window failures. - Snapshot `slotAtMeasurement` for the vote-count lower bound, then wait for the L1 slot to advance two more so the trailing vote (signed in build slot N for target slot N+1) has time to mine before counting.
## Motivation Under proposer pipelining, the checkpoint job opens a world-state fork with `closeDelayMs: 12_000`. If a pending-chain unwind or historical prune destroys that fork on the C++ side before the delay fires, `DELETE_FORK` rejects with `"Fork not found"`, producing a stray warn log and leaking the per-fork queue entry in the JS instance — one dead entry per affected pipelined slot. ## Approach Make `close()` idempotent via an in-flight `closePromise`, and treat `"Fork not found"` as benign on close (same precedent as the existing `"Native instance is closed"` suppression — fork IDs are monotonic and never reused). Also wrap the per-fork queue cleanup in `try/finally` in both the native and IPC instances so the JS-side queue map cannot outlive the native fork on error. ## Changes - **world-state**: `MerkleTreesForkFacade.close()` is now idempotent and swallows `"Fork not found"`; per-fork queue cleanup in `NativeWorldStateInstance` and `IpcWorldStateInstance` moved to `finally`. - **world-state (tests)**: Regression test that disposes a `closeDelayMs` fork, triggers an unwind that destroys it on the C++ side, and asserts no warn is logged and the queue entry is cleaned up. Fixes A-1055
## Summary > **Depends on PR #23296** -- this PR is rebased on top of `palla/fix-b5-escape-hatch-slot-targeting`, which forward-ports the §6 B5 escape-hatch slot-targeting fix onto the modern `buildCheckpointSimulationOverridesPlan` + flat `l1Contracts` API. With B5 in, `e2e_sequencer/escape_hatch_vote_only` and `e2e_sequencer/gov_proposal.parallel` "should vote even when unable to build blocks" are now re-enabled under pipelining on this PR. Extracts the tests known to pass under proposer pipelining from PR #23150, without flipping the global default. Tests opt into pipelining explicitly via a new `PIPELINING_SETUP_OPTS` helper. The global `enableProposerPipelining` default stays `false` on `merge-train/spartan`; this PR migrates tests file-by-file so each one is opted in by name. This PR is intentionally scoped: it only includes tests whose pipelining-ready status is reasonably well understood. Tests that depend on shared base-class fixtures (`FeesTest`, `BlacklistTokenContractTest`, `CrossChainMessagingTest`, `DeployTest`, `FullProverTest`, etc.) keep their branch changes but are not yet wired to pipelining via their base class -- those base classes are used by tests outside this batch and a blanket opt-in would over-migrate. They will be migrated in follow-up PRs. Two commits: 1. **`test(e2e): opt unchanged tests into proposer pipelining`** -- adds `PIPELINING_SETUP_OPTS` to `fixtures.ts`, the small deploy-phase `accountsDeployMinTxs` conditional to `setup.ts`, and the explicit opt-in to every §1 test that calls `setup()` directly. 2. **`test(e2e): migrate tests that needed fixes into proposer pipelining`** -- the §2 tests with their branch fixes plus the infrastructure they depend on (sequencer.ts B5 fix, dummy_service.ts loopback, sequencer-publisher.ts error logging, sequencer-client READMEs rewrite, bootstrap.sh / test_simple.sh timeout bumps). The global default flip and the migration of base-class-using tests are intentionally deferred. They will land separately once each batch can be verified independently. --- ## §1 -- Pipelining enabled and passing (no code changes) Tests that pick up `enableProposerPipelining=true` from the explicit opt-in and pass without any per-test fix. This is the majority of the suite -- too many to enumerate. Examples include the unmodified `e2e_authwit`, `e2e_nft`, `e2e_amm`, `e2e_partial_notes`, `e2e_token_contract/*` (non-overflow), `e2e_offchain_*`, `e2e_orderbook`, `e2e_event_*`, `e2e_keys`, `e2e_avm_simulator` (after the suite-level timeout bump only), `e2e_pending_note_hashes_contract`, etc. None of these required test-level pipelining adaptations. Pre-existing `it.skip`s in this bucket are unrelated to pipelining (they predate the branch) and were not touched: - `e2e_token_contract/{transfer,transfer_in_private,transfer_in_public}` "transfer into account to overflow" - `e2e_blacklist_token_contract/{transfer_private,transfer_public}` "transfer into account to overflow" - `e2e_synching` "replay history and then do a fresh sync" / "a wild prune appears" - `e2e_p2p/reex` "validators re-execute transactions before attesting" ## §2 -- Pipelining enabled and needed fixes Tests that needed test- or fixture-level changes to pass under pipelining. All currently passing under PR #23150. **Fixture-level (`src/fixtures/fixtures.ts` + `src/fixtures/setup.ts`)** - New `PIPELINING_SETUP_OPTS` preset exporting `inboxLag=2`, `minTxsPerBlock=0`, `aztecSlotDuration=12s`, `ethereumSlotDuration=4s`, `walletMinFeePadding=PIPELINED_FEE_PADDING` (30x), and `enableProposerPipelining=true`. - `setup.ts` gains a small conditional so the deploy-phase `minTxsPerBlock` override uses `0` instead of `1` under pipelining (otherwise the chain stalls on alternating slots). **Cheat-codes (`src/testing/cheat_codes.ts`)** -- already on `merge-train/spartan` via cherry-pick of #23213. **P2P (`src/services/dummy_service.ts`)** - `notifyOwnCheckpointProposal` now invokes the all-nodes callback synchronously, mirroring libp2p loopback. Without this the in-process e2e sequencer never sees its own proposal and the pipelined parent verification blocks indefinitely. **Sequencer-client** - `sequencer.ts::tryVoteWhenEscapeHatchOpen` -- §6 B5 fix: takes `targetSlot`, signs the voter for `targetSlot`, and delays submission via `sendRequestsAt(getTimestampForSlot(targetSlot))` when pipelining is enabled. Mirrors the existing `tryVoteWhenSyncFails` and `CheckpointProposalJob.execute` patterns. Plus a refactor of `canProposeAt` simulation overrides via `SimulationOverridesBuilder`. - `sequencer-publisher.ts` -- error log on publisher exhaustion now includes the underlying viem error and tried-addresses context. **Per-suite test fixes** - `e2e_lending_contract` -- predictable-time stub, longer hook windows. - `e2e_fees/private_payments` "pays fees for tx that dont run public app logic". - `e2e_blacklist_token_contract/{burn, minting, shielding, transfer_private, transfer_public, unshielding}` -- 6/7 suites re-enabled (`access_control` still skipped, see §5). - `e2e_contract_updates` -- all 4 tests re-enabled (covered by §1 opt-in in this PR). - `e2e_expiration_timestamp` invalidates tests -- L1-only `eth.warp(target, { resetBlockInterval: true })`, no publisher cascade. - `e2e_ordering` -- switched from "latest block" to receipt-block reads; helper renamed to `expectLogsFromBlockToBe(logMessages, fromBlock)`. - `e2e_fees/failures` -- snapshot `provenCheckpointBefore/After`, use `waitForProven` with extended timeout, account for newly-proven checkpoint deltas in reward math, read committed fee headers via `getCommittedProverFee` / `getCommittedBurn`. - `e2e_fees/gas_estimation` -- pad `maxFeesPerGas` via `getPaddedMaxFeesPerGas(aztecNode)` in `beforeEach` to absorb fee-asset price evolution between snapshot and submission. 3/3 passing. - `e2e_crowdfunding_and_claim` "cannot donate after a deadline" -- L1-only `cheatCodes.eth.warp(deadline+1, { resetBlockInterval: true })`. - `e2e_deploy_contract/contract_class_registration` private-ctor variants -- thread `receipt.blockNumber` through `deployFn`, read logs from that specific block instead of "latest". 21/21 passing. - `e2e_state_vars` DelayedPublicMutable -- root cause was slot-duration mismatch (`delay(4)` assumed `aztecSlotDuration=72s` from `DefaultL1ContractsConfig`; fixture forces `12s` under pipelining). Replaced `delay(4)` with a loop that pumps no-op txs until `timestamp >= timestamp_of_change`, and asserted exact equality against `tx.data.constants.anchorBlockHeader.globalVariables.timestamp + newDelay - 1n`. Tight `toEqual`, no widened bound. - `e2e_pending_note_hashes_contract` -- squash helpers use the latest *non-empty* block. - `e2e_expiration_timestamp` -- include-by computation bumped by 2x `aztecSlotDuration`. - `e2e_p2p/*` and `e2e_epochs/*` -- explicit `enableProposerPipelining: true` + `inboxLag: 2` on every test that builds its own config (so behavior is intentional rather than implicit). - `e2e_block_building` "processes txs until hitting timetable" -- replaced legacy `canStartNextBlock` mock + single-deadline timetable with the pipelined sub-slot budget (`blockDurationMs=2000`, `enforceTimeTable=true`, `fakeProcessingDelayPerTxMs=500`). 10 simultaneous txs must span at least 2 distinct blocks; would fail if the proposer reverted to single-block-per-slot or stopped enforcing sub-slot deadlines. - `e2e_block_building` "assembles a block with multiple txs" (x2) -- pre-publish the contract class once and pass `skipClassPublication: true` on each per-tx deploy so the deploys don't all share the same `ContractClassRegistry.publish` nullifier and get RBF-rejected against each other. Also reset `blockDurationMs` in `afterEach` so the multi-block-per-slot state from the previous test doesn't leak. - `e2e_block_building` "publishes two empty blocks" -- `buildCheckpointIfEmpty: true` so the proposer doesn't skip empty sub-slots; retry budget bumped from 10s -> 60s because empty checkpoints land every `aztecSlotDuration` (12s) rather than every legacy block. - `e2e_epochs/epochs_mbps.parallel` "builds multiple blocks per slot with L2 to L1 messages" -- pipelined timing loses one sub-slot to attestation propagation; expectation dropped from `EXPECTED_BLOCKS_PER_CHECKPOINT=3` to `>= 2`, mirroring the sibling MBPS tests. - `e2e_l1_with_wall_time` -- test was explicitly passing `ethereumSlotDuration` from env (=12s), defeating the fixture's pipelining override (=4s). With `aztec=eth=12s`, pipelined timing can't fit propose+attest+publish in one Aztec slot. Removed the explicit `ethereumSlotDuration`; also wrapped `teardown` in `afterEach` so setup failures surface their real error. - `e2e_p2p/add_rollup` re-enabled (entire describe; 1 test, passes in ~9:14 locally). AttestationTimeoutError still fires in some slots, but the bundled-multicall governance-signal preCheck is independent of the propose preCheck -- signals accumulate and reach quorum even when checkpoint proposes fail to attest. - `e2e_pruned_blocks` "can discover and use notes created in both pruned and available blocks" -- restored the explicit `markAsProven` call (as it had pre-#21156) + a 2-block buffer for Anvil's `finalized = latest - 2` heuristic; test re-enabled and passes. - `e2e_sequencer/escape_hatch_vote_only` re-enabled. Source fix at `sequencer.ts::tryVoteWhenEscapeHatchOpen` (see §B5 in PR #23150). Test-side: attach event listeners *after* the warp, explicitly drain trailing in-flight votes before counting. - `e2e_sequencer/gov_proposal.parallel` re-enabled (both tests). Two pipelining-aware adjustments: warp offset bumped to `nextRoundBeginsAtTimestamp - AZTEC_SLOT_DURATION - ETHEREUM_SLOT_DURATION`, and per-tx wait timeouts tuned for two slots of catch-up (proposer + L1 mine). **Bash-level timeout adjustments (`end-to-end/bootstrap.sh`)** -- pipelined sequential dependent txs run at ~2x legacy latency: - simple e2e default: 10m -> 20m - `e2e_block_building`: 25m - `e2e_avm_simulator`: 30m - compose/web3signer: 20m - HA: 30m - `scripts/test_simple.sh` Jest `--testTimeout` 5m -> 10m - ~21 test files: per-file `const TIMEOUT` raised from 100/120/150/180s -> 300s. --- ## Out of scope - **Global default flip**: PR #23150 flipped `enableProposerPipelining=true` everywhere. This PR keeps the default `false` and migrates per-test. The global flip will land in a follow-up. - **§3 opt-outs** (`e2e_l1_publisher` "with attestations" describe, `epoch_cache.test.ts` non-pipelined branch coverage, demo `docker-compose.yml`): no change required while the default is `false`. - **§5 still-skipped tests**: the tests in §5 of PR #23150's categorization (e.g. `e2e_blacklist_token_contract/access_control`, `e2e_publisher_funding_multi`, `e2e_fees/fee_settings`, etc.) remain at `merge-train/spartan` state. - **Base-class fixtures** (`FeesTest`, `BlacklistTokenContractTest`, `CrossChainMessagingTest`, `DeployTest`, `FullProverTest`, `EpochesTest`, P2P fixtures): test files using these get their branch-side changes preserved but are not wired to pipelining via the base class -- those base classes are shared with tests not in this batch and a blanket opt-in would over-migrate. Follow-up PRs will opt them in selectively. Reference: PR #23150 (`palla/kill-non-pipelined-flow`) for full context on the categorization, source-level bugs surfaced (§6 B1-B6), and per-suite investigation notes.
AztecBot
added a commit
that referenced
this pull request
May 15, 2026
… composes Both docs/examples/ts/docker-compose.yml and playground/docker-compose.yml ran with SEQ_ENABLE_PROPOSER_PIPELINING=true (added in #23277), but the sandbox is not yet configured to absorb pipelining's side effects: - example_swap stalls on `wait for proven block N` because the proven tip stops advancing in an idle pipelined sandbox (the original PR #23253 dequeue, http://ci.aztec-labs.com/b08ac48286302949). - aztecjs_advanced fails on `Cannot get L1 to L2 messages for checkpoint N: inbox tree in progress is N, messages not yet sealed` because under pipelining `AztecNodeService.simulatePublicCalls` reads L1->L2 messages from an in-progress checkpoint (http://ci.aztec-labs.com/419c4513023a1799). This is the same `simulator + inboxLag` mismatch already TODO'd in e2e_bot.test.ts and several e2e_fees tests. Disable the flag in the two sandbox composes to unblock the spartan merge train; aztec-up scripts (basic_install / bridge_and_claim / amm_flow) keep the flag and continue exercising pipelining in CI.
Builds on top of #23180
## Summary Moves the data-withholding slash from the L1-prune path to a per-slot check at `slotStart(checkpoint.slot + slashDataWithholdingToleranceSlots)`, and removes the now-unnecessary `VALID_EPOCH_PRUNED` offense and `EpochPruneWatcher`. Per AZIP-7: validators are responsible for making tx data available, not for ensuring proofs land. The new `DataWithholdingWatcher` ticks at quarter-eth-slot cadence and, for each published checkpoint older than `dataWithholdingToleranceSlots` (default 3), probes the local mempool for the txs in the checkpoint's blocks. Missing txs trigger a `DATA_WITHHOLDING` slash for the validators who actually attested to that checkpoint. ## Highlights - **New** `DataWithholdingWatcher` (`yarn-project/slasher/src/watchers/data_withholding_watcher.ts`) with full unit-test coverage. Sentinel-style tick + restart floor (no KV). - **Slot-keyed** `DATA_WITHHOLDING` — moved from `'epoch'` to `'slot'` in `getTimeUnitForOffense`. Offense identity is now per-checkpoint, not per-epoch. - **Single source of truth for tolerance**. `P2PClient.collectingMissingTxs` anchors its tx-collection deadline to `slotStart(block.slot + slashDataWithholdingToleranceSlots)` so the collection effort runs to exactly the wall-clock instant the watcher renders its verdict. The ad-hoc `p2pMissingTxCollectionDeadlineMs` is removed. - **A-525 deletions** bundled in: `OffenseType.VALID_EPOCH_PRUNED`, `slashPrunePenalty` config + env var + spartan plumbing, `EpochPruneWatcher` class + tests, `valid_epoch_pruned_slash.test.ts`. - **e2e test** rewritten in `data_withholding_slash.test.ts`: 4 validators, slashSelfAllowed, tx is mined normally then stubbed missing on every node, watcher fires, slash executes on-chain, committee is kicked. Asserts slot-keyed offense identity + on-chain effect. ## Test plan - `yarn workspace @aztec/slasher test` — 76 tests pass (incl. 9 new `DataWithholdingWatcher` tests). - `yarn workspace @aztec/stdlib test src/slashing` — 55 tests pass after the keying flip. - `yarn workspace @aztec/p2p test src/client/p2p_client.test.ts` — 20 tests pass with the new slot-anchored deadline. - `yarn build` clean across the monorepo. - The e2e (`e2e_p2p_data_withholding_slash`) is in place but only runs in CI. ## Out of scope - L1 contract changes — none needed (offense type code is purely off-chain). - The `L2PruneUnproven` event/emitter is left in place; nothing subscribes to it after this PR but the event itself stays available for future observers. - Re-execution of checkpoints (covered by sibling A-1022). Closes A-523. Closes A-525.
An agent claimed this test fails under pipelining. This shows that isn't the case. Fixes A-1058
The `advanceBlock` helper sent a noop tx every iteration but never proved. Under pipelined slot cadence the L1 contract's `aztecProofSubmissionEpochs=2` window expires mid-test, L1 auto-prunes the unproven epoch, and `InvalidTxsAfterReorgRule` drops the wallet's in-flight noop tx with "Tx dropped by P2P node". This is the failure documented in A-1056. Fix is to call `markAsProven()` after each successful `advanceBlock`, matching the precedent in `e2e_cross_chain_messaging/l1_to_l2.test.ts` test 2. Fixes A-1056
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BEGIN_COMMIT_OVERRIDE
refactor(p2p): merge FastTxCollection into TxCollection with sequential pipeline (#23245)
refactor(publisher): bundle-level simulate; drop per-action enqueue sims (#23165)
refactor(stdlib): remove deprecated RevertCode/TxExecutionResult aliases (#23249)
test(e2e): fix race in 'proposer invalidates multiple checkpoints' (#23259)
fix: clean up old jobs regardless of pending status (#23260)
refactor(p2p): remove unused sendBatchRequest (#23273)
chore(p2p): remove proposal_tx_collector leftovers (#23276)
feat: slash truncated checkpoint proposals (#23250)
refactor: remove unused map in attestation pool (#23284)
chore(p2p): assert last block in checkpoint proposal is correct (#23274)
refactor(l1-tx-utils): use DateProvider for fail-fast timeout check (#23257)
feat(sandbox): support proposer pipelining in local network (#23277)
test(e2e): fix race in broadcasted_invalid_block_proposal_slash under pipelining (#23302)
fix(archiver): atomic getter for L2 tips (#23295)
fix(sequencer): use targetSlot in tryVoteWhenEscapeHatchOpen under pipelining (#23296)
fix(world-state): make fork close idempotent for pruned forks (#23298)
test(e2e): migrate passing tests to proposer pipelining (#23275)
chore: update dashboard (#23312)
chore: Revert "feat(sandbox): support proposer pipelining in local network" (#23313)
test: slash on bad attestation (#23184)
feat(slasher): per-slot data-withholding watcher (A-523, A-525) (#23116)
test(e2e): enable pipelining on e2e debug trace (#23301)
test(e2e): enable pipelining on l1-to-l2 test (#23300)
END_COMMIT_OVERRIDE