BAL-driven parallel commitment (PR #5 of the perf stack)#21416
Open
mh0lt wants to merge 4 commits into
Open
Conversation
08af255 to
50e1afd
Compare
…mitment LoadFromBAL populates the commitment calculator's calcState from an EIP-7928 Block Access List instead of the per-tx VersionedWrites stream. The BAL declares the block's post-state up front, so the calculator can build the trie without waiting for execution to stream writes tx-by-tx — the prerequisite for running commitment fully parallel to execution. For each touched account it takes the block-end value per field — the highest-tx-indexed change, via the generic finalChange helper — and feeds the existing ApplyWrites, reusing the SELFDESTRUCT / Deleted / EIP-161 routing rather than reimplementing it. Storage reads are ignored (commitment only needs the changed set). Not yet modelled: the BAL carries no explicit SelfDestructPath or incarnation field, so account-deletion and fresh-contract-incarnation blocks diverge from the incremental path — tracked as a Stage-1 follow-up. TestLoadFromBAL_MatchesApplyWrites is the differential proof: loading calcState from a BAL produces byte-identical accumulated state to feeding the equivalent multi-write VersionedWrites stream through ApplyWrites. TestFinalChange covers the highest-index-wins helper. This is Stage 1 of the parallel-commitment work; it adds no call site yet — LoadFromBAL is unused until the calculator wiring lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Default the engineapi test harness ethConfig to FcuBackgroundCommit=true. Async commit runs the post-FCU flush on a background goroutine, so a subsequent newPayload may read the parent SD either pre- or post-flush — the path functional tests must exercise. Sync commit makes every flush deterministic per-FCU and masks flush-timing bugs. A test needing sync commit can still opt out via EthConfigTweaker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llel commitment Stage 2) The commitment calculator inferred block boundaries from the txResult / blockResult stream. Give it an explicit per-block heads-up: a blockRequest carrying the block identity and the block's BAL, sent by the dispatch layer on its own channel — separate from the result fan-out so a request is never trapped behind a prior block's txResults. The calculator's loop now multiplexes the result channel and the blockRequests channel; handleBlockRequest records the per-block mode (BAL-driven when the block has a BAL and BAL I/O is enabled, else incremental) into a pending map, cleared on the matching blockResult. This stage is inert plumbing — the mode is recorded but not yet acted on; compute behaviour is unchanged. Verified: engineapi reorg test shows an identical pass rate with and without this change. Also corrects the LoadFromBAL docstring: account deletion / incarnation need not be modelled — BALs exist only for Amsterdam+ blocks and post-EIP-6780 SELFDESTRUCT cannot delete a pre-existing account at block scope, so this is not a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… Stages 3-5) The commitment calculator can now fold a block from its BAL ahead of the per-tx result stream, overlapping the block's execution — the parallel commitment win (max(exec,trie) instead of exec+trie). Stage 3 — BAL-driven fold. handleBlockRequest selects BAL-driven mode for a block carrying a BAL (gated on BAL_DRIVEN_COMMITMENT). maybeFoldAhead folds it once the fold gate is open: block N-1's committed state must be in sd.mem, which blockResult(N-1) signals (the batch's first block has its baseline from the prior cycle). foldBlockFromBAL loads a fresh calcState from the BAL, computes the root via the shared computeRoot path, and verifies it against the block header's stateRoot. Stage 4 — calculator failure stops execution. fail() calls the executor's CancelFunc so the exec loop's ctx.Done branches fire eagerly instead of running ahead behind the 2048-deep result buffer; the error is also published to the stage loop. Stage 5 — incremental fallback + dual-compute shadow mode. With BAL_DRIVEN_COMMITMENT off (the default), every block stays incremental and the consensus path is byte-for-byte unchanged. BAL_SHADOW_COMPUTE recomputes each BAL-driven block the incremental way at blockResult(N) and asserts the two roots match before publishing — the consistency net; divergence fails the block. blockRequest carries lastTxNum so the calculator can position asOfReader and ComputeCommitment when folding ahead of blockResult(N). BAL-driven mode is off by default and stays off until the shadow-compute check has proven the BAL-driven root matches across a full validation window. Verified: build + lint clean; make test-all green except the pre-existing async-commit engineapi flake (tracked separately). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
50e1afd to
71a46a4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
DRAFT — does NOT build on this base. The commits use the typed
VersionedWrite[T]API (ValU256,ValU64,ValBytesfields), which lands as part of the versionedio refactor (PR #6 of the perf stack, currently unposted). Build error on the current base:Pushed for visibility / review of the BAL-driven commitment design. CI will fail until the typed-VW prerequisite lands.
Merge order (when the dep lands): #21380 → #21386 → #21387 (already merged) → typed-VW PR → this.
Summary
Pipelines commitment computation alongside EVM execution by feeding the commitment calculator the BAL (block access list) at block-arrival, instead of waiting for execution to publish writes. Closes the structural piece of issue #19791 — "perf: pipeline commitment with execution in parallel path".
Today execution and commitment serialize: EVM finishes a block → publishes the writeset → commitment hashes it. With this PR, the calculator receives requests at block-arrival (BAL declares the post-block leaf set), runs in parallel with the EVM, and folds with the EVM's actual writes when both finish. Steady-state cost shifts from
exec + trietomax(exec, trie).Mechanism — 4 stages
calcState.LoadFromBAL(8ead839656): the parallel commitment calculator'scalcStatelearns to seed itself from a BAL-declared leaf set, instead of only the exec-published writeset. Lays the foundation for early-start commitment.48eb2b0173): flips the engineapi test harness to run withFcuBackgroundCommit=trueby default, so the existing assertoor suite exercises the parallel-commitment path in CI.2a57d9cd66): wires the BAL-arrival hook to enqueue per-block commitment requests at the calculator. Calculator begins hashing in parallel with execution.BAL_SHADOW_COMPUTE(08af2551f5): folds the parallel calculator's root with the EVM's actual writes when both finish; emitsBAL_SHADOW_COMPUTEdivergence metrics so the parallel path is observable against the serial reference until we trust it in steady state.Files (7)
execution/stagedsync/calc_state.goexecution/stagedsync/committer.goexecution/stagedsync/exec3.go,exec3_parallel.goexecution/stagedsync/bal_load_test.goexecution/engineapi/engineapitester/engine_api_tester.gocommon/dbg/experiments.goBAL_SHADOW_COMPUTEenv knob7 files, +221/-33.
Commits (4)
Gating + safety
BAL_SHADOW_COMPUTE(env knob, default off in production). When unset, the parallel calculator runs shadow-only — its root is computed but not consumed; the serial reference still drives correctness. Divergence is logged so the parallel path is observable against the reference before we cut over.Out of scope
BAL_SHADOW_COMPUTE— stays off in this PR. Cutover lands once shadow-compute observability shows zero divergence in steady state.Related
🤖 Generated with Claude Code