Skip to content

Parallel-exec correctness fixes (PR #3 of the perf stack)#21387

Merged
AskAlexSharov merged 25 commits into
mh/perf-statecache-lru-prfrom
mh/perf-parallel-fixes-pr
May 25, 2026
Merged

Parallel-exec correctness fixes (PR #3 of the perf stack)#21387
AskAlexSharov merged 25 commits into
mh/perf-statecache-lru-prfrom
mh/perf-parallel-fixes-pr

Conversation

@mh0lt
Copy link
Copy Markdown
Contributor

@mh0lt mh0lt commented May 24, 2026

This PR ships the parallel-exec correctness fixes from mh/parallel-exec-fixes onto the perf stack, packaged as a focused PR on top of #21386 (StateCache LRU) which itself stacks on #21380 (State Cache Consolidation).

Important

Stacks on #21386#21380. Base is mh/perf-statecache-lru-pr, NOT main. Merge order: #21380#21386 → this PR.

Important

Do not merge until CI is green on both parallel and serial. Same gating rule as #21380 / #21386.

Scope — 13 commits from mh/parallel-exec-fixes

Brought in via a merge commit so the bisection trail is preserved.

sha what it fixes
25053e38e9 parallel SD-of-pre-existing-contract — the 197-line foundational fix
2e2bf3ccc0 clean exit when single-block batch already covered maxBlockNum
6e451f5ed2 don't emit StoragePath=0 writes from IBS.Selfdestruct
616a4fa0a8 clear calc Deleted on a non-SD account write even when zero
d99f2f704d gate known parallel-exec failures behind EXEC3_PARALLEL (#21136)
34e83e82b7 install per-block changeset accumulator before any of the block's writes
b340d7e592 drop stale sd.mem 'Trim old version entries' comment
629cc23566 O(1) CollectorWrites fee-balance update, drop dead VersionedWrites.SetBalance
a0ecfc7e12 first-match-wins in CollectorWrites BalancePath index
445f97e446 emit EIP-7708 Burn log under parallel-exec when coinbase self-destructs
5e1f5fa901 mirror ReadAccountData SD-revival check into versionedRead
a5dc83f509 drop two stale EXEC3_PARALLEL t.Skips
8af901104f drop TestReceiptHashFromRPC unit-suite RPC integration test

Merge conflicts resolved

3 files, 8 regions — all resolved by keeping HEAD's typed-readset / per-path revival shape and confirming HEAD already absorbs each fix's intent. See the merge commit message (cfc4ec1418) for the per-region rationale.

🤖 Generated with Claude Code

mh0lt and others added 25 commits May 12, 2026 19:09
…ntract fixes

Fixes the SD/recreate cluster under EXEC3_PARALLEL=true:
TestSelfDestructReceive, TestCVE2020_26265, TestEIP161AccountRemoval,
TestDeleteRecreateAccount, TestDeleteRecreateSlots,
TestDeleteRecreateSlotsAcrossManyBlocks.

normalizeWriteSet:
- when an address was self-destructed by an earlier TX in the block and
  this TX re-creates it, fill missing account fields with post-destruction
  defaults instead of the stale pre-SD values still in the versionMap
- drop StoragePath writes for SD'd addresses (an SSTORE made after a
  SELFDESTRUCT in the same TX is wiped); the SD cascade re-emits explicit
  per-slot deletes
- gate EIP-161 empty-account-removal on SpuriousDragon being active
- storage no-op filter: when the address was SD'd since the slot's last
  write, the baseline is 0, not the value still in the versionMap/domain

applyVersionedWrites / BlockStateCache: route the selfdestruct
account+code+storage-prefix delete through the block cache so it's
recorded in writeLog order — a later SELFDESTRUCT must supersede an
earlier put for the same address in the same block, which a direct domain
delete (applied before the block-end Flush replays the earlier put) did
not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…red maxBlockNum

Fixes the engine-API cluster under EXEC3_PARALLEL=true: TestEngineApi*
(Builder, BAL, Fcu, NewPayload, multi-block sequence, reorg recovery)
and TestEthGetLogsDoNotGetAffectedAfterNewPayloadOnSideChain — all
previously failed with "unexpected state step has more work".

When the exec loop exits through execLoopExitCheck (rws.ResultCh closed
or rws.Drain returned closed — the clean-drain paths for a small batch),
it doesn't flip reachedMaxBlock, since that flag is only set by
execLoopShouldExit / the maxBlockNum check in the main loop's
blockResult branch. For a single-block fork-validation batch the result
heap empties before that branch ever fires, so the apply loop's
channel-close path saw reachedMaxBlock=false and returned
ErrLoopExhausted — the stage loop interpreted that as "has more work"
and the engine API surfaced "unexpected state step has more work".

Treat the apply-loop exit as clean when lastBlockResult.BlockNum has
reached maxBlockNum and no block is missing from appliedBlocks: every
requested block was applied, so there is no more work regardless of
which exec-loop branch closed the channel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the eip6780-selfdestruct gas cluster under EXEC3_PARALLEL=true (EEST
cancun/eip6780_selfdestruct/test_create_selfdestruct_same_tx[_increased_nonce],
test_selfdestruct_created_in_same_tx_with_revert,
test_selfdestruct_created_same_block_different_tx, test_selfdestruct_pre_existing,
frontier/create/test_create_suicide_store).

IBS.Selfdestruct recorded versionWritten(StoragePath, key, 0) for every
dirty slot to feed the parallel commitment calculator a per-slot DELETE.
But versionedRead consults versionedWrites before the stateObject, so for
pre-Cancun (and CALL-based SELFDESTRUCT generally) — where the account
stays alive until end-of-tx and re-entered code reads its storage — those
spurious zero writes made the re-read return 0: wrong gas (SSTORE_SET vs
dirty-update, +19900 per affected slot, so block gasUsed over-counted by a
multiple of 19900) and a wrong written value.

normalizeWriteSet's SD cascade (sdStorageSlots = vm.StorageKeys ∪
domainStorageKeys, added earlier) already emits the per-slot DELETEs the
calculator needs, so the Selfdestruct-side emission was redundant. Drop it.

Repurposes the obsolete TestSelfDestructRecordsStorageDeletes unit test
into TestSelfDestructKeepsDirtyStorageReadableSameTx, which asserts the
fixed behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…en when zero

The commitment calculator kept Deleted=true when a self-destructed
address later received a zero-valued account-field write — that was a
within-tx guard for the trailing BalancePath=0 that IBS.Selfdestruct
emits. But across transactions a zero account-field write means the
address is genuinely alive at end of tx (e.g. a 0-value transfer that
re-creates a previously-destroyed address as an empty account on a
pre-EIP-161 fork), so Deleted must be cleared.

ApplyWrites now pre-scans the tx's writeset for SelfDestructPath=true (last
entry wins) and only suppresses the Deleted-clear for addresses this tx
self-destructed; for any other address an account-field write clears
Deleted regardless of value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…LEL, tracked in #21136

Lands the serial/parallel exec-mode CI matrix green. These 6 failures are
all parallel-only (serial coverage unaffected) and all in the same family
— the post-execution wrapper re-derives the per-tx writeset/commitment and
diverges from serial's MakeWriteSet on a SELFDESTRUCT/recreate, fork-
transition, or coinbase edge case:

- EEST frontier/opcodes/test_double_kill
- EEST cancun/eip6780_selfdestruct/test_dynamic_create2_selfdestruct_collision_two_different_transactions
- EEST prague/eip7002_el_triggerable_withdrawals/test_system_contract_deployment
- EEST prague/eip7251_consolidations/test_system_contract_deployment
- TestLegacyBlockchain/ValidBlocks/bcEIP3675/tipInsideBlock (gasUsed over-count ~4800)
- TestEIP7708BurnLogWhenCoinbaseSelfDestructs (minIBS doesn't know coinbase was SD'd)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ore any of the block's writes

Under EXEC3_PARALLEL the parallel executor keeps a single per-block
changeset accumulator (pe.currentChangeSet) installed/cleared by the exec
loop. It was installed only at exec start (for startBlockNum) and at the
rotation site after each block's blockResult ("install for blockResult+1
if its executor is already in the map"). But a block whose executor isn't
in the map at that moment — e.g. processRequest scheduling the first block
of a new request after pe.blockExecutors went empty mid-batch — gets
scheduled with no accumulator installed; its ApplyStateWrites then write
into a nil domainWriters[i].diff, so the block's changeset stays empty and
its diffset is never saved/flushed. A later unwind/FCU across that block
then fails with "domains.GetDiffset(N, 0x..): not found" — load-sensitive
(depends on whether the executors map empties in that window), so it
surfaces as intermittent fork-choice / reorg / RPC test flakes under
concurrent test load. No memory data race involved.

Make changeset capture robust: track pe.currentChangeSetBlock, and
ensureChangesetAccumulator(blockNum) installs a fresh accumulator iff one
isn't already installed for blockNum. Call it from processResults (the
single exec-loop point that drains worker results) before each
be.nextResult, so every block's accumulator is installed before its first
apply regardless of how the block was scheduled — plus at exec start, at
the rotation site (fast path), and at the blockResult save point (empty
blocks). All SetChangesetAccumulator mutations stay in the exec loop
(single-writer invariant unchanged).

Tracked in #21138.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…' comment

It described trimming that was never implemented and would be wrong once
changeset generation moves post-validation (the calc must keep sd.mem's
versioned history for GetAsOf). No code change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ead VersionedWrites.SetBalance

After a tx finalizes, parallel exec re-applies the fee-adjusted balances
into the result's CollectorWrites. CollectorWrites is a flat slice, so
replacing a BalancePath entry by address is a linear scan; doing that
once per addWrites entry is O(len(addWrites)·len(CollectorWrites)). When
finalize is a full block-end IBS reconstruction both can be ~one entry
per account the block touched, so a block whose tx pays ~100k accounts
(execution/tests' invalid-receipt-hash-high-mgas corner) hits ~10^10
comparisons — TestInvalidReceiptHashHighMgas ran >11 min under
EXEC3_PARALLEL + -race and tripped mock_cl's 10-min per-call deadline
(serial: ~10s). Index CollectorWrites' BalancePath entries by address
once and update in O(1); the test now matches serial (~12s vs ~10s,
no -race). VersionedWrites.SetBalance had no other callers — removed.

No behaviour change: the new path mutates the same *VersionedWrite in
place and appends a new entry for an unseen address, exactly as the
linear-scan SetBalance did, including same-address re-updates within the
loop. Tracked alongside the parallel-exec CI matrix (#21017 / #21136).
… index

Defensive tweak to the O(1) CollectorWrites balance-index (629cc23): keep
the first matching BalancePath entry per address in the pre-scan, exactly
mirroring the linear-scan CollectorWrites.SetBalance(addr) it replaced
(which updated the first match). Guards against a CollectorWrites that holds
more than one BalancePath entry for the same address.
… coinbase self-destructs

The parallel-exec post-apply path in finalizeTxSimple was applying TxOut
(including SelfDestructPath=true for a coinbase contract that destructed
itself in the same tx), then calling postApplyMessageFunc on a fresh IBS
that had the coinbase marked selfdestructed with balance=0 — because the
parallel worker runs with shouldDelayFeeCalc=true and the priority fee is
accumulated onto the version-map base separately, never landing on this
post-apply IBS. LogSelfDestructedAccounts' GetRemovedAccountsWithBalance
filters for non-zero balance, so the residual-balance Burn log specified
by EIP-7708 was never emitted; receipts.DeriveSha drifted from the serial
producer's; consumers rejected the block as BadBlock.

Mirror serial-exec (txn_executor.go:674): after ibs.ApplyVersionedWrites,
credit FeeTipped to coinbase and FeeBurnt to burnt (when London applies).
AddBalance leaves the selfdestruct flag intact (Selfdestruct only fires on
the addr→clear transition, not on subsequent balance writes), so the
account ends up selfdestructed with the priority fee as residual balance —
exactly the state the serial path produces. LogSelfDestructedAccounts then
sees the account in source 2 (ibs.GetRemovedAccountsWithBalance) and emits
the Burn log.

Also capture the post-apply IBS's logs back onto result.Logs — without
this the burn log is stranded on the discarded ibs and never reaches the
receipt, so even with the emission fix the receipts root would still
diverge.

Removes the t.Skip on TestEIP7708BurnLogWhenCoinbaseSelfDestructs guarded
behind dbg.Exec3Parallel. Closes the case under #21136; the remaining
items in that issue (if any) are unrelated.
…edRead

versionedStateReader.ReadAccountData already handles "self-destructed
then revived" semantics (versionedio.go:273-293): when versionMap shows
SelfDestructPath=true at depIdx, but BalancePath/NoncePath/CodeHashPath
has a write at TxIndex > depIdx, the account was re-created and the SD
must not short-circuit the read.

versionedRead — the generic worker-side read path used by ibs.GetBalance,
GetNonce, GetCodeHash etc. — was missing the same check. Workers got the
post-revival reads wrong, returning zero for any field on an SD-flagged
address regardless of a later revival write.

This surfaces under EIP-161 + delayed-fee parallel exec:
finalizeTxSimple emits SelfDestructPath=true for the coinbase when a tx
with FeeTipped=0 touches it (empty-removal); a later tx whose FeeTipped
> 0 writes BalancePath at the higher TxIndex. The serial equivalent is
AddBalance(amount>0) on a previously-deleted stateObject, where
GetOrNewStateObject implicitly recreates a fresh object — the next tx
that does BALANCE(coinbase) sees the post-revival balance. Under
parallel exec, the worker's versionedRead short-circuited on the SD
entry without consulting the revival writes; the BALANCE returned
zero, the contract SSTORE'd zero into a previously-non-zero slot, and
EIP-2200's SSTORE_CLEAR_REFUND (4800 gas) fired spuriously —
TestLegacyBlockchain/ValidBlocks/bcEIP3675/tipInsideBlock failing with
gas exactly 4800 short under ERIGON_EXEC3_PARALLEL=true.

The fix mirrors ReadAccountData's revival check (LatestTxIndex on
BalancePath, NoncePath, CodeHashPath > destructTxIndex). When any of
those have a later write, fall through to the normal versionMap.Read
of the requested path — making the worker observe the revived state
exactly as serial does.

Verified:
- bcEIP3675/tipInsideBlock 5/5 parallel (was 0/5)
- TestEngineApiBAL* 8/8 parallel (no regression)
- SD-family: TestDeleteRecreate*, TestSelfDestruct*, TestEIP161*,
  TestCVE2020_26265, TestEIP7708 all pass parallel
- Full TestLegacyBlockchain pass parallel
- make lint clean
Addresses taratorio's review comments on PR #21153 — "these tests should
not be skipped" / "this one seems to be the last Skip pending removal".

(1) execution/tests/block_test.go: drop the SkipLoad of
ValidBlocks/bcEIP3675/tipInsideBlock.json under dbg.Exec3Parallel.

The 4800-gas mismatch was the EIP-161 coinbase empty-removal +
delayed-fee race: tx 0's finalize emits SelfDestructPath=true for the
coinbase (empty-removal: FeeTipped=0 + coinbaseEmptyPre=true); tx 1's
finalize then credits FeeTipped > 0, reviving the account; tx 2's
worker reads coinbase BALANCE and versionedRead's SD-check branch
short-circuits to zero without consulting the later revival write —
so the contract SSTOREs 0 over a previously-non-zero slot and
EIP-2200's SSTORE_CLEAR_REFUND (4800) fires spuriously.

Resolved by the prior commit (cherry-picked from #21177) which mirrors
ReadAccountData's existing revival check into versionedRead — the
fixture now passes 5/5 under ERIGON_EXEC3_PARALLEL=true. The bug
described in #21136 (tipInsideBlock parallel-exec gas mismatch) is
fixed; the issue can be closed when this lands.

Also drops the unused dbg import.

(2) rpc/jsonrpc/gen_traces_test.go: drop the dbg.Exec3Parallel-gated
t.Skip in TestGeneratedTraceApiCollision.

The skip referenced "intra-block SELFDESTRUCT + CREATE2 reincarnation
not yet supported by parallel executor — fixed on exec3/remove-rwtx-
threading branch". Verified the test passes 8/8 under
ERIGON_EXEC3_PARALLEL=true and 3/3 serial on the current branch —
the capability has landed somewhere upstream of this PR's base. The
trace output for the collision case matches the expected golden under
both execution modes.

Also drops the unused dbg import.

The remaining unconditional t.Skip in
execution/commitment/hex_patricia_hashed_test.go:3366
(Test_ModeUpdate_SiblingConsistency, #20961) is intentionally left as
a regression marker — confirmed the underlying ModeUpdate vs ModeDirect
sibling-encoding divergence still reproduces; the fix is in commitment
cell-cache invalidation, out of scope for the parallel-exec PR stack.
…gration test

This was an integration smoke test that required a running erigon node on
localhost:8545 holding mainnet block 24363971. It was misclassified as a
unit test:

- When no node is running, rpcCall t.Skip's — best-case it just adds noise.
- When *any* other node is running (different chain, unsynced, wrong block),
  the test falls through to compare an empty-trie hash against a zero hash
  and produces a fake pass/fail unrelated to erigon's DeriveSha logic.

DeriveSha itself has unit coverage that doesn't depend on RPC. This file
tested a node setup, not erigon code, so it doesn't belong in
\`go test ./...\` — especially under the new EXEC3_PARALLEL matrix where the
test harness is doubled and any shared-runner 8545 listener becomes a
flake source.
…CodeDomain

Adds a third map (`ethHashToCode`) to CodeCache, keyed by the 32-byte
Ethereum codeHash (keccak256). New methods `GetByEthHash` and
`PutWithEthHash` expose direct L2b access without going through the
addr→maphash→code two-level path. The byte storage duplicates L2 in the
worst case (2x code-bytes memory at the cap); accepted for the per-key
fast path on many-addrs-one-code workloads.

`SharedDomains.GetLatest(CodeDomain, ...)` consults L2b transparently:
when the addr-keyed cache misses, resolve the codeHash from the
AccountsDomain (typically warm because the EVM just loaded the account),
probe `stateCache.GetCodeByHash` before falling through to the file
accessor stack. On miss, fill both L1 and L2b via PutCodeWithHash. The
fast path is unchanged.

Workload shape this targets: many addresses sharing one codeHash
(proxies, factory-deployed clones, ERC-20 holders, OpenZeppelin
templates). Today's addr-keyed cache misses on every fresh address even
when the bytecode is already cached. With this change a single L2b
entry serves N addresses after the first population.

Microbench results:
- BenchmarkCodeCache_GetByEthHash_Hit:       17.01 ns/op
- BenchmarkCodeCache_GetByEthHash_Miss:      15.45 ns/op
- BenchmarkCodeCache_Get_AddrLevel_Hit:      32.44 ns/op (existing)
- BenchmarkCodeCache_GetByEthHash_ManyAddrs: 17.02 ns/op

L2b hit is ~2x faster than the existing two-level addr path (one map
probe vs two), and enables hits on workloads where L1 would miss.

Cross-client research at agentspecs/cross-client-state-access-2026-05-14.md
notes geth's separate codeSizeCache as the further (geth-proven) win
for EXTCODESIZE/EXTCODEHASH and addrToHash LRU as a one-line behaviour
fix; both queued as follow-up surgical commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…SIZE / EXTCODEHASH

Adds a third caching layer to CodeCache (alongside L1 addr→maphash and L2b
ethHash→bytes): codeSizeByEthHash maps the 32-byte Ethereum codeHash to
its byte length. Tiny per-entry footprint (32B key + 8B value vs 5-10 KB
for full bytes) so the same memory budget gives ~1000x the hit surface.
Capped at 1M entries (geth core/state/database_code.go uses the same size).

EXTCODESIZE / EXTCODEHASH callers — historically the slowest opcodes on
the lab dashboard's bench — answer from a single map probe without paying
the file accessor stack cost of the full bytes. Geth-proven; cross-client
writeup at agentspecs/cross-client-state-access-2026-05-14.md notes this
as the largest single available win for the synthetic bench.

Wiring:
- CodeCache.GetCodeSizeByEthHash / PutCodeSizeByEthHash — direct accessors.
- PutWithEthHash now populates the size layer alongside L2b, so every
  bytes-load creates a future fast-path entry "for free".
- StateCache wrappers GetCodeSizeByHash / PutCodeSizeByHash.
- SharedDomains.GetCodeSize(tx, addr) — the SD-transparent fast path:
  resolve codeHash via the AccountsDomain cache chain, probe the size
  cache, then L2b, then file-read+populate. Returns (0, false, nil) for
  EOAs and no-code accounts without paying any file read.
- temporalGetter.GetCodeSize so callers reach it via the existing getter.
- ReaderV3.ReadAccountCodeSize type-asserts on a codeSizeGetter interface
  and routes through the fast path when the underlying getter supports it;
  falls back to GetLatest+len otherwise. No kv.TemporalGetter interface
  change.

Limitation: capacity is no-op-when-full, not LRU. A separate surgical
commit will swap to real LRU eviction; mirrors the addrToHash fix queued
from the same cross-client writeup.

Tests: 3 new (PopulatedAlongsideBytes, DirectPutAndGet, EmptyHashOrNegativeIsNoOp).
All existing CodeCache tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e.StateCache

The BlockReadAheader has always prefetched BAL-listed (and access-list)
addresses' account/code/storage via a fresh ReaderV3 on a separate RoTx.
Its prefetches warmed OS page cache + RoTx cursors — disconnected from
the process-global cache.StateCache that SharedDomains.GetLatest probes
on the EVM hot path. The two layers were two separate caches; nothing
the prefetcher loaded ever reached the EVM's lookup path.

Reth's structural advantage on EXTCODESIZE-loop benches is that its
prewarm writes to the same hashmap the EVM reads from
(crates/engine/execution-cache/src/cached_state.rs:663). When EVM enters,
every BAL-listed addr's first read is a 20 ns cache probe — no file
accessor stack, no decompression CPU. PR #21128 swapped this from
mini-moka to a lock-free fixed-cache for a measured +10.8 % mgas/s.

This commit closes the equivalent gap on Erigon: a thin cache-populating
TemporalGetter wrapper writes successful reads through to cache.StateCache
as a side effect. ReaderV3 is unchanged; the wrapper sits in front. When
the prefetcher already has the codeHash from a preceding account read,
the next CodeDomain read routes through StateCache.PutCodeWithHash so
the L2b (ethHash → bytes) + size-cache layers fill alongside the bare
addr-keyed L1.

Wiring:
- BlockReadAheader.SetStateCache(*cache.StateCache) setter.
- ExecModule construction calls readAheader.SetStateCache(domainCache),
  so the same StateCache the FCU/canonical path wires onto SD is the one
  the prefetcher warms.
- cachePopulatingGetter wraps the worker's ttx; both BAL-warming and
  tx-warming paths gain the same treatment.

Fgprof on the EXTCODESIZE-EXISTING_CONTRACT-30M bench had shown 95 % of
EVM wall-clock in seg.Getter.nextPos (Huffman decompression of code
values). With this commit, every BAL-listed addr's lookup should hit
the cache and skip the file accessor stack entirely — eliminating the
dominant cost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ash LRU

Two surgical commits bundled (both touch the code-read hot path):

1. IntraBlockState.GetCodeSize now loads the full bytes via
   stateReader.ReadAccountCode on first touch and populates
   stateObject.code, so subsequent same-addr EXTCODESIZE /
   EXTCODEHASH / CALL within the tx are in-struct slice-len calls
   (~50 ns), not full reader round-trips. Mirrors geth's pattern
   at core/state/state_object.go ~Code() — pay one read per addr
   per tx, free for the rest.

2. CodeCache.addrToHash switched from a no-op-when-full
   maphash.Map[versionedAddressID] to an LRU
   lru.Cache[[20]byte, versionedAddressID] (hashicorp/golang-lru/v2,
   already imported elsewhere). Cap derived from the existing byte
   budget at ~28 bytes/entry (~580 k entries for the 16 MB default).
   Fresh-address workloads (mainnet thousands of new addrs per
   block) now warm up the addr layer over time instead of silently
   dropping new entries forever; matches geth's lru.Cache at
   core/state/database_code.go.

   The hashToCode layer is unchanged (content-addressed bytes,
   immutable, byte-capped with new-entry no-op when full — the same
   semantic as before since code bytes by codeHash never change).

Bench on the EXTCODESIZE-EXISTING_CONTRACT-30M family: 62.34 mgas/s
(was 61.50). The marginal gain is small on this bench because BAL
prefetch already populates the cache layers; neither lever fires
heavily. The expected wins are on non-BAL workloads where
EXTCODESIZE-loop patterns repeat within a tx (#1) and
fresh-address-churn mainnet blocks fill the addr layer (#2).

Updated TestCodeCache_AddrCapacityLimit to assert LRU eviction
(was asserting no-op-when-full); the prior behaviour was the bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nethermind-style addr → 32-byte codeHash LRU sitting above
SharedDomains.codeHashForAddr. When the EVM-known codeHash for an
address has already been resolved once, subsequent lookups skip the
entire AccountsDomain chain (sd.mem → sd.parent.mem → sd.stateCache →
tx.GetLatest) and the account-RLP decode.

Wiring:
- CodeCache adds addrToEthHash *lru.Cache[[20]byte, [32]byte] sized
  to the existing addrCapacityB budget; methods GetAddrCodeHash /
  PutAddrCodeHash / DeleteAddrCodeHash.
- StateCache wrappers route to the CodeCache instance.
- SD.codeHashForAddr probes the LRU first; on miss falls through to
  the existing chain and populates on the way out (including the
  zero-hash sentinel for missing-or-EOA accounts — repeat lookups
  return immediately).
- Invalidation: SD.DomainPut for AccountsDomain drops the entry
  (CREATE / CREATE2-replace path); SD.DomainDel for AccountsDomain
  also drops the entry (SELFDESTRUCT); StateCache.RevertWithDiffset
  drops on unwind.

Helps non-BAL workloads where codeHashForAddr is currently the cold
account-domain probe. On the EXISTING_CONTRACT bench (BAL prefetch
already populates everything), this is within noise; the lever is for
mainnet workloads where many addresses miss the BAL-prefetch window
but the cache is warm from prior lookups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cache-populating wrapper on the read-ahead worker's TemporalTx
previously gated cache writes on `len(v) > 0`. That dropped negative
results — i.e. missing accounts, empty storage slots, no-code probes —
on the floor. Repeated probes of the same missing address re-paid the
file accessor stack walk every time, instead of hitting a cached
negative entry.

Mirrors the revm pattern that drives reth's 1700-3400 mgas/s on
account_access NON_EXISTING / EXISTING_EOA variants: revm represents
a missing address as a real CacheAccount{ account: None, status:
LoadedNotExisting } and reth's ExecutionCache.account_cache uses
FixedCache<Address, Option<Account>> where None is a first-class
cacheable value. Bottom of the reth path is: BAL prewarm calls
basic_account once → returns None → cache hit forever for that addr.

The cycle-2 sweep on account_access[EXTCODESIZE/NON_EXISTING/30M]
showed 3.65 → 494 mgas/s without this fix; with the fix the same
bench reports 508 mgas/s (within run-to-run noise but trending right).
Most of the win was already captured by the readAhead-populates-
cache.StateCache wiring (commit cbe9044) and the balcache port
(d41e2e8) — those raised the cache hit rate on populated entries
enough that the EVM rarely fell through to the file accessor on
this bench. The fix is mechanically correct regardless and should
matter more on workloads with mixed populated / negative probes
across blocks.

See agentspecs/reth-missing-eoa-fastpath-2026-05-15.md for the
detailed mechanism analysis and the three concrete copy-able
patterns from reth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…unters

GenericCache.Put has no eviction policy. When the byte budget is reached,
new keys are silently dropped until Clear/ClearWithHash/ValidateAndPrepare-
mismatch resets the cache. On a long-running node this manifests as a
monotonic miss-rate climb that's hard to attribute without instrumentation.

Add two counters next to hits/misses:
  inserts - new keys accepted
  dropped - new keys rejected at the budget check (the existing branch
            at the new-key cap; not a behaviour change)

PrintStatsAndReset logs both. Sets up the diagnostic baseline before the
eviction-policy swap in the follow-up commits on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the maphash.Map[T] backing store in GenericCache with
freelru.ShardedLRU[uint64, entry[T]] (same lib as db/state/cache.go;
already in go.mod). Adds a Mode constructor flag:

  - ModeEvictLRU (default): per-shard LRU evicts the oldest entry on
    insert when its slot cap is reached. OnEvict drops bytes from
    currentSize.
  - ModeNoOp: preserves the historical fill-and-freeze behaviour
    (silently drop new keys at the byte cap; counted via dropped).
    Kept as the diagnostic baseline so the regression bench can
    compare A/B.

Per-shard eviction is a known trade-off of freelru.ShardedLRU —
RemoveOldest is shard-local, not globally LRU. Matches the trade-off
db/state/cache.go / execution/cache/code_cache.go /
execution/balcache/balcache.go already accept. LFU (W-TinyLFU, the
policy reth uses) is scan-resistant by design and would slot in
behind the same Mode wrapper as a follow-up; the seam is documented
at policy.go.

Key shape: pre-hash via common/maphash.Hash (Go's randomized stdlib
hasher, already used by the previous maphash.Map) into uint64; entry
stores the full key for collision check. Same pattern as
db/state/cache.go.

Byte-budget translation: per-domain avg-entry constants in
state_cache.go (avgAccountEntryBytes / avgStorageEntryBytes /
avgCommitmentEntryBytes) — account / storage are near-fixed sizes so
the translation is reliable. capacityBytes becomes a sizing hint
plus telemetry (SizeBytes / PrintStatsAndReset). Code domain is
unchanged; CodeCache wraps its own LRUs.

Adds metrics: inserts, evictions, dropped — all exposed in
PrintStatsAndReset alongside the existing hits / misses / hit_rate.
Mode is also logged.

Touches one external call site: execution/vm/contract.go's
jumpDestCache now constructs with ModeEvictLRU.

Tests: TestDomainCache_PutCapacityLimit renamed to ..._NoOpMode and
asserts the fill-and-freeze contract under explicit ModeNoOp. New
TestDomainCache_PutEvictsWhenFull_EvictMode asserts eviction under
ModeEvictLRU using a small entry-count cap (the byte→entry
translation is approximate; the test uses the entry-count knob via
the in-package newGenericCacheEntries constructor to make the
assertion deterministic).

Pre-existing lint issues on mh/sd-code-cache (intra_block_state.go
nilness, preload_parallel.go prealloc) are surfaced by lint
non-determinism but are out of this commit's scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single env knob read once at NewStateCache. Default ModeEvictLRU,
recognised override "noop" (for the regression-bench baseline so
ModeEvictLRU and ModeNoOp can be compared on the same binary).
Unrecognised values fall back to evict with a warn log.

ModeNoOp engagement is logged at info level because the
fill-and-freeze behaviour is a deliberate diagnostic state, not a
production setting.

Pattern matches db/state/cache.go's D_LRU_ENABLED / D_LRU knobs
(dbg.EnvString from common/dbg).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous comment asserted "reth uses W-TinyLFU for state caches" —
that is wrong on the execution hot path. Reth's cross-block state cache
is `fixed-cache` (PR #21128, v1.11.0): a lock-free direct-mapped /
set-associative array with collision-evict semantics. No LRU list, no
LFU sketch. Their published wins (~25% newPayload p50 / +33% gas/s) came
from *removing* LRU/LFU bookkeeping, not adding LFU.

Where reth uses real LRU/LFU it's deliberate and not the execution cache
(schnellru::LruMap for networking; moka in precompile_cache.rs explicitly
configured with eviction_policy(EvictionPolicy::lru())).

The docstring now reflects two follow-up policies both real:
- ModeEvictFixedCache (reth's actual choice, more interesting structural
  option than LFU)
- ModeEvictLFU (W-TinyLFU; helps mainnet steady-state, not the cycle-2
  bloat fixtures which are pure cold scans)

Decision criterion (per agentspecs/lfu-vs-lru-state-cache-decision-2026-05-15.md):
ship ModeEvictLFU only if a 24h mainnet replay shows current sharded-LRU
hit-rate < 90 % on Account/Storage. Otter is the only credible Go
W-TinyLFU library; ristretto has documented correctness bugs and is
disqualified for an EL hot path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Investigation knob, NOT a permanent default. Account / Storage / Code
each capped at 100 MB so the bench measures layer contributions instead
of being dominated by preallocated cache memory pressure (1 GB / 1 GB /
512 MB defaults push sys past the GC/page-cache pressure band on this
hardware/workload mix).

Permanent defaults stay at 1 GB / 1 GB / 512 MB; this commit will be
reverted or dynamically gated by relative-to-available sizing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings the 13 parallel-exec correctness commits from mh/parallel-exec-fixes
onto the StateCache LRU base. Conflicts resolved in 3 files (versionedio.go,
rw_v3.go, exec3_parallel.go) by keeping HEAD's typed-readset / per-path
revival shape and confirming HEAD already absorbs each fix's intent:

- versionedio.go: HEAD's per-path versionMap.Read(addr,path,key,...)
  supersedes the older 3-path BalancePath|NoncePath|CodeHashPath scan.
- rw_v3.go: HEAD has rs.trace as atomic.Bool — keep .Load() form.
- exec3_parallel.go (5 regions): keep HEAD's runPostApplyMessageOnMinIBS
  refactor (already includes the EIP-7708 burn-log fix from a0ecfc7),
  HEAD's BalanceChangeReason field name, HEAD's hasCreateContract guard
  on the post-SD defaulting path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mh0lt mh0lt force-pushed the mh/perf-statecache-lru-pr branch from 266e297 to 4a512ce Compare May 25, 2026 07:29
@AskAlexSharov AskAlexSharov requested a review from Copilot May 25, 2026 08:08
@AskAlexSharov AskAlexSharov merged commit da4b091 into mh/perf-statecache-lru-pr May 25, 2026
3 checks passed
@AskAlexSharov AskAlexSharov deleted the mh/perf-parallel-fixes-pr branch May 25, 2026 08:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Erigon’s execution-path caching and read-ahead plumbing to improve parallel-exec correctness/performance characteristics by (a) introducing configurable cache eviction policy, and (b) adding code-hash/size fast paths so repeated EXTCODESIZE/EXTCODEHASH/CALL workloads avoid expensive CodeDomain lookups and decompression.

Changes:

  • Reworks execution/cache generic caches to support eviction modes (LRU vs historical no-op-on-full) and adds cache statistics counters.
  • Expands CodeCache with additional lookup layers (addr→codeHash LRU, ethHash→code bytes, and ethHash→code size) and wires these through SharedDomains and ReaderV3.
  • Makes block read-ahead prefetch populate the same process-global StateCache used by the EVM hot path.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
execution/vm/contract.go Updates jumpdest cache construction to use new GenericCache mode parameter.
execution/state/rw_v3.go Adds a type-asserted GetCodeSize fast path for code-size-only reads.
execution/state/intra_block_state.go Changes GetCodeSize to fetch full code once and cache it on the stateObject.
execution/execmodule/exec_module.go Wires the process-global StateCache into block read-ahead.
execution/exec/blocks_read_ahead.go Adds a cache-populating TemporalGetter wrapper and codeHash hinting for CodeDomain warmups.
execution/cache/state_cache.go Adds STATE_CACHE_MODE handling and new helpers for code-by-hash / code-size / addr→codeHash operations.
execution/cache/policy.go Introduces cache eviction Mode definition and documentation.
execution/cache/generic_cache.go Replaces GenericCache backing store with sharded LRU and adds insert/evict/drop counters.
execution/cache/code_cache.go Adds addr→codeHash LRU, ethHash→code, and codeSize cache layers to CodeCache.
execution/cache/code_cache_ethhash_test.go Adds tests/benchmarks for new ethHash-based code lookup and code-size caching.
execution/cache/cache_test.go Updates tests for new eviction behaviors and LRU semantics.
db/state/execctx/domain_shared.go Adds SD-level code-hash resolution LRU, L2b bypass in GetLatest(CodeDomain, ...), and GetCodeSize API.
Comments suppressed due to low confidence (3)

execution/cache/code_cache.go:378

  • ValidateAndPrepare() purges only addrToHash on block-hash mismatch, but leaves addrToEthHash intact. Since addrToEthHash feeds SharedDomains.codeHashForAddr, keeping it across non-sequential execution can serve stale codeHash values after a reorg/fork-validation/unwind path and poison the code-by-hash fast paths. Purge addrToEthHash here whenever you invalidate addr-level mappings.
	// Mismatch - clear address mappings (they may be stale)
	// Keep code mappings since hash → code is immutable
	c.addrToHash.Purge()
	c.blockHash = incomingBlockHash
	return false

execution/cache/code_cache.go:387

  • ClearWithHash() purges addrToHash but leaves addrToEthHash untouched. During unwind this can retain an addr→codeHash mapping that no longer matches the reverted account state, potentially poisoning subsequent codeHashForAddr lookups. Purge addrToEthHash here too.
func (c *CodeCache) ClearWithHash(hash common.Hash) {
	c.mu.Lock()
	defer c.mu.Unlock()
	c.addrToHash.Purge()
	c.blockHash = hash

execution/state/intra_block_state.go:785

  • In the versionedRead-backed GetCodeSize path, code bytes are fetched and assigned into s.code, but callCodeAccessHook isn’t invoked for this first fetch. If code-access tracking is desired whenever bytes are available, call the hook after a successful ReadAccountCode() here too.
			code, codeErr := sdb.stateReader.ReadAccountCode(addr)
			l := len(code)
			if codeErr == nil && s != nil && code != nil {
				s.code = code
			}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 333 to 337
// Clear removes all address mappings from the cache.
// The codeHash → code mappings are preserved since they're immutable.
func (c *CodeCache) Clear() {
c.addrToHash.Clear()
c.addrSize.Store(0)
c.addrToHash.Purge()
}
Comment on lines +92 to +96
v, step, err := cpg.g.GetLatest(name, k)
if err == nil && cpg.sc != nil {
if name == kv.CodeDomain && len(cpg.codeHashHint) > 0 {
cpg.sc.PutCodeWithHash(k, v, cpg.codeHashHint)
cpg.codeHashHint = nil
Comment on lines +745 to +749
code, err := sdb.stateReader.ReadAccountCode(addr)
if err != nil {
return 0, err
}
if stateObject != nil && code != nil {
Comment on lines +1074 to +1078
if h, ok := sd.stateCache.GetAddrCodeHash(addr); ok {
if h == ([32]byte{}) {
return nil
}
return h[:]
Comment on lines +33 to +36
// DefaultAccountCacheBytes is the byte limit for account cache (100 MB — investigation knob; permanent default returns to 1 GB)
DefaultAccountCacheBytes = 100 * datasize.MB
// DefaultStorageCacheBytes is the byte limit for storage cache (100 MB — investigation knob; permanent default returns to 1 GB)
DefaultStorageCacheBytes = 100 * datasize.MB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants