[wip] db/state: move refcnt to visibleFiles object#21397
Open
AskAlexSharov wants to merge 14 commits into
Open
[wip] db/state: move refcnt to visibleFiles object#21397AskAlexSharov wants to merge 14 commits into
refcnt to visibleFiles object#21397AskAlexSharov wants to merge 14 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors db/state snapshot-file deletion to avoid TOCTOU/double-free hazards by moving from per-FilesItem refcount/canDelete gating to generation-based (bundle) refcounting with oldest-first reclamation of retired files.
Changes:
- Add generation-chained
aggregatorVisiblebundles with a singlerefcntpin per reader, plus aretiredset reclaimed when the oldest drained generation advances. - Update merge/cleanup paths to “retire” files (remove from
dirtyFiles) and defer physical deletion to the bundle reclaimer. - Update debug pinning and GC tests to validate deferred deletion behavior under both normal readers and debug accessor-building readers.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/plans/20260525-lockfree-file-reclamation-spec.md | New design spec describing generation-based reclamation and invariants. |
| db/state/aggregator.go | Implements bundle refcnt pinning, generation chain, retired-file reclamation, and updated merge cleanup plumbing. |
| db/state/dirty_files.go | Replaces deleteMergeFile with retireMergeFiles and removes per-visible-file refcount increment/decrement helpers. |
| db/state/merge.go | Threads “retired files” up from domain/history/II cleanup so aggregator can publish+defer deletion. |
| db/state/domain.go | Removes per-file refcount pin/unpin in domain RO tx lifecycle (now covered by bundle pin). |
| db/state/history.go | Removes per-file refcount pin/unpin in history RO tx lifecycle (now covered by bundle pin). |
| db/state/inverted_index.go | Removes per-file refcount pin/unpin in II RO tx lifecycle (now covered by bundle pin). |
| db/state/aggregator_debug.go | Debug dirty-files RO tx now pins the current bundle generation instead of per-file refcounting. |
| db/state/gc_test.go | New tests verifying deferred deletion semantics for normal readers, debug pins, and concurrent reclaim. |
| db/state/aggregator_debug_test.go | Updates test to assert debug Close is not a deleter (reclaimer is sole deleter). |
| db/state/squeeze.go | Wraps recalcVisibleFiles(nil) with dirtyFilesLock where needed and updates signature calls. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TestDirtyFilesRoTx_CloseIsNotADeleter leaked the seg.Decompressor FD, so on Windows t.TempDir cleanup could not unlink the still-open file. Close it via defer (the Close() under test must not delete the file, only release the FD). Also refresh a stale "via FilesItem.refcount" comment — the debug RoTx now pins the bundle generation.
refcnt to visibleFiles object
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Aggregator.BeginFilesRo()was made lock-free in #20462/#20490, but physicalfile deletion stayed gated by two per-
FilesItematomics (refcount+canDelete).Two atomics guarding one destructive action (
closeFilesAndRemove) is the TOCTOUdouble-free behind #21384, and a per-file refcount taken after the snapshot
pointer is loaded can't protect the load→pin window by itself.
What this does
Replaces per-file
refcount/canDeletewith MVCC reclamation gated by a refcounton the published bundle (
aggregatorVisible) — the MDBX freelist model (a pagefreed at txnid
Tis reclaimable once the oldest live reader's txnid> T),realized in Go by reference-counting the generation object instead of each file.
refcnt.refcntonly grows while a bundle is current, only shrinks once superseded.BeginFilesRodoes validate-after-pin (one atomic add + re-check), closing theload→pin window. One add instead of dozens of per-file increments.
dirtyFilesby a merge/prune are attached to the outgoinggeneration's
retiredset and physically deleted only once that generation(and every older one) drains — reclaimed oldest-first, single owner of
closeFilesAndRemove, no per-file flag, no double-free.DebugBeginDirtyFilesRo(BuildMissedAccessors) pins the generation the same way,so its captured dirty files — including unindexed ones absent from the visible
set — are protected for the duration of the accessor build.
FilesItem.refcount/canDeleteare now used only by the forkable subsystem(out of scope here).
Design + file lifecycle:
docs/plans/20260525-lockfree-file-reclamation-spec.md.Status
WIP. Validated locally:
db/state/...under-race(no data races),make lint,make erigon integration.