Skip to content

[wip] db/state: only last reader deletes merged files#21395

Closed
AskAlexSharov wants to merge 2 commits into
mainfrom
alex/no_rm_merged_35
Closed

[wip] db/state: only last reader deletes merged files#21395
AskAlexSharov wants to merge 2 commits into
mainfrom
alex/no_rm_merged_35

Conversation

@AskAlexSharov
Copy link
Copy Markdown
Collaborator

@AskAlexSharov AskAlexSharov commented May 25, 2026

Summary

  • deleteMergeFile no longer removes files eagerly. It drops the item from dirtyFiles and sets canDelete=true; physical removal (closeFilesAndRemove) is left to the last reader inside RoTx.Close.
  • Motivation: BeginFilesRo is lock-free, so eagerly closing a file in deleteMergeFile can race a reader that loaded a visible generation containing the file but has not yet bumped its refcount.
  • cleanAfterMerge now holds a dirty-files rotx (DebugBeginDirtyFilesRo) pinning the subsumed files, so its Close is a guaranteed last reader. Without this, files with no live reader (the final merge step's subsumed files, and RemoveOverlapsAfterMerge) would be marked canDelete but never unlinked — a FD/disk leak that POSIX hides (unlink-while-open) but Windows surfaces as RemoveAll: Access is denied.
  • Adds a portable regression test (TestCleanAfterMerge_UnlinksSubsumedFiles) that catches the leak on all platforms via on-disk presence.

Open items (wip)

  • The forkable/snapshot-repo twin SnapshotRepo.DeleteFilesAfterMerge still uses the eager refcount==0 → closeFilesAndRemove branch. It is not broken (still unlinks), but for consistency it should eventually move to the same last-reader model.

deleteMergeFile no longer removes files eagerly. It removes the item from
dirtyFiles and marks canDelete=true; physical removal is left to the last
reader inside RoTx.Close. The merge holds a rotx pinning these files, so it
is itself such a reader.
…ks them

Since deleteMergeFile no longer unlinks files eagerly, a subsumed file with no
live reader (e.g. RemoveOverlapsAfterMerge, or files already non-visible by the
final merge step) would be marked canDelete but never closed - leaking the FD
and leaving the file on disk. POSIX hides this via unlink-while-open; Windows
fails RemoveAll with "Access is denied".

cleanAfterMerge now holds a dirty-files rotx (DebugBeginDirtyFilesRo) pinning the
subsumed files, so its Close is a guaranteed last reader that unlinks any file no
other reader still holds. Add a portable regression test asserting subsumed files
are unlinked from disk after RemoveOverlapsAfterMerge.
@AskAlexSharov
Copy link
Copy Markdown
Collaborator Author

wrong direction: this func deleting non-visible (garbage) files also

Better avoid frozen flag (which disabling refcnt) - for correctness. Until we done with #21397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants