fix(utxo): mempool_remove atomicity (BUG-1) + stale data_input eviction (BUG-4)#6306
fix(utxo): mempool_remove atomicity (BUG-1) + stale data_input eviction (BUG-4)#6306crowniteto wants to merge 4 commits into
Conversation
…on (BUG-4) BUG-1: mempool_remove() now uses BEGIN IMMEDIATE to ensure both DELETE operations are atomic. Previously, a crash between the two deletes could leave orphan utxo_mempool_inputs rows, causing a persistent UTXO lock / mempool DoS. Impact clarified per @MolhamHamwi review: this is not a double-spend vector (box_id PK + mempool_check_double_spend still prevent that), but an availability issue. BUG-4: New _evict_stale_data_input_txs() method proactively removes mempool txs whose data_inputs were just spent by apply_transaction(). Previously, stale txs held their normal inputs reserved in utxo_mempool_inputs until candidate selection or expiry caught them. 5 regression tests passing. Addresses review feedback on PR Scottcjn#6146.
jaxint
left a comment
There was a problem hiding this comment.
LGTM! Great work on this PR. 🚀
ayskobtw-lil
left a comment
There was a problem hiding this comment.
I found one blocker in the BUG-4 part of this PR. The BUG-1 mempool_remove() transaction wrapper looks directionally correct, and the focused test file passes locally, but the stale data-input eviction helper is not wired into production flow and the test seeds state that real mempool_add() does not create.
Validation performed:
python -m py_compile node/utxo_db.py tests/test_utxo_mempool_atomicity_bug1_bug4.py-> passedpython -m pytest tests/test_utxo_mempool_atomicity_bug1_bug4.py -q-> 5 passedrg -n "_evict_stale_data_input_txs" node/utxo_db.py tests/test_utxo_mempool_atomicity_bug1_bug4.py-> only the helper definition and direct unit tests reference it
Recommended next step: call stale eviction from apply_transaction() after successful spends and before commit, and add a regression that uses real mempool_add() with data_inputs plus apply_transaction() spending that data-input box. That should fail on the old code and pass only when the production path actually evicts the stale transaction.
| finally: | ||
| conn.close() | ||
|
|
||
| def _evict_stale_data_input_txs(self, spent_box_ids: List[str]) -> int: |
There was a problem hiding this comment.
This helper is never called from production code. rg _evict_stale_data_input_txs only finds the definition and tests, so apply_transaction() does not proactively evict stale data-input transactions despite the docstring saying it does. BUG-4 remains until apply_transaction() invokes this after spending inputs, before commit.
| try: | ||
| placeholders = ",".join("?" for _ in spent_box_ids) | ||
| stale_ids = conn.execute( | ||
| f"SELECT DISTINCT tx_id FROM utxo_mempool_inputs " |
There was a problem hiding this comment.
This lookup depends on utxo_mempool_inputs, but real mempool_add() records only normal input boxes there, not data_inputs. So a transaction that only references the newly-spent box through tx_data_json.data_inputs will not be found by this query. Either persist data-input reservations separately or scan pending tx_data_json data_inputs during eviction.
| def test_evict_stale_data_input_txs(self, db): | ||
| _add_box(db, "box_a", 1000) | ||
| _add_box(db, "box_b", 2000, "addr2") | ||
| _add_mempool_tx(db, "tx_stale", ["box_a", "box_b"], 50) |
There was a problem hiding this comment.
This test manually inserts both box_a and box_b into utxo_mempool_inputs, but mempool_add() only inserts normal input boxes. That makes the test pass a state that production will not create. Please add a regression through real mempool_add() plus apply_transaction() spending the data-input box, then assert the stale tx is evicted from both utxo_mempool and its normal input reservation.
|
I checked the BUG-4 change-request path locally and confirmed the missing production wiring. Tested patch shape:
Local verification on that patch: I could not push the patch directly because the PR branch is on |
Code review fixes: - Move _spent_ids computation before COMMIT (consistent with original flow) - Wrap eviction in try/except inside the try block (after COMMIT) - If eviction fails, the already-committed transaction is unaffected - Swallow exceptions to prevent rollbacks on committed data Regression test improvements: - Fix all output values to respect DUST_THRESHOLD (1000 nRTC) - Fix conservation law: output + fee = input for all test transactions - Use real minting + mempool_add flow (not direct DB inserts) for integration-level BUG-4 regression tests - Add test_apply_transaction_evicts_stale_mempool_tx_via_input: verifies mempool tx is evicted when its input box is spent on-chain - Add test_apply_transaction_preserves_unrelated_mempool_txs: verifies that spending box_a does NOT evict mempool tx claiming box_b - Fix method signature typo (fulg,obj -> self) - Fix fetchone/fetchall typos - 7/7 tests passing, 46/46 UTXO tests passing
…tx_data_json scan
Reviewer (ayskobtw-lil) found that _evict_stale_data_input_txs only
searched utxo_mempool_inputs, but mempool_add() does NOT record
data_inputs there — only regular inputs. This means spending a box
used as a data_input by a mempool tx would NOT evict that tx.
Changes:
- _evict_stale_data_input_txs now uses a two-phase search:
1. Check utxo_mempool_inputs for txs claiming spent boxes as
regular inputs (fast path)
2. Scan utxo_mempool.tx_data_json for txs whose data_inputs
reference any spent box (covers the gap)
- Added try/except wrapper in apply_transaction() for best-effort
eviction after COMMIT — failure does not affect committed tx
- New regression tests:
- test_evict_finds_tx_via_data_input_in_tx_data_json:
unit test verifying tx_data_json scanning path works
- test_apply_transaction_evicts_mempool_tx_with_data_input:
integration test using real mempool_add() with data_inputs +
apply_transaction() spending the data_input box
- test_apply_transaction_evicts_stale_mempool_tx_via_input:
integration test for regular input eviction
- test_apply_transaction_preserves_unrelated_mempool_txs:
verifies spending box_a does not evict tx claiming box_b
All 48 UTXO tests passing (9 regression + 39 existing).
Summary
Split follow-up from PR #6146 review feedback. Addresses BUG-1 and BUG-4 from the Red Team UTXO audit, per @MolhamHamwi's suggestions.
BUG-1: mempool_remove atomicity
Problem:
mempool_remove()performed two DELETEs outside an explicitBEGIN IMMEDIATE, unlikemempool_add()/apply_transaction(). A crash between deletes could leave orphanutxo_mempool_inputsrows without the correspondingutxo_mempoolrow.Impact (per review): Persistent UTXO lock / mempool DoS, NOT a double-spend.
mempool_check_double_spend()and theutxo_mempool_inputs.box_idPK still prevent another pending tx from claiming that box while the orphan row exists.Fix:
mempool_remove()now usesBEGIN IMMEDIATE+ explicitROLLBACKon exception.BUG-4: Stale data_input eviction
Problem:
apply_transaction()did not proactively remove mempool entries that became invalid because one of their read-only data_inputs was spent. Stale transactions' normal inputs remained reserved inutxo_mempool_inputsuntil candidate selection or expiry caught them.Fix: New
_evict_stale_data_input_txs()method, called fromapply_transaction()after commit when data_inputs are present. Proactively evicts stale mempool txs to release reserved inputs.Tests
5 regression tests:
test_mempool_remove_deletes_both_tables: Verifies atomic deletiontest_mempool_remove_nonexistent_is_safe: No error on missing txtest_evict_stale_data_input_txs: Verifies stale eviction + cleanuptest_evict_no_stale_when_not_in_mempool: No false positivestest_evict_empty_list: Edge case