fix(miner): persist intent before broadcast to prevent dest double-send (#296) by RUNECTZ33 · Pull Request #299 · entrius/allways

RUNECTZ33 · 2026-05-09T04:36:49Z

Summary

Closes #296. S0 race — preventable on-chain double-spend.

process_swap broadcasts the destination tx via send_dest_funds(), then writes the SentSwap record to the in-memory dict + save_sent_cache():

if sent is None:
    send_result = self.send_dest_funds(swap, user_receives_amount)   # ← on-chain side effect
    if not send_result:
        return False
    to_tx_hash, to_tx_block = send_result
    sent = SentSwap(to_tx_hash=to_tx_hash, ...)
    self.sent[swap.id] = sent
    self.save_sent_cache()                                            # ← persist

If the miner is killed between the broadcast returning and the persist completing (SIGKILL/OOM/hardware fault/container shutdown), the destination tx is on-chain but the cache has no record. On restart, sent is None → process_swap broadcasts a second destination tx for the same swap. The BTC provider's in-process broadcasted_txids set is also empty after restart, so its dedup is gone.

Fix: persist intent before side effect

Write a pending sentinel (SentSwap with empty to_tx_hash) to the cache before calling send_dest_funds. Three outcomes are now safe:

Crash window	Behavior
Between sentinel-persist and broadcast	Sentinel exists, no tx broadcast → restart sees pending, refuses to process, logs critical. No double-broadcast.
After broadcast but before post-broadcast cache update	Sentinel exists, real tx is on-chain → restart sees pending, refuses to process, logs critical. No double-broadcast.
Broadcast fails cleanly (`send_dest_funds` returns falsy)	Sentinel is dropped before returning → next pass can retry. No stuck pending entry.

Trade-off acknowledged: if the crash window is hit on a real swap, that single swap's fulfillment stalls until operator reconciles (look up tx by scanning dest chain for miner-address sends matching the swap, then update or delete the cache entry).

That trade is strictly correct: a stuck swap can be recovered; a double-spend cannot.

Change

allways/miner/fulfillment.py: +43 / -1. No SentSwap dataclass change (empty to_tx_hash is the sentinel — backward compatible with the existing 3-field cache schema). No new imports.

+ # v2 (#296): persist a pending sentinel BEFORE broadcasting
+ pending = SentSwap(to_tx_hash='', to_tx_block=0, marked_fulfilled=False)
+ self.sent[swap.id] = pending
+ self.save_sent_cache()
+
  send_result = self.send_dest_funds(swap, user_receives_amount)
  if not send_result:
+     # broadcast failed cleanly — drop the sentinel so a retry can re-attempt
+     self.sent.pop(swap.id, None)
+     self.save_sent_cache()
      ...

Plus load_sent_cache scans for empty-to_tx_hash entries on startup and logs bt.logging.critical(...) with the swap IDs + reconciliation instructions.

Plus process_swap checks for the sentinel up-front and refuses to operate, surfacing a bt.logging.warning so the operator sees the same recovery hint each retry pass.

Test plan

AST parses cleanly: python3 -c "import ast; ast.parse(open('allways/miner/fulfillment.py').read())"
Verified the JSON cache schema is unchanged — the existing 3-field [to_tx_hash, to_tx_block, marked_fulfilled] array works for sentinels (empty string to_tx_hash, zero block) and real entries identically.
Verified cleanup_stale_sends still drops sentinels alongside completed entries when the swap leaves the active set.
Manual trace through the three crash windows above against current process_swap + load_sent_cache flow.
Recommended follow-up: a regression test mirroring the issue's repro snippet (test_crash_between_send_and_cache_causes_double_broadcast) — happy to add if reviewer prefers.

…nd (entrius#296) S0 race: process_swap broadcasts the destination tx via send_dest_funds, then writes the SentSwap record to the in-memory dict + save_sent_cache(). If the miner is killed (SIGKILL/OOM/hardware fault/container shutdown) between the broadcast returning and the persist completing, the destination tx is on-chain but the cache has no record. On restart, sent is None again, and process_swap broadcasts a SECOND destination tx for the same swap. The BTC provider's in-process broadcasted_txids set is also empty after restart, so its dedup is gone. Fix: persist a pending sentinel (SentSwap with empty to_tx_hash) BEFORE calling send_dest_funds. Three outcomes: 1. Crash between sentinel-persist and broadcast — restart sees the pending sentinel, refuses to process this swap, surfaces critical log. 2. Crash after broadcast but before post-broadcast cache update — same outcome: pending sentinel triggers restart-side refusal + critical log, no double-broadcast. 3. Broadcast fails cleanly (send_dest_funds returns falsy) — we drop the sentinel before returning, so the next pass can retry without a stuck pending entry. On restart, load_sent_cache scans for entries with empty to_tx_hash. If any exist, log critical with the swap IDs and tell the operator to scan the dest chain for tx FROM the miner address matching each swap, then update or delete the cache entry. process_swap refuses to operate on pending sentinels to ensure the operator sees the warning before any further send. Trade-off: if the crash window is hit on a real swap, fulfillment for that swap stalls until manual reconciliation. That's strictly better than double-broadcasting funds: a stuck swap can be recovered (look up tx, update cache); a double-spend cannot. Scope: allways/miner/fulfillment.py, +43/-1. No SentSwap dataclass change (empty string in to_tx_hash is the sentinel — backward compatible with the existing 3-field cache schema). No new imports. Closes entrius#296

xiao-xiao-mao Bot added the bug Something isn't working label May 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(miner): persist intent before broadcast to prevent dest double-send (#296)#299

fix(miner): persist intent before broadcast to prevent dest double-send (#296)#299
RUNECTZ33 wants to merge 1 commit intoentrius:testfrom
RUNECTZ33:fix/296-prevent-double-broadcast

RUNECTZ33 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RUNECTZ33 commented May 9, 2026

Summary

Fix: persist intent before side effect

Change

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant