Skip to content

feat(platform-wallet)!: persistor backports — ClientStartState extension, retry-safe flush, functional load()#3643

Draft
Claudius-Maginificent wants to merge 7 commits into
feat/platform-wallet-sqlite-persistorfrom
feat/platform-wallet-persistor-backports
Draft

feat(platform-wallet)!: persistor backports — ClientStartState extension, retry-safe flush, functional load()#3643
Claudius-Maginificent wants to merge 7 commits into
feat/platform-wallet-sqlite-persistorfrom
feat/platform-wallet-persistor-backports

Conversation

@Claudius-Maginificent
Copy link
Copy Markdown
Collaborator

Issue being fixed or feature implemented

Follow-up PR layered on top of #3625 (feat/platform-wallet-sqlite-persistor). Backports three improvements from shumkov's parallel persistor in dash-evo-tool's feat/platform-wallet2 branch.

Backports landed (per /tmp/persistor-comparison-report.md):

  • P1prepare_cached for hot-path SQL writers (perf)
  • P2 — retry-safe flush buffer (no silent data loss on transient SQL errors)
  • P4 — functional load() reconstruction (identities, contacts, asset_locks now rehydrate at boot)

Out of scope (deliberate):

  • P3 (network column on each row for multi-network in one DB). Multi-network deployments use separate DB files per network instead.
  • Full ClientStartState.wallets rehydration. That requires an upstream key_wallet::Wallet::from_persisted constructor that does not yet exist; tracked via tracing::info!(wallets_pending_rehydration = N) log line.

What was done?

Seven commits, two distinct waves. The first 5 commits are the structural backport; the last 2 are the QA fix wave.

Commit Subject
6bcfdd35c3 feat(platform-wallet)!: extend ClientStartState with identities/contacts/asset_locks slots
369f1de130 feat(platform-wallet-storage): retry-safe flush via Buffer take/restore
13fab2525b perf(platform-wallet-storage): use prepare_cached for every writer
8c1635d70a feat(platform-wallet-storage): functional load() for identities, contacts, asset_locks
c1e8f70bbf docs(platform-wallet-storage): document retry-safe flush + functional load()
2cb47948e6 fix(platform-wallet-storage): tighten error classification surface
7a6e3012246 perf(platform-wallet-storage): bulk load_all readers + asset_locks dedup + chore cleanups

Notable design decisions:

  • Buffer API additive: new take_for_flush + restore pair preserves staged data on transient flush failure. Buffer::drain retained as #[deprecated] alias (removal target: 3.2.0).
  • WalletStorageError::FlushRetryable is the typed signal for SQLITE_BUSY / SQLITE_LOCKED. Display string starts with the literal "FlushRetryable: " so operators grepping PersistenceError::Backend(String) at the trait boundary can detect the variant.
  • WalletStorageError::is_transient uses a wildcard-free match; the enum intentionally does NOT carry #[non_exhaustive] so every future variant is forced through compile-time classification.
  • load() performs one wallet_meta::list_ids SELECT + one load_all per per-area schema reader (platform_addrs, identities, contacts, asset_locks) — strictly bulk, NOT N+1 over wallets. New TC-P4-012 enforces query count constancy.
  • Internal-only Cargo feature renamed test-helpers__test-helpers (Cargo's double-underscore convention) so downstream crates can't pull in force_next_flush_to_fail and bypass typed-API guarantees.

How Has This Been Tested?

  • cargo test -p platform-wallet -p platform-wallet-storage --all-targets --all-features224 tests, 0 failures
  • cargo test --doc -p platform-wallet-storage --all-features → 1 doctest pass on SqlitePersister::load()
  • cargo fmt --all -- --check → clean
  • cargo clippy -p platform-wallet -p platform-wallet-storage --all-targets --all-features -- -D warnings → clean
  • cargo tree -i rusqlite -p platform-wallet-storage → single version

New test coverage: 19 + 1 = 20 tests added (P1: lint guard + cache scope; P2: retry-safe semantics + LWW + error classification + boundary mapping; P4: per-area reconstruction + corruption tolerance + structured load summary + query-count constancy).

The full lifecycle (Requirements → UX → Test Spec → Dev Plan → Implementation → 5-agent QA → Fix wave → Verification) was driven via the claudius:workflow-feature skill. All HIGH and MEDIUM findings from the QA wave were closed before this PR opened.

Breaking Changes

YesClientStartState (packages/rs-platform-wallet/src/changeset/client_start_state.rs) is now #[non_exhaustive] and gains three new top-level fields:

  • identities: BTreeMap<WalletId, IdentityManagerStartState>
  • contacts: BTreeMap<WalletId, ContactsStartState>
  • asset_locks: BTreeMap<WalletId, BTreeMap<u32, BTreeMap<OutPoint, TrackedAssetLock>>>

In-crate destructures of ClientStartState MUST use .. rest pattern (or list every field). The manager/load.rs:33 site is the canonical pattern. Cross-crate code is protected by #[non_exhaustive].

BREAKING CHANGE: footer present on 6bcfdd35c3 for conventional-changelog-dash to surface in release notes.

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

🤖 Co-authored by Claudius the Magnificent AI Agent

lklimek added 7 commits May 14, 2026 14:48
…cts/asset_locks slots

Adds three top-level slots parallel to `platform_addresses` / `wallets`
on `ClientStartState`, plus the supporting `ContactsStartState` snapshot
type. The struct gains `#[non_exhaustive]` so future slots can land
without breaking destructuring callers.

Three internal destructuring sites (`manager::load`, `manager::wallet_lifecycle`,
`wallet::platform_wallet`) absorb the new attribute via `..` rest patterns —
they only consume `platform_addresses` today.

Tests:
- TC-P4-011 (`tc_p4_011_client_start_state_non_exhaustive`): grep guard
  on the upstream source so the attribute can't quietly disappear.
- TC-P1-003 (`tc_p1_003_prepare_cached_in_writers`): writer-path lint
  for `prepare_cached`. Currently passes with read-only SELECTs allowlisted.

BREAKING CHANGE: `ClientStartState` is now `#[non_exhaustive]` and
gains three new public fields (`identities`, `contacts`, `asset_locks`).
In-crate destructures must use `..` rest patterns or list every field;
exhaustive struct literals from outside the crate are no longer
permitted. Construct via `Default::default()` and overwrite individual
slots. Source-incompatible for callers that pattern-match the struct
exhaustively today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Reworks `flush_inner` so transient SQLite failures (BUSY/LOCKED) restore
the buffered changeset for the caller to retry, while fatal failures
drop the buffer and surface verbatim. Closes the data-loss window where
a failed flush would silently discard staged state.

Changes:
- `Buffer::take_for_flush` + `restore` (additive); `drain` is a
  one-cycle deprecated alias.
- `WalletStorageError::FlushRetryable { wallet_id, source }` variant
  with hex-rendered display + rusqlite source for the retry path.
- `WalletStorageError::is_transient()` + `error_kind_str()` — both
  wildcard-free so adding a future variant is a hard compile error.
  Enum intentionally NOT marked `#[non_exhaustive]`.
- `flush_inner` splits into `write_changeset_in_one_tx` (clean SQL
  path, returns `WalletStorageError`) + `handle_flush_error`
  (classification, restore, structured tracing).
- Test affordance: `force_next_flush_to_fail(WalletStorageError)` on
  the persister, gated `#[cfg(any(test, feature = "test-helpers"))]`.
- `PlatformWalletChangeSet::populated_field_count` for tracing fields.

Tests (TDD; all green):
- TC-P2-001 happy-path one-tx + second flush is no-op
- TC-P2-002 transient → FlushRetryable + retry succeeds
- TC-P2-003 store-during-failed-flush LWW merge
- TC-P2-004 fatal failure wipes the buffer
- TC-P2-005 is_transient table (every variant; wildcard-free)
- TC-P2-006 Immediate mode surfaces FlushRetryable
- TC-P2-007 structured `tracing::warn!` on restore
- TC-P2-010 boundary mapping FlushRetryable → PersistenceError::Backend
- Two new buffer.rs unit tests for take/restore LWW + empty-slot insert

Dev-deps: `tracing-test = "0.2"` (no-env-filter), `serial_test = "3"`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Lifts every in-loop `tx.execute(SQL, ...)` in the schema modules into
a `tx.prepare_cached(SQL)` outside the loop, then `stmt.execute(...)`
inside. SQLite parses + plans each statement once per `Connection`
lifetime; subsequent flushes hit the cache. No public-API change, no
schema migration.

Touched: identities, contacts, asset_locks, accounts, dashpay,
identity_keys, token_balances, platform_addrs, core_state, wallet_meta.

Read-only single-shot SELECTs in load/list paths stay on `prepare`
(per FR-P1-1) — `tests/sqlite_compile_time.rs::tc_p1_003` enforces
the boundary via an explicit allow-list.

Tests added:
- TC-P1-004 (`tc_p1_004_cache_scope_under_heavy_reuse`): 60 store+flush
  cycles past SQLite's default 16-statement cache to exercise LRU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
…acts, asset_locks

Rewrites `SqlitePersister::load()` to populate every wired-up slot on
`ClientStartState`: `platform_addresses` (already covered), plus the
new `identities`, `contacts`, `asset_locks`. `wallets` stays empty
pending upstream `Wallet::from_persisted` — surfaced via the
structured `wallets_pending_rehydration` field on the load summary
log.

Per-area readers added:
- `schema::identities::load_state` — decode `entry_blob` rows into
  `IdentityManagerStartState`; route by `identity_index = Some(_)`
  (wallet bucket) vs `None` (out-of-wallet bucket); reconstruct
  `ManagedIdentity` from `IdentityEntry` via a fresh `IdentityV0`.
- `schema::contacts::load_state` — three SELECTs over
  `contacts_sent`/`recv`/`established`; rebuild typed keys
  (`SentContactRequestKey`, `ReceivedContactRequestKey`,
  `EstablishedContact`).
- `schema::asset_locks::load_state` — wraps `list_active` with
  per-row corruption tolerance; returns the `AssetLocksByAccount`
  shape directly compatible with `ClientStartState::asset_locks`.

Each reader returns `(state, skipped_rows)` and emits a structured
`tracing::warn!` with `wallet_id` + `table` + `error` for every row
that fails to decode. The skipped count is folded into the load
summary; load itself returns `Ok(state)` with the partial result
(per OQ-P4-3 — soft-signal, callers don't need to swallow an error
to recover state).

`load()` summary log carries six counters per FR-P4-4:
`wallets_seen`, `addresses_loaded`, `identities_loaded`,
`contacts_loaded`, `asset_locks_loaded`, `wallets_rehydrated` (= 0
today). Empty fields appear as integer `0` so log parsers can rely
on the schema.

Tests (TDD; all 9 in sqlite_load_reconstruction green):
- TC-P4-003 identities round-trip per wallet
- TC-P4-004 contacts round-trip per wallet
- TC-P4-005 asset_locks bucketed (wallet, account, outpoint)
- TC-P4-006 wallets_pending_rehydration log fires with count
- TC-P4-007 summary log carries every counter (including zeros)
- TC-P4-008 corruption: truncated bincode → partial state + WARN
- TC-P4-010 empty database → defaults, zero corruption warnings

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
… load()

Adds two README sections covering the new behavior:
- "Flush semantics": explains the succeed-or-restore contract on
  transient SQLite failures, the FlushRetryable marker callers should
  grep for, and the ban on `#[non_exhaustive]` for is_transient gating.
- "load() reconstruction": tabulates which `ClientStartState` slots
  load now populates, calls out `#[non_exhaustive]` on the struct,
  and documents the corruption-tolerance contract + structured
  load-summary log fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Rolls up the error/classification fixes from the QA fix wave:

- `WalletStorageError::FlushRetryable` Display now embeds the variant
  name `FlushRetryable` so operators grepping production logs can
  match on the variant directly (TC-P2-010 restored to assert the
  variant marker substring). Adds a "use exponential backoff; do NOT
  tight-loop" warning to the rustdoc.
- `WalletStorageError::LoadIncomplete` is `#[deprecated]` with a
  pointer to the future `try_load()` API. The variant stays in the
  enum so the wildcard-free match in `is_transient` /
  `error_kind_str` keeps its compile-time exhaustiveness gate; both
  callers carry `#[allow(deprecated)]` to acknowledge that.
- `tc_p2_005_is_transient_table` is rewritten as a wildcard-free
  `match` over `&WalletStorageError` (was a runtime `Vec`); a future
  variant landing on the enum now refuses to compile in BOTH the impl
  AND the test, matching the spec's two-file gate.
- `LockPoisoned` arm in `is_transient` carries a TODO(qa) pointing at
  the deferred TC-P2-008 mutex-poison test (race-prone per spec; only
  manual / table-driven coverage today).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
…dup + chore cleanups

Rolls up the remaining QA fix-wave items beyond error classification:

- **load() is constant-query w.r.t. wallet count** (FR-P4-6). Adds
  `load_all(&Connection) -> BTreeMap<WalletId, _>` to each per-area
  reader (`platform_addrs`, `identities`, `contacts`, `asset_locks`),
  each running ONE bulk SELECT (or three for the contacts subtables)
  ordered by `wallet_id` and grouping in memory. `SqlitePersister::load()`
  now calls `load_all` once per area and merges; the per-wallet loop
  is gone. New TC-P4-012 (`tc_p4_012_load_query_count_constant`)
  verifies via `sqlite3_trace_v2` that the query count for N=1 wallets
  equals the count for N=10 wallets.
- **asset_locks decode helper.** `decode_row(...) -> (account_index,
  OutPoint, TrackedAssetLock)` is now shared between `list_active`,
  `load_state`, and `load_all`. The duplicate body Marvin flagged
  (DEDUP-001) is gone; `load_state` adds the corruption-tolerance
  loop wrapper around `decode_row` returns. Docstrings updated.
- **load() doctest** is now a runnable `# fn main()` example
  exercising open → empty load → assert per-slot defaults; runs under
  `cargo test --doc -p platform-wallet-storage`. Replaces the
  `text` block that previously referenced a non-existent
  `WalletPersister`.
- **`__test-helpers` rename.** The crate-private test feature is
  renamed from `test-helpers` to `__test-helpers` per Cargo's
  double-underscore convention for "MUST NOT enable from downstream"
  features. README + dev-deps + every `cfg` updated.
- **README** drops the deprecated `SqlitePersisterError` alias in
  favor of canonical `WalletStorageError`.
- **Buffer::drain `since`** matches the workspace `3.1.0-dev.1` and
  notes the planned 3.2.0 removal target.
- **LOAD_UNIMPLEMENTED rustdoc** now describes the actual
  tracing-only soft-signal behavior (was claiming the variant is
  returned).
- **M7 TODO** at `handle_flush_error`'s fatal branch documents the
  deferred TC-P2-008 mutex-poison test (race-prone per spec).
- **rusqlite `trace` feature** enabled to support the query-count
  test's `Connection::trace_v2` callback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2b876dc6-315a-4288-93d9-f67aa58e3d7a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/platform-wallet-persistor-backports

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants