feat(platform-wallet)!: persistor backports — ClientStartState extension, retry-safe flush, functional load()#3643
Draft
Claudius-Maginificent wants to merge 7 commits into
Conversation
…cts/asset_locks slots Adds three top-level slots parallel to `platform_addresses` / `wallets` on `ClientStartState`, plus the supporting `ContactsStartState` snapshot type. The struct gains `#[non_exhaustive]` so future slots can land without breaking destructuring callers. Three internal destructuring sites (`manager::load`, `manager::wallet_lifecycle`, `wallet::platform_wallet`) absorb the new attribute via `..` rest patterns — they only consume `platform_addresses` today. Tests: - TC-P4-011 (`tc_p4_011_client_start_state_non_exhaustive`): grep guard on the upstream source so the attribute can't quietly disappear. - TC-P1-003 (`tc_p1_003_prepare_cached_in_writers`): writer-path lint for `prepare_cached`. Currently passes with read-only SELECTs allowlisted. BREAKING CHANGE: `ClientStartState` is now `#[non_exhaustive]` and gains three new public fields (`identities`, `contacts`, `asset_locks`). In-crate destructures must use `..` rest patterns or list every field; exhaustive struct literals from outside the crate are no longer permitted. Construct via `Default::default()` and overwrite individual slots. Source-incompatible for callers that pattern-match the struct exhaustively today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Reworks `flush_inner` so transient SQLite failures (BUSY/LOCKED) restore
the buffered changeset for the caller to retry, while fatal failures
drop the buffer and surface verbatim. Closes the data-loss window where
a failed flush would silently discard staged state.
Changes:
- `Buffer::take_for_flush` + `restore` (additive); `drain` is a
one-cycle deprecated alias.
- `WalletStorageError::FlushRetryable { wallet_id, source }` variant
with hex-rendered display + rusqlite source for the retry path.
- `WalletStorageError::is_transient()` + `error_kind_str()` — both
wildcard-free so adding a future variant is a hard compile error.
Enum intentionally NOT marked `#[non_exhaustive]`.
- `flush_inner` splits into `write_changeset_in_one_tx` (clean SQL
path, returns `WalletStorageError`) + `handle_flush_error`
(classification, restore, structured tracing).
- Test affordance: `force_next_flush_to_fail(WalletStorageError)` on
the persister, gated `#[cfg(any(test, feature = "test-helpers"))]`.
- `PlatformWalletChangeSet::populated_field_count` for tracing fields.
Tests (TDD; all green):
- TC-P2-001 happy-path one-tx + second flush is no-op
- TC-P2-002 transient → FlushRetryable + retry succeeds
- TC-P2-003 store-during-failed-flush LWW merge
- TC-P2-004 fatal failure wipes the buffer
- TC-P2-005 is_transient table (every variant; wildcard-free)
- TC-P2-006 Immediate mode surfaces FlushRetryable
- TC-P2-007 structured `tracing::warn!` on restore
- TC-P2-010 boundary mapping FlushRetryable → PersistenceError::Backend
- Two new buffer.rs unit tests for take/restore LWW + empty-slot insert
Dev-deps: `tracing-test = "0.2"` (no-env-filter), `serial_test = "3"`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
<sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Lifts every in-loop `tx.execute(SQL, ...)` in the schema modules into a `tx.prepare_cached(SQL)` outside the loop, then `stmt.execute(...)` inside. SQLite parses + plans each statement once per `Connection` lifetime; subsequent flushes hit the cache. No public-API change, no schema migration. Touched: identities, contacts, asset_locks, accounts, dashpay, identity_keys, token_balances, platform_addrs, core_state, wallet_meta. Read-only single-shot SELECTs in load/list paths stay on `prepare` (per FR-P1-1) — `tests/sqlite_compile_time.rs::tc_p1_003` enforces the boundary via an explicit allow-list. Tests added: - TC-P1-004 (`tc_p1_004_cache_scope_under_heavy_reuse`): 60 store+flush cycles past SQLite's default 16-statement cache to exercise LRU. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
…acts, asset_locks Rewrites `SqlitePersister::load()` to populate every wired-up slot on `ClientStartState`: `platform_addresses` (already covered), plus the new `identities`, `contacts`, `asset_locks`. `wallets` stays empty pending upstream `Wallet::from_persisted` — surfaced via the structured `wallets_pending_rehydration` field on the load summary log. Per-area readers added: - `schema::identities::load_state` — decode `entry_blob` rows into `IdentityManagerStartState`; route by `identity_index = Some(_)` (wallet bucket) vs `None` (out-of-wallet bucket); reconstruct `ManagedIdentity` from `IdentityEntry` via a fresh `IdentityV0`. - `schema::contacts::load_state` — three SELECTs over `contacts_sent`/`recv`/`established`; rebuild typed keys (`SentContactRequestKey`, `ReceivedContactRequestKey`, `EstablishedContact`). - `schema::asset_locks::load_state` — wraps `list_active` with per-row corruption tolerance; returns the `AssetLocksByAccount` shape directly compatible with `ClientStartState::asset_locks`. Each reader returns `(state, skipped_rows)` and emits a structured `tracing::warn!` with `wallet_id` + `table` + `error` for every row that fails to decode. The skipped count is folded into the load summary; load itself returns `Ok(state)` with the partial result (per OQ-P4-3 — soft-signal, callers don't need to swallow an error to recover state). `load()` summary log carries six counters per FR-P4-4: `wallets_seen`, `addresses_loaded`, `identities_loaded`, `contacts_loaded`, `asset_locks_loaded`, `wallets_rehydrated` (= 0 today). Empty fields appear as integer `0` so log parsers can rely on the schema. Tests (TDD; all 9 in sqlite_load_reconstruction green): - TC-P4-003 identities round-trip per wallet - TC-P4-004 contacts round-trip per wallet - TC-P4-005 asset_locks bucketed (wallet, account, outpoint) - TC-P4-006 wallets_pending_rehydration log fires with count - TC-P4-007 summary log carries every counter (including zeros) - TC-P4-008 corruption: truncated bincode → partial state + WARN - TC-P4-010 empty database → defaults, zero corruption warnings Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
… load() Adds two README sections covering the new behavior: - "Flush semantics": explains the succeed-or-restore contract on transient SQLite failures, the FlushRetryable marker callers should grep for, and the ban on `#[non_exhaustive]` for is_transient gating. - "load() reconstruction": tabulates which `ClientStartState` slots load now populates, calls out `#[non_exhaustive]` on the struct, and documents the corruption-tolerance contract + structured load-summary log fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Rolls up the error/classification fixes from the QA fix wave: - `WalletStorageError::FlushRetryable` Display now embeds the variant name `FlushRetryable` so operators grepping production logs can match on the variant directly (TC-P2-010 restored to assert the variant marker substring). Adds a "use exponential backoff; do NOT tight-loop" warning to the rustdoc. - `WalletStorageError::LoadIncomplete` is `#[deprecated]` with a pointer to the future `try_load()` API. The variant stays in the enum so the wildcard-free match in `is_transient` / `error_kind_str` keeps its compile-time exhaustiveness gate; both callers carry `#[allow(deprecated)]` to acknowledge that. - `tc_p2_005_is_transient_table` is rewritten as a wildcard-free `match` over `&WalletStorageError` (was a runtime `Vec`); a future variant landing on the enum now refuses to compile in BOTH the impl AND the test, matching the spec's two-file gate. - `LockPoisoned` arm in `is_transient` carries a TODO(qa) pointing at the deferred TC-P2-008 mutex-poison test (race-prone per spec; only manual / table-driven coverage today). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
…dup + chore cleanups Rolls up the remaining QA fix-wave items beyond error classification: - **load() is constant-query w.r.t. wallet count** (FR-P4-6). Adds `load_all(&Connection) -> BTreeMap<WalletId, _>` to each per-area reader (`platform_addrs`, `identities`, `contacts`, `asset_locks`), each running ONE bulk SELECT (or three for the contacts subtables) ordered by `wallet_id` and grouping in memory. `SqlitePersister::load()` now calls `load_all` once per area and merges; the per-wallet loop is gone. New TC-P4-012 (`tc_p4_012_load_query_count_constant`) verifies via `sqlite3_trace_v2` that the query count for N=1 wallets equals the count for N=10 wallets. - **asset_locks decode helper.** `decode_row(...) -> (account_index, OutPoint, TrackedAssetLock)` is now shared between `list_active`, `load_state`, and `load_all`. The duplicate body Marvin flagged (DEDUP-001) is gone; `load_state` adds the corruption-tolerance loop wrapper around `decode_row` returns. Docstrings updated. - **load() doctest** is now a runnable `# fn main()` example exercising open → empty load → assert per-slot defaults; runs under `cargo test --doc -p platform-wallet-storage`. Replaces the `text` block that previously referenced a non-existent `WalletPersister`. - **`__test-helpers` rename.** The crate-private test feature is renamed from `test-helpers` to `__test-helpers` per Cargo's double-underscore convention for "MUST NOT enable from downstream" features. README + dev-deps + every `cfg` updated. - **README** drops the deprecated `SqlitePersisterError` alias in favor of canonical `WalletStorageError`. - **Buffer::drain `since`** matches the workspace `3.1.0-dev.1` and notes the planned 3.2.0 removal target. - **LOAD_UNIMPLEMENTED rustdoc** now describes the actual tracing-only soft-signal behavior (was claiming the variant is returned). - **M7 TODO** at `handle_flush_error`'s fatal branch documents the deferred TC-P2-008 mutex-poison test (race-prone per spec). - **rusqlite `trace` feature** enabled to support the query-count test's `Connection::trace_v2` callback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> <sub>🤖 Co-authored by [Claudius the Magnificent](https://github.com/lklimek/claudius) AI Agent</sub>
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue being fixed or feature implemented
Follow-up PR layered on top of #3625 (
feat/platform-wallet-sqlite-persistor). Backports three improvements from shumkov's parallel persistor in dash-evo-tool'sfeat/platform-wallet2branch.Backports landed (per
/tmp/persistor-comparison-report.md):prepare_cachedfor hot-path SQL writers (perf)load()reconstruction (identities, contacts, asset_locks now rehydrate at boot)Out of scope (deliberate):
networkcolumn on each row for multi-network in one DB). Multi-network deployments use separate DB files per network instead.ClientStartState.walletsrehydration. That requires an upstreamkey_wallet::Wallet::from_persistedconstructor that does not yet exist; tracked viatracing::info!(wallets_pending_rehydration = N)log line.What was done?
Seven commits, two distinct waves. The first 5 commits are the structural backport; the last 2 are the QA fix wave.
6bcfdd35c3feat(platform-wallet)!: extend ClientStartState with identities/contacts/asset_locks slots369f1de130feat(platform-wallet-storage): retry-safe flush via Buffer take/restore13fab2525bperf(platform-wallet-storage): use prepare_cached for every writer8c1635d70afeat(platform-wallet-storage): functional load() for identities, contacts, asset_locksc1e8f70bbfdocs(platform-wallet-storage): document retry-safe flush + functional load()2cb47948e6fix(platform-wallet-storage): tighten error classification surface7a6e3012246perf(platform-wallet-storage): bulk load_all readers + asset_locks dedup + chore cleanupsNotable design decisions:
BufferAPI additive: newtake_for_flush+restorepair preserves staged data on transient flush failure.Buffer::drainretained as#[deprecated]alias (removal target: 3.2.0).WalletStorageError::FlushRetryableis the typed signal forSQLITE_BUSY/SQLITE_LOCKED. Display string starts with the literal"FlushRetryable: "so operators greppingPersistenceError::Backend(String)at the trait boundary can detect the variant.WalletStorageError::is_transientuses a wildcard-freematch; the enum intentionally does NOT carry#[non_exhaustive]so every future variant is forced through compile-time classification.load()performs onewallet_meta::list_idsSELECT + oneload_allper per-area schema reader (platform_addrs,identities,contacts,asset_locks) — strictly bulk, NOT N+1 over wallets. New TC-P4-012 enforces query count constancy.test-helpers→__test-helpers(Cargo's double-underscore convention) so downstream crates can't pull inforce_next_flush_to_failand bypass typed-API guarantees.How Has This Been Tested?
cargo test -p platform-wallet -p platform-wallet-storage --all-targets --all-features→ 224 tests, 0 failurescargo test --doc -p platform-wallet-storage --all-features→ 1 doctest pass onSqlitePersister::load()cargo fmt --all -- --check→ cleancargo clippy -p platform-wallet -p platform-wallet-storage --all-targets --all-features -- -D warnings→ cleancargo tree -i rusqlite -p platform-wallet-storage→ single versionNew test coverage: 19 + 1 = 20 tests added (P1: lint guard + cache scope; P2: retry-safe semantics + LWW + error classification + boundary mapping; P4: per-area reconstruction + corruption tolerance + structured load summary + query-count constancy).
The full lifecycle (Requirements → UX → Test Spec → Dev Plan → Implementation → 5-agent QA → Fix wave → Verification) was driven via the
claudius:workflow-featureskill. All HIGH and MEDIUM findings from the QA wave were closed before this PR opened.Breaking Changes
Yes —
ClientStartState(packages/rs-platform-wallet/src/changeset/client_start_state.rs) is now#[non_exhaustive]and gains three new top-level fields:identities: BTreeMap<WalletId, IdentityManagerStartState>contacts: BTreeMap<WalletId, ContactsStartState>asset_locks: BTreeMap<WalletId, BTreeMap<u32, BTreeMap<OutPoint, TrackedAssetLock>>>In-crate destructures of
ClientStartStateMUST use..rest pattern (or list every field). Themanager/load.rs:33site is the canonical pattern. Cross-crate code is protected by#[non_exhaustive].BREAKING CHANGE:footer present on6bcfdd35c3forconventional-changelog-dashto surface in release notes.Checklist
For repository code-owners and collaborators only
🤖 Co-authored by Claudius the Magnificent AI Agent