Codebase cleanup: DRY, error handling, decomposition, docs#220
Open
tlongwell-block wants to merge 21 commits intomainfrom
Open
Codebase cleanup: DRY, error handling, decomposition, docs#220tlongwell-block wants to merge 21 commits intomainfrom
tlongwell-block wants to merge 21 commits intomainfrom
Conversation
Critical fixes: - C1: Fix TOCTOU race in replace_addressable_event — advisory lock serializes concurrent calls, IS NOT DISTINCT FROM handles NULL channel_id, excludes incoming event ID from soft-delete, ON CONFLICT DO UPDATE un-deletes replayed events, RETURNING xmax for correct duplicate detection - C2: Add React ErrorBoundary to desktop app (prevents blank white screen) - C3: Align nostr version 0.37→0.36 (matches workspace pin, fixed 5 files) - C4: Fix split_filters empty kinds inversion (Some([]) now matches nothing) - C5: Fix VISION.md marking unimplemented approval gates as done - H2: Add delete_channel to ALL_TOOLS (channel_admin, disabled by default) - H13: Eliminate duplicate PgPool (Db::pool() accessor, relay reuses it) Documentation reconciliation (10 files): - MySQL→Postgres across all docs (syntax, ports, references) - ARCHITECTURE.md: LOC counts, MCP tools 43→44, constants, known limitations - README.md: missing crates, NIPs, env vars - TESTING.md: test counts, missing test files - SECURITY.md: HMAC→SHA-256 hash chain - NOSTR.md: psql bytea syntax - Proxy/ACP READMEs: false claims corrected Crossfire reviewed by codex CLI (9/10) and opus reviewer.
Theme 1 — Shared relay protocol: - Create sprout-core::relay_protocol with RelayMessage, parse_relay_message, relay_ws_to_http, pct_encode, TwoGenDedup (25 tests) Theme 5 — Unified error strategy: - Define ApiError enum (10 variants) with IntoResponse in sprout-relay - Replace all raw (StatusCode, Json<Value>) tuples across 19 API handler files - 148 relay tests pass MCP cleanup: - Extract sign_and_send helper (17 occurrences replaced) - Extract parse_uuid helper (12 double-parse patterns eliminated) - Remove validate_uuid (replaced by parse_uuid) Workspace hygiene: - Add [workspace.lints.clippy] unwrap_used = warn - Fix sprout-workflow non-workspace deps (evalexpr, cron) - Fix deny.toml multiple-versions = deny (24 skip entries) - Make sprout-media axum dep optional behind feature flag
Relay protocol deduplication: - sprout-acp: Remove local RelayMessage, TwoGenDedup, parse_relay_message, relay_ws_to_http, pct_encode — all imported from sprout-core::relay_protocol - sprout-mcp: Remove local RelayMessage, OkResponse, parse_relay_message, relay_ws_to_http — all imported from sprout-core::relay_protocol - sprout-test-client: Import directly from sprout-core instead of sprout-mcp - Add OkResponse struct to relay_protocol (backward compat with MCP/test-client) - Add From<RelayProtocolError> impls for RelayError and RelayClientError - Replace manual TwoGenDedup in sprout-acp/main.rs with core's TwoGenDedup Error handling: - Define SideEffectError enum (Database, Json, PubSub, Nostr, Internal) - Replace all 79 anyhow uses in side_effects.rs with typed errors - 148 relay tests pass, 234 acp tests pass, 47 mcp tests pass ~985 lines of duplication eliminated.
Theme 2 — God function decomposition: - Decompose ingest_event (1524 LOC, 22 steps) into named pub(crate) functions; ingest_event is now a ~130-line readable pipeline - Extract AppShell.tsx hooks: useSearchSession, useAncestorResolution, useViewRouter (AppShell 819→680 LOC) - Decompose ChannelManagementSheet.tsx (762→300 LOC) into ChannelDetailsSection, ChannelMembersSection, ChannelDangerZone + shared Section component Theme 3 — DRY fixes: - Fan-out loop (4×) → ConnectionManager::send_event_to_matches - effective_message_author: delete duplicate in side_effects.rs + messages.rs - row_to_channel_record: pub(crate) in channel.rs, imported in dm.rs - parse_api_token_row: consolidated in api_token.rs - query_feed_* aliases removed (3 pairs → 3 functions) - channel_base_to_json extracted for shared channel serialization - truncatePubkey (5×) → shared/lib/pubkey.ts - formatRelativeTime + parseTimestamp (8 callers) → shared/lib/time.ts - useBackendProviderProbe hook extracted (2 dialogs) - evalexpr string helpers → sprout-core/evalexpr_helpers.rs All tests pass: relay 148, acp 234, mcp 47, db 50, workflow 140. Desktop pnpm check: 0 errors.
Desktop fixes: - Replace window.confirm() with AlertDialog in AgentsView.tsx (3 occurrences) - Fix TokenSettingsCard N parallel getChannelMembers → use ch.isMember directly - Fix ForumPostCard button-inside-button (outer button → div[role=button]) - Add aria-pressed to scope toggle buttons in CreateAgentDialogSections.tsx - Add aria-hidden to sticky day header in MessageTimeline.tsx - Delete dead code: sidebarData.ts, UpdateChecker.tsx Test infrastructure: - Extract shared test helpers into tests/helpers/mod.rs (tiny_jpeg, sign_blossom_auth, blossom_auth_header, sub_id, create_test_channel, http_client) - Add Drop impl to McpSession (kills orphaned child process on panic) - Replace 10s blind sleep with stderr readiness scan + 30s timeout - Fix stale '43 tools' comments → 44 in e2e_mcp.rs - Fix e2e_workflows.rs seeded pubkey → self-contained ensure_user_exists() All cargo check clean. Desktop pnpm check: 0 errors.
Proxy fixes: - PROXY-H2: Channel UUID validation in admin endpoints — invalid UUIDs now return 400 with specific error listing invalid values (was silently dropped) - PROXY-M8: Replace DashMap + flush-on-full with moka LRU cache in shadow_keys (eliminates thundering herd on cache eviction) Workspace safety: - Add #![deny(unsafe_code)] to sprout-media and sprout-cli All cargo check clean. Proxy: 80/80 tests pass.
…text note - ApiError::Internal and ApiError::Database now log the real error server-side and return a generic 'internal server error' to API clients (prevents leaking DB/serialization details over the REST API) - Fix stale 'duplicate:' message in ingest_event → 'duplicate: event already exists' - Add hashtext collision note to advisory lock comment in replace_addressable_event
…e ErrorBoundary TOCTOU fix (replace_addressable_event): - Only tombstone active row if incoming event is strictly newer (created_at comparison + event ID tiebreaker per NIP-01). Stale replays now no-op instead of reviving older events over newer ones. relay_protocol: - relay_ws_to_http: use strip_prefix instead of blind string replace (preserves path/query/fragment, won't corrupt ws:// in paths) - TwoGenDedup::new: clamp limit to minimum of 2 (prevents degenerate behavior) - Added tests: path-with-trailing-slash, non-ws scheme, limit clamping Desktop ErrorBoundary: - Show generic 'An unexpected error occurred' instead of raw error.message (prevents leaking internal details to users; error is still logged to console)
The unwrap_used=warn lint interacts with the pre-push hook's 'cargo clippy -- -D warnings' which promotes all warnings to errors. With ~610 pre-existing unwrap() calls across the workspace, this breaks the build. Deferred to a follow-up that addresses the unwrap calls incrementally.
…code in helpers
- e2e_media_extended.rs: restore nostr::{EventBuilder, Kind, Tag, Timestamp} imports
that were incorrectly removed during helpers extraction
- e2e_workflows.rs: shadow pubkey_hex as &str to prevent move-after-use errors
- helpers/mod.rs: add #![allow(dead_code)] since not every test file uses every helper
- Full workspace clippy --all-targets -D warnings passes clean
…ecific error codes
TOCTOU fix (replace_addressable_event):
- Remove RETURNING xmax (system columns not accessible on partitioned tables
in Postgres 17) — use ON CONFLICT DO NOTHING + rows_affected() instead
- Stale-replay guard makes ON CONFLICT DO UPDATE unnecessary
ApiError response format:
- Standard variants use original single-field format: {"error": "human text"}
- Custom variant uses two-field format: {"error": "code", "message": "text"}
for domain-specific codes (nip98_not_supported, scope_escalation, etc.)
- Unauthorized variant now carries a message string
- Internal/Database errors log server-side, return generic message
Token handler domain codes restored:
- nip98_not_supported, invalid_scopes, scope_escalation, already_revoked,
rate_limited, not_found — all use ApiError::Custom for backward compat
Integration test results (on fresh DB):
- REST API: 40/40 ✅
- Relay: 27/27 ✅
- MCP: 14/14 ✅
- Media: 7/7 ✅
- Media extended: 18/18 ✅
- Nostr interop: 15/15 ✅
- Tokens: 17/20 (3 pre-existing failures on fresh DB — need seeded channel)
- Workflows: 0/7 (pre-existing — all tests depend on seeded CHANNEL_GENERAL UUID)
The test_api_error_helper_maps_status_codes test used
ApiError::Unauthorized("...".into()) inside matches!() which is
a pattern context, not an expression context. Use _ wildcard instead.
…n codes
All 8 remaining token error branches now use ApiError::Custom with explicit
domain codes instead of embedding codes in human-readable message strings:
- channel_escalation (3 sites), invalid_channel_ids (2), not_channel_member,
missing_required_scope, token_limit_exceeded
Updated IntoResponse comment to accurately describe the dual envelope format:
standard variants → {"error": "text"}, Custom → {"error": "code", "message": "text"}
…nction, remove expect() 1. replace_addressable_event: add ORDER BY created_at DESC to stale-replay check so it always compares against the newest active row (handles legacy corruption with multiple active rows for same addressable key) 2. Token mint: distinguish ChannelNotFound from real DB errors in get_channel call — DB failures now return 500 internal_error instead of 400 invalid_channel_ids 3. Remove expect() on Scope::from_str — use unwrap_or(Scope::Unknown) to avoid panicking in production even though the parse is infallible
Merges 4 commits from main: - feat: add Workflows UI surface (#217) - chore: clean Tauri build artifacts in just clean (#219) - fix: debounce composer pattern detection (#222) - fix(desktop): plug memory leaks causing 37GB+ RAM growth (#223) Conflict resolutions: - approvals.rs: Merged main's refactored approval handlers (validate_approval, execute_grant/deny, by-hash endpoints) with our ApiError conversion - workflows.rs: Converted new list_run_approvals endpoint to ApiError - AppShell.tsx: Merged main's 'workflows' view type with our hook extractions - CreateAgentDialog.tsx / AddChannelBotDialog.tsx: Merged main's eslint cleanup with our useBackendProviderProbe extraction All cargo clippy clean. Desktop pnpm check: 0 errors.
…n AppShell TypeScript build failed because AppShell.tsx references AppView and MainView types in the selectView callback, but those types were local to useViewRouter.ts. Export them and import in AppShell.
…estors Crossfire review (codex CLI + opus subagents) found 13 issues across 4 rounds of adversarial review. Final score: 9/10 APPROVED. API contract restorations (mod.rs, tokens.rs, workflows.rs, error.rs): - Restore distinct auth error codes: invalid_token, token_revoked, token_expired (were collapsed to generic 'authentication required') - Restore 'authentication failed' on invalid-credential paths (JWT validation, dev JWT, X-Pubkey) vs 'authentication required' on no-auth-provided paths - Restore retry_after_seconds in rate-limit response: ApiError::Custom gains Option<serde_json::Value> 4th field for extra response data - Restore structured 403 codes: channel_not_permitted and insufficient_scope use ApiError::Custom for two-field envelope - Restore NIP-98 auth failure codes in POST /api/tokens: invalid_auth with specific messages per failure mode - Restore token-management NIP-98 rejection messages: 'Bearer token required to list/revoke tokens' - Restore webhook auth error messages: 'authentication failed' vs 'webhook secret required but not configured' - Fix NIP-98 message wording: 'only supported' (was 'only valid') - Remove double-logging: internal_error() no longer logs (IntoResponse is the single logging point) TOCTOU hardening (sprout-db/lib.rs): - Add id DESC to ORDER BY in replace_addressable_event stale-replay guard: deterministic NIP-01 tiebreak even with legacy corruption Desktop fix (useAncestorResolution.ts): - Restore MAX_REQUESTED_ANCESTORS=500 eviction cap dropped during hook extraction (prevents unbounded Set growth in long sessions) Test results (clean DB, release build): - Unit: 812 passed, 0 failed - e2e_rest_api: 39/40 (1 pre-existing NIP-05 test-ordering issue) - e2e_relay: 27/27 - e2e_mcp: 14/14 - e2e_nostr_interop: 15/15 - e2e_tokens: 20/20 - e2e_media: 7/7 - e2e_media_extended: 18/18 - e2e_workflows: 0/7 (pre-existing — seeded CHANNEL_GENERAL UUID) Zero regressions from these fixes.
Replace positional tuple Custom(StatusCode, &str, String, Option<Value>)
with named struct Coded { status, code, message, extra } for self-documenting
call sites. No behavior change — status codes, error codes, and JSON response
shapes are identical.
23 call sites updated across error.rs, api/mod.rs, and api/tokens.rs.
… check Pre-storage validation (validate_admin_event, validate_standard_deletion_event) now returns ValidationError with Rejected/Infra variants instead of SideEffectError. This fixes a semantic bug where DB failures during validation were misreported as 'invalid event' (client error) instead of infrastructure failures. Pipeline callers (check_admin_event, check_standard_deletion) now route: - ValidationError::Rejected → IngestError::Rejected (400 / OK false) - ValidationError::Infra → IngestError::Internal (500) Also fixes check_channel_not_archived: previously silently ignored DB errors (fail-open by accident). Now fail-closed with tracing::warn — DB errors during archive checks return IngestError::Internal instead of letting events through. validate_admin_event: get_channel() no longer swallows DB errors into misleading 'channel not found' message — bare ? propagates the real error as Infra. SideEffectError unchanged for post-storage side effects (fire-and-forget path). 148 relay tests pass.
DbError::ChannelNotFound, MemberNotFound, NotFound, and AccessDenied are client-state errors (the event references something that doesn't exist or the caller lacks permission). The blanket From<DbError> was mapping all variants to ValidationError::Infra, causing admin events against nonexistent channels to return 500 instead of 400. Now: client-state DbError variants → Rejected (400), infrastructure errors (Sqlx, Serde, etc.) → Infra (500). 148 relay tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
General cleanup pass. The codebase grew quickly and accumulated some duplication, oversized functions, inconsistent error handling, and stale docs. Nothing was on fire, but it was getting harder to navigate.
What changed
110 files · +4,255 / −4,373 (net −118 lines) · 10 crates · 8 docs
Bug fixes
replace_addressable_event— advisory lock serializes concurrent calls; stale-replay guard rejects older events;IS NOT DISTINCT FROMfor NULL channel_id; deterministic NIP-01 tiebreak (ORDER BY created_at DESC, id DESC)split_filtersempty-kinds inversion —Some([])now correctly matches nothing instead of everythingErrorBoundary— prevents blank white screen on render crashDRY / decomposition
sprout-core::relay_protocolreplaces ~985 lines of duplication across sprout-acp and sprout-mcpingest_eventdecomposed from 1,524 LOC into ~130-line pipeline + 22 named step functionsChannelManagementSheet.tsxdecomposed (762 → 300 LOC) into 3 section componentsAppShell.tsxhooks extracted (useSearchSession,useAncestorResolution,useViewRouter)sign_and_send(17 call sites),parse_uuid(12 call sites)truncatePubkey,formatRelativeTime,useBackendProviderProbeError handling
ApiErrorenum (10 variants) replaces raw(StatusCode, Json<Value>)tuples across 19 API handler filesSideEffectError(5 variants) replaces 79anyhowuses inside_effects.rsinvalid_token,token_revoked,token_expired,invalid_auth,channel_not_permitted,insufficient_scope,retry_after_seconds, etc.)Proxy hardening
DashMap+ flush-on-full →mokaLRU cache in shadow_keysDocs
All 8 top-level docs updated — MySQL → Postgres, LOC counts, MCP tool counts, infrastructure ports, test counts, etc.
Desktop a11y
window.confirm()→AlertDialogaria-pressed,aria-hiddenadditionssidebarData.ts,UpdateChecker.tsx)Test infrastructure
tests/helpers/mod.rsMcpSessionDropimpl for cleanupWhat's deferred
side_effects.rsby event kind familytests/e2e_proxy.rsget_thread_summaries_bulkcanvascolumn from channel queriesTest results
Desktop
pnpm check: 0 errors.Full workspace
cargo check+cargo clippy -D warnings: clean.