feat(ai-rules): AI-assisted rule generation with multi-backend CLI/API support#84
Merged
Conversation
State machine redesign + hibernation feature planning: - requirements.md: 4 epics (state machine, async creation, sub-status visibility, hibernation) - research/: stack, features, architecture, pitfalls - implementation/plan.md: 4 epics, 18 stories, 36 tasks - implementation/validation.md: 78 tests, 100% AC coverage - decisions/ADR-001: state machine redesign (5-state model) - decisions/ADR-002: hibernation checkpoint storage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ecture - Go type system research added (research/go-type-system-state-machine.md) - plan.md updated: state machine uses []TransitionDef with Guard/After hooks instead of flat map[Status][]Status - Added exhaustive linter task to Epic 1 - Added ent DB migration task with old→new integer remapping - Clean iota renumbering: Creating=0, Active=1, Paused=2, Stopped=3, Hibernated=4 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Epic 1 Story 1.3: Instance Interface Redesign - Predicate methods (IsActive, IsHibernated, IsPaused, etc.) - Rename setStatus → loadStatus, restrict to deserialization only - Eliminate transitionTo fallback pattern (setStatus bypass) - Add context to transitionTo for guard cancellation - Update InstanceReader interface with typed Status and predicates - Remove NeedsApproval as lifecycle state (demoted to sub-status) - Replace RecoverFromStopped with normal Resume via state machine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creating=0, Active=1, Paused=2, Stopped=3, Hibernated=4. Deprecated aliases Running/Ready/Loading kept temporarily for compile compatibility. Adds _statusSentinel guard against iota reordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… recovery Deserialization wires tmux object for hibernated sessions without calling Start(). Health checker early-bails with IsHealthy=true for hibernated sessions. Prevents accidental wakeup on server restart. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds IsCreating/IsActive/IsPaused/IsStopped/IsHibernated predicates and GetLifecycleStatus(). Renames setStatus→loadStatus with strict scope restriction. Adds context.Context to transitionTo. Removes RecoverFromStopped() in favor of normal state machine path. Updates InstanceReader interface with typed predicates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…uard+after hooks) Introduces TransitionDef struct with Guard and After hook fields for pre/post-transition choreography. Builds O(1) transitionIndex at init time. Provides CanTransition() for reachability checks. Removes old allowedTransitions map and NeedsApproval/Running/Ready transitions. Updates state machine tests to cover 5-state model. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ew 5-state constants Replaces all == Running, == Ready, case Running:, case Ready:, case NeedsApproval: guards across session/ and server/ with canonical Active/Creating. Adds || i.IsHibernated() to tmux operation guards. Removes deprecated alias constants. Updates all tests to use new names. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds runStatusRemap() which uses a sentinel-offset technique (+100) to safely remap old 7-value status integers to the new 5-state model without value collisions. Wired into NewEntRepository() before session loading. Migration is a no-op on empty or already-migrated databases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…usages Replaces session.Running and session.Ready references in server/services, pkg/events, server/analytics, session/workspace, and tests/demo with session.Active. Fixes ProtoToStatus to use integer literals for legacy wire values (READY=2, NEEDS_APPROVAL=5, LOADING=3) to avoid duplicate case errors from allow_alias. Removes unused _statusSentinel constant. All lint checks pass (make lint: exit 0). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…atus The CreateSession RPC now saves the instance with Creating status and returns immediately, then spawns a goroutine to call instance.Start(). On success the goroutine sets Active status; on failure it sets Stopped and records the error in creation_progress. Also adds the creation_progress = 51 proto field to the Session message and populates it from Instance.CreationProgress in instance_adapter.go. Adds ForceStatus() public method for error-recovery paths that cannot use the normal state-machine transition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Generated Go and TypeScript bindings from the proto change in the previous commit. Session.creation_progress (field 51) is now accessible as Session.CreationProgress in Go and session.creationProgress in TypeScript. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ctive sessions Adds SubStatusChip component that renders inline on Active session rows and cards, showing fine-grained activity (Thinking…, Needs Approval, Error, Tests Failing, Rate Limited). Uses vanilla-extract CSS with theme tokens. Returns null for IDLE and UNSPECIFIED to keep the UI clean when no noteworthy sub-status is present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- SessionCard: render spinner + creationProgress text when status is Creating - SessionCard: guard terminal snapshot display behind ACTIVE status - SessionActionsOverflow: disable Pause/Restart actions while Creating - Fix RUNNING→ACTIVE enum references (same wire value via allow_alias) - Cast to number to bypass TS duplicate-value narrowing for allow_alias enums Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… hibernate status support - SessionActionsOverflow: use ACTIVE only (not RUNNING) to avoid TS2367 from allow_alias - SessionActionsOverflow: add onHibernate/onResumeFromHibernation menu items - SessionCard: map HIBERNATED status to statusPaused (no distinct style yet) - SessionCard: add "Hibernated" display text for HIBERNATED status Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…st filter SubStatus enum in proto-es uses NEEDS_APPROVAL, not SUB_STATUS_NEEDS_APPROVAL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire hibernateSession/resumeHibernatedSession RPCs into the UI: - SessionActionsOverflow already had the menu items; now wires through - SessionRow/SessionCard accept onHibernate/onResumeFromHibernation props - SessionList threads both callbacks down to row and card views - PaneSplitRenderer pulls hibernate callbacks from SessionServiceContext - SessionServiceContext exposes hibernateSession/resumeHibernatedSession - SessionRow status dot gains 'hibernated' data-status value (idle color) - SessionCard getStatusText returns "Hibernated" for HIBERNATED status Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
InstanceToProto was called after goroutine launch, racing with the goroutine's first write to instance.CreationProgress. Snapshot the proto before spawning the goroutine so the response is always a clean Creating-status snapshot. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BottomNav had its own hardcoded lists that didn't reflect NAV_PAGES. Backlog was missing from primaryItems; Features was missing from moreItems entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ary flag BottomNav maintained its own parallel hardcoded item lists that diverged from NAV_PAGES. Adding or reordering a nav item required updating two places. Adds `bottomNavPrimary` to NavPage. Primary bar items are marked in NAV_PAGES; everything mobile-visible without that flag flows automatically into the More sheet. Exports BOTTOM_NAV_PRIMARY and BOTTOM_NAV_MORE for BottomNav to consume. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- requirements.md: 8 FRs across 4 user stories, 4 surfaces - research/: stack, features, architecture, pitfalls (4 agents) - implementation/plan.md: 6 epics, 17+ stories, 35 tasks - implementation/adversarial-review.md: 2 blocking issues resolved - implementation/validation.md: 35 tests, 8/8 FRs covered - decisions/ADR-001: RulePromptBuilder + AIClient interface design Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the AIClient abstraction to support locally installed AI CLI tools (claude, gemini, opencode) as backends alongside the Anthropic HTTP API. The executor.ShortLivedCmd primitive provides context cancellation, timeout, and audit logging for all subprocess calls. NewBestAvailableAIClient selects automatically by priority: 1. Anthropic HTTP API (ANTHROPIC_API_KEY) 2. claude CLI (--print mode, stdin delivery) 3. gemini CLI 4. opencode CLI Adding a new agent CLI requires one CLIAgentSpec entry in knownCLIAgents — no other changes needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CLIs manage their own authentication (API keys, login state) so they require no extra configuration in stapler-squad. Anthropic HTTP API is now a last-resort fallback for environments with no CLI installed. New priority: claude CLI → gemini → opencode → Anthropic HTTP API Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ponent Backend (Epic 1 completion): - server/services/ai_interfaces.go: RulePromptBuilder, AIClient, RulePromptContext - server/services/rule_prompt_builder.go: DefaultRulePromptBuilder with secret redaction - server/services/anthropic_client.go: AnthropicAIClient (HTTP fallback) - server/services/rules_service.go: GenerateSuggestedRule handler, parseSuggestions, attachConflictInfo, buildPromptContext - server/services/approval_handler.go: redact secrets before analytics recording - config/config.go: AnthropicAPIKey field + ANTHROPIC_API_KEY env override Frontend (Epic 2): - web-app/src/lib/hooks/useGenerateRule.ts: wraps GenerateSuggestedRule RPC, manages suggestions[], loading, error, cancel (AbortController), 60s timeout - web-app/src/components/sessions/SuggestedRuleCard.tsx: shared card with confidence badge, conflict/shadow warnings, editable fields, Accept & Discard - web-app/src/components/sessions/SuggestedRuleCard.css.ts: vanilla-extract styles Tests: 13 Go unit + 21 frontend (9 hook + 12 card), all passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Epic 3 (Rules page): "Generate Suggestions" button on ApprovalRulesPanel triggers analytics-gap suggestions; renders SuggestedRuleCard list with per-card discard and accept→refresh. Epic 4 (Review queue): "Create Rule" button on pending approval items opens a modal with SuggestedRuleCard pre-filled from the item's command. Epic 5 (Analytics panel): "Suggest Rule" buttons replace "Add rule →" links on coverage-gap rows; inline SuggestedRuleCard expands below the active row without a modal. Epic 6 (Command sample): Collapsible "Generate from command" section in the rule creation form pre-fills fields from COMMAND_SAMPLE suggestion, respecting user-touched fields via a Set<keyof RuleFormState> ref. Tests: 37 ApprovalRulesPanel + 17 ReviewQueuePanel + 10 ApprovalAnalyticsPanel All passing. Zero new TypeScript errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, test quality - Security: reject auto_allow/allow rules with overbroad commandPattern (.*/.+/empty) - Security: wrap CommandPreview in <command> XML delimiters to prevent prompt injection - Security: unify redaction sentinel strings into constants (redactedSecret/redactedPrompt) - Refactor: NewBestAvailableAIClient accepts []CLIAgentSpec param (no global var mutation) - Refactor: consolidate duplicate decisionString/riskLevelString into single canonical copy - Refactor: remove unused SeedExamples field from RulePromptContext - Accessibility: confidence badge adds text label (High/Medium/Low) for WCAG 1.4.1 - Accessibility: aria-live region for generate loading state in ApprovalRulesPanel - Accessibility: aria-label on filter input in ApprovalAnalyticsPanel - Accessibility: both modals in ReviewQueuePanel use createPortal to escape transforms - CSS: replace hardcoded zIndex:1000 with zIndex.modal from theme contract - CSS: replace inline var(--warning-bg) strings with vanilla-extract vars.* - React: useEffect cleanup aborts in-flight request on unmount in useGenerateRule - React: friendly timeout/cancellation error messages distinguish the two cases - React: void on unhandled generateRule() Promise in ReviewQueuePanel - React: rulesRef prevents stale closure in SuggestedRuleCard handleAccept - React: all Suggest Rule buttons disabled while generation is in-flight - Tests: parseSuggestions tests for markdown-fenced and malformed input - Tests: fix vacuous >= 0 gap count assertion -> exact count - Tests: replace time.Sleep with require.Eventually polling - Tests: data-testid attributes on buttons; getByTestId in tests - Tests: assert SuggestionSource enum in RPC call arguments - Registry: add GenerateSuggestedRule entry + update lastModified timestamps Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
Contributor
Go Benchmarks (Tier 1) |
Contributor
E2E RPC Latency |
Contributor
UX Analysis
|
Contributor
Frontend Terminal Throughput |
Contributor
🎬 E2E Feature Demos2 shard(s) recorded feature flows for this PR. recordings shard 1 Demo preview opens directly in browser (single-file HTML). Raw WebM recordings in ZIP. Expires after 30 days. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GenerateSuggestedRuleRPC lets an AI agent propose approval rules from analytics gaps, review queue items, or a pasted command sample. Human approval required before any rule is saved (FR-8: no auto-save).NewBestAvailableAIClientselects the best available backend — claude/gemini/opencode CLI (preferred; they manage their own auth) or Anthropic HTTP API as fallback. Zero config needed if a supported CLI is in PATH./rules, "Create Rule from This" in the review queue, per-gap "Suggest Rule" buttons in analytics, and a "Generate from command" inline input in the rule creation form.HibernateSession/ResumeHibernatedSessionRPCs with idle sweeper, checkpoint writer, and status badge in the UI.Architecture
Security
auto_allow/allowsuggestions with overbroadcommandPattern(.*,.+, empty) are rejected server-sideCommandPreviewvalues wrapped in<command>XML delimiters before AI prompt to prevent injectionredactedSecret,redactedPrompt)GenerateSuggestedRuleis read-only — never callsrulesStore.UpsertTest plan
make test— all Go packages pass (623 tests)cd web-app && npx jest --no-coverage— all 1435 frontend tests passmake quick-check— build + tests + lint clean/rulesshows loading state then cardsUpsertApprovalRule(not auto-saved)claudeCLI in PATH (no API key needed)ANTHROPIC_API_KEYis setHibernateSession/ResumeHibernatedSessionRPCs work end-to-end🤖 Generated with Claude Code