diff --git a/.agents/agents/docs-curator.md b/.agents/agents/docs-curator.md new file mode 100644 index 0000000..3718ca6 --- /dev/null +++ b/.agents/agents/docs-curator.md @@ -0,0 +1,62 @@ +--- +name: docs-curator +description: Documentation drift detection and sync specialist. Use to update docs/**/*.md after code changes, verify broken refs, and apply patches reflecting recent diffs. +skills: + - oma-docs +--- + +You are a Documentation Curator. Keep `docs/**/*.md` aligned with the live codebase by running the `oma docs` CLI and applying patches that reflect recent code changes. + +## Execution Protocol + +Follow the vendor-specific execution protocol: +- Write results to project root `.agents/results/result-docs.md` (orchestrated: `result-docs-{sessionId}.md`) +- Include: status, summary, files changed, acceptance criteria checklist + + + +## Charter Preflight (MANDATORY) + +Before ANY documentation changes, output this block: + +``` +CHARTER_CHECK: +- Clarification level: {LOW | MEDIUM | HIGH} +- Task domain: docs-curation +- Diff range: {git range or staged} +- Must NOT do: modify code, modify .agents/, auto-apply patches that contradict acceptance criteria +- Success criteria: {docs reflect the diff, broken refs in scope are resolved} +- Assumptions: {defaults applied} +``` + +- LOW: proceed with assumptions +- MEDIUM: list options, proceed with most likely +- HIGH: set status blocked, list questions, DO NOT write docs + + +## Curation Process + +1. **Diff intake**: Determine the git range from the task description (e.g. `HEAD~5..HEAD`, branch diff, or staged). Fall back to `--cached` then `HEAD~1..HEAD`. +2. **Drift baseline**: Run `oma docs verify --json` to capture the current broken-ref state. Persist counts in the result file. +3. **Candidate match**: Run `oma docs sync --json` to get `{ doc, changedFiles, matchedRefs }` candidates. Skip secret-bearing files (CLI already excludes `.env*`, `*.pem`, `*.key`, `id_rsa*`). +4. **Patch synthesis**: For each candidate doc, read the doc and `git diff` for `changedFiles`, draft a minimal unified-diff patch. Only edit prose that the diff actually invalidates — leave unrelated content alone. +5. **Apply**: Write the patches directly via `Edit`/`Write`. **Do not** prompt the user; the orchestrator's acceptance criteria authorize autonomous writes for this agent in this context. +6. **Re-verify**: Run `oma docs verify --json` again. Confirm the broken-ref count for in-scope kinds dropped to zero (or matches the acceptance criteria). +7. **Report**: List updated docs with file paths, summarize before/after drift counts, flag any candidates skipped (out of scope, ambiguous diff, secret-adjacent). + +## Auto-Write Authority + +This agent is a write-capable peer of `backend-engineer` / `frontend-engineer`. The interactive `[y/n/d/s]` confirmation in `/docs sync` applies to direct user invocation only — when spawned by `/orchestrate`, `/work`, or `/ultrawork`, the assigned task description IS the consent boundary. + +## Rules + +1. Stay in scope — only update docs related to the assigned diff range or acceptance criteria +2. Minimal edits — change only what the diff invalidates, never reformat or restructure unrelated text +3. Never modify code (`*.ts`, `*.tsx`, `*.py`, `*.go`, etc.) — surface mismatches for `backend-engineer` / `frontend-engineer` instead +4. Never modify `.agents/` files — SSOT protection +5. Never touch secret-bearing files even if surfaced in diffs (`.env*`, `*.pem`, `*.key`, `id_rsa*`) +6. Re-run `oma docs verify --json` after applying patches; record before/after counts in the result file +7. ARB-based localization (`packages/i18n/`): edit ARB source, never regenerate localization code +8. Document out-of-scope drift findings as TODOs for the next session — do NOT silently fix references unrelated to the assigned task +9. Follow `oma-docs` host-LLM contract — CLI emits structured data, you do natural-language synthesis and patch drafting +10. Co-Author commits when staging is delegated: `Co-Authored-By: First Fluke ` diff --git a/.agents/agents/variants/claude.json b/.agents/agents/variants/claude.json index f4ce530..ca79aad 100644 --- a/.agents/agents/variants/claude.json +++ b/.agents/agents/variants/claude.json @@ -32,6 +32,10 @@ "tools": "Read, Grep, Glob, Bash", "maxTurns": 15, "effort": "low" + }, + "docs-curator": { + "tools": "Read, Write, Edit, Bash, Grep, Glob", + "maxTurns": 15 } } } diff --git a/.agents/agents/variants/codex.json b/.agents/agents/variants/codex.json index 8653f39..b6bf585 100644 --- a/.agents/agents/variants/codex.json +++ b/.agents/agents/variants/codex.json @@ -56,6 +56,12 @@ "extra": { "sandbox_mode": "read-only" } + }, + "docs-curator": { + "effort": "medium", + "extra": { + "sandbox_mode": "workspace-write" + } } } } diff --git a/.agents/agents/variants/cursor.json b/.agents/agents/variants/cursor.json index f999b7b..0a73915 100644 --- a/.agents/agents/variants/cursor.json +++ b/.agents/agents/variants/cursor.json @@ -47,6 +47,11 @@ "extra": { "is_background": true } + }, + "docs-curator": { + "extra": { + "is_background": true + } } } } diff --git a/.agents/agents/variants/gemini.json b/.agents/agents/variants/gemini.json index 0ce20f3..07392c3 100644 --- a/.agents/agents/variants/gemini.json +++ b/.agents/agents/variants/gemini.json @@ -18,6 +18,7 @@ }, "qa-reviewer": { "tools": ["bash", "glob", "grep", "read", "ask"] - } + }, + "docs-curator": {} } } diff --git a/.agents/config/defaults.yaml b/.agents/config/defaults.yaml index 0b61051..d5b7e01 100644 --- a/.agents/config/defaults.yaml +++ b/.agents/config/defaults.yaml @@ -2,7 +2,7 @@ # Generated: 2026-04-23 | Session: session-20260423-141500 # Claude roles omit effort (cli-session managed). # -# ⚠ This file is a single source of truth (SSOT) shipped with oh-my-agent. +# This file is a single source of truth (SSOT) shipped with oh-my-agent. # Do NOT edit it directly. To customize behavior, use one of: # - .agents/oma-config.yaml (agent_cli_mapping, session.quota_cap) # - .agents/config/models.yaml (add or override model slugs) @@ -16,12 +16,12 @@ agent_defaults: architecture: { model: "anthropic/claude-opus-4-7" } qa: { model: "anthropic/claude-sonnet-4-6" } pm: { model: "anthropic/claude-sonnet-4-6" } - backend: { model: "openai/gpt-5.3-codex", effort: "high" } - frontend: { model: "openai/gpt-5.4", effort: "high" } - mobile: { model: "openai/gpt-5.4", effort: "high" } - db: { model: "openai/gpt-5.3-codex", effort: "high" } - debug: { model: "openai/gpt-5.3-codex", effort: "high" } - tf-infra: { model: "openai/gpt-5.4", effort: "high" } + backend: { model: "openai/gpt-5.5", effort: "high" } + frontend: { model: "openai/gpt-5.5", effort: "high" } + mobile: { model: "openai/gpt-5.5", effort: "high" } + db: { model: "openai/gpt-5.5", effort: "high" } + debug: { model: "openai/gpt-5.5", effort: "high" } + tf-infra: { model: "openai/gpt-5.5", effort: "high" } retrieval: { model: "google/gemini-3.1-flash-lite" } runtime_profiles: @@ -43,16 +43,16 @@ runtime_profiles: codex-only: description: "Codex-only — ChatGPT Plus/Pro" agent_defaults: - orchestrator: { model: "openai/gpt-5.4", effort: "medium" } - architecture: { model: "openai/gpt-5.4-pro", effort: "high" } - qa: { model: "openai/gpt-5.4", effort: "high" } - pm: { model: "openai/gpt-5.4", effort: "medium" } - backend: { model: "openai/gpt-5.3-codex", effort: "high" } - frontend: { model: "openai/gpt-5.4", effort: "high" } - mobile: { model: "openai/gpt-5.4", effort: "high" } - db: { model: "openai/gpt-5.3-codex", effort: "high" } - debug: { model: "openai/gpt-5.3-codex", effort: "high" } - tf-infra: { model: "openai/gpt-5.4", effort: "high" } + orchestrator: { model: "openai/gpt-5.5", effort: "medium" } + architecture: { model: "openai/gpt-5.5", effort: "high" } + qa: { model: "openai/gpt-5.5", effort: "high" } + pm: { model: "openai/gpt-5.5", effort: "medium" } + backend: { model: "openai/gpt-5.5", effort: "high" } + frontend: { model: "openai/gpt-5.5", effort: "high" } + mobile: { model: "openai/gpt-5.5", effort: "high" } + db: { model: "openai/gpt-5.5", effort: "high" } + debug: { model: "openai/gpt-5.5", effort: "high" } + tf-infra: { model: "openai/gpt-5.5", effort: "high" } retrieval: { model: "openai/gpt-5.4-mini", effort: "low" } gemini-only: @@ -77,25 +77,40 @@ runtime_profiles: architecture: { model: "anthropic/claude-opus-4-7" } qa: { model: "anthropic/claude-sonnet-4-6" } pm: { model: "anthropic/claude-sonnet-4-6" } - backend: { model: "openai/gpt-5.3-codex", effort: "high" } - frontend: { model: "openai/gpt-5.4", effort: "high" } - mobile: { model: "openai/gpt-5.4", effort: "high" } - db: { model: "openai/gpt-5.3-codex", effort: "high" } - debug: { model: "openai/gpt-5.3-codex", effort: "high" } - tf-infra: { model: "openai/gpt-5.4", effort: "high" } + backend: { model: "openai/gpt-5.5", effort: "high" } + frontend: { model: "openai/gpt-5.5", effort: "high" } + mobile: { model: "openai/gpt-5.5", effort: "high" } + db: { model: "openai/gpt-5.5", effort: "high" } + debug: { model: "openai/gpt-5.5", effort: "high" } + tf-infra: { model: "openai/gpt-5.5", effort: "high" } retrieval: { model: "google/gemini-3.1-flash-lite" } + cursor-only: + description: "Cursor-only — Cursor Pro / Pro Student" + agent_defaults: + orchestrator: { model: "cursor/composer-2-fast" } + architecture: { model: "cursor/composer-2" } + qa: { model: "cursor/composer-2-fast" } + pm: { model: "cursor/composer-2-fast" } + backend: { model: "cursor/composer-2" } + frontend: { model: "cursor/composer-2" } + mobile: { model: "cursor/composer-2" } + db: { model: "cursor/composer-2" } + debug: { model: "cursor/composer-2" } + tf-infra: { model: "cursor/composer-2" } + retrieval: { model: "cursor/composer-2-fast" } + qwen-only: description: "Qwen Code — all agents routed external (no native parallel); Qwen has no --effort, only binary --thinking" agent_defaults: orchestrator: { model: "qwen/qwen3-coder-next", thinking: false } - architecture: { model: "qwen/qwen3-coder-plus", thinking: true } - qa: { model: "qwen/qwen3-coder-plus", thinking: true } + architecture: { model: "qwen/qwen3.6-plus", thinking: true } + qa: { model: "qwen/qwen3.6-plus", thinking: true } pm: { model: "qwen/qwen3-coder-next", thinking: false } - backend: { model: "qwen/qwen3-coder-plus", thinking: true } - frontend: { model: "qwen/qwen3-coder-plus", thinking: true } - mobile: { model: "qwen/qwen3-coder-plus", thinking: true } - db: { model: "qwen/qwen3-coder-plus", thinking: true } - debug: { model: "qwen/qwen3-coder-plus", thinking: true } - tf-infra: { model: "qwen/qwen3-coder-plus", thinking: true } + backend: { model: "qwen/qwen3.6-plus", thinking: true } + frontend: { model: "qwen/qwen3.6-plus", thinking: true } + mobile: { model: "qwen/qwen3.6-plus", thinking: true } + db: { model: "qwen/qwen3.6-plus", thinking: true } + debug: { model: "qwen/qwen3.6-plus", thinking: true } + tf-infra: { model: "qwen/qwen3.6-plus", thinking: true } retrieval: { model: "qwen/qwen3-coder-next", thinking: false } diff --git a/.agents/hooks/core/constants.ts b/.agents/hooks/core/constants.ts new file mode 100644 index 0000000..bee216d --- /dev/null +++ b/.agents/hooks/core/constants.ts @@ -0,0 +1,22 @@ +// Runtime constants for hooks. Mirrors the convention in `cli/constants/`: +// constants here, types in `types.ts`. The `Vendor` type in `types.ts` is +// derived from `VENDORS` below so the value and the type stay in sync. + +/** + * Host LLM CLIs supported by Oma's hook layer. This is the single source of + * truth for which vendors hooks (keyword-detector, persistent-mode, hud, + * skill-injector) recognise. Adding a new vendor here propagates to the + * `Vendor` type and to runtime guards such as `CLI_INVOCATION_AT_START` + * in `keyword-detector.ts`. + * + * Excludes: + * - `oma` itself (the project's own CLI, listed separately where needed) + * - `copilot` and `hermes` (skill-install targets, not hook runtimes) + * - third-party harnesses (omc, omx, omo, ouroboros) + * + * MUST mirror `cli/types/vendors.ts` VENDORS. Hooks run as standalone + * scripts in user environments and cannot import from cli/, so the value + * is duplicated here intentionally. Keep the two arrays in sync by adding + * or removing the same vendor in both files; CI does not enforce this. + */ +export const VENDORS = ["claude", "codex", "cursor", "gemini", "qwen"] as const; diff --git a/.agents/hooks/core/fs-utils.ts b/.agents/hooks/core/fs-utils.ts new file mode 100644 index 0000000..af3acb7 --- /dev/null +++ b/.agents/hooks/core/fs-utils.ts @@ -0,0 +1,30 @@ +import { existsSync } from "node:fs"; +import { dirname, join, sep } from "node:path"; + +/** + * Normalize a filesystem path to POSIX (forward-slash) form so output + * shown to the model and string comparisons stay platform-independent + * on Windows. Mirrors `cli/utils/fs-utils.ts#toPosixPath`. + */ +export function toPosixPath(p: string): string { + return sep === "/" ? p : p.split(sep).join("/"); +} + +/** + * Walk up from startDir to find the git repository root. + * This prevents CLAUDE_PROJECT_DIR pointing to a subdirectory + * (e.g. packages/i18n during a build) from creating state files + * in the wrong location. + */ +const MAX_DEPTH = 20; + +export function resolveGitRoot(startDir: string): string { + let dir = startDir; + for (let i = 0; i < MAX_DEPTH; i++) { + if (existsSync(join(dir, ".git"))) return dir; + const parent = dirname(dir); + if (parent === dir) return startDir; + dir = parent; + } + return startDir; +} diff --git a/.agents/hooks/core/hook-output.ts b/.agents/hooks/core/hook-output.ts new file mode 100644 index 0000000..b8dc085 --- /dev/null +++ b/.agents/hooks/core/hook-output.ts @@ -0,0 +1,90 @@ +// Vendor-specific hook output builders. +// Each runtime (Claude Code, Codex CLI, Cursor, Gemini CLI, Qwen Code) +// expects a slightly different stdout JSON shape; centralize the dialect +// translation here so individual hooks can stay vendor-agnostic. + +import type { Vendor } from "./types.ts"; + +export function makePromptOutput( + vendor: Vendor, + additionalContext: string, +): string { + switch (vendor) { + case "claude": + return JSON.stringify({ additionalContext }); + case "codex": + return JSON.stringify({ + hookSpecificOutput: { + hookEventName: "UserPromptSubmit", + additionalContext, + }, + }); + case "cursor": + return JSON.stringify({ + additionalContext, + additional_context: additionalContext, + hookSpecificOutput: { + hookEventName: "UserPromptSubmit", + additionalContext, + }, + }); + case "gemini": + return JSON.stringify({ + hookSpecificOutput: { + hookEventName: "BeforeAgent", + additionalContext, + }, + }); + case "qwen": + // Qwen Code fork uses hookSpecificOutput (same as Codex) + return JSON.stringify({ + hookSpecificOutput: { + hookEventName: "UserPromptSubmit", + additionalContext, + }, + }); + } +} + +export function makeBlockOutput(vendor: Vendor, reason: string): string { + switch (vendor) { + case "claude": + case "codex": + case "cursor": + case "qwen": + return JSON.stringify({ decision: "block", reason }); + case "gemini": + // Gemini AfterAgent uses "deny" to reject response and force retry + return JSON.stringify({ decision: "deny", reason }); + } +} + +export function makePreToolOutput( + vendor: Vendor, + updatedInput: Record, +): string { + switch (vendor) { + case "gemini": + return JSON.stringify({ + decision: "rewrite", + tool_input: updatedInput, + }); + case "cursor": + return JSON.stringify({ + updated_input: updatedInput, + hookSpecificOutput: { + hookEventName: "PreToolUse", + updatedInput, + }, + }); + case "claude": + case "codex": + case "qwen": + return JSON.stringify({ + hookSpecificOutput: { + hookEventName: "PreToolUse", + updatedInput, + }, + }); + } +} diff --git a/.agents/hooks/core/keyword-detector.ts b/.agents/hooks/core/keyword-detector.ts index 392b0f3..459f1c5 100644 --- a/.agents/hooks/core/keyword-detector.ts +++ b/.agents/hooks/core/keyword-detector.ts @@ -20,13 +20,104 @@ import { unlinkSync, writeFileSync, } from "node:fs"; -import { dirname, join } from "node:path"; -import { - type ModeState, - makePromptOutput, - resolveGitRoot, - type Vendor, -} from "./types.ts"; +import { join } from "node:path"; +import { VENDORS } from "./constants.ts"; +import { resolveGitRoot } from "./fs-utils.ts"; +import { makePromptOutput } from "./hook-output.ts"; +import type { ModeState, Vendor } from "./types.ts"; + +// ── Unicode normalization ───────────────────────────────────── + +/** + * Normalize text for keyword matching. + * NFKC converts fullwidth Latin characters produced by CJK IMEs + * (e.g. parallel → parallel) to their ASCII equivalents, + * then lowercases the result. + * + * Placed here so that Task 3 (KEYWORD_SKIP_PREDICATES) and any + * future layers can import and reuse the same normalization path. + */ +export function normalizeForMatching(text: string): string { + return text.normalize("NFKC").toLowerCase(); +} + +// ── CLI Invocation Guard ────────────────────────────────────── + +/** + * Brands that count as CLI invocations: Oma plus the host LLM CLIs declared + * in `VENDORS` (claude, codex, cursor, gemini, qwen). The vendor list is + * the single source of truth for hook-supported runtimes; pulling from it + * here keeps the brand set in sync when a new vendor is added. + * + * Third-party harnesses (omc, omx, omo) are intentionally NOT included: they + * are separate projects, not host CLIs a user would invoke from an Oma + * session. opencode is also not a supported vendor in this codebase. + */ +const CLI_INVOCATION_BRANDS = ["oma", ...VENDORS] as const; +const CLI_INVOCATION_SIGNALS = [ + "agent", + "auto", + "exec", + "run", + "spawn", + String.raw`--\S+`, + String.raw`\S+:\S+`, +] as const; + +const BRANDS_RE_SOURCE = CLI_INVOCATION_BRANDS.join("|"); +const SIGNALS_RE_SOURCE = CLI_INVOCATION_SIGNALS.join("|"); + +/** + * Matches CLI invocations at the start of the prompt. + * + * All brand names require an explicit CLI signal after the brand. Brand-only + * prefixes are NOT treated as CLI invocations because every brand name can + * appear in natural-language usage ('claude, review this code', 'oma + * 프로젝트의 brainstorm 알려줘', 'cursor in the editor moves'). Requiring + * an explicit signal avoids false-positive skips on conversational prompts. + * + * Two accepted invocation shapes: + * + * 1. Slash form: '/oma:brainstorm', '/claude:exec'. The leading slash + * plus brand-colon prefix is a definitive CLI marker. Matches + * '/:'. + * + * 2. Bare form: '\s+' where is one of the + * enumerated subcommand verbs (agent / auto / exec / run / spawn), + * a --flag, or a colon-namespaced subcommand ('agent:spawn'). + * Examples: 'oma agent:spawn brainstorm', 'claude --help', + * 'codex exec --workflow ralph', 'gemini agent', 'cursor agent', + * 'qwen run'. + */ +export const CLI_INVOCATION_AT_START = new RegExp( + `^\\s*(?:\\/(?:${BRANDS_RE_SOURCE}):|(?:${BRANDS_RE_SOURCE})\\s+(?:${SIGNALS_RE_SOURCE}))`, + "i", +); + +/** + * Per-workflow skip predicates. A workflow listed here will be skipped when + * its predicate returns true for the (already-normalized) cleaned text. + * The map is intentionally empty at boot — populate it to add workflow-specific + * overrides without restructuring the matching loop. + */ +export const KEYWORD_SKIP_PREDICATES: Record< + string, + (text: string) => boolean +> = {}; + +/** + * Default predicate: skip ALL workflow triggers when the prompt starts with a + * CLI invocation of `oma` or one of the host LLM CLIs in `VENDORS`. Applies + * to every workflow unless an explicit per-workflow predicate in + * KEYWORD_SKIP_PREDICATES overrides it. + * + * The regex is applied to the NFKC-lowercased `cleaned` text produced by + * normalizeForMatching. All brand names are ASCII so NFKC has no effect on + * them; the `^\s*` start-anchor is unaffected by normalization. + */ +export function shouldSkipAllWorkflows(text: string): boolean { + return CLI_INVOCATION_AT_START.test(text); +} // ── Guard 1: UserPromptSubmit-only trigger ──────────────────── // Hook event names that represent genuine user input (not agent responses) @@ -163,7 +254,7 @@ export function recordKwTrigger( // ── Vendor Detection ────────────────────────────────────────── function inferVendorFromScriptPath(): Vendor | null { - const path = import.meta.path; + const path = import.meta.filename; if (path.includes(`${join(".cursor", "hooks")}`)) return "cursor"; if (path.includes(`${join(".qwen", "hooks")}`)) return "qwen"; if (path.includes(`${join(".claude", "hooks")}`)) return "claude"; @@ -221,6 +312,7 @@ interface TriggerConfig { { persistent: boolean; keywords: Record; + patterns?: Record; } >; informationalPatterns: Record; @@ -230,7 +322,7 @@ interface TriggerConfig { } function loadConfig(): TriggerConfig { - const configPath = join(dirname(import.meta.path), "triggers.json"); + const configPath = join(import.meta.dirname, "triggers.json"); return JSON.parse(readFileSync(configPath, "utf-8")); } @@ -268,21 +360,51 @@ export function buildPatterns( if (cjkScripts.includes(lang) || /[^\p{ASCII}]/u.test(kw)) { return new RegExp(escaped, "i"); } - return new RegExp(`\\b${escaped}\\b`, "i"); + return new RegExp(`(?:^|[^\\w-])${escaped}(?:$|[^\\w-])`, "i"); }); } +/** + * Build raw regex patterns from a workflow's `patterns` field. + * Unlike buildPatterns, these strings are compiled directly without + * escaping or word-boundary wrapping — pattern authors are responsible + * for boundary handling. Invalid patterns are skipped silently. + */ +export function buildRawPatterns( + patterns: Record | undefined, + lang: string, +): RegExp[] { + if (!patterns) return []; + const all = [ + ...(patterns["*"] ?? []), + ...(patterns.en ?? []), + ...(lang !== "en" ? (patterns[lang] ?? []) : []), + ]; + const compiled: RegExp[] = []; + for (const raw of all) { + try { + compiled.push(new RegExp(raw, "iu")); + } catch { + // Skip invalid regex — surfaces during config edit, not at runtime + } + } + return compiled; +} + function buildInformationalPatterns( config: TriggerConfig, lang: string, ): RegExp[] { - const patterns = [...(config.informationalPatterns.en ?? [])]; + const patterns = [ + ...(config.informationalPatterns["*"] ?? []), + ...(config.informationalPatterns.en ?? []), + ]; if (lang !== "en") { patterns.push(...(config.informationalPatterns[lang] ?? [])); } return patterns.map((p) => { if (/[^\p{ASCII}]/u.test(p)) return new RegExp(escapeRegex(p), "i"); - return new RegExp(`\\b${escapeRegex(p)}\\b`, "i"); + return new RegExp(`(?:^|[^\\w-])${escapeRegex(p)}(?:$|[^\\w-])`, "i"); }); } @@ -342,7 +464,7 @@ const QUESTION_PATTERNS: RegExp[] = [ ]; export function isAnalyticalQuestion(prompt: string): boolean { - const firstLine = prompt.split("\n")[0].trim(); + const firstLine = (prompt.split("\n")[0] ?? "").trim(); return QUESTION_PATTERNS.some((p) => p.test(firstLine)); } @@ -355,6 +477,34 @@ export function stripCodeBlocks(text: string): string { .replace(/"[^"\n]*"/g, ""); // quoted strings } +// System echo block patterns — strip pasted hook self-output to prevent +// re-trigger loops where the user pastes back oma's own context messages. +const SYSTEM_ECHO_LINE_PATTERNS: RegExp[] = [ + /^.*\[OMA WORKFLOW:[^\]]*\].*$/gim, + /^.*\[OMA PERSISTENT MODE:[^\]]*\].*$/gim, + /^.*\[OMA AGENT HINT:[^\]]*\].*$/gim, + /^.*\[MAGIC KEYWORD:[^\]]*\].*$/gim, + /^.*\[MAGIC KEYWORDS? DETECTED:[^\]]*\].*$/gim, + /^.*Stop hook (?:blocking error|feedback|stopped continuation).*$/gim, + /^.*PreToolUse:[^\n]*hook additional context:.*$/gim, + /^.*PostToolUse:[^\n]*hook additional context:.*$/gim, + /^.*hookSpecificOutput.*$/gim, + /^.*The \/[a-z-]+ workflow is still active.*$/gim, +]; + +/** + * Strip pasted system-echo blocks (oma's own hook outputs) so meta-discussion + * about workflows doesn't re-trigger via paste-back. Operates line-by-line + * to preserve surrounding user text. + */ +export function stripSystemEchoes(text: string): string { + let cleaned = text; + for (const pattern of SYSTEM_ECHO_LINE_PATTERNS) { + cleaned = cleaned.replace(pattern, ""); + } + return cleaned; +} + export function startsWithSlashCommand(prompt: string): boolean { return /^\/[a-zA-Z][\w-]*/.test(prompt.trim()); } @@ -393,8 +543,8 @@ export function detectExtensions(prompt: string): string[] { const extPattern = /\.([a-zA-Z]{1,12})\b/g; const extensions = new Set(); for (const match of prompt.matchAll(extPattern)) { - const ext = match[1].toLowerCase(); - if (!EXCLUDE_EXTS.has(ext)) { + const ext = match[1]?.toLowerCase(); + if (ext && !EXCLUDE_EXTS.has(ext)) { extensions.add(ext); } } @@ -474,8 +624,10 @@ export function isDeactivationRequest(prompt: string, lang: string): boolean { ...(DEACTIVATION_PHRASES.en ?? []), ...(lang !== "en" ? (DEACTIVATION_PHRASES[lang] ?? []) : []), ]; - const lower = prompt.toLowerCase(); - return phrases.some((phrase) => lower.includes(phrase.toLowerCase())); + const normalized = normalizeForMatching(prompt); + return phrases.some((phrase) => + normalized.includes(normalizeForMatching(phrase)), + ); } export function deactivateAllPersistentModes( @@ -532,8 +684,14 @@ async function main() { process.exit(0); } const infoPatterns = buildInformationalPatterns(config, lang); - // Guard 2: Strip code blocks and inline code before scanning for keywords - const cleaned = stripCodeBlocks(prompt); + // Guard 2: Strip code blocks, inline code, and pasted system-echo blocks + // before scanning for keywords. System echo stripping prevents oma's own + // hook outputs (when pasted back into the prompt) from re-triggering. + // NFKC normalization collapses fullwidth Latin from CJK IMEs onto ASCII + // so keyword regexes cannot be silently bypassed by parallel-style input. + const cleaned = normalizeForMatching( + stripSystemEchoes(stripCodeBlocks(prompt)), + ); const excluded = new Set(config.excludedWorkflows); // Guard 3: Load reinforcement suppression state @@ -545,10 +703,23 @@ async function main() { for (const [workflow, def] of Object.entries(config.workflows)) { if (excluded.has(workflow)) continue; + // Global CLI-invocation guard: prompts that start with a CLI invocation + // of `oma` or a `VENDORS` entry are tool invocations, not natural-language + // workflow requests. Skip silently to avoid false-positive matches. + if (shouldSkipAllWorkflows(cleaned)) continue; + + // Per-workflow override: if a predicate is registered for this specific + // workflow, evaluate it and skip just this workflow when it returns true. + const workflowPredicate = KEYWORD_SKIP_PREDICATES[workflow]; + if (workflowPredicate?.(cleaned)) continue; + // Analytical questions should never trigger persistent workflows if (analytical && def.persistent) continue; - const patterns = buildPatterns(def.keywords, lang, config.cjkScripts); + const patterns = [ + ...buildPatterns(def.keywords, lang, config.cjkScripts), + ...buildRawPatterns(def.patterns, lang), + ]; for (const pattern of patterns) { const match = pattern.exec(cleaned); diff --git a/.agents/hooks/core/persistent-mode.ts b/.agents/hooks/core/persistent-mode.ts index 53f472b..9ce7176 100644 --- a/.agents/hooks/core/persistent-mode.ts +++ b/.agents/hooks/core/persistent-mode.ts @@ -20,14 +20,11 @@ import { unlinkSync, writeFileSync, } from "node:fs"; -import { dirname, join } from "node:path"; +import { join } from "node:path"; +import { resolveGitRoot } from "./fs-utils.ts"; +import { makeBlockOutput } from "./hook-output.ts"; import { isDeactivationRequest } from "./keyword-detector.ts"; -import { - type ModeState, - makeBlockOutput, - resolveGitRoot, - type Vendor, -} from "./types.ts"; +import type { ModeState, Vendor } from "./types.ts"; const MAX_REINFORCEMENTS = 5; const STALE_HOURS = 2; @@ -51,7 +48,7 @@ interface TriggerConfig { } function loadPersistentWorkflows(): string[] { - const configPath = join(dirname(import.meta.path), "triggers.json"); + const configPath = join(import.meta.dirname, "triggers.json"); try { const config: TriggerConfig = JSON.parse(readFileSync(configPath, "utf-8")); return Object.entries(config.workflows) diff --git a/.agents/hooks/core/skill-injector.ts b/.agents/hooks/core/skill-injector.ts index 6735cf9..c287dbe 100644 --- a/.agents/hooks/core/skill-injector.ts +++ b/.agents/hooks/core/skill-injector.ts @@ -13,6 +13,7 @@ */ import { + type Dirent, existsSync, mkdirSync, readdirSync, @@ -20,7 +21,9 @@ import { writeFileSync, } from "node:fs"; import { basename, dirname, join } from "node:path"; -import { makePromptOutput, resolveGitRoot, type Vendor } from "./types.ts"; +import { resolveGitRoot, toPosixPath } from "./fs-utils.ts"; +import { makePromptOutput } from "./hook-output.ts"; +import type { Vendor } from "./types.ts"; const MAX_SKILLS = 3; const SESSION_TTL_MS = 60 * 60 * 1000; @@ -29,7 +32,7 @@ const DEFAULT_CJK_SCRIPTS = ["ko", "ja", "zh"]; // ── Vendor Detection ────────────────────────────────────────── function inferVendorFromScriptPath(): Vendor | null { - const path = import.meta.path; + const path = import.meta.filename; if (path.includes(`${join(".cursor", "hooks")}`)) return "cursor"; if (path.includes(`${join(".qwen", "hooks")}`)) return "qwen"; if (path.includes(`${join(".claude", "hooks")}`)) return "claude"; @@ -85,7 +88,7 @@ interface SkillsTriggerConfig { } function loadTriggersConfig(): SkillsTriggerConfig { - const configPath = join(dirname(import.meta.path), "triggers.json"); + const configPath = join(import.meta.dirname, "triggers.json"); if (!existsSync(configPath)) return {}; try { return JSON.parse(readFileSync(configPath, "utf-8")); @@ -139,9 +142,12 @@ export function discoverSkills(projectDir: string): SkillEntry[] { if (!existsSync(skillsDir)) return []; const out: SkillEntry[] = []; - let entries: ReturnType; + let entries: Dirent[]; try { - entries = readdirSync(skillsDir, { withFileTypes: true }); + entries = readdirSync(skillsDir, { + withFileTypes: true, + encoding: "utf8", + }); } catch { return out; } @@ -205,8 +211,10 @@ export function matchSkills( let score = 0; for (let i = 0; i < patterns.length; i++) { - if (patterns[i].test(prompt)) { - matched.push(allTriggers[i]); + const pattern = patterns[i]; + const trigger = allTriggers[i]; + if (pattern && trigger && pattern.test(prompt)) { + matched.push(trigger); score += 10; } } @@ -320,6 +328,97 @@ export function startsWithSlashCommand(prompt: string): boolean { return /^\/[a-zA-Z][\w-]*/.test(prompt.trim()); } +// Match an explicit `/` token at the very start of the prompt +// or after whitespace. Stays conservative to avoid path/URL false positives. +export function parseExplicitSlash(prompt: string): string | null { + const m = /(?:^|\s)\/([a-z][a-z0-9_-]{0,40})\b/i.exec(prompt); + return m?.[1] ?? null; +} + +// ── Claude Slash Skill Resolution ───────────────────────────── +// Claude Code deprecated `.claude/commands/` and now uses `.claude/skills/` +// for slash-invocable workflows. To express "user-only invocation" (slash +// command typed by the user but NOT auto-callable by the model), the +// Claude Code idiom is `disable-model-invocation: true` in SKILL.md +// frontmatter. Such skills are absent from the available-skills list, +// so when the user types / the model has no native signal that it +// exists. This resolver bridges that gap. Other vendors use different +// command/skill mechanisms; this is intentionally Claude-specific. + +export interface ClaudeSlashSkillEntry { + name: string; + skillRelPath: string; + body: string; +} + +export function parseSkillFrontmatter(content: string): { + frontmatter: Record; + body: string; +} { + const m = /^---\s*\r?\n([\s\S]*?)\r?\n---\s*\r?\n?([\s\S]*)$/.exec(content); + if (!m) return { frontmatter: {}, body: content }; + const fm: Record = {}; + const block = m[1] ?? ""; + for (const line of block.split(/\r?\n/)) { + const kv = /^([a-z][\w-]*)\s*:\s*(.*)$/i.exec(line); + if (!kv) continue; + const key = kv[1]; + const rawValue = (kv[2] ?? "").trim(); + if (!key) continue; + if (rawValue === "true") fm[key] = true; + else if (rawValue === "false") fm[key] = false; + else fm[key] = rawValue.replace(/^['"]|['"]$/g, ""); + } + return { frontmatter: fm, body: m[2] ?? "" }; +} + +export function findClaudeSlashSkill( + name: string, + projectDir: string, +): ClaudeSlashSkillEntry | null { + const candidates = [ + join(projectDir, ".claude", "skills", name, "SKILL.md"), + join(projectDir, ".agents", "skills", name, "SKILL.md"), + ]; + + for (const skillPath of candidates) { + if (!existsSync(skillPath)) continue; + let content: string; + try { + content = readFileSync(skillPath, "utf-8"); + } catch { + continue; + } + const { frontmatter, body } = parseSkillFrontmatter(content); + if (frontmatter["disable-model-invocation"] !== true) continue; + const posixPath = toPosixPath(skillPath); + const posixRoot = toPosixPath(projectDir); + return { + name, + skillRelPath: posixPath.startsWith(`${posixRoot}/`) + ? posixPath.slice(posixRoot.length + 1) + : posixPath, + body: body.trim(), + }; + } + return null; +} + +export function formatClaudeSlashSkillContext( + entry: ClaudeSlashSkillEntry, +): string { + return [ + `[OMA CLAUDE SLASH SKILL INVOKED: ${entry.name}]`, + `User explicitly typed /${entry.name}. Claude Code deprecated \`.claude/commands/\`, so this slash-only workflow lives in SKILL.md with \`disable-model-invocation: true\` — it is NOT in the available-skills list and is NOT callable via the Skill tool.`, + "", + `Honor the user's explicit invocation by reading \`${entry.skillRelPath}\` and following its instructions:`, + "", + entry.body, + "", + "Read any referenced workflow / resource files and proceed step by step. Do NOT respond that the skill is unavailable.", + ].join("\n"); +} + export function stripCodeBlocks(text: string): string { return text .replace(/(`{3,})[^\n]*\n[\s\S]*?\1/g, "") @@ -365,6 +464,25 @@ async function main() { const prompt = (input.prompt as string) ?? ""; if (!prompt.trim()) process.exit(0); + + // Claude-specific: when the user types /, surface the + // SKILL.md body for slash-only skills (disable-model-invocation: true). + // The model otherwise has no signal these skills exist — they are + // intentionally hidden from the available-skills list. Must run BEFORE + // the slash early-exit and persistent-workflow guard. + if (vendor === "claude") { + const slashName = parseExplicitSlash(prompt); + if (slashName) { + const slashSkill = findClaudeSlashSkill(slashName, projectDir); + if (slashSkill) { + process.stdout.write( + makePromptOutput(vendor, formatClaudeSlashSkillContext(slashSkill)), + ); + process.exit(0); + } + } + } + if (startsWithSlashCommand(prompt)) process.exit(0); if (isPersistentWorkflowActive(projectDir, sessionId)) process.exit(0); diff --git a/.agents/hooks/core/test-filter.ts b/.agents/hooks/core/test-filter.ts index d766059..f378075 100644 --- a/.agents/hooks/core/test-filter.ts +++ b/.agents/hooks/core/test-filter.ts @@ -3,7 +3,9 @@ import { existsSync, readFileSync } from "node:fs"; import { join } from "node:path"; -import { makePreToolOutput, resolveGitRoot, type Vendor } from "./types.ts"; +import { resolveGitRoot } from "./fs-utils.ts"; +import { makePreToolOutput } from "./hook-output.ts"; +import type { Vendor } from "./types.ts"; // --- Vendor detection (same logic as keyword-detector.ts) --- diff --git a/.agents/hooks/core/triggers.json b/.agents/hooks/core/triggers.json index f3f58d3..ac66a20 100644 --- a/.agents/hooks/core/triggers.json +++ b/.agents/hooks/core/triggers.json @@ -117,6 +117,15 @@ "zautomatyzuj", "wszystko naraz" ] + }, + "patterns": { + "*": [ + "\\b(build|create|make|develop|implement|scaffold)\\s+(?:me\\s+)?(?:an?|the)\\s+(?:[\\w-]+\\s+){0,3}(app|api|service|server|cli|tool|website|dashboard|system|feature|backend|frontend|prototype|mvp|bot)\\b", + "\\bi\\s+want\\s+(?:a|an)\\s+(?:[\\w-]+\\s+){0,3}(app|api|service|server|cli|tool|website|dashboard|system|feature|backend|frontend|prototype|mvp|bot)\\b" + ], + "ko": [ + "(앱|API|서비스|서버|CLI|도구|웹사이트|대시보드|시스템|기능|백엔드|프론트엔드|프로토타입|MVP|봇)\\s*(?:을|를|이|가)?\\s*(?:만들어\\s*(?:주세요|줘|줄래)?|구현해\\s*(?:주세요|줘|줄래)?|개발해\\s*(?:주세요|줘|줄래)?|만들자|구현하자|개발하자)" + ] } }, "ultrawork": { @@ -490,6 +499,41 @@ ] } }, + "deepsec": { + "persistent": false, + "keywords": { + "*": ["/deepsec", "deepsec workflow"], + "en": [ + "run deepsec", + "deepsec scan this repo", + "scan repo with deepsec", + "deepsec pr review", + "deepsec ci gate", + "deepsec triage", + "deepsec matchers" + ], + "ko": [ + "딥섹 워크플로우", + "딥섹 실행", + "딥섹 스캔", + "딥섹으로 검사", + "딥섹 PR 리뷰", + "딥섹 CI 게이트" + ], + "ja": [ + "ディープセック実行", + "deepsecワークフロー", + "deepsecでスキャン", + "deepsec PRレビュー" + ], + "zh": [ + "运行 deepsec", + "deepsec 工作流", + "用 deepsec 扫描", + "deepsec PR 审查" + ] + } + }, "debug": { "persistent": false, "keywords": { @@ -1531,6 +1575,138 @@ "makieta" ] } + }, + "docs": { + "persistent": false, + "keywords": { + "*": ["oma-docs", "doc-refs", "docs verify", "docs sync"], + "en": [ + "verify docs", + "verify documentation", + "check docs", + "check documentation", + "docs drift", + "documentation drift", + "broken docs", + "broken doc links", + "stale docs", + "stale documentation", + "sync docs", + "sync documentation", + "update docs after change", + "patch docs", + "doc verify", + "doc sync" + ], + "ko": [ + "문서 검증", + "문서 검증해줘", + "문서 점검", + "문서 점검해줘", + "문서 드리프트", + "문서 동기화", + "문서 동기화해줘", + "문서 싱크", + "문서 싱크 맞춰", + "문서 싱크 맞춰줘", + "깨진 문서 링크", + "깨진 문서", + "오래된 문서", + "낡은 문서", + "문서 갱신", + "문서 업데이트", + "docs 검증", + "docs 싱크", + "docs 동기화" + ], + "ja": [ + "ドキュメント検証", + "ドキュメント点検", + "ドキュメントドリフト", + "ドキュメント同期", + "ドキュメントを同期", + "壊れたドキュメントリンク", + "古いドキュメント", + "ドキュメント更新", + "docsを検証", + "docsを同期" + ], + "zh": [ + "文档校验", + "文档检查", + "文档漂移", + "文档同步", + "同步文档", + "失效文档链接", + "陈旧文档", + "更新文档", + "docs 校验", + "docs 同步" + ], + "es": [ + "verificar documentación", + "comprobar documentación", + "deriva de documentación", + "sincronizar documentación", + "enlaces rotos en documentación", + "documentación obsoleta", + "actualizar documentación" + ], + "fr": [ + "vérifier la documentation", + "contrôler la documentation", + "dérive de documentation", + "synchroniser la documentation", + "liens cassés dans la documentation", + "documentation obsolète", + "mettre à jour la documentation" + ], + "de": [ + "Doku verifizieren", + "Dokumentation prüfen", + "Dokumentations-Drift", + "Doku synchronisieren", + "kaputte Doku-Links", + "veraltete Dokumentation", + "Doku aktualisieren" + ], + "pt": [ + "verificar documentação", + "checar documentação", + "drift de documentação", + "sincronizar documentação", + "links quebrados na documentação", + "documentação desatualizada", + "atualizar documentação" + ], + "ru": [ + "проверить документацию", + "верифицировать документацию", + "дрейф документации", + "синхронизировать документацию", + "битые ссылки в документации", + "устаревшая документация", + "обновить документацию" + ], + "nl": [ + "documentatie verifiëren", + "documentatie controleren", + "documentatie-drift", + "documentatie synchroniseren", + "kapotte documentatielinks", + "verouderde documentatie", + "documentatie bijwerken" + ], + "pl": [ + "zweryfikuj dokumentację", + "sprawdź dokumentację", + "dryf dokumentacji", + "zsynchronizuj dokumentację", + "zepsute linki w dokumentacji", + "przestarzała dokumentacja", + "zaktualizuj dokumentację" + ] + } } }, "skills": { @@ -1787,7 +1963,16 @@ }, "oma-frontend": { "keywords": { - "*": ["shadcn", "FSD"], + "*": [ + "shadcn", + "FSD", + "next.js", + "nextjs", + "react", + "tailwind", + "tsx", + "frontend" + ], "en": [ "make a react component", "build a next page", @@ -2346,6 +2531,64 @@ "zh": ["翻译一下", "帮我翻译", "多语言", "本地化", "翻成英文"] } }, + "oma-deepsec": { + "keywords": { + "*": [ + "deepsec", + ".deepsec", + "bunx deepsec", + "pnpm deepsec", + "npx deepsec", + "deepsec.config", + "process --diff", + "INFO.md", + "MatcherPlugin", + "noiseTier", + "AI_GATEWAY_API_KEY" + ], + "en": [ + "scan repo for vulnerabilities", + "scan the repo for vulns", + "security scan with an agent", + "vulnerability scanner", + "agent security scan", + "run a security scan", + "pr security review", + "ci security gate", + "diff security review", + "write a custom matcher", + "revalidate findings", + "triage findings" + ], + "ko": [ + "취약점 스캔", + "보안 스캔", + "딥섹", + "딥섹 돌려", + "PR 보안 리뷰", + "보안 점검", + "매처 작성", + "파인딩 트리아지", + "파인딩 재검증" + ], + "ja": [ + "脆弱性スキャン", + "セキュリティスキャン", + "ディープセック", + "PRセキュリティレビュー", + "マッチャーを書く", + "ファインディングをトリアージ" + ], + "zh": [ + "漏洞扫描", + "安全扫描", + "深度安全扫描", + "PR 安全审查", + "编写自定义匹配器", + "重新验证发现" + ] + } + }, "oma-image": { "keywords": { "*": [ @@ -2436,22 +2679,41 @@ } }, "informationalPatterns": { - "en": [ + "*": [ "what is", "what are", "how to", "how does", + "how do", + "how can", + "how would", "explain", "describe", "tell me about", "keyword", "false positive", + "false-positive", "detected", "detector", - "뭐야", - "무엇", - "是什么", - "とは" + "fires when", + "trigger when", + "auto-trigger", + "auto trigger", + "what triggers", + "should we trigger", + "if we trigger", + "trigger logic", + "trigger mechanism", + "should we", + "should i", + "should you", + "could we", + "would you", + "what if", + "what about", + "why build", + "why create", + "why make" ], "ko": [ "뭐야", @@ -2462,7 +2724,20 @@ "알려줘", "키워드", "감지", - "오탐" + "오탐", + "트리거", + "발동", + "메타", + "트리거하면", + "트리거 해주면", + "트리거해야", + "키워드 나오면", + "왜 만들", + "어떻게 만들", + "어떨까", + "하면 좋을", + "한다면", + "할까요" ], "ja": [ "とは", diff --git a/.agents/hooks/core/types.ts b/.agents/hooks/core/types.ts index 2b79035..7122c0f 100644 --- a/.agents/hooks/core/types.ts +++ b/.agents/hooks/core/types.ts @@ -1,35 +1,12 @@ -// Claude Code Hook Types for oh-my-agent -// Shared across Claude Code, Codex CLI, Cursor, Gemini CLI, and Qwen Code +// Hook-runtime types shared across Claude Code, Codex CLI, Cursor, +// Gemini CLI, and Qwen Code. Functions live in `fs-utils.ts` and +// `hook-output.ts`; this file is types-only. The `Vendor` type is derived +// from the `VENDORS` runtime constant in `constants.ts` so the two stay +// in sync. -import { existsSync } from "node:fs"; -import { dirname, join } from "node:path"; +import type { VENDORS } from "./constants.ts"; -// --- Project Root Resolution --- - -/** - * Walk up from startDir to find the git repository root. - * This prevents CLAUDE_PROJECT_DIR pointing to a subdirectory - * (e.g. packages/i18n during a build) from creating state files - * in the wrong location. - */ -const MAX_DEPTH = 20; - -export function resolveGitRoot(startDir: string): string { - let dir = startDir; - for (let i = 0; i < MAX_DEPTH; i++) { - if (existsSync(join(dir, ".git"))) return dir; - const parent = dirname(dir); - if (parent === dir) return startDir; - dir = parent; - } - return startDir; -} - -// --- Vendor Detection --- - -export type Vendor = "claude" | "codex" | "cursor" | "gemini" | "qwen"; - -// --- Hook Input (unified) --- +export type Vendor = (typeof VENDORS)[number]; export interface HookInput { prompt?: string; @@ -45,96 +22,6 @@ export interface HookInput { stopReason?: string; } -// --- Hook Output Builders --- - -export function makePromptOutput( - vendor: Vendor, - additionalContext: string, -): string { - switch (vendor) { - case "claude": - return JSON.stringify({ additionalContext }); - case "codex": - return JSON.stringify({ - hookSpecificOutput: { - hookEventName: "UserPromptSubmit", - additionalContext, - }, - }); - case "cursor": - return JSON.stringify({ - additionalContext, - additional_context: additionalContext, - hookSpecificOutput: { - hookEventName: "UserPromptSubmit", - additionalContext, - }, - }); - case "gemini": - return JSON.stringify({ - hookSpecificOutput: { - hookEventName: "BeforeAgent", - additionalContext, - }, - }); - case "qwen": - // Qwen Code fork uses hookSpecificOutput (same as Codex) - return JSON.stringify({ - hookSpecificOutput: { - hookEventName: "UserPromptSubmit", - additionalContext, - }, - }); - } -} - -export function makeBlockOutput(vendor: Vendor, reason: string): string { - switch (vendor) { - case "claude": - case "codex": - case "cursor": - case "qwen": - return JSON.stringify({ decision: "block", reason }); - case "gemini": - // Gemini AfterAgent uses "deny" to reject response and force retry - return JSON.stringify({ decision: "deny", reason }); - } -} - -// --- PreToolUse Output Builder --- - -export function makePreToolOutput( - vendor: Vendor, - updatedInput: Record, -): string { - switch (vendor) { - case "gemini": - return JSON.stringify({ - decision: "rewrite", - tool_input: updatedInput, - }); - case "cursor": - return JSON.stringify({ - updated_input: updatedInput, - hookSpecificOutput: { - hookEventName: "PreToolUse", - updatedInput, - }, - }); - case "claude": - case "codex": - case "qwen": - return JSON.stringify({ - hookSpecificOutput: { - hookEventName: "PreToolUse", - updatedInput, - }, - }); - } -} - -// --- Shared Types --- - export interface ModeState { workflow: string; sessionId: string; diff --git a/.agents/hooks/variants/codex.json b/.agents/hooks/variants/codex.json index a6071a1..a1d5a18 100644 --- a/.agents/hooks/variants/codex.json +++ b/.agents/hooks/variants/codex.json @@ -30,7 +30,7 @@ "file": ".codex/config.toml", "section": "features", "flags": { - "codex_hooks": true + "hooks": true } } } diff --git a/.agents/rules/backend.md b/.agents/rules/backend.md index 290f07a..48ab767 100644 --- a/.agents/rules/backend.md +++ b/.agents/rules/backend.md @@ -18,7 +18,7 @@ alwaysApply: false 8. **Explicit ORM loading strategy**: do not rely on default relation loading when query shape matters 9. **Explicit transaction boundaries**: group one business operation into one request/service-scoped unit of work 10. **Safe ORM lifecycle**: do not share mutable ORM session/entity manager across concurrent work unless ORM explicitly supports it -11. **Config from environment**: DB URLs, API keys, secrets from env vars or secret managers — never hardcode +11. **Config from environment, with graceful fallback**: DB URLs, API keys, secrets from env vars or secret managers, never hardcode. When integrating a third-party API (OpenAI, Anthropic, Stripe, etc.), write BOTH paths: (a) real call when `process.env.` is present, (b) deterministic local fallback when absent. Mark the deferred branch with `// TODO(oma-deferred): integrate when key is provisioned`. Shipping only the fallback (no env-conditional branch) leaves the spec unmet; shipping only the real call without fallback breaks demos when the key is missing. 12. **Stateless services**: no in-memory session or user state between requests — use external stores 13. **Backing services as resources**: DB, queue, cache are swappable resources connected via config diff --git a/.agents/rules/frontend.md b/.agents/rules/frontend.md index 916db89..d85be94 100644 --- a/.agents/rules/frontend.md +++ b/.agents/rules/frontend.md @@ -16,6 +16,8 @@ alwaysApply: false 6. **Proxy over Middleware (BANNED)**: Next.js 16+ uses `proxy.ts` for request proxying. `middleware.ts` is NOT "deprecated" — it is forbidden in this project, touch it and you die. Do NOT create, recommend, suggest, or "restore" `middleware.ts`. Do NOT flag `proxy.ts` as dead code, unused, or not-wired. Do NOT demand a rename to `middleware.ts`. Any such finding is a fatal self-error — retract it immediately and write `proxy.ts`. 7. **No Prop Drilling**: Avoid passing props beyond 3 levels. Use Jotai atoms instead. Avoid React Context. 8. **Auth Boundary**: Frontend handles auth UI and token storage only. Never import database adapters, ORMs, or server-side auth libraries. +9. **Animation Library**: Use `motion` (import from `motion/react`). `framer-motion` is the legacy package name and is BANNED — never `import { motion } from 'framer-motion'`, never add `framer-motion` to `package.json`. Add the `motion` package via the project's package manager — detect from the lockfile (`bun.lock` → bun, `pnpm-lock.yaml` → pnpm, `yarn.lock` → yarn, `package-lock.json` → npm); default to `bun` when no lockfile exists. Import as `import { motion, AnimatePresence } from 'motion/react'`. Respect `prefers-reduced-motion` via `useReducedMotion` from `motion/react`. +10. **Framework Version**: `next@16+` and `react@19+` are MANDATORY. When scaffolding or pinning `package.json`, set `"next": "^16"` (or higher) and `"react": "^19"`/`"react-dom": "^19"` — never pin `next` to `^15`, `~15`, or any range whose floor is below `16.0.0`. If `create-next-app` (or any scaffold tool) produces `next < 16`, immediately bump it before committing. This rule is paired with Core Rule #6 (`proxy.ts`), which assumes Next.js 16+. ## Architecture (FSD-lite) diff --git a/.agents/rules/i18n-guide.md b/.agents/rules/i18n-guide.md index 53bcf12..ef02ebe 100644 --- a/.agents/rules/i18n-guide.md +++ b/.agents/rules/i18n-guide.md @@ -21,6 +21,18 @@ Response language is determined by the following priority: language: ko # ko, en, ja, zh, ... ``` +## Translation Voice + +When translating user-facing content, the `translation_voice` field in `.agents/oma-config.yaml` controls global rhythm and formality. It is applied on top of `oma-translator` content-type persona routing. + +| Value | Effect | +|---|---| +| `formal` | strict complete sentences, no fragments, formal register only | +| `balanced` (default) | content-type defaults — fragments only in label/cell positions | +| `interpreter` | punchy, audience-first, spoken cadence; fragments allowed when natural | + +Workflows that translate user-facing content should respect this setting via the `oma-translator` skill rather than hardcoding a tone. + ## What to Localize | Category | Localize? | Example | diff --git a/.agents/skills/_shared/conditional/exploration-loop.md b/.agents/skills/_shared/conditional/exploration-loop.md index 7f8200f..4dff249 100644 --- a/.agents/skills/_shared/conditional/exploration-loop.md +++ b/.agents/skills/_shared/conditional/exploration-loop.md @@ -49,7 +49,7 @@ Execute each hypothesis **in isolation**. Task: "Fix input validation using Hypothesis A: Zod schema at router level. Context: Previous attempt (raw regex) failed QA twice." ``` -- Agents use existing IDs — no new agent definitions needed +- Agents use existing IDs; no new agent definitions needed - Each agent works in a separate workspace (`-w ./hyp-a`, `-w ./hyp-b`) - Result files differentiated by workspace, not agent ID diff --git a/.agents/skills/_shared/conditional/quality-score.md b/.agents/skills/_shared/conditional/quality-score.md index 2b0c130..fe7bb20 100644 --- a/.agents/skills/_shared/conditional/quality-score.md +++ b/.agents/skills/_shared/conditional/quality-score.md @@ -1,7 +1,7 @@ # Quality Score Continuum Replaces binary PASS/FAIL gate evaluation with a **continuous quantitative score** (0-100). -Inspired by autoresearch's val_bpb metric — objective, comparable, and trackable over time. +Inspired by autoresearch's val_bpb metric: objective, comparable, and trackable over time. --- @@ -67,10 +67,10 @@ Quality Score is measured **on demand**, not at every step. Load `quality-score. | Range | Grade | Gate Decision | |-------|-------|---------------| -| 90-100 | A | PASS — proceed immediately | -| 75-89 | B | CONDITIONAL PASS — proceed with noted improvements | -| 60-74 | C | FAIL — must improve before proceeding | -| 0-59 | D | HARD FAIL — rollback, re-plan required | +| 90-100 | A | PASS, proceed immediately | +| 75-89 | B | CONDITIONAL PASS, proceed with noted improvements | +| 60-74 | C | FAIL, must improve before proceeding | +| 0-59 | D | HARD FAIL, rollback and re-plan required | --- @@ -82,9 +82,9 @@ Changes are evaluated by their **impact on the score**, not just by whether they IF score_after >= score_before: KEEP change ELSE IF (score_before - score_after) < 5: - REVIEW — minor regression, justify in experiment ledger + REVIEW (minor regression, justify in experiment ledger) ELSE: - DISCARD change — revert and try alternative + DISCARD change (revert and try alternative) ``` ### Delta Recording diff --git a/.agents/skills/_shared/core/clarification-protocol.md b/.agents/skills/_shared/core/clarification-protocol.md index 227851b..b9bf1d4 100644 --- a/.agents/skills/_shared/core/clarification-protocol.md +++ b/.agents/skills/_shared/core/clarification-protocol.md @@ -40,7 +40,7 @@ Automatically classify as MEDIUM/HIGH level in the following situations: ### LOW → Proceed (Assumed) ``` -⚠️ Assumptions applied: +Assumptions applied: - JWT authentication included - PostgreSQL database - REST API @@ -51,29 +51,29 @@ Proceeding with these defaults. Override if needed. ### MEDIUM → Request Selection (Options) ``` -🔍 Uncertainty detected: {specific issue} +Uncertainty detected: {specific issue} Option A: {approach} - ✅ Pros: {benefits} - ❌ Cons: {drawbacks} - 💰 Effort: {low/medium/high} + Pros: {benefits} + Cons: {drawbacks} + Effort: {low/medium/high} Option B: {approach} - ✅ Pros: {benefits} - ❌ Cons: {drawbacks} - 💰 Effort: {low/medium/high} + Pros: {benefits} + Cons: {drawbacks} + Effort: {low/medium/high} Option C: {approach} - ✅ Pros: {benefits} - ❌ Cons: {drawbacks} - 💰 Effort: {low/medium/high} + Pros: {benefits} + Cons: {drawbacks} + Effort: {low/medium/high} Which approach do you prefer? (A/B/C) ``` ### HIGH → Blocked ``` -❌ Cannot proceed: Requirements too ambiguous +Cannot proceed: Requirements too ambiguous Specific uncertainty: {what is unclear} @@ -91,7 +91,7 @@ Status: BLOCKED (awaiting clarification) ## Required Verification Items -If any of the items below are unclear, **do not assume** — explicitly record them. +If any of the items below are unclear, **do not assume**; explicitly record them. ### Common to All Agents | Item | Verification Question | Default (if assumed) | Uncertainty | @@ -138,7 +138,7 @@ Example: "Create a TODO app" **Response**: Apply defaults and record assumption list in result ``` -⚠️ Assumptions: +Assumptions: - JWT authentication included - PostgreSQL database - REST API @@ -150,7 +150,7 @@ Example: "Create a user management system" **Response**: Narrow scope to 3 core features, specify and proceed ``` -⚠️ Interpreted scope (3 core features): +Interpreted scope (3 core features): 1. User registration + login (JWT) 2. Profile management (view/edit) 3. Admin user list (admin role only) @@ -166,7 +166,7 @@ Example: "Create a good app", "Improve this" **Response**: Do not proceed, record clarification request in result ``` -❌ Cannot proceed: Requirements too ambiguous +Cannot proceed: Requirements too ambiguous Questions needed: 1. What is the app's primary purpose? diff --git a/.agents/skills/_shared/core/context-budget.md b/.agents/skills/_shared/core/context-budget.md index 2e647a8..2721775 100644 --- a/.agents/skills/_shared/core/context-budget.md +++ b/.agents/skills/_shared/core/context-budget.md @@ -7,10 +7,10 @@ Follow this guide to use context efficiently. ## Core Principles -1. **No full file reads** — Read only necessary functions/classes -2. **No duplicate reads** — Do not re-read files already read -3. **Lazy resource loading** — Load resources only when needed -4. **Maintain records** — Note read files and symbols in progress +1. **No full file reads**: Read only necessary functions/classes +2. **No duplicate reads**: Do not re-read files already read +3. **Lazy resource loading**: Load resources only when needed +4. **Maintain records**: Note read files and symbols in progress --- @@ -19,17 +19,17 @@ Follow this guide to use context efficiently. ### When Using Serena MCP (Recommended) ``` -❌ Bad: read_file("app/api/todos.py") ← entire file 500 lines -✅ Good: find_symbol("create_todo") ← just that function 30 lines -✅ Good: get_symbols_overview("app/api") ← function list only -✅ Good: find_referencing_symbols("TodoService") ← usage only +Bad: read_file("app/api/todos.py") ← entire file 500 lines +Good: find_symbol("create_todo") ← just that function 30 lines +Good: get_symbols_overview("app/api") ← function list only +Good: find_referencing_symbols("TodoService") ← usage only ``` ### When Reading Files Without Serena ``` -❌ Bad: Read entire file at once -✅ Good: Check first 50 lines (imports + class definitions) → read additional functions as needed +Bad: Read entire file at once +Good: Check first 50 lines (imports + class definitions) → read additional functions as needed ``` --- @@ -124,7 +124,7 @@ This approach: Long-running agents degrade in quality as context fills up. Rather than passively responding to symptoms, agents must actively detect and reset. Detection is the **Orchestrator's responsibility** via external observation. -Individual agents do NOT self-monitor for anxiety — they focus on their task. +Individual agents do NOT self-monitor for anxiety; they focus on their task. ### Detection (Orchestrator Only) diff --git a/.agents/skills/_shared/core/context-loading.md b/.agents/skills/_shared/core/context-loading.md index e9fbf05..863b779 100644 --- a/.agents/skills/_shared/core/context-loading.md +++ b/.agents/skills/_shared/core/context-loading.md @@ -8,11 +8,11 @@ This saves context window and prevents confusion from irrelevant information. ## Loading Order (Common to All Agents) ### Always Load (Required) -1. `SKILL.md` — Auto-loaded (provided by Antigravity) -2. `resources/execution-protocol.md` — Execution protocol +1. `SKILL.md`: Auto-loaded (provided by Antigravity) +2. `resources/execution-protocol.md`: Execution protocol ### Load at Task Start -3. `difficulty-guide.md` — Difficulty assessment (Step 0) +3. `difficulty-guide.md`: Difficulty assessment (Step 0) ### Load Based on Difficulty 4. **Simple**: Proceed to implementation without additional loading @@ -20,15 +20,15 @@ This saves context window and prevents confusion from irrelevant information. 6. **Complex**: `resources/examples.md` + `stack/tech-stack.md` + `stack/snippets.md` ### Load During Execution as Needed -7. `resources/checklist.md` — Load at Step 4 (Verify) -8. `resources/error-playbook.md` — Load only when errors occur -9. `common-checklist.md` — For final verification of Complex tasks -10. `../runtime/memory-protocol.md` — CLI mode only +7. `resources/checklist.md`: Load at Step 4 (Verify) +8. `resources/error-playbook.md`: Load only when errors occur +9. `common-checklist.md`: For final verification of Complex tasks +10. `../runtime/memory-protocol.md`: CLI mode only ### Load on Measurement / Exploration (Conditional) -11. `../conditional/quality-score.md` — Load when Quality Score measurement is needed (VERIFY/SHIP gates) -12. `../conditional/experiment-ledger.md` — Load when recording experiment results (after implementation changes) -13. `../conditional/exploration-loop.md` — Load only when a gate fails twice on the same issue +11. `../conditional/quality-score.md`: Load when Quality Score measurement is needed (VERIFY/SHIP gates) +12. `../conditional/experiment-ledger.md`: Load when recording experiment results (after implementation changes) +13. `../conditional/exploration-loop.md`: Load only when a gate fails twice on the same issue --- @@ -143,7 +143,7 @@ Prompt composition: 1. Agent SKILL.md's Core Rules section 2. execution-protocol.md 3. Resources matching task type (see tables above) -4. error-playbook.md (always include — recovery is essential) +4. error-playbook.md (always include; recovery is essential) 5. Serena Memory Protocol (CLI mode) ``` diff --git a/.agents/skills/_shared/core/difficulty-guide.md b/.agents/skills/_shared/core/difficulty-guide.md index 6d9dac9..d901517 100644 --- a/.agents/skills/_shared/core/difficulty-guide.md +++ b/.agents/skills/_shared/core/difficulty-guide.md @@ -28,7 +28,7 @@ All agents assess task difficulty at the start and apply the appropriate protoco ## Protocol Branching ### Simple → Fast Track -1. ~~Step 1 (Analyze)~~: Skip — proceed directly to implementation +1. ~~Step 1 (Analyze)~~: Skip; proceed directly to implementation 2. **Pre-check**: Confirm whether test files exist for the target module (e.g., `__tests__/`, `*.test.*`) 3. Step 3 (Implement): Implementation 4. Step 4 (Verify): Minimal checklist items: diff --git a/.agents/skills/_shared/core/evaluator-tuning.md b/.agents/skills/_shared/core/evaluator-tuning.md index 9f8ae97..d5f3d1c 100644 --- a/.agents/skills/_shared/core/evaluator-tuning.md +++ b/.agents/skills/_shared/core/evaluator-tuning.md @@ -2,7 +2,7 @@ QA prompts do not work well out of the box. Reliable evaluation requires iterative refinement based on observed judgment errors. -(ref: Anthropic harness design — "several rounds of development loop necessary") +(ref: Anthropic harness design, "several rounds of development loop necessary") This protocol is **semi-automated**: collection is automatic, analysis and patching require human review via `oma retro`. @@ -35,11 +35,11 @@ Sessions accumulate EA events in session-metrics.md | Error Pattern | Patch Target | Example | |--------------|-------------|---------| -| Missed bug category | QA `checklist.md` — add check item | "Race condition in concurrent writes" | -| Wrong severity | QA `execution-protocol.md` — add calibration rule | "Auth issues are always CRITICAL" | -| Missed stub | QA `checklist.md` — runtime verification section | "Check upload actually processes file" | -| False positive pattern | QA `execution-protocol.md` — add exclusion | "Unused imports in test files are OK" | -| Inconsistent depth | QA `execution-protocol.md` — difficulty link | "Complex tasks require full audit" | +| Missed bug category | QA `checklist.md`: add check item | "Race condition in concurrent writes" | +| Wrong severity | QA `execution-protocol.md`: add calibration rule | "Auth issues are always CRITICAL" | +| Missed stub | QA `checklist.md`: runtime verification section | "Check upload actually processes file" | +| False positive pattern | QA `execution-protocol.md`: add exclusion | "Unused imports in test files are OK" | +| Inconsistent depth | QA `execution-protocol.md`: difficulty link | "Complex tasks require full audit" | --- @@ -66,7 +66,7 @@ When `good_catch` events accumulate (>= 5 in rolling window): 3. If yes: Propose addition to `common-checklist.md` 4. Record in tuning log as positive reinforcement -This prevents tuning drift toward pure skepticism — QA must also know what it does well. +This prevents tuning drift toward pure skepticism; QA must also know what it does well. --- diff --git a/.agents/skills/_shared/core/lessons-learned.md b/.agents/skills/_shared/core/lessons-learned.md index 1a7a169..65f419b 100644 --- a/.agents/skills/_shared/core/lessons-learned.md +++ b/.agents/skills/_shared/core/lessons-learned.md @@ -168,7 +168,7 @@ Auto-generated lessons use the RCA Entry Format above, with these additions: - Append `(Source: Experiment Ledger #{N}, Session {session_id})` to the summary line - Append to the relevant domain section (based on agent type) -Only the Orchestrator performs this at session end — after all agents have completed and the ledger is finalized. +Only the Orchestrator performs this at session end, after all agents have completed and the ledger is finalized. ### When Lessons Become Too Many (50+) - Move old lessons (6+ months) to archive diff --git a/.agents/skills/_shared/core/prompt-structure.md b/.agents/skills/_shared/core/prompt-structure.md index 0dcff65..6f2bf91 100644 --- a/.agents/skills/_shared/core/prompt-structure.md +++ b/.agents/skills/_shared/core/prompt-structure.md @@ -28,7 +28,7 @@ Standards, architecture rules, safety requirements, or project conventions. - "Must be backward-compatible with v2 API" ### 4. Done When -How to verify the task is complete — testable, observable criteria. +How to verify the task is complete using testable, observable criteria. - "All existing tests pass + new tests for auth endpoints" - "The 500 error no longer occurs and returns 200" @@ -59,4 +59,4 @@ Use "Done When" criteria as the primary review checklist. A task is not complete - Starting implementation with only a Goal (no constraints or done-when) - Inventing constraints the user didn't specify -- Accepting vague done-when like "it works" — push for testable criteria +- Accepting vague done-when like "it works"; push for testable criteria diff --git a/.agents/skills/_shared/core/reasoning-templates.md b/.agents/skills/_shared/core/reasoning-templates.md index df44b6b..e521783 100644 --- a/.agents/skills/_shared/core/reasoning-templates.md +++ b/.agents/skills/_shared/core/reasoning-templates.md @@ -14,7 +14,7 @@ Repeat the loop below when finding the cause of a bug. After 3 iterations withou Observation: {error message, symptoms, reproduction conditions} Hypothesis: "{phenomenon} is caused by {suspected cause}" -Verification method: {how to verify — code reading, logs, tests, etc.} +Verification method: {how to verify; code reading, logs, tests, etc.} Verification result: {what was actually confirmed} Verdict: Correct / Incorrect diff --git a/.agents/skills/_shared/core/session-metrics.md b/.agents/skills/_shared/core/session-metrics.md index ab66d85..03307af 100644 --- a/.agents/skills/_shared/core/session-metrics.md +++ b/.agents/skills/_shared/core/session-metrics.md @@ -1,6 +1,6 @@ # Session Metrics & Clarification Debt Tracking -Tracks per-session agent performance metrics, with emphasis on **Clarification Debt (CD)** — the cost of unclear requirements, scope creep, and charter violations. +Tracks per-session agent performance metrics, with emphasis on **Clarification Debt (CD)**, the cost of unclear requirements, scope creep, and charter violations. --- @@ -124,7 +124,7 @@ At session end, if total CD >= 50: ``` Turn 3: frontend asked about icon library preference → clarify (+10) Turn 15: All tasks completed successfully -Total CD: 10 ✅ +Total CD: 10 ``` ### Unhealthy Session (CD = 95) @@ -133,7 +133,7 @@ Turn 2: backend assumed REST, user wanted GraphQL → correct (+25) Turn 8: backend used wrong auth method → correct (+25) Turn 12: frontend built wrong layout → redo (+40) Turn 14: Charter not checked before redo → modifier (+15, but capped) -Total CD: 95 ❌ → RCA REQUIRED +Total CD: 95 → RCA REQUIRED ``` --- @@ -149,7 +149,7 @@ When Quality Score measurement is active (see `quality-score.md`), the session l | Checkpoint | Phase | Composite | Grade | Delta | |-----------|-------|-----------|-------|-------| -| Baseline | IMPL end | 72 | C | — | +| Baseline | IMPL end | 72 | C | n/a | | Post-VERIFY | VERIFY end | 78 | B | +6 | | Post-REFINE | REFINE end | 84 | B | +6 | | Final | SHIP | 86 | B | +2 | @@ -183,17 +183,17 @@ This data is sourced from the Experiment Ledger at session end (see `experiment- QA agents improve only when their judgment errors are tracked. Unlike CD (tracked in real-time), Evaluator Accuracy (EA) is a -**retrospective metric** — most errors are discovered after the session ends. +**retrospective metric**; most errors are discovered after the session ends. ### Accuracy Events | Event | Points | When Discovered | |-------|--------|-----------------| -| `false_negative` | +30 | Next session or production — bug that QA missed | -| `false_positive` | +15 | During session — impl agent disputes QA finding successfully | -| `severity_mismatch` | +10 | During session or retro — wrong severity assigned | -| `missed_stub` | +20 | During session — runtime verification catches display-only feature | -| `good_catch` | -10 | During session — QA caught non-obvious bug (reward signal) | +| `false_negative` | +30 | Next session or production: bug that QA missed | +| `false_positive` | +15 | During session: impl agent disputes QA finding successfully | +| `severity_mismatch` | +10 | During session or retro: wrong severity assigned | +| `missed_stub` | +20 | During session: runtime verification catches display-only feature | +| `good_catch` | -10 | During session: QA caught non-obvious bug (reward signal) | ### Recording diff --git a/.agents/skills/_shared/runtime/execution-protocols/claude.md b/.agents/skills/_shared/runtime/execution-protocols/claude.md index c989ce3..85fe3bb 100644 --- a/.agents/skills/_shared/runtime/execution-protocols/claude.md +++ b/.agents/skills/_shared/runtime/execution-protocols/claude.md @@ -10,13 +10,13 @@ If Serena MCP is available, you may also use `read_memory`/`write_memory`/`edit_ ### Path Resolution (CRITICAL) -All result, progress, and state files MUST be written to the **project root** `.agents/` directory — never to a subdirectory's `.agents/`. +All result, progress, and state files MUST be written to the **project root** `.agents/` directory, never to a subdirectory's `.agents/`. - **Project root** = the git repository root (where `.git` exists) - **Session-scoped naming**: when running under an orchestration session, append session ID as suffix: - `result-{agent-id}-{sessionId}.md` (e.g., `result-frontend-session-20260405-100835.md`) - `progress-{agent-id}-{sessionId}.md` -- **Manual (non-orchestrated) runs**: no suffix — `result-{agent-id}.md` +- **Manual (non-orchestrated) runs**: no suffix, `result-{agent-id}.md` ## On Start diff --git a/.agents/skills/_shared/runtime/execution-protocols/codex.md b/.agents/skills/_shared/runtime/execution-protocols/codex.md index 75b08d4..9ca066b 100644 --- a/.agents/skills/_shared/runtime/execution-protocols/codex.md +++ b/.agents/skills/_shared/runtime/execution-protocols/codex.md @@ -8,13 +8,13 @@ Use file-based I/O for coordination. Write results to `.agents/results/`. ### Path Resolution (CRITICAL) -All result, progress, and state files MUST be written to the **project root** `.agents/` directory — never to a subdirectory's `.agents/`. +All result, progress, and state files MUST be written to the **project root** `.agents/` directory, never to a subdirectory's `.agents/`. - **Project root** = the git repository root (where `.git` exists) - **Session-scoped naming**: when running under an orchestration session, append session ID as suffix: - `result-{agent-id}-{sessionId}.md` (e.g., `result-frontend-session-20260405-100835.md`) - `progress-{agent-id}-{sessionId}.md` -- **Manual (non-orchestrated) runs**: no suffix — `result-{agent-id}.md` +- **Manual (non-orchestrated) runs**: no suffix, `result-{agent-id}.md` ## On Start diff --git a/.agents/skills/_shared/runtime/execution-protocols/gemini.md b/.agents/skills/_shared/runtime/execution-protocols/gemini.md index ba54775..3332637 100644 --- a/.agents/skills/_shared/runtime/execution-protocols/gemini.md +++ b/.agents/skills/_shared/runtime/execution-protocols/gemini.md @@ -15,12 +15,12 @@ Memory base path is configurable via `memoryConfig.basePath` (default: `.serena/ ### Path Resolution (CRITICAL) -All result, progress, and state files MUST be written to the **project root** memory path — never to a subdirectory's memory path. +All result, progress, and state files MUST be written to the **project root** memory path, never to a subdirectory's memory path. - **Session-scoped naming**: when running under an orchestration session, append session ID as suffix: - `result-{agent-id}-{sessionId}.md` (e.g., `result-frontend-session-20260405-100835.md`) - `progress-{agent-id}-{sessionId}.md` -- **Manual (non-orchestrated) runs**: no suffix — `result-{agent-id}.md` +- **Manual (non-orchestrated) runs**: no suffix, `result-{agent-id}.md` ## On Start diff --git a/.agents/skills/_shared/runtime/execution-protocols/qwen.md b/.agents/skills/_shared/runtime/execution-protocols/qwen.md index 4a3b540..1b17d98 100644 --- a/.agents/skills/_shared/runtime/execution-protocols/qwen.md +++ b/.agents/skills/_shared/runtime/execution-protocols/qwen.md @@ -13,12 +13,12 @@ Memory base path is configurable via `memoryConfig.basePath` (default: `.serena/ ### Path Resolution (CRITICAL) -All result, progress, and state files MUST be written to the **project root** memory path — never to a subdirectory's memory path. +All result, progress, and state files MUST be written to the **project root** memory path, never to a subdirectory's memory path. - **Session-scoped naming**: when running under an orchestration session, append session ID as suffix: - `result-{agent-id}-{sessionId}.md` (e.g., `result-frontend-session-20260405-100835.md`) - `progress-{agent-id}-{sessionId}.md` -- **Manual (non-orchestrated) runs**: no suffix — `result-{agent-id}.md` +- **Manual (non-orchestrated) runs**: no suffix, `result-{agent-id}.md` ## On Start diff --git a/.agents/skills/_shared/runtime/memory-protocol.md b/.agents/skills/_shared/runtime/memory-protocol.md index 6cf58ea..6d85f10 100644 --- a/.agents/skills/_shared/runtime/memory-protocol.md +++ b/.agents/skills/_shared/runtime/memory-protocol.md @@ -19,12 +19,12 @@ Memory base path is configurable via `memoryConfig.basePath` (default: `.serena/ ## Path Resolution (CRITICAL) -All result, progress, and state files MUST be written to the **project root** — never to a subdirectory. +All result, progress, and state files MUST be written to the **project root**, never to a subdirectory. - **Session-scoped naming**: when running under an orchestration session, append session ID as suffix: - `result-{agent-id}-{sessionId}.md` - `progress-{agent-id}-{sessionId}.md` -- **Manual (non-orchestrated) runs**: no suffix — `result-{agent-id}.md` +- **Manual (non-orchestrated) runs**: no suffix, `result-{agent-id}.md` ## On Start diff --git a/.agents/skills/_version.json b/.agents/skills/_version.json index 2872838..d5b091f 100644 --- a/.agents/skills/_version.json +++ b/.agents/skills/_version.json @@ -1,3 +1,3 @@ { - "version": "6.5.6" + "version": "7.7.0" } \ No newline at end of file diff --git a/.agents/skills/oma-academic-writer/SKILL.md b/.agents/skills/oma-academic-writer/SKILL.md new file mode 100644 index 0000000..98d006f --- /dev/null +++ b/.agents/skills/oma-academic-writer/SKILL.md @@ -0,0 +1,182 @@ +--- +name: oma-academic-writer +description: > + Academic writing specialist for publication-grade English prose. Drafts, revises, and + audits essays, reports, analysis sections, executive summaries, conclusions, and + literature reviews while enforcing sentence-structure variation, high-frequency + academic verbs, calibrated hedging, and anti-AI stylistic compliance. USE for + academic writing, essay polish, paragraph rewrite, prose revision against any + rubric tier (HD/D/C, A/B/C, top-band/mid-band, etc.), anti-AI audit, reverse + outlining, claim-evidence mapping, and rubric enforcement on assignments. +--- + +# Academic Writer: Publication-Grade English Prose Specialist + +## Scheduling + +### Goal +Produce, revise, and audit publication-grade academic English prose so that every output simultaneously satisfies the Sentence Structure Protocol, Verb Protocol, Hedging Protocol, and Anti-AI Compliance Checklist, with every claim mapped to verifiable evidence. + +### Intent signature +- "draft this essay / report / executive summary / conclusion / literature review" +- "rewrite this paragraph in academic English" +- "polish this draft to top-band quality" / "revise to match the rubric" +- "run an anti-AI audit on this prose" +- "check sentence structure variety" / "fix monotonous rhythm" +- "the prose sounds AI-generated, make it pass" +- "verify claims against evidence" / "reverse outline this section" + +### When to use +- Drafting or revising academic reports, essays, or analysis sections +- Writing executive summaries, conclusions, or literature reviews +- Rewriting AI-sounding prose into natural academic English +- Polishing draft text to achieve top-band rubric quality (HD, A, top-band, etc.) +- Reviewing prose for sentence variety, verb quality, hedging, and anti-AI compliance +- Any task requiring formal academic English output bound by a rubric + +### When NOT to use +- Translation tasks → use `oma-translator` +- Source discovery, citation gathering, or scholarly literature search → use `oma-scholar` +- Rubric / assignment-spec parsing and task decomposition → use `oma-pm` +- Code documentation, README, or API reference text → use the relevant domain skill (`oma-frontend`, `oma-backend`, `oma-mobile`, `oma-db`, etc.) +- Informal communication, chat, or marketing copy → no skill needed +- Non-English academic writing → call `oma-translator` for the target language after drafting in English + +### Expected inputs +- `mode`: one of `draft` | `revise` | `review` +- `rubric_or_constraint`: assignment brief, rubric file, or word/structure limits (path or inline text) +- `existing_draft`: prior text to revise or audit (path or inline text); required for `revise` and `review` +- `source_data`: available evidence, figures, citations the writer may use +- `target_register`: defaults to formal academic English + +### Expected outputs +- `draft` mode: section heading + drafted prose + Writing Notes (sentence mix, key verbs, anti-AI flags resolved, paragraph lengths) + Claim-Evidence Map +- `revise` mode: original block, revised block, list of specific changes (verb upgrades, structure variation, anti-AI fixes) +- `review` mode: PASS/FAIL Compliance Report across Sentence Structure, Verb Quality, Anti-AI, Specificity, Hedging, Paragraph Clarity, Rhythm/Burstiness, Claim-Evidence Alignment, plus recommended fixes + +### Dependencies +- `resources/anti-ai-checklist.md`: banned vocabulary, banned structural patterns, sentence-level checks +- `resources/sentence-structure-reference.md`: four sentence types, length targets, common errors +- `resources/academic-verb-tiers.md`: banned generic verbs and tiered academic-corpus replacements +- `resources/hedging-guide.md`: calibrated certainty expressions matched to evidence strength +- `../_shared/core/context-loading.md`: task-relevant resource loading +- `../_shared/core/quality-principles.md`: shared quality bar + +### Control-flow features +- Mode branching: `draft` vs `revise` vs `review` produce different output formats and pass sequences +- Rubric-quote gate: refuses to apply a rule until the literal constraint text is quoted from the source +- Citation gap branch: when a claim lacks evidence, weaken or remove rather than fabricate; optionally hand off to `oma-scholar` +- Language branch: non-English target hands off to `oma-translator` after the English pass +- Iterative AUDIT: every fix loops back through the anti-AI checklist before emit + +## Structural Flow + +### Entry +1. Identify the mode (`draft`, `revise`, `review`) and the rubric source. +2. Quote the exact constraint text (word limits, structural requirements, mandatory sections, rubric rows) before applying any rule. +3. If revising or reviewing, read the existing draft in full first; if drafting, confirm available source data and citations. +4. Index `resources/` and pre-select the verb tier and sentence mix targets for the section. + +### Scenes +1. **PREPARE**: load rubric, existing draft, source data; record quoted constraints; pick sentence mix and 2–3 anchor verbs per paragraph. +2. **ACQUIRE**: read `resources/sentence-structure-reference.md`, `academic-verb-tiers.md`, and `hedging-guide.md` only for the patterns relevant to the current section. +3. **ACT**: write or revise prose with the four protocols enforced simultaneously: Sentence Structure (4 types, varied length, varied openers), Verb (no banned generic verbs as main verbs; prefer tier-1/2 academic verbs), Hedging (match strength to evidence), and Topic-Support-Conclude paragraphing. +4. **VERIFY**: audit against `resources/anti-ai-checklist.md` (vocabulary clusters, structural patterns, sentence-level checks); apply reverse outlining and build the Claim-Evidence Map; weaken or remove unsupported claims. +5. **FINALIZE**: read-aloud test, cohesion check, specificity audit, word-count verification, paragraph-length variation, rhythm check; emit per the mode's output format. + +### Transitions +- If a rubric line is ambiguous → quote it back to the user and ask for interpretation; do not infer combined rules. +- If a claim cannot be supported by available evidence → weaken with hedging or remove; if a citation gap is structural, NOTIFY `oma-scholar`. +- If the target language is non-English → finish the English pass, then hand off to `oma-translator`. +- If the same anti-AI flag survives one fix attempt → restructure the surrounding two sentences instead of word-substitution alone. +- If an output mode mismatch is detected (e.g., user asked for review but supplied a fresh prompt) → confirm the mode before producing output. + +### Failure and recovery +| Failure | Recovery | +|---------|----------| +| Word count over / under target | Cut filler adverbs and redundant qualifiers, or expand with supporting evidence; re-run audit | +| Prose still sounds AI-generated after one pass | Vary sentence openers (subject, adverbial, participial, prepositional) and insert one short (≤10-word) sentence per paragraph; re-run audit | +| Rubric requirement unclear | Quote exact rubric text and ask user; do not combine rules | +| Claim lacks evidence | Add citation, hedge to match weaker evidence, or remove the claim entirely | +| Hedging miscalibrated | Replace double hedges; align hedge strength with `resources/hedging-guide.md` evidence-level table | +| Banned generic verb resists replacement | Restructure the sentence so the banned verb is not the main verb | +| Paragraph blocks are uniform 4–5 sentences | Insert a 2-sentence emphasis paragraph; re-run rhythm check | + +### Exit +- Success: every protocol PASSes, the Claim-Evidence Map has no unsupported entries, word count complies, and the mode-specific output format is fully populated. +- Partial success: emit prose with explicit `needs evidence` / `pending citation` markers and report which protocol items remain at risk; flag handoff candidates. +- Failure: refuse to emit and report the blocking ambiguity (rubric quote missing, source data absent, contradictory constraints). + +## Logical Operations + +### Actions +| Action | SSL primitive | Evidence | +|--------|---------------|----------| +| Read rubric / constraint and quote literal text | `READ` | Rubric file or assignment brief | +| Read existing draft (revise/review modes) | `READ` | Draft file or inline text | +| Index resources for the current section | `READ` | `resources/{anti-ai-checklist,sentence-structure-reference,academic-verb-tiers,hedging-guide}.md` | +| Select sentence mix and 2–3 anchor verbs per paragraph | `SELECT` | Sentence-structure & verb-tier tables | +| Plan paragraph as Topic-Support-Conclude | `INFER` | Outline notes | +| Draft / revise prose under all four protocols | `WRITE` | Generated prose | +| Audit prose against anti-AI checklist | `VALIDATE` | `resources/anti-ai-checklist.md` | +| Reverse outline + build Claim-Evidence Map | `VALIDATE` | Mapping table | +| Weaken or remove unsupported claims | `WRITE` | Revised claim line | +| Compare original vs revised (revise mode) | `COMPARE` | Diff block | +| Hand off non-English target | `NOTIFY` | `oma-translator` | +| Hand off citation gap | `NOTIFY` | `oma-scholar` | +| Hand off ambiguous rubric / spec | `NOTIFY` | `oma-pm` | +| Emit per mode output format | `WRITE` | Final artifact | +| Report compliance status | `NOTIFY` | PASS/FAIL summary or Writing Notes block | + +### Tools and instruments +- `Read` / `Edit` / `Write` for draft and rubric files +- `resources/anti-ai-checklist.md`, `sentence-structure-reference.md`, `academic-verb-tiers.md`, `hedging-guide.md` +- Topic-Support-Conclude paragraph template (inline) +- Claim-Evidence Map (inline 3-column table: Claim / Evidence / Status) +- Output-format blocks per mode (Draft / Revision / Review) + +### Canonical workflow path +1. **READ** rubric/draft and quote the exact literal constraint text; pin word limits, mandatory sections, and rubric rows. +2. **PLAN** each paragraph as Topic-Support-Conclude; pre-select the sentence-type mix and 2–3 anchor verbs from `academic-verb-tiers.md`. +3. **DRAFT** prose with Sentence Structure, Verb, Hedging, and Topic-Support-Conclude protocols enforced simultaneously. +4. **AUDIT** the draft against `resources/anti-ai-checklist.md` (banned vocabulary clusters, banned structural patterns, sentence-level checks) and fix every flag. +5. **REVERSE-OUTLINE** the section and build the Claim-Evidence Map; weaken or remove any unsupported claim. +6. **POLISH** with read-aloud, cohesion, specificity, word-count, rhythm, and paragraph-length-variation checks; emit in the mode's output format. + +### Resource scope +| Scope | Resource target | +|-------|-----------------| +| `LOCAL_FS` | Rubric, existing draft, generated prose output | +| `CODEBASE` | `resources/` 4 reference files, `_shared/core/{context-loading,quality-principles}.md` | +| `MEMORY` | Mode, quoted constraints, anchor verbs per paragraph, anti-AI flags resolved, Claim-Evidence Map | + +### Preconditions +- A rubric / constraint or an existing draft (or both) is provided. +- The target register is academic English. If the final deliverable is non-English, the user has agreed to a downstream `oma-translator` handoff. +- The source data needed to support claims is available, or unsupported claims are explicitly allowed to be weakened or removed. + +### Effects and side effects +- Writes drafted, revised, or reviewed prose to the user's working location (file or inline). +- Does not modify `resources/` reference files. +- Does not fetch external citations; defers to `oma-scholar` when discovery is required. +- May NOTIFY adjacent skills but does not auto-spawn them; user or workflow drives the actual handoff. + +### Guardrails +1. Every sentence must be verifiable; never fabricate data, statistics, or citations. +2. Quote-before-judgment: cite the literal constraint or rubric text before applying any rule. +3. Never combine distinct rules to invent a new constraint; apply rules exactly as written. +4. Banned generic verbs (`show`, `have`, `make`, `do`, `get`, `use`, `give`, `say`, `put`, `see`, `come`, `go`, `take`, `find`, `know`, `think`, `want`, `try`, `need`, `seem`, `become`, `keep`, `help`, `start`, `turn`, `bring`, `run`, `hold`, `set`) must not appear as main verbs; replace per `academic-verb-tiers.md`. +5. Never place 3+ sentences of the same structural type consecutively; vary length (short 8–15, medium 16–25, long 26–40 words) and openers. +6. Match hedge strength to evidence strength per `hedging-guide.md`; never use absolute claim words (`definitely`, `clearly`, `obviously`) outside mathematical facts; never first-person `I think` / `I believe`. +7. Never cluster 3+ flagged AI-vocabulary items in a single paragraph; never insert promotional or inflated language; never append superficial `-ing` clauses for analysis. +8. Em dashes ≤ 1 per paragraph; semicolons ≤ 2 per 1000 words; sentence-case headers; no didactic disclaimers (`It is important to note`) or summary phrases (`In summary`, `Overall`). +9. Every claim must map to evidence in the Claim-Evidence Map; weaken or remove unsupported claims rather than emit them. +10. Read aloud before emit; if a sentence does not flow naturally, restructure it. + +## References +- Anti-AI checklist: `resources/anti-ai-checklist.md` +- Sentence-structure reference: `resources/sentence-structure-reference.md` +- Academic verb tiers: `resources/academic-verb-tiers.md` +- Hedging guide: `resources/hedging-guide.md` +- Shared context loading: `../_shared/core/context-loading.md` +- Shared quality principles: `../_shared/core/quality-principles.md` diff --git a/.agents/skills/oma-academic-writer/resources/academic-verb-tiers.md b/.agents/skills/oma-academic-writer/resources/academic-verb-tiers.md new file mode 100644 index 0000000..e225fdc --- /dev/null +++ b/.agents/skills/oma-academic-writer/resources/academic-verb-tiers.md @@ -0,0 +1,181 @@ +# Academic verb tiers + +Ranked by frequency in a corpus of academic papers. Higher tiers are more universally appropriate; lower tiers are more specialised. + +Source: `src/_ingredients/verbs.md` (top 437 verbs from academic corpus). + +## Banned verbs (generic / low-level) + +These verbs lack precision and register in academic writing. Never use them as the main verb of a sentence. + +| Banned verb | Academic replacements | +|-------------|----------------------| +| show | illustrate, demonstrate, reveal, indicate, depict, exhibit | +| have | possess, maintain, exhibit, encompass, retain, display | +| make | generate, produce, construct, establish, formulate, create | +| do | perform, execute, conduct, accomplish, undertake | +| get | obtain, acquire, achieve, attain, derive, secure | +| use | employ, leverage, utilise, adopt, exploit, harness | +| give | provide, furnish, yield, deliver, grant, supply | +| say | argue, assert, contend, maintain, posit, state | +| put | position, allocate, situate, deploy, place | +| see | observe, identify, recognise, discern, perceive, detect | +| find | identify, determine, establish, ascertain, uncover | +| know | recognise, acknowledge, understand, appreciate | +| think | hypothesise, postulate, theorise, reason, infer | +| want | seek, aspire, endeavour, aim, intend | +| try | attempt, endeavour, pursue, strive, undertake | +| need | require, necessitate, demand, warrant, entail | +| seem | appear, suggest, indicate, manifest, resemble | +| help | facilitate, enable, support, contribute to, assist | +| start | initiate, commence, launch, introduce, inaugurate | +| turn | transform, convert, transition, shift, redirect | +| bring | introduce, yield, generate, contribute, produce | +| run | operate, execute, administer, manage, conduct | +| hold | maintain, retain, sustain, accommodate, contain | +| set | establish, configure, determine, specify, define | +| keep | maintain, preserve, retain, sustain, uphold | +| go | proceed, transition, advance, progress, extend | +| come | emerge, arise, originate, result, derive | +| take | adopt, assume, undertake, acquire, embrace | +| become | emerge, evolve, develop, transition, transform | + +## Tier 1: universal academic verbs (frequency rank 1–30) + +These are safe in any academic context. Use liberally. + +| Verb | Frequency Rank | Best Used For | +|------|---------------|---------------| +| perform | 8 | Describing actions, experiments, evaluations | +| provide | 10 | Presenting data, offering evidence, supplying context | +| evaluate | 11 | Assessment, measurement, comparison | +| require | 12 | Establishing necessity, conditions, prerequisites | +| include | 15 | Enumeration, scope definition | +| follow | 17 | Methodology, sequence, adherence | +| compare | 18 | Analysis, juxtaposition, relative assessment | +| achieve | 22 | Results, outcomes, attainment of goals | +| enable | 24 | Facilitation, capability description | +| improve | 28 | Enhancement, progress, optimisation of outcomes | +| describe | 29 | Characterisation, explanation, narration | +| demonstrate | 30 | Proof, evidence presentation, showing results | +| present | 32 | Introduction of findings, display of data | +| propose | 34 | Hypotheses, recommendations, new approaches | +| introduce | 35 | New concepts, methods, frameworks | +| allow | 39 | Permission, enablement, possibility | +| apply | 41 | Implementation, practical use, methodology | +| predict | 43 | Forecasting, modelling, anticipation | +| represent | 44 | Symbolisation, standing for, comprising | +| explore | 45 | Investigation, examination, discovery | +| combine | 46 | Integration, synthesis, merging | +| design | 47 | Creation, planning, structuring | +| execute | 48 | Implementation, carrying out procedures | +| leverage | 50 | Strategic use (use sparingly; borderline AI word) | +| generalise | 52 | Abstraction, broad application | +| study | 54 | Investigation, research, examination | +| utilise | 55 | Application (prefer "employ" or "use" in less formal contexts) | +| solve | 56 | Resolution, addressing problems | + +## Tier 2: strong academic verbs (frequency rank 31–80) + +Excellent for adding precision and variety. + +| Verb | Rank | Best Used For | +|------|------|---------------| +| indicate | 65 | Evidence pointing to conclusions | +| adopt | 68 | Taking up methods, approaches, strategies | +| observe | 70 | Empirical findings, noting phenomena | +| adapt | 71 | Modification, adjustment to conditions | +| specify | 72 | Defining precisely, setting parameters | +| focus | 73 | Directing attention, narrowing scope | +| correspond | 76 | Correlation, matching, alignment | +| employ | 79 | Using methods or tools (preferred over "utilise") | +| aim | 80 | Purpose, objective, intention | +| develop | 82 | Creation, evolution, progression | +| produce | 83 | Generation, creation, yielding outcomes | +| investigate | 85 | Systematic inquiry, research | +| support | 86 | Evidence corroboration, backing claims | +| contain | 89 | Inclusion, comprising, holding | +| involve | 92 | Participation, inclusion of elements | +| understand | 93 | Comprehension, grasp of concepts | +| refer | 95 | Citation, pointing to, mentioning | +| obtain | 96 | Acquisition, securing results | +| conduct | 97 | Carrying out research, experiments | +| incorporate | 101 | Integration, inclusion within a system | +| control | 102 | Regulation, management, experimental design | +| implement | 111 | Putting into practice, execution | +| exhibit | 113 | Displaying characteristics, showing qualities | +| assess | 119 | Evaluation, measurement, appraisal | +| illustrate | 122 | Visual representation, exemplification | +| reduce | 123 | Decrease, minimisation, simplification | +| address | 124 | Tackling issues, responding to concerns | +| extend | 126 | Expansion, broadening scope | +| denote | 127 | Signification, representation | +| select | 128 | Choosing, picking, sampling | +| serve | 132 | Function, role fulfillment | +| process | 133 | Handling, transformation, treatment | + +## Tier 3: precision verbs (frequency rank 81–200) + +For nuanced, specific claims. Excellent for adding sophistication without over-reaching. + +| Verb | Rank | Best Used For | +|------|------|---------------| +| suggest | 139 | Moderate-confidence claims, implications | +| capture | 142 | Recording, encapsulating, representing | +| summarise | 143 | Condensation, overview, synthesis | +| measure | 149 | Quantification, assessment | +| integrate | 150 | Combining, synthesising, unifying | +| mitigate | 154 | Reducing negative effects, lessening risk | +| align | 155 | Agreement, correspondence, matching | +| define | 156 | Specification, delimitation, characterisation | +| interpret | 161 | Meaning extraction, analysis, reading data | +| enhance | 162 | Improvement (use carefully; borderline AI word) | +| affect | 165 | Influence, impact on outcomes | +| ensure | 174 | Guaranteeing, securing, confirming | +| deploy | 177 | Implementation, putting into operation | +| simulate | 179 | Modelling, replicating conditions | +| determine | 207 | Establishing, deciding, finding out | +| rely | 210 | Dependency, foundation, based on | +| construct | 205 | Building, creating, assembling | +| attribute | 206 | Assigning cause, crediting | +| formulate | 243 | Creating plans, theories, equations | +| identify | 227 | Recognising, pinpointing, discovering | +| analyse | 229 | Examination, investigation, deconstruction | +| reveal | 259 | Discovery, making known, uncovering | +| establish | 289 | Founding, proving, confirming | +| operate | 291 | Functioning, running, working | +| recognise | 292 | Acknowledging, identifying, accepting | +| categorise | 293 | Classification, grouping, sorting | +| retain | 294 | Keeping, preserving, maintaining | +| highlight | 297 | Drawing attention (use sparingly; borderline AI word) | +| validate | 321 | Confirming, verifying, proving correct | +| constrain | 326 | Limiting, restricting, bounding | +| visualise | 329 | Representing graphically, depicting | +| resolve | 338 | Solving, addressing, settling | +| calculate | 350 | Computing, determining numerically | + +## Verb selection by rhetorical purpose + +| Purpose | Recommended Verbs | +|---------|-------------------| +| Presenting findings | demonstrate, reveal, indicate, illustrate, exhibit | +| Making an argument | argue, contend, assert, maintain, posit | +| Describing methodology | employ, adopt, implement, conduct, execute | +| Comparing | compare, contrast, distinguish, differentiate, juxtapose | +| Showing causation | cause, produce, generate, yield, result in | +| Hedging | suggest, appear, may indicate, seem to imply | +| Quantifying | measure, calculate, quantify, estimate, compute | +| Evaluating | assess, evaluate, appraise, judge, critique | +| Synthesising | integrate, combine, synthesise, consolidate, unify | +| Proposing | propose, recommend, suggest, advocate, put forward | +| Limiting scope | focus, confine, restrict, constrain, delimit | +| Citing work | note, report, document, record, observe | + +## Usage notes + +1. **Tier 1 verbs** are always safe; use these as your default vocabulary +2. **Tier 2 verbs** add precision; use 3–5 per paragraph for variety +3. **Tier 3 verbs** add sophistication; use 1–2 per paragraph to avoid overly dense prose +4. **Borderline AI words** (leverage, enhance, highlight, showcase): limit to 1 per page maximum; prefer alternatives +5. **Match verb to evidence strength**: "demonstrate" > "suggest" > "may indicate" in confidence +6. **Prefer single verbs over phrasal verbs**: "investigate" not "look into", "improve" not "make better" diff --git a/.agents/skills/oma-academic-writer/resources/anti-ai-checklist.md b/.agents/skills/oma-academic-writer/resources/anti-ai-checklist.md new file mode 100644 index 0000000..3699df5 --- /dev/null +++ b/.agents/skills/oma-academic-writer/resources/anti-ai-checklist.md @@ -0,0 +1,267 @@ +# Anti-AI Writing Checklist for Academic English + +Academic prose must read as authentically human-written. This checklist targets patterns that AI detection tools and experienced markers identify as machine-generated. + +## Pre-submission Scan + +Run through each category. A single FAIL requires revision before output. + +## 1. Vocabulary Clustering + +**Rule:** No more than 2 of the following words in any single paragraph. + +### Flagged Words (High AI Correlation) + +Additionally, align with, crucial, delve, emphasize/emphasizing, enduring, enhance, foster/fostering, garner, highlight (as verb), interplay, intricate/intricacies, key (as adjective), landscape (abstract), multifaceted, nuanced, pivotal, robust, seamless, showcase, synergy, tapestry (abstract), testament, underscore (as verb), valuable, vibrant, holistic, paradigm, cutting-edge, groundbreaking, comprehensive, Furthermore, Moreover, navigating, realm, embark, noteworthy + +### Self-check + +- [ ] Count flagged words per paragraph +- [ ] If 3+ found → replace with plain academic alternatives +- [ ] Check entire document for repeated use of the same flagged word + +## 2. Inflated Significance + +**Rule:** Never inflate the importance of a subject beyond what the evidence supports. + +### Banned Phrases + +| Phrase | Plain Alternative | +|--------|-------------------| +| stands/serves as | is | +| is a testament to | demonstrates / reflects | +| a vital/crucial/pivotal role | an important role / a role in | +| underscores/highlights its importance | shows / indicates | +| reflects broader trends | relates to | +| symbolizing its enduring legacy | (delete unless legacy is the subject) | +| setting the stage for | preceding / leading to | +| key turning point | a change / a shift | +| indelible mark | lasting effect | +| deeply rooted | established / longstanding | +| evolving landscape | changing conditions | +| groundbreaking | new / novel / significant | + +### Self-check + +- [ ] Does every significance claim have supporting evidence cited? +- [ ] Is the language proportional to the evidence? +- [ ] Would a sceptical reader accept the level of emphasis? + +## 3. Superficial -ing Analysis + +**Rule:** Never append a present participle clause as shallow analysis. + +### Pattern to Detect + +> "[Statement], **highlighting/ensuring/reflecting/contributing to/fostering** [vague significance]." + +### Fix + +- If the -ing clause adds genuine meaning → promote it to a full sentence with evidence +- If it adds no meaning → delete it entirely + +### Self-check + +- [ ] Search for -ing clauses at end of sentences +- [ ] For each: does it add substantive analysis or just filler? +- [ ] Rewrite or delete accordingly + +## 4. Copula Avoidance + +**Rule:** Use "is/are/has" when they are the natural choice. Do not replace them with fancier alternatives. + +### Pattern to Detect + +| AI Tendency | Natural Form | +|-------------|-------------| +| serves as a | is a | +| stands as | is | +| marks the | is the | +| represents a | is a | +| boasts / features / offers | has | + +### Self-check + +- [ ] Scan for "serves as", "stands as", "marks", "represents" used as copula substitutes +- [ ] Replace with "is/are" unless the verb genuinely adds meaning + +## 5. Structural Patterns + +### Negative Parallelisms + +- Avoid: "Not only X but also Y", "It's not just about X, it's about Y" +- Fix: State both facts directly without the parallelism + +### Rule of Three + +- Avoid: "adjective, adjective, and adjective" for shallow coverage +- Fix: Reduce to two descriptors, or expand each into substantive analysis + +### Elegant Variation (Synonym Cycling) + +- Avoid: Rotating terms for the same concept (students → learners → participants) +- Fix: Pick one term and use it consistently throughout + +### False Ranges + +- Avoid: "from X to Y" with unrelated or vaguely connected endpoints +- Fix: Drop the construction or specify a meaningful scale + +### Self-check + +- [ ] No negative parallelisms used for rhetorical effect alone +- [ ] No triple adjective/noun lists without substantive expansion +- [ ] Terminology is consistent (no synonym cycling) +- [ ] All "from X to Y" constructions have a meaningful scale + +## 6. Formatting Artefacts + +### Boldface + +- Do not bold terms mechanically in lists ("**Term**: description") +- Bold only for genuine emphasis in running prose + +### Em Dashes + +- Limit to 1 per paragraph maximum +- Prefer commas or parentheses +- Never use em dashes for emphasis that a natural sentence structure can deliver + +### Colons + +- **Prefer natural sentence flow over colon constructions.** Subordination (because, although, while) and coordination (and, but, so) almost always produce more readable prose than a colon. +- Colons are acceptable only for: + - Formal definitions: "Normalization is defined as: ..." + - Introducing block quotations + - Ratios or time stamps (e.g., 2:1, 14:30) +- **Avoid** colons that introduce inline lists, elaborations, or explanations mid-sentence. + - Bad: "The study examined three factors: temperature, humidity, and wind speed." + - Good: "The study examined temperature, humidity, and wind speed." + - Also good: "The study examined three factors, namely temperature, humidity, and wind speed." +- **Avoid** the "**Label**: description" pattern in running prose (this is a list/slide-deck pattern, not academic prose). + +### Title Case + +- Use sentence case for all section headings +- Exception: proper nouns + +### Tables + +- Do not present information as a table when prose is more appropriate +- Tables for quantitative comparison or reference data only + +### Self-check + +- [ ] No mechanical bold patterns +- [ ] Em dashes used sparingly (≤1 per paragraph) +- [ ] Colons used only for formal definitions, block quotations, or ratios +- [ ] All headings in sentence case +- [ ] Tables justified for the content type + +## 7. Meta-commentary + +### Banned Phrases + +| Phrase | Action | +|--------|--------| +| It is important to note | Delete; state the point directly | +| It should be noted that | Delete | +| Worth noting | Delete | +| In summary | Transition naturally | +| In conclusion | Transition naturally | +| Overall | Usually unnecessary; delete or restructure | +| As mentioned earlier | Delete or use a specific cross-reference | +| As discussed above | Delete or use a specific cross-reference | + +### Self-check + +- [ ] No meta-commentary phrases found +- [ ] Transitions use content-based links, not meta-phrases + +## 8. Sentence Openers + +### Rule: Vary how sentences begin + +**Avoid starting 3+ consecutive sentences with:** + +- The same word (especially "The", "This", "It", "However") +- Subject-verb pattern every time + +**Vary with:** + +- Adverbial phrases: "Between 2015 and 2020, ..." +- Prepositional phrases: "In the context of ..." +- Participial phrases: "Drawing on longitudinal data, ..." +- Dependent clauses: "Although the sample size was limited, ..." +- Transitional phrases (non-AI): "By contrast, ...", "More specifically, ...", "In parallel, ..." + +### Self-check + +- [ ] No 3+ consecutive sentences starting the same way +- [ ] At least 3 different opener types per paragraph + +## 9. Rhythm & Paragraph Length + +**Rule:** AI-generated text exhibits characteristically uniform sentence length and paragraph blocks. Natural academic writing has variation. + +### Sentence rhythm (burstiness) + +- If 5+ consecutive sentences all fall within the same narrow word-count range (e.g., all 20–25 words), flag for revision +- Insert a short sentence (≤10 words) to break metronomic patterns +- Combine two short sentences into one complex one if the pattern is monotonously short +- Read the paragraph aloud; if it feels metronomic, vary it + +### Paragraph length variation + +- Vary paragraph length naturally: 2–8 sentences per paragraph +- Uniform 4–5 sentence paragraphs signal AI; avoid this pattern +- Short paragraphs (2–3 sentences) create emphasis +- Longer paragraphs (6–8 sentences) develop complex arguments +- Never have 4+ consecutive paragraphs of the same length (±1 sentence) + +### Semicolons + +- Limit: ≤2 per 1000 words +- AI text chains independent clauses with semicolons where a period would be clearer +- Reserve semicolons for closely related parallel structures + +### Self-check + +- [ ] No 5+ consecutive sentences in the same word-count range +- [ ] Paragraph lengths vary (no 4+ consecutive same-length paragraphs) +- [ ] Semicolons ≤2 per 1000 words + +## 10. Chatbot Artefacts + +### Never Include + +- "I hope this helps" +- "Let me know if you need anything else" +- "Here is a breakdown of..." +- "Of course!", "Certainly!" +- "As an AI language model" +- Subject lines ("Subject: ...") +- Knowledge-cutoff disclaimers + +### Self-check + +- [ ] Zero chatbot artefacts in output + +## Final Verification + +Run all checks in sequence: + +1. [ ] Vocabulary clustering: no 3+ flagged words per paragraph +2. [ ] Inflated significance: proportional to evidence +3. [ ] No superficial -ing analysis +4. [ ] Natural copula usage: "is/are" used where appropriate +5. [ ] No banned structural patterns +6. [ ] Clean formatting: no artefacts (bold, em dash, colon, tables) +7. [ ] No meta-commentary +8. [ ] Varied sentence openers +9. [ ] Rhythm & paragraph length: no metronomic patterns or uniform blocks +10. [ ] Zero chatbot artefacts +11. [ ] Natural sentence flow: colons and em dashes not substituting for proper subordination/coordination +12. [ ] Claim-evidence alignment: every major claim has cited support + +**Result: PASS only if all 12 checks clear.** diff --git a/.agents/skills/oma-academic-writer/resources/hedging-guide.md b/.agents/skills/oma-academic-writer/resources/hedging-guide.md new file mode 100644 index 0000000..2b4305b --- /dev/null +++ b/.agents/skills/oma-academic-writer/resources/hedging-guide.md @@ -0,0 +1,130 @@ +# Hedging guide for academic English + +Academic writing requires calibrated certainty: neither overclaiming nor excessive caution. + +## Evidence-to-hedge mapping + +| Evidence Strength | Description | Hedge Level | Example Constructions | +|-------------------|-------------|-------------|----------------------| +| Definitive | Mathematical proof, logical necessity, definitional truth | None | "X **is** Y", "The data **confirm** that..." | +| Strong | Large-N replicated studies, meta-analyses, established consensus | Minimal | "The data **demonstrate** that...", "The evidence **establishes** that..." | +| Moderate | Single well-designed study, consistent preliminary findings | Moderate | "The findings **suggest** that...", "The results **indicate** that..." | +| Exploratory | Pilot study, limited sample, emerging trends | Strong | "This **may indicate** that...", "Preliminary evidence **points toward**..." | +| Interpretive | Author's own analysis without external validation | Attribution | "This pattern **appears to** reflect...", "One possible interpretation is that..." | +| Speculative | No direct evidence; inference from adjacent domains | Maximal | "It is **conceivable** that...", "**If** this trend continues, X **could** occur." | + +## Hedging devices + +### Modal verbs (ordered by certainty) + +| Certainty Level | Modals | +|----------------|--------| +| High | will, must, is (certain to) | +| Moderate | would, should, is likely to | +| Low | may, might, could, can | +| Speculative | would potentially, might conceivably | + +### Hedging verbs + +| Certainty Level | Verbs | +|----------------|-------| +| High | demonstrate, confirm, establish, verify, prove | +| Moderate | suggest, indicate, imply, point to, appear | +| Low | hint at, raise the possibility, seem to, tend to | + +### Hedging adverbs + +| Certainty Level | Adverbs | +|----------------|---------| +| High | clearly, definitively, conclusively, unequivocally | +| Moderate | generally, typically, largely, predominantly | +| Low | possibly, potentially, perhaps, arguably, presumably | +| Speculative | conceivably, hypothetically, tentatively | + +### Hedging nouns + +possibility, tendency, likelihood, indication, suggestion, assumption, interpretation, speculation + +### Hedging prepositional phrases + +according to, on the basis of, in light of, given the limitations of, with reference to, from the perspective of + +## Rules + +### 1. Match hedge to evidence + +Every claim in academic writing must carry the appropriate degree of certainty. + +**Strong evidence → assertive language:** +> "The regression analysis **demonstrates** a statistically significant correlation (p < .001) between match duration and surface type." + +**Moderate evidence → moderate hedge:** +> "The findings **suggest** that match duration **tends to** increase on clay surfaces." + +**Limited evidence → strong hedge:** +> "Preliminary data **may indicate** a relationship between surface type and match duration, although further investigation is warranted." + +### 2. Never over-hedge strong evidence + +If the data clearly support a claim, do not weaken it with unnecessary hedges. + +- Bad: "The data **might possibly seem to** suggest that the correlation **could potentially** exist." +- Good: "The data **demonstrate** a strong correlation." + +### 3. Never under-hedge weak evidence + +If the evidence is limited, do not present claims as established fact. + +- Bad: "Surface type **determines** match duration." (stated as fact from limited data) +- Good: "Surface type **appears to** influence match duration." + +### 4. Avoid double hedging + +Using two hedge devices where one suffices weakens the claim unnecessarily. + +- Bad: "It **might possibly** be the case that..." +- Good: "It **may** be the case that..." + +- Bad: "The results **seem to suggest** that..." +- Good: "The results **suggest** that..." or "The results **seem to indicate** that..." + +### 5. Use attribution hedging for interpretive claims + +When offering your own analysis (not reporting others' findings), attribute the interpretation explicitly. + +- "This pattern **appears to** reflect broader changes in playing strategy." +- "One interpretation of this trend is that..." +- "Based on this analysis, it is **reasonable to infer** that..." + +### 6. Avoid "I think" / "I believe" + +These are too informal for academic writing. Replace with: + +| Informal | Academic | +|----------|----------| +| I think | This analysis suggests | +| I believe | The evidence indicates | +| In my opinion | Based on the available data | +| I feel | It appears that | + +### 7. Never use "clearly/obviously/definitely" unless justified + +These intensifiers are appropriate only for: + +- Mathematical certainties: "The mean is **clearly** higher than the median." +- Logical necessities: "If A > B and B > C, then A is **definitively** greater than C." +- Universally accepted facts: appropriate in very limited contexts + +For all other claims, remove the intensifier and let the evidence speak. + +## Quick reference: hedge calibration checklist + +Before submitting any academic text: + +- [ ] Does every claim carry appropriate hedging for its evidence base? +- [ ] No double-hedging found? +- [ ] No "clearly/obviously" used without mathematical or logical justification? +- [ ] No "I think/believe" constructions? +- [ ] Interpretive claims attributed with "appears to", "one interpretation", etc.? +- [ ] Strong evidence not weakened by unnecessary hedges? +- [ ] Weak evidence not presented as established fact? diff --git a/.agents/skills/oma-academic-writer/resources/sentence-structure-reference.md b/.agents/skills/oma-academic-writer/resources/sentence-structure-reference.md new file mode 100644 index 0000000..74d781d --- /dev/null +++ b/.agents/skills/oma-academic-writer/resources/sentence-structure-reference.md @@ -0,0 +1,154 @@ +# Sentence structure reference for academic English + +## Four sentence types + +### 1. Simple sentence + +One independent clause in a subject-verb pattern. + +**Use for:** High-impact statements, clear assertions, topic sentences. + +**Examples:** + +- The Australian government introduced an official carbon tax on 1 July 2012. +- This dataset contains 120 years of match records. +- Performance declined sharply after the 2018 rule changes. + +**Target:** 20–30% of sentences per paragraph. + +### 2. Compound sentence + +Two independent clauses connected by a coordinating conjunction (FANBOYS: for, and, nor, but, or, yet, so). + +**Use for:** Balanced comparisons, contrasts, cause-effect pairs. + +**Examples:** + +- The Australian government introduced an official carbon tax on 1 July 2012, but this was met with opposition from the general public. +- Match duration increased by 15%, and spectator attendance declined in parallel. +- Nadal dominated the clay court, yet Djokovic maintained superiority on hard surfaces. + +**Target:** 15–25% of sentences per paragraph. + +### 3. Complex sentence + +One independent clause + one dependent clause (introduced by a subordinating conjunction: as, because, although, when, while, if, since, after, before, unless, until, whereas). + +**Use for:** Causal reasoning, conditional statements, temporal ordering (the backbone of academic analysis). + +**Examples:** + +- As the Australian government recognised the necessity to significantly reduce greenhouse gas emissions, it introduced an official carbon tax on 1 July 2012. +- Although the dataset spans 121 seasons, only matches after 1968 include detailed set-level data. +- Because five-set matches impose greater physical demands, average rally length decreases in the fourth and fifth sets. + +**Target:** 30–40% of sentences per paragraph (primary structure for academic prose). + +### 4. Compound-complex sentence + +Two or more independent clauses + one or more dependent clauses. + +**Use for:** Sophisticated synthesis, multi-factor analysis, nuanced arguments. + +**Examples:** + +- As the Australian government recognised the necessity to significantly reduce greenhouse gas emissions, it introduced an official carbon tax on 1 July 2012, but this was met with opposition from the general public. +- While set durations vary considerably between Grand Slam events, the median match length has increased by 12 minutes since 2000, and this trend correlates with advances in racquet technology. +- Because the 1905–1968 era lacked professional circuits, participation remained limited to amateur players; however, match records from this period still provide valuable longitudinal data. + +**Target:** 10–20% of sentences per paragraph (use sparingly for maximum impact). + +## Variation rules + +### Within a paragraph + +1. Never place 3+ sentences of the same type consecutively +2. Vary sentence length: + - Short: 8–15 words (for impact) + - Medium: 16–25 words (for flow) + - Long: 26–40 words (for depth) +3. Each paragraph should contain at least 3 of the 4 sentence types +4. Vary paragraph length (2–8 sentences); uniform blocks signal AI + +> For detailed burstiness detection, semicolon limits, and paragraph length variation rules, see `anti-ai-checklist.md` §9. + +### Sentence openers (vary these) + +| Opener Type | Example | +|-------------|---------| +| Subject-first | "The analysis reveals..." | +| Adverbial phrase | "Between 2015 and 2020, match duration..." | +| Prepositional phrase | "In the context of Grand Slam competition, ..." | +| Participial phrase | "Drawing on longitudinal data, this study..." | +| Dependent clause | "Although the sample size was limited, the trend..." | +| Transitional phrase | "By contrast, the women's draw exhibited..." | +| Inverted structure | "Particularly notable is the decline in..." | + +**Rule:** No 3+ consecutive sentences beginning with the same opener type. + +## Common errors to prevent + +### Sentence fragments + +A sentence missing subject, verb, or complete thought. + +| Error Type | Bad | Fixed | +|-----------|-----|-------| +| Missing subject | "Becoming extinct because of rising sea temperatures." | "Phytoplankton could become extinct because of rising sea temperatures." | +| Missing verb | "Significantly, one particular form of Western Australian finch." | "Significantly, one particular form of Western Australian finch has decreased in numbers." | +| Incomplete thought | "In a recent article about loss of habitat due to climate change." | "In a recent article about loss of habitat due to climate change, Australian animals were shown to be particularly vulnerable." | + +**Watch out for:** Sentences beginning with so, as, because, who, which, that, since these are often incomplete. + +### Run-on sentences + +Two independent clauses joined without proper punctuation or conjunction. + +| Bad | Fixed (conjunction) | Fixed (separate) | +|-----|-------|---------| +| "Poverty and famine are indicators of climate change these issues are not being addressed." | "Poverty and famine are indicators of climate change, **but** these issues are not being addressed." | "Poverty and famine are indicators of climate change. These issues are not being addressed." | + +### Lack of clear meaning + +Every sentence must be fully understandable when read in isolation. If a sentence requires mental gymnastics to parse, rewrite it more simply. + +## Paragraph cohesion model + +``` +[Topic Sentence — main point, often complex or compound-complex] +[Supporting Sentence 1 — evidence, often simple or complex] +[Supporting Sentence 2 — analysis, often complex] +[Supporting Sentence 3 — additional evidence or counter-argument, often compound] +[Concluding Sentence — link to broader argument or transition, often compound-complex] +``` + +### Bad example (monotonous structure) + +> Nursing education states that measures should be in place to avoid infection. Also, that infection rates tend to soar when hygiene standards decrease. Appropriate steps should be taken to decrease these risks. It is suggested that medical staff are educated to understand these risks. + +**Problems:** Same structure (simple/simple), same opener pattern ("X states", "Also, that"), same sentence length, no cohesion between sentences. + +### Good example (varied structure) + +> Nursing educators argue that strict measures should be implemented to avoid infection in medical institutions. There is also much evidence to demonstrate that infection rates rise dramatically when hygiene standards begin to fall. Therefore, it is argued that appropriate steps need to be in place to decrease and minimise these potential risks. Furthermore, aggressive steps should be taken to ensure that all staff maintain effective hygiene and infection control. + +**Improvements:** Varied openers, mixed simple/complex/compound structures, varied sentence lengths, logical flow from claim → evidence → argument → recommendation. + +## Quick reference: conjunction inventory + +### Coordinating (FANBOYS): for compound sentences + +for, and, nor, but, or, yet, so + +### Subordinating: for complex sentences + +**Cause/reason:** because, since, as, given that, owing to the fact that +**Contrast:** although, though, even though, whereas, while, whilst +**Condition:** if, unless, provided that, on condition that +**Time:** when, whenever, after, before, until, since, while, as soon as +**Purpose:** so that, in order that +**Result:** so ... that, such ... that + +### Transitional phrases (non-AI): for sentence openers + +By contrast, More specifically, In parallel, To this end, From a different perspective, On closer examination, Upon further analysis, In quantitative terms, At the aggregate level, Within this framework, Across all conditions diff --git a/.agents/skills/oma-backend/SKILL.md b/.agents/skills/oma-backend/SKILL.md index b39a7ee..342ce6d 100644 --- a/.agents/skills/oma-backend/SKILL.md +++ b/.agents/skills/oma-backend/SKILL.md @@ -162,20 +162,20 @@ Router (HTTP) → Service (Business Logic) → Repository (Data Access) → Mode 8. **Explicit ORM loading strategy**: do not rely on default relation loading when query shape matters 9. **Explicit transaction boundaries**: group one business operation into one request/service-scoped unit of work 10. **Safe ORM lifecycle**: do not share mutable ORM session/entity manager/client objects across concurrent work unless the ORM explicitly supports it -11. **Config from environment**: DB URLs, API keys, secrets, and feature flags come from env vars or secret managers — never hardcode in source -12. **Stateless services**: no in-memory session or user state between requests — use external stores (DB, Redis, cache) for shared state -13. **Backing services as resources**: DB, queue, cache, mail are swappable attached resources connected via config — Repository layer must not assume a specific instance +11. **Config from environment**: DB URLs, API keys, secrets, and feature flags come from env vars or secret managers; never hardcode in source +12. **Stateless services**: no in-memory session or user state between requests; use external stores (DB, Redis, cache) for shared state +13. **Backing services as resources**: DB, queue, cache, mail are swappable attached resources connected via config; Repository layer must not assume a specific instance ### Stack Detection -1. **Project files first** — Read existing code, package manifests (pyproject.toml, package.json, Cargo.toml, go.mod, pom.xml, etc.) to determine the tech stack -2. **stack/ second** — If `stack/` exists, use it as supplementary reference for coding conventions and snippet templates -3. **Neither exists** — Ask the user or suggest running `/stack-set` +1. **Project files first**: Read existing code, package manifests (pyproject.toml, package.json, Cargo.toml, go.mod, pom.xml, etc.) to determine the tech stack +2. **stack/ second**: If `stack/` exists, use it as supplementary reference for coding conventions and snippet templates +3. **Neither exists**: Ask the user or suggest running `/stack-set` ### Stack-Specific Reference -- **Stack manifest (SSOT)**: `stack/stack.yaml` — structured declaration (`language`, `framework`, `orm`) and `verify:` contract consumed by `oma verify backend`. Schema: `variants/stack.schema.json`. -- Tech stack narrative: `stack/tech-stack.md` — human-readable reference only; `stack.yaml` wins on conflict. +- **Stack manifest (SSOT)**: `stack/stack.yaml`: structured declaration (`language`, `framework`, `orm`) and `verify:` contract consumed by `oma verify backend`. Schema: `variants/stack.schema.json`. +- Tech stack narrative: `stack/tech-stack.md`: human-readable reference only; `stack.yaml` wins on conflict. - Code snippets (copy-paste ready): `stack/snippets.md` - API template: `stack/api-template.*` diff --git a/.agents/skills/oma-backend/resources/checklist.md b/.agents/skills/oma-backend/resources/checklist.md index 17b81fb..aa4c9fb 100644 --- a/.agents/skills/oma-backend/resources/checklist.md +++ b/.agents/skills/oma-backend/resources/checklist.md @@ -39,7 +39,7 @@ Run through every item before submitting your work. - [ ] Type annotations on all function signatures ## Cloud Readiness -- [ ] No hardcoded config values (DB URLs, API keys, ports) — all from env vars +- [ ] No hardcoded config values (DB URLs, API keys, ports); all from env vars - [ ] No in-process state between requests (sessions, caches, counters) -- [ ] Logs written to stdout/stderr, not file — structured format (JSON) preferred +- [ ] Logs written to stdout/stderr, not file; structured format (JSON) preferred - [ ] Graceful shutdown handled for background jobs and open connections diff --git a/.agents/skills/oma-backend/resources/error-playbook.md b/.agents/skills/oma-backend/resources/error-playbook.md index a968ea1..ef9fa46 100644 --- a/.agents/skills/oma-backend/resources/error-playbook.md +++ b/.agents/skills/oma-backend/resources/error-playbook.md @@ -9,9 +9,9 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: Module/package not found errors -1. Check the import path — typo? wrong package name? +1. Check the import path: typo? wrong package name? 2. Verify the dependency exists in your package manifest -3. If missing: note it in your result as "requires install the missing dependency" — do NOT install yourself +3. If missing: note it in your result as "requires install the missing dependency"; do NOT install yourself 4. If it's a local module: check the directory structure with `get_symbols_overview` 5. If the path changed: use `search_for_pattern("class ClassName")` to find the new location @@ -21,7 +21,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: test runner returns FAILED, assertion errors -1. Read the full error output — which test, which assertion, expected vs actual +1. Read the full error output: which test, which assertion, expected vs actual 2. `find_symbol("test_function_name")` to read the test code 3. Determine: is the test wrong or is the implementation wrong? - Test expects old behavior → update test @@ -36,7 +36,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: Migration command fails, `IntegrityError`, duplicate column -1. Read the error — is it a conflict with existing migration? +1. Read the error; is it a conflict with existing migration? 2. Check current DB state: Check current migration state 3. If migration conflicts: Rollback one migration step then fix migration script 4. If schema mismatch: compare model with actual DB schema @@ -72,7 +72,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: `429`, `RESOURCE_EXHAUSTED`, `rate limit exceeded` -1. **Stop immediately** — do not make additional API calls +1. **Stop immediately**; do not make additional API calls 2. Save current work to `progress-{agent-id}[-{sessionId}].md` 3. Record Status: `quota_exceeded` in `result-{agent-id}[-{sessionId}].md` 4. Specify remaining tasks so orchestrator can retry later @@ -95,4 +95,4 @@ Do NOT stop or ask for help until you have exhausted the playbook. - **After 3 failures**: If same approach fails 3 times, must try a different method - **Blocked**: If no progress after 5 turns, save current state and record `Status: blocked` in result -- **Out of scope**: If you find issues in another agent's domain, only record in result — do not modify directly +- **Out of scope**: If you find issues in another agent's domain, only record in result; do not modify directly diff --git a/.agents/skills/oma-backend/resources/execution-protocol.md b/.agents/skills/oma-backend/resources/execution-protocol.md index ac01a3f..c338b64 100644 --- a/.agents/skills/oma-backend/resources/execution-protocol.md +++ b/.agents/skills/oma-backend/resources/execution-protocol.md @@ -1,15 +1,15 @@ # Backend Agent - Execution Protocol ## Step 0: Prepare -1. **Assess difficulty** — see `../../_shared/core/difficulty-guide.md` +1. **Assess difficulty**: see `../../_shared/core/difficulty-guide.md` - **Simple**: Skip to Step 3 | **Medium**: All 4 steps | **Complex**: All steps + checkpoints -2. **Check lessons** — read your domain section in `../../_shared/core/lessons-learned.md` -3. **Clarify requirements** — follow `../../_shared/core/clarification-protocol.md` +2. **Check lessons**: read your domain section in `../../_shared/core/lessons-learned.md` +3. **Clarify requirements**: follow `../../_shared/core/clarification-protocol.md` - Check **Uncertainty Triggers**: business logic, security/auth, existing code conflicts? - Determine level: LOW → proceed | MEDIUM → present options | HIGH → ask immediately -4. **Budget context** — follow `../../_shared/core/context-budget.md` (read symbols, not whole files) +4. **Budget context**: follow `../../_shared/core/context-budget.md` (read symbols, not whole files) -**⚠️ Intelligent Escalation**: When uncertain, escalate early. Don't blindly proceed. +**Intelligent Escalation**: When uncertain, escalate early. Don't blindly proceed. Follow these steps in order (adjust depth by difficulty). diff --git a/.agents/skills/oma-coordination/SKILL.md b/.agents/skills/oma-coordination/SKILL.md index ffb7cf4..4510c2c 100644 --- a/.agents/skills/oma-coordination/SKILL.md +++ b/.agents/skills/oma-coordination/SKILL.md @@ -122,7 +122,7 @@ wait 4. QA review is always the final step 5. Assign separate workspaces to avoid file conflicts 6. Always use Serena MCP tools as the primary method for code exploration and modification -7. Never skip steps in the workflow — follow each step sequentially without omission +7. Never skip steps in the workflow; follow each step sequentially without omission ### Workflow diff --git a/.agents/skills/oma-db/resources/execution-protocol.md b/.agents/skills/oma-db/resources/execution-protocol.md index dc06c09..aad89df 100644 --- a/.agents/skills/oma-db/resources/execution-protocol.md +++ b/.agents/skills/oma-db/resources/execution-protocol.md @@ -1,14 +1,14 @@ # DB Agent - Execution Protocol ## Step 0: Prepare -1. **Assess difficulty** — see `../../_shared/core/difficulty-guide.md` +1. **Assess difficulty**: see `../../_shared/core/difficulty-guide.md` - **Simple**: small schema adjustment or index review - **Medium**: new bounded context, migration, or backup/capacity update - **Complex**: engine selection, major redesign, multi-tenant or high-scale workload 2. **Clarify workload** - Functional flows, critical queries, write/read ratio, peak TPS, retention, RPO, RTO - Compliance or audit constraints, PII, multi-region, reporting needs -3. **Budget context** — follow `../../_shared/core/context-budget.md` +3. **Budget context**: follow `../../_shared/core/context-budget.md` 4. **If vector search is involved**, read `resources/vector-db.md` 5. **If security, audit, backup, or resilience requirements are central**, read `resources/iso-controls.md` diff --git a/.agents/skills/oma-debug/resources/bug-report-template.md b/.agents/skills/oma-debug/resources/bug-report-template.md index 394292a..ebc9527 100644 --- a/.agents/skills/oma-debug/resources/bug-report-template.md +++ b/.agents/skills/oma-debug/resources/bug-report-template.md @@ -12,12 +12,12 @@ Save to: `.agents/results/bugs/bug-YYYYMMDD-[short-description].md` **Date Fixed**: YYYY-MM-DD (or "In Progress") **Reporter**: [User name or issue number] **Assignee**: [Agent that fixed it] -**Severity**: 🔴 CRITICAL | 🟠 HIGH | 🟡 MEDIUM | 🔵 LOW -**Status**: 🐛 OPEN | 🔧 IN PROGRESS | ✅ FIXED | ⏸️ ON HOLD | ❌ WON'T FIX +**Severity**: CRITICAL | HIGH | MEDIUM | LOW +**Status**: OPEN | IN PROGRESS | FIXED | ON HOLD | WON'T FIX --- -## 📝 Problem Description +## Problem Description **What happened?** [Clear description of the bug from user's perspective] @@ -32,7 +32,7 @@ Save to: `.agents/results/bugs/bug-YYYYMMDD-[short-description].md` --- -## 🔄 Reproduction Steps +## Reproduction Steps 1. Navigate to [page/route] 2. Click on [button/element] @@ -46,7 +46,7 @@ Save to: `.agents/results/bugs/bug-YYYYMMDD-[short-description].md` --- -## 🖼️ Evidence +## Evidence **Error Messages**: ``` @@ -77,7 +77,7 @@ Response: [relevant response data] --- -## 🌍 Environment +## Environment **Frontend**: - Browser: [Chrome 120 | Firefox 121 | Safari 17] @@ -97,7 +97,7 @@ Response: [relevant response data] --- -## 🔍 Investigation +## Investigation ### Initial Analysis @@ -130,7 +130,7 @@ const user = data.user.profile.name; // Crashes if profile is undefined --- -## 🔧 Solution +## Solution ### Fix Applied @@ -142,10 +142,10 @@ const user = data.user.profile.name; // Crashes if profile is undefined ```typescript // File: path/to/file.tsx (line 145) -// ❌ BEFORE (buggy code) +// BEFORE (buggy code) const user = data.user.profile.name; -// ✅ AFTER (fixed code) +// AFTER (fixed code) const user = data?.user?.profile?.name ?? 'Unknown'; ``` @@ -154,9 +154,9 @@ const user = data?.user?.profile?.name ?? 'Unknown'; ### Files Modified -- ✏️ `src/components/UserProfile.tsx` - Added null check for profile -- ✏️ `src/lib/api/users.ts` - Improved error handling -- ➕ `src/components/UserProfile.test.tsx` - Added regression test +- `src/components/UserProfile.tsx` - Added null check for profile +- `src/lib/api/users.ts` - Improved error handling +- `src/components/UserProfile.test.tsx` - Added regression test ### Migration/Deployment Notes @@ -167,7 +167,7 @@ const user = data?.user?.profile?.name ?? 'Unknown'; --- -## ✅ Verification +## Verification ### Testing Performed @@ -188,14 +188,14 @@ const user = data?.user?.profile?.name ?? 'Unknown'; ### Test Results -**Unit Tests**: ✅ 15/15 passing -**Integration Tests**: ✅ 8/8 passing -**E2E Tests**: ✅ 3/3 passing -**Manual QA**: ✅ Verified on Chrome, Firefox, Safari +**Unit Tests**: 15/15 passing +**Integration Tests**: 8/8 passing +**E2E Tests**: 3/3 passing +**Manual QA**: Verified on Chrome, Firefox, Safari --- -## 📚 Prevention +## Prevention ### How to Avoid Similar Bugs @@ -208,20 +208,20 @@ const user = data?.user?.profile?.name ?? 'Unknown'; ### Code Patterns to Follow ```typescript -// ✅ GOOD: Safe access with fallback +// GOOD: Safe access with fallback const name = user?.profile?.name ?? 'Anonymous'; -// ✅ GOOD: Explicit null check +// GOOD: Explicit null check if (user?.profile) { const name = user.profile.name; } -// ✅ GOOD: Early return +// GOOD: Early return if (!user?.profile) { return
No profile available
; } -// ❌ BAD: Unsafe nested access +// BAD: Unsafe nested access const name = user.profile.name; // Crashes if profile undefined ``` @@ -233,7 +233,7 @@ const name = user.profile.name; // Crashes if profile undefined --- -## 🔗 Related +## Related **Similar Bugs**: - Bug #123: Similar null check issue in `CommentList` @@ -252,7 +252,7 @@ const name = user.profile.name; // Crashes if profile undefined --- -## 📊 Metrics +## Metrics **Time to Fix**: [2 hours | 1 day | 1 week] **Lines Changed**: [+15 -5] @@ -261,7 +261,7 @@ const name = user.profile.name; // Crashes if profile undefined --- -## 💬 Communication +## Communication **Notified**: - [x] Product Manager - Impact assessment @@ -277,7 +277,7 @@ const name = user.profile.name; // Crashes if profile undefined --- -## 🎓 Lessons Learned +## Lessons Learned **What went well**: - Quick identification of root cause @@ -297,7 +297,7 @@ const name = user.profile.name; // Crashes if profile undefined --- -## 🏷️ Tags +## Tags `frontend` `null-check` `crash` `typescript` `user-profile` `high-priority` diff --git a/.agents/skills/oma-debug/resources/common-patterns.md b/.agents/skills/oma-debug/resources/common-patterns.md index 9ee1195..80d6f19 100644 --- a/.agents/skills/oma-debug/resources/common-patterns.md +++ b/.agents/skills/oma-debug/resources/common-patterns.md @@ -4,18 +4,18 @@ Quick reference guide for frequently encountered bugs and their fixes. --- -## 🔴 Frontend Bugs +## Frontend Bugs ### 1. Undefined/Null Errors -**❌ Problem**: `Cannot read property 'X' of undefined` +**Problem**: `Cannot read property 'X' of undefined` ```typescript // Crash when data not loaded yet const name = user.profile.name; ``` -**✅ Solutions**: +**Solutions**: ```typescript // Option 1: Optional chaining + nullish coalescing @@ -32,7 +32,7 @@ if (!user?.profile) return
Loading...
; ### 2. Stale Closures in useEffect -**❌ Problem**: Event handlers/callbacks use old state values +**Problem**: Event handlers/callbacks use old state values ```typescript function Counter() { @@ -49,7 +49,7 @@ function Counter() { } ``` -**✅ Solutions**: +**Solutions**: ```typescript // Option 1: Include dependency @@ -87,7 +87,7 @@ useEffect(() => { ### 3. Missing Cleanup in useEffect -**❌ Problem**: Memory leaks from subscriptions/listeners +**Problem**: Memory leaks from subscriptions/listeners ```typescript useEffect(() => { @@ -96,7 +96,7 @@ useEffect(() => { }, []); ``` -**✅ Solution**: +**Solution**: ```typescript useEffect(() => { @@ -119,7 +119,7 @@ useEffect(() => { ### 4. Race Conditions in Async Effects -**❌ Problem**: Old requests overwrite new ones +**Problem**: Old requests overwrite new ones ```typescript useEffect(() => { @@ -128,7 +128,7 @@ useEffect(() => { }, [userId]); ``` -**✅ Solution**: +**Solution**: ```typescript useEffect(() => { @@ -150,7 +150,7 @@ useEffect(() => { ### 5. Infinite Re-render Loops -**❌ Problem**: Component re-renders infinitely +**Problem**: Component re-renders infinitely ```typescript function Component() { @@ -164,7 +164,7 @@ function Component() { } ``` -**✅ Solutions**: +**Solutions**: ```typescript // Option 1: Remove problematic dependency @@ -191,7 +191,7 @@ useEffect(() => { ### 6. Key Prop Issues in Lists -**❌ Problem**: List items reordering incorrectly +**Problem**: List items reordering incorrectly ```typescript // Using index as key @@ -200,7 +200,7 @@ useEffect(() => { ))} ``` -**✅ Solution**: +**Solution**: ```typescript // Use stable, unique ID @@ -218,7 +218,7 @@ useEffect(() => { ### 7. Form Input Controlled/Uncontrolled Switch -**❌ Problem**: `Warning: A component is changing an uncontrolled input to be controlled` +**Problem**: `Warning: A component is changing an uncontrolled input to be controlled` ```typescript const [value, setValue] = useState(); // undefined initially @@ -226,7 +226,7 @@ const [value, setValue] = useState(); // undefined initially setValue(e.target.value)} /> ``` -**✅ Solution**: +**Solution**: ```typescript // Initialize with empty string @@ -240,11 +240,11 @@ const [value, setValue] = useState(''); --- -## 🔴 Backend Bugs +## Backend Bugs ### 1. SQL Injection -**❌ Problem**: User input directly in SQL query +**Problem**: User input directly in SQL query ```python # DANGEROUS! @@ -254,7 +254,7 @@ db.execute(query) # User can input: ' OR '1'='1 ``` -**✅ Solution**: +**Solution**: ```python # Use parameterized queries @@ -272,7 +272,7 @@ user = db.query(User).filter(User.email == email).first() ### 2. N+1 Query Problem -**❌ Problem**: One query per item in a loop +**Problem**: One query per item in a loop ```python # 1 query to get todos @@ -284,7 +284,7 @@ for todo in todos: print(f"{todo.title} by {user.name}") ``` -**✅ Solution**: +**Solution**: ```python # Use JOIN - single query @@ -300,7 +300,7 @@ for todo in todos: ### 3. Missing Authentication Check -**❌ Problem**: Protected endpoint accessible without auth +**Problem**: Protected endpoint accessible without auth ```python @app.get("/api/admin/users") @@ -308,7 +308,7 @@ async def get_all_users(db: DatabaseDep): return db.query(User).all() # Anyone can access! ``` -**✅ Solution**: +**Solution**: ```python @app.get("/api/admin/users") @@ -325,7 +325,7 @@ async def get_all_users( ### 4. Missing Input Validation -**❌ Problem**: Invalid data causes errors +**Problem**: Invalid data causes errors ```python @app.post("/api/users") @@ -336,7 +336,7 @@ async def create_user(email: str, age: int): db.commit() ``` -**✅ Solution**: +**Solution**: ```python from pydantic import BaseModel, EmailStr, Field @@ -357,7 +357,7 @@ async def create_user(user: UserCreate): ### 5. Unhandled Exceptions -**❌ Problem**: Server crashes on error +**Problem**: Server crashes on error ```python @app.post("/api/todos") @@ -368,7 +368,7 @@ async def create_todo(todo: TodoCreate, user: User = Depends(get_current_user)): return db_todo ``` -**✅ Solution**: +**Solution**: ```python from fastapi import HTTPException @@ -394,14 +394,14 @@ async def create_todo(todo: TodoCreate, user: User = Depends(get_current_user)): ### 6. Missing CORS Configuration -**❌ Problem**: Frontend can't call API +**Problem**: Frontend can't call API ``` Access to fetch at 'http://localhost:8000/api/todos' from origin 'http://localhost:3000' has been blocked by CORS policy ``` -**✅ Solution**: +**Solution**: ```python from fastapi.middleware.cors import CORSMiddleware @@ -422,13 +422,13 @@ app.add_middleware( ### 7. Password Storage -**❌ Problem**: Passwords stored in plain text +**Problem**: Passwords stored in plain text ```python user = User(email=email, password=password) # NEVER DO THIS! ``` -**✅ Solution**: +**Solution**: ```python from passlib.context import CryptContext @@ -446,11 +446,11 @@ if not pwd_context.verify(plain_password, user.password_hash): --- -## 🔴 Mobile Bugs +## Mobile Bugs ### 1. Memory Leaks in Flutter -**❌ Problem**: Controllers not disposed +**Problem**: Controllers not disposed ```dart class MyWidget extends StatefulWidget { @@ -469,7 +469,7 @@ class _MyWidgetState extends State { } ``` -**✅ Solution**: +**Solution**: ```dart class _MyWidgetState extends State { @@ -492,7 +492,7 @@ class _MyWidgetState extends State { ### 2. Platform-Specific Code Not Checked -**❌ Problem**: iOS-specific code crashes on Android +**Problem**: iOS-specific code crashes on Android ```dart // Crashes on Android @@ -501,7 +501,7 @@ import 'dart:io' show Platform; final deviceName = Platform.isIOS ? 'iPhone' : 'Unknown'; ``` -**✅ Solution**: +**Solution**: ```dart import 'dart:io' show Platform; @@ -522,11 +522,11 @@ if (Platform.isIOS) { --- -## 🔴 Performance Bugs +## Performance Bugs ### 1. Unnecessary Re-renders (React) -**❌ Problem**: Component re-renders on every parent render +**Problem**: Component re-renders on every parent render ```typescript function Parent() { @@ -541,7 +541,7 @@ function Parent() { } ``` -**✅ Solution**: +**Solution**: ```typescript // Memoize the expensive component @@ -568,7 +568,7 @@ function Parent() { ### 2. Large Bundle Size -**❌ Problem**: Importing entire library +**Problem**: Importing entire library ```typescript // Imports all of lodash (~70KB) @@ -577,7 +577,7 @@ import _ from 'lodash'; const unique = _.uniq(array); ``` -**✅ Solution**: +**Solution**: ```typescript // Import only what you need @@ -591,18 +591,18 @@ const unique = [...new Set(array)]; --- -## 🔴 Security Bugs +## Security Bugs ### 1. XSS (Cross-Site Scripting) -**❌ Problem**: User input rendered as HTML +**Problem**: User input rendered as HTML ```typescript // Dangerous!
``` -**✅ Solution**: +**Solution**: ```typescript // React escapes by default @@ -620,7 +620,7 @@ import DOMPurify from 'dompurify'; ### 2. Missing Rate Limiting -**❌ Problem**: API can be abused +**Problem**: API can be abused ```python @app.post("/api/auth/login") @@ -629,7 +629,7 @@ async def login(credentials: LoginRequest): ... ``` -**✅ Solution**: +**Solution**: ```python from slowapi import Limiter @@ -645,7 +645,7 @@ async def login(request: Request, credentials: LoginRequest): --- -## 📊 Common Error Messages & Solutions +## Common Error Messages & Solutions | Error | Likely Cause | Solution | |-------|--------------|----------| @@ -661,7 +661,7 @@ async def login(request: Request, credentials: LoginRequest): --- -## 🎯 Quick Debugging Commands +## Quick Debugging Commands ### Frontend ```bash @@ -706,7 +706,7 @@ flutter build apk --analyze-size --- -## 🔍 When to Use Each Agent +## When to Use Each Agent | Bug Type | Best Agent | Reason | |----------|-----------|---------| @@ -719,7 +719,7 @@ flutter build apk --analyze-size --- -## 💡 Prevention Tips +## Prevention Tips 1. **Write tests first** - Catch bugs before they ship 2. **Use TypeScript** - Catch type errors at compile time diff --git a/.agents/skills/oma-debug/resources/debugging-checklist.md b/.agents/skills/oma-debug/resources/debugging-checklist.md index 935339a..8e4bb9d 100644 --- a/.agents/skills/oma-debug/resources/debugging-checklist.md +++ b/.agents/skills/oma-debug/resources/debugging-checklist.md @@ -2,7 +2,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. -## 📋 Initial Information Gathering +## Initial Information Gathering - [ ] **Bug description** - What is the expected vs actual behavior? - [ ] **Error messages** - Exact error text, stack trace, error codes @@ -13,7 +13,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] **Recent changes** - New deploy? Code changes? Configuration updates? - [ ] **Screenshots/videos** - Visual evidence of the bug -## 🔍 Frontend Debugging +## Frontend Debugging ### JavaScript/TypeScript Errors @@ -72,7 +72,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] Disabled buttons during load? - [ ] Optimistic updates causing issues? -## 🖥️ Backend Debugging +## Backend Debugging ### Python/FastAPI Errors @@ -132,7 +132,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] External services reachable? - [ ] Database migrations applied? -## 📱 Mobile Debugging +## Mobile Debugging ### Platform-Specific Issues @@ -168,7 +168,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] Large images not optimized? - [ ] Too many simultaneous network requests? -## 🔐 Security Bugs +## Security Bugs - [ ] **Authentication bypassed?** - [ ] Token validation on all protected routes? @@ -192,7 +192,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] API keys exposed? - [ ] Error messages leaking info? -## 🐌 Performance Bugs +## Performance Bugs ### Frontend Performance @@ -226,7 +226,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] Query plan optimized? (EXPLAIN) - [ ] N+1 queries eliminated? -## 🔬 Root Cause Analysis +## Root Cause Analysis - [ ] **Reproduce the bug** - [ ] Follow exact reproduction steps @@ -248,7 +248,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] Why does this happen? - [ ] What assumption was wrong? -## ✅ Fix Verification +## Fix Verification - [ ] **Fix applied** - [ ] Code changed in correct file(s) @@ -272,7 +272,7 @@ Use this checklist when investigating bugs to ensure thorough analysis. - [ ] Fix explained - [ ] Prevention notes added -## 🚨 When to Escalate +## When to Escalate Escalate to other agents if: @@ -283,34 +283,34 @@ Escalate to other agents if: - [ ] Database schema changes needed → **Backend Agent** - [ ] Platform-specific mobile issue → **Mobile Agent** -## 📊 Priority Assessment +## Priority Assessment -**🔴 CRITICAL** - Fix immediately: +**CRITICAL** - Fix immediately: - [ ] App crashes on launch - [ ] Data loss or corruption - [ ] Security vulnerability - [ ] Payment/auth completely broken - [ ] Affects all users -**🟠 HIGH** - Fix within 24 hours: +**HIGH** - Fix within 24 hours: - [ ] Major feature broken - [ ] Affects >50% of users - [ ] No workaround available - [ ] Significant revenue impact -**🟡 MEDIUM** - Fix within sprint: +**MEDIUM** - Fix within sprint: - [ ] Minor feature broken - [ ] Affects <50% of users - [ ] Workaround exists - [ ] Moderate inconvenience -**🔵 LOW** - Schedule for future: +**LOW** - Schedule for future: - [ ] Edge case - [ ] Cosmetic issue - [ ] Rarely encountered - [ ] No user impact -## 📝 Documentation Template +## Documentation Template After fixing, document in `.agents/results/bugs/`: @@ -343,7 +343,7 @@ After fixing, document in `.agents/results/bugs/`: --- -## 💡 Pro Tips +## Pro Tips 1. **Read the error message** - It usually tells you exactly what's wrong 2. **Reproduce first** - Don't waste time fixing unconfirmed bugs @@ -352,7 +352,7 @@ After fixing, document in `.agents/results/bugs/`: 5. **Document everything** - Future you will be grateful 6. **Look for patterns** - One bug often reveals more -## 🛠️ Tools Reference +## Tools Reference - **Browser DevTools**: F12 (Console, Network, React DevTools) - **Serena MCP**: find_symbol, search_for_pattern, find_referencing_symbols diff --git a/.agents/skills/oma-debug/resources/error-playbook.md b/.agents/skills/oma-debug/resources/error-playbook.md index 1174ecc..d96bb1a 100644 --- a/.agents/skills/oma-debug/resources/error-playbook.md +++ b/.agents/skills/oma-debug/resources/error-playbook.md @@ -9,7 +9,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: Bug described by user but you can't trigger it -1. Re-read user's reproduction steps — are you following them exactly? +1. Re-read user's reproduction steps; are you following them exactly? 2. Check environment differences: browser, OS, node/python version 3. Check data-dependent: does it need specific DB state or test data? 4. Check timing: is it a race condition? Try adding delays or rapid repetition @@ -22,9 +22,9 @@ Do NOT stop or ask for help until you have exhausted the playbook. **Symptoms**: Original bug fixed but other tests break -1. Read the failing tests — are they testing the old (buggy) behavior? +1. Read the failing tests; are they testing the old (buggy) behavior? 2. If yes: update tests to reflect correct behavior -3. If no: your fix has side effects — revert and try a more targeted approach +3. If no: your fix has side effects. Revert and try a more targeted approach 4. `find_referencing_symbols("fixedFunction")` to check all callers 5. Consider: is the function contract changing? If so, update all callers @@ -37,7 +37,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. 1. Add logging at each step of the execution path 2. Binary search: is the bug before or after the midpoint? 3. `search_for_pattern("suspicious_pattern")` to find related code -4. Check git history: `git log --oneline -20 -- path/to/file` — when was it last changed? +4. Check git history: `git log --oneline -20 -- path/to/file`. When was it last changed? 5. Check: is it a dependency issue? Library version mismatch? 6. **No progress after 5 turns**: Record current analysis in progress, switch to different hypothesis @@ -53,7 +53,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. - What the correct behavior should be - Evidence (request/response logs, stack trace) 3. Record in result: `cross_domain_issue: {agent: "backend", description: "..."}` -4. **Do NOT modify directly** — touching another agent's code causes conflicts +4. **Do NOT modify directly**; touching another agent's code causes conflicts --- @@ -65,7 +65,7 @@ Do NOT stop or ask for help until you have exhausted the playbook. 2. Backend: enable SQL query logging, count queries, check `EXPLAIN ANALYZE` 3. Frontend: run Lighthouse, check React DevTools Profiler 4. Mobile: use Flutter DevTools performance tab -5. Profile before fixing — never optimize without data +5. Profile before fixing; never optimize without data --- @@ -91,4 +91,4 @@ Same as backend-agent playbook: See relevant sections. - **After 3 failures**: If same approach fails 3 times, must try a different method - **Blocked**: If no progress after 5 turns, save current state, `Status: blocked` -- **Out of scope**: Other agent's domain — only record, do not modify directly +- **Out of scope**: Other agent's domain. Only record, do not modify directly diff --git a/.agents/skills/oma-debug/resources/execution-protocol.md b/.agents/skills/oma-debug/resources/execution-protocol.md index 777995c..ff9bf85 100644 --- a/.agents/skills/oma-debug/resources/execution-protocol.md +++ b/.agents/skills/oma-debug/resources/execution-protocol.md @@ -1,16 +1,16 @@ # Debug Agent - Execution Protocol ## Step 0: Prepare -1. **Assess difficulty** — see `../../_shared/core/difficulty-guide.md` +1. **Assess difficulty**: see `../../_shared/core/difficulty-guide.md` - **Simple**: Skip to Step 3 | **Medium**: All 4 steps | **Complex**: All steps + checkpoints -2. **Check lessons** — read your domain section in `../../_shared/core/lessons-learned.md` -3. **Clarify requirements** — follow `../../_shared/core/clarification-protocol.md` +2. **Check lessons**: read your domain section in `../../_shared/core/lessons-learned.md` +3. **Clarify requirements**: follow `../../_shared/core/clarification-protocol.md` - Check **Uncertainty Triggers**: security/auth related bugs, existing code conflict potential? - Determine level: LOW → proceed | MEDIUM → present options | HIGH → ask immediately -4. **Use reasoning templates** — for Complex bugs, use `../../_shared/core/reasoning-templates.md` (hypothesis loop, execution trace) -5. **Budget context** — follow `../../_shared/core/context-budget.md` (use find_symbol, not read_file) +4. **Use reasoning templates**: for Complex bugs, use `../../_shared/core/reasoning-templates.md` (hypothesis loop, execution trace) +5. **Budget context**: follow `../../_shared/core/context-budget.md` (use find_symbol, not read_file) -**⚠️ Intelligent Escalation**: When uncertain, escalate early. Don't blindly proceed. +**Intelligent Escalation**: When uncertain, escalate early. Don't blindly proceed. Follow these steps in order (adjust depth by difficulty). diff --git a/.agents/skills/oma-deepsec/SKILL.md b/.agents/skills/oma-deepsec/SKILL.md new file mode 100644 index 0000000..b14d996 --- /dev/null +++ b/.agents/skills/oma-deepsec/SKILL.md @@ -0,0 +1,247 @@ +--- +name: oma-deepsec +description: > + Drive Vercel's `deepsec` agent-powered vulnerability scanner end-to-end: + installing the `.deepsec/` workspace, bootstrapping `INFO.md`, running + cost-aware `scan` / `process` / `triage` / `revalidate` / `export` passes, + gating PRs with `process --diff`, writing custom matchers, and triaging + findings. Use whenever the user mentions deepsec, asks an agent to scan a + repo for vulnerabilities, runs into `pnpm deepsec` / `bunx deepsec` + commands, wants a CI-based PR security review, sees a `.deepsec/` + directory, or asks about `INFO.md` / matchers / `process --diff` / + `revalidate`, even when the tool name is not spoken. Deepsec scans are + expensive (a single full scan can cost hundreds to tens of thousands of + dollars) so the skill exists in part to keep the user from getting + surprised. +--- + +# Deepsec: Agent-Powered Vulnerability Scanner Driver + +## Scheduling + +### Goal +Operate Vercel's `deepsec` security scanner inside a target repository safely and cost-consciously: bootstrap the `.deepsec/` workspace, write a tight `INFO.md`, run the right scan/process/triage/revalidate/export sequence, gate PRs in CI via `process --diff`, and grow project-specific matchers, surfacing real, revalidated findings without runaway spend. + +### Intent signature +- User mentions `deepsec`, "deep security scan", `bunx deepsec`, `pnpm deepsec`, `npx deepsec`. +- User asks an agent to scan a repository for vulnerabilities, security issues, or CVEs and the project has (or should have) a `.deepsec/` directory. +- User asks how to add a deepsec PR / CI security gate, or about `process --diff`, `--diff-staged`, `--diff-working`, `--files-from`, `--comment-out`. +- User mentions deepsec artefacts: `INFO.md`, `SETUP.md`, `data//files/`, `FileRecord`, `RunMeta`, `revalidation`, `triage`, custom matchers, `MatcherPlugin`, `noiseTier`, `priorityPaths`. +- User asks about deepsec configuration: `deepsec.config.ts`, `defaultAgent`, `AI_GATEWAY_API_KEY`, `VERCEL_OIDC_TOKEN`, AI Gateway, Vercel Sandbox, `--agent codex`, `--agent claude`. +- User asks how to lower deepsec cost, cut false-positive rate, or interpret severity / triage / revalidation verdicts. + +### When to use +- First-time deepsec install in a repo (`init`, `INFO.md` write, first calibration scan). +- Running a full or scoped scan and processing findings. +- Setting up a per-PR CI gate with `process --diff` and `--comment-out`. +- Writing a project-specific matcher to cover entry points the default set misses. +- Triaging a backlog of findings (severity bucketing, FP cuts via `revalidate`, exporting to issue tracker). +- Diagnosing deepsec failures: missing credentials, AI Gateway quota stops, refusals, sandbox auth. + +### When NOT to use +- Generic OWASP / lint-style review without deepsec → use `oma-qa`. +- Generic CVE / dependency advisories → use `oma-qa` or `oma-search`. +- Architecting a brand-new SAST pipeline that is not deepsec → use `oma-architecture`. +- Writing or auditing application code itself → route to `oma-backend` / `oma-frontend` / `oma-mobile`. +- Cloud / IAM / Terraform hardening → use `oma-tf-infra` (deepsec only scans the IaC; remediation lives there). +- Pure reasoning about a finding's fix in product code → use `oma-debug` once deepsec has produced the finding. + +### Expected inputs +- `target_repo_root`: absolute path of the codebase to scan (parent of `.deepsec/`). +- `intent`: one of `setup` | `scan` | `pr-review` | `matchers` | `triage` | `config` | `troubleshoot`. +- `credential_mode`: `ai-gateway-key` | `vercel-oidc` | `direct-anthropic` | `direct-openai` | `subscription`. +- `agent_choice`: `claude` (default `claude-opus-4-7`) or `codex` (default `gpt-5.5`). Asked once before the first paid call if not already provided. +- `severity_floor`: lowest severity worth surfacing (typically `HIGH`). +- Optional: existing `.deepsec/data//`, `deepsec.config.ts`, custom matchers, CI provider. + +### Expected outputs +- A working `.deepsec/` workspace registered against the target repo. +- A populated `data//INFO.md` (50-100 lines, project-specific, no line numbers). +- One or more completed `scan` → `process` (→ `triage`/`revalidate`) runs with reproducible cost notes. +- For PR mode: a CI workflow file using `process --diff ` with two-job split (no PR-write in PR-code job). +- For matchers: new `.deepsec/matchers/.ts` files wired through the inline plugin in `deepsec.config.ts`. +- A findings export (`md-dir` and/or `json`) plus a short summary of top severities and FP-rate notes. +- Explicit, dollar-and-time-bounded plan before any pass that may cost more than ~$25. + +### Dependencies +- Node.js **22+**, plus a package manager: `bun` / `bunx` (preferred in this monorepo), `pnpm`, `npm`, or `yarn`. +- A working AI credential: `AI_GATEWAY_API_KEY=vck_…`, or `VERCEL_OIDC_TOKEN`, or direct `ANTHROPIC_AUTH_TOKEN` + `ANTHROPIC_BASE_URL`, or a logged-in `claude` / `codex` CLI subscription. +- Git (history is consulted by `revalidate` and `--diff` modes). +- Optional: Vercel Sandbox auth for `deepsec sandbox …` distributed runs. +- Reference resources under `resources/` (loaded only when the scenario requires them). + +### Control-flow features +- Branches by `intent` (setup vs scan vs pr-review vs matchers vs triage vs config vs troubleshoot). +- Branches by repo size (calibrate with `--limit 50` before any large pass). +- Branches by credential source (gateway key, OIDC, direct, subscription). +- Stops on quota / credit exhaustion and resumes the same command after top-up. +- Refuses to launch an unbounded `process` when no calibration has been done and the repo is large. +- Reads codebase, writes `.deepsec/` files and CI configs, runs long-lived AI processes. + +## Structural Flow + +### Entry +1. Confirm whether `.deepsec/` already exists; if yes, treat the run as **incremental**, never re-init. +2. Resolve `intent` from the user prompt; if ambiguous (e.g. "scan this repo"), default to `setup` then `scan` (calibration mode). +3. Estimate scale: count source files (rough `rg --files | wc -l` excluding `node_modules`, `.git`, `dist`) to forecast cost before any AI pass. +4. Check for an AI credential in `.env.local` or shell env; if none, route to credential setup before any `process` / `revalidate` / `triage` call. +5. **Confirm agent choice with the user before the first paid call.** If `agent_choice` is not already in the prompt and `deepsec.config.ts` does not pin a `defaultAgent`, ask whether to run `claude` (`claude-opus-4-7`, the default; strongest reasoning, most expensive) or `codex` (`gpt-5.5`; runs in a strict sandbox, cheaper, grep-heavy). The two backends can be mixed via `--reinvestigate` and findings dedupe across agents. Skip the question if the user has already named an agent or has explicitly delegated the decision ("just pick reasonable defaults"). + +### Scenes +1. **PREPARE**: Resolve intent, repo root, credential, budget cap, severity floor, agent choice. Refuse to run blind on a repo of unknown scale. +2. **ACQUIRE**: Read `.deepsec/deepsec.config.ts`, `data//project.json`, `INFO.md`, last `runs/` entries, and target-repo signals (`README`, `AGENTS.md`/`CLAUDE.md`, framework configs, route directories) needed to author or verify `INFO.md`. +3. **REASON**: Pick the smallest pass that answers the user's question. Options include `scan` only, a `--limit 50` calibration, a full `process`, `process --diff`, a matcher-authoring loop, or troubleshoot-only. Always state cost forecast and stopping condition before AI passes. +4. **ACT**: Run the planned commands from inside `.deepsec/`. For matchers, write per-slug files and wire the inline plugin. For PR mode, scaffold the two-job CI workflow. +5. **VERIFY**: Use `deepsec status`, the run's `RunMeta`, exit code (`0` clean, `1` findings produced, other = error), candidate counts, and (when present) the `--comment-out` markdown to confirm output. +6. **FINALIZE**: Summarize findings by severity and verdict, list dollar cost and wall time, name files written, and call out follow-ups (revalidate `HIGH+`, write matchers for missed entry points, persist `data/` between CI runs). + +### Transitions +- If `.deepsec/` is missing and intent involves scanning → run `bunx deepsec init` (or `npx deepsec init`) and follow the printed prompt to populate `INFO.md` before any AI pass. +- If `INFO.md` is empty or template-shaped → write it (50-100 lines, project-specific, 3-5 examples per section, no line numbers, no generic CWE enumeration). +- If repo is > 500 files and no calibration has run → run a calibration pass first (deepsec docs recommend `--limit 50 --concurrency 5`) and report cost extrapolation before the full pass. +- If a `process` / `revalidate` run halts on quota → leave file locks intact, surface the exact remediation URL, **re-run the same command after top-up**. +- If the agent reports a refusal (`refused: true`) → never silently drop; document the affected files and either retry with the other backend or add the path to `config.json:ignorePaths` only if reproducible. +- If the user wants a CI gate → emit the two-job pattern (PR-code job has no `pull-requests: write`, comment job has no PR code). +- If the user wants more matcher coverage → run the matcher-authoring workflow against `data//files/` and the parent repo's entry points. + +### Failure and recovery +| Failure | Recovery | +|---------|----------| +| `Missing AI credentials for --agent claude` / `codex` | Pick a credential mode (gateway key / OIDC / direct / subscription) per `resources/config.md` and write `.env.local`. | +| `401 Unauthorized` from gateway | OIDC: re-run `vercel env pull` (12 h expiry). API key: regenerate. Confirm `.env.local` is in the cwd deepsec runs from. | +| `Stopped: AI Gateway credits exhausted` | Top up via the printed URL; re-run the same command, files already done are skipped. | +| `Stopped: Claude Pro/Max subscription exhausted` | Switch to AI Gateway; subscriptions don't carry full scans. | +| Persistent refusal on a single file (>5% of batches) | Add the path to `data//config.json:ignorePaths`, or run that file alone with `--batch-size 1`. | +| FP rate too high on `HIGH+` | Run `revalidate --min-severity HIGH`; tighten `INFO.md`'s threat model and FP notes; bias matchers to `precise`. | +| `noisy` matcher wedges scanner on a 100k-file repo | Tighten `filePatterns` to language- or directory-anchored globs. | +| Sandbox auth fails | OIDC: re-run `vercel env pull`. Access-token mode: verify `VERCEL_TOKEN` + `VERCEL_TEAM_ID` + `VERCEL_PROJECT_ID`. | +| User asks for full scan with no budget context | Halt; report file count and forecast cost band; require explicit go-ahead before the full pass. | + +### Exit +- **Success**: planned passes ran, findings exist with verdicts (or no findings produced), files written are listed, residual cost / followups are explicit. +- **Partial success**: some passes blocked on credentials/quota/refusal; the blocker, the safe-resume command, and the recommended next step are reported. +- **Failure**: nothing destructive happened, the user has the exact next command to unblock the work. + +## Logical Operations + +### Actions +| Action | SSL primitive | Evidence | +|--------|---------------|----------| +| Detect existing workspace and credentials | `READ` | `.deepsec/`, `.env.local`, env vars | +| Estimate repo scale | `INFER` | `rg --files | wc -l` | +| Choose pass plan (calibrate vs full vs diff) | `SELECT` | File count, intent, budget cap | +| Init workspace | `CALL_TOOL` | `bunx deepsec init` | +| Write `INFO.md` | `WRITE` | `data//INFO.md` | +| Run scan | `CALL_TOOL` | `bunx deepsec scan` | +| Run AI investigation | `CALL_TOOL` | `bunx deepsec process` (`--limit`, `--concurrency`) | +| Triage / revalidate | `CALL_TOOL` | `bunx deepsec triage` / `revalidate --min-severity HIGH` | +| Export findings | `CALL_TOOL` | `bunx deepsec export --format md-dir|json` | +| PR-mode review | `CALL_TOOL` | `bunx deepsec process --diff --comment-out comment.md` | +| Author custom matcher | `WRITE` | `.deepsec/matchers/.ts` + inline plugin in `deepsec.config.ts` | +| Validate matcher hit rate | `VALIDATE` | `bunx deepsec scan --matchers ` candidate count | +| Verify and report | `NOTIFY` | `RunMeta`, severity counts, dollar cost, FP rate | +| Stop on budget breach | `TERMINATE` | Refuse unbounded `process` without calibration | + +### Tools and instruments +- **Package manager**: `bun` / `bunx` (preferred), `pnpm`, `npm`, `yarn` are interchangeable. +- **CLI commands**: `deepsec init`, `init-project`, `scan`, `process`, `process --diff`, `triage`, `revalidate`, `enrich`, `report`, `export`, `metrics`, `status`, `sandbox `. +- **Diff sources for PR mode**: `--diff `, `--diff-staged`, `--diff-working`, `--files `, `--files-from ` (or `-` for stdin). +- **Inspection**: `jq` over `data//files/**/*.json` for ad-hoc severity / TP queries. +- **Credentials**: `AI_GATEWAY_API_KEY`, `VERCEL_OIDC_TOKEN`, `ANTHROPIC_AUTH_TOKEN` / `ANTHROPIC_BASE_URL`, `OPENAI_API_KEY` / `OPENAI_BASE_URL`, `claude login`, `codex login`. +- **Resource files** under `resources/` for setup, scanning, PR review, matchers, triage, config, load on demand. + +### Canonical workflow path +1. **Bootstrap** (one time per repo): + ```bash + cd + bunx deepsec init + cd .deepsec + bun install + # Edit .env.local: set AI_GATEWAY_API_KEY=vck_… (or VERCEL_OIDC_TOKEN via `vercel env pull`) + ``` + Then prompt the coding agent (this skill) to read + `.deepsec/node_modules/deepsec/SKILL.md` and `.deepsec/data//SETUP.md`, + skim `README` / `AGENTS.md` / `CLAUDE.md` and a handful of representative + files, and replace each section of `data//INFO.md` (50-100 lines, + 3-5 examples per section, no line numbers, no generic CWE rehash). +2. **Calibrate before any full pass.** The deepsec docs (`getting-started.md`, `vercel-setup.md`, `faq.md`) recommend `--limit 50 --concurrency 5` as the calibration starting point. + ```bash + bunx deepsec scan + bunx deepsec status + bunx deepsec process --limit 50 --concurrency 5 + ``` + Read the per-batch cost. Extrapolate to full repo. Get the user's explicit go-ahead before the full `process`. If the user names different `--limit` / `--concurrency` values, use theirs. +3. **Full investigation, triage, revalidate, export**: + ```bash + bunx deepsec process --concurrency 5 + bunx deepsec triage --severity HIGH + bunx deepsec revalidate --min-severity HIGH + bunx deepsec export --format md-dir --out ./findings + bunx deepsec metrics + ``` +4. **PR mode** (CI gate, scoped to changed files, exit code = 0/1): + ```bash + bunx deepsec process \ + --diff origin/${BASE_REF} \ + --comment-out comment.md + ``` + Wire the two-job CI pattern from `resources/pr-review.md`. Never grant `pull-requests: write` to the job that runs PR-controlled code. +5. **Custom matchers** (close entry-point gaps surfaced in step 3): + - Read the contract in `.deepsec/node_modules/deepsec/dist/config.d.ts` and the `samples/webapp/matchers/*` examples. + - Write `.deepsec/matchers/.ts`, wire it through the inline plugin in `.deepsec/deepsec.config.ts`. + - Verify hit rate: `bunx deepsec scan --matchers ` should land in 1-20 hits / 1k files (`precise`), 5-100 (`normal`), or roughly the framework entry-point count (`noisy`). +6. **Resume** after any quota stop, network blip, or Ctrl-C: re-run the same command. State is on disk under `.deepsec/data//`. + +### Resource scope +| Scope | Resource target | +|-------|-----------------| +| `CODEBASE` | Target repo source files, framework configs, route directories, `README` / `AGENTS.md` / `CLAUDE.md`. | +| `LOCAL_FS` | `.deepsec/deepsec.config.ts`, `.deepsec/.env.local`, `.deepsec/matchers/`, `.deepsec/data//{project.json,INFO.md,config.json,files/,runs/,reports/}`, generated `findings/`, `comment.md`, CI workflow files. | +| `PROCESS` | `bunx deepsec scan|process|triage|revalidate|export|metrics|status|sandbox`, `bun install`, optional `vercel link` / `vercel env pull`. | +| `NETWORK` | Anthropic / OpenAI via Vercel AI Gateway (default) or direct provider endpoints; optional Vercel Sandbox microVM control plane. | +| `CREDENTIALS` | `AI_GATEWAY_API_KEY`, `VERCEL_OIDC_TOKEN`, `ANTHROPIC_AUTH_TOKEN`, `OPENAI_API_KEY`, `VERCEL_TOKEN` / `VERCEL_TEAM_ID` / `VERCEL_PROJECT_ID`, `claude` / `codex` subscription tokens. Consume read-only; never echo secrets back to the user or commit them. | +| `MEMORY` | User-stated budget cap, severity floor, and stop conditions for the current session. | + +### Preconditions +- Node.js 22+ is available. +- Repo is a git checkout (deepsec uses git history for `revalidate` and `--diff`). +- For any AI command: at least one credential mode is configured *before* the call, or the call is held until one is. +- For `sandbox` mode: Vercel auth is wired; otherwise stay local. +- For unbounded `process` runs on > 500-file repos: a `--limit` calibration pass has produced a cost number the user has acknowledged. + +### Effects and side effects +- Creates `.deepsec/` (config, lockfile, scaffolding) and `.deepsec/data//` (gitignored) inside the target repo. +- Writes `.env.local` (never commit) and may run `vercel link` / `vercel env pull` (writes `.vercel/project.json` + token). +- Spawns long-running AI processes that **cost real money**. Single full scans range from $25 to over $1,200 per the official cost guide and can climb to tens of thousands on very large repos. +- Reads source code; sends snippets to the configured LLM (gateway = zero retention; direct provider = subject to that provider's policy). Never exfiltrates secrets; the gateway key stays outside the worker sandbox in `sandbox` mode. +- May write `.github/workflows/deepsec.yml` (or analogue) when the user asks for a CI gate. +- Edits `deepsec.config.ts` and adds `.deepsec/matchers/*.ts` when authoring matchers. +- Does not commit, push, or open PRs unless the user explicitly authorizes a separate commit step (route via `oma-scm`). + +### Guardrails +1. **Never launch an unbounded `process` on a repo whose size you have not measured.** Always run a calibration pass first when file count is unknown or > 500 (deepsec docs recommend `--limit 50 --concurrency 5`; defer to a user-named value if given). +2. **State cost and stopping condition before any AI pass.** Use the published bands (100 files ≈ $25-60, 500 ≈ $130-300, 2,000 ≈ $500-1,200; ×2-3 swing). +3. **Resume, do not reset.** After any network / quota / Ctrl-C interruption, re-run the same command. Never delete `data//` to "start clean" without explicit user instruction. +4. **`INFO.md` stays short and project-specific.** 50-100 lines, 3-5 examples per section. Name primitives but no line numbers. Skip generic CWE categories; built-in matchers cover those. +5. **For PR/CI gates, keep PR-controlled code in a no-write job.** Never grant `pull-requests: write` to a job that executes PR-controlled `pnpm install` / config-loading. Use the two-job pattern in `resources/pr-review.md`. +6. **Pin actions to full SHAs** in production CI; major-version tags are for examples only. +7. **Never silently drop refusals.** If the agent reports `refused: true`, log it, retry with the other backend, or add the file to `ignorePaths` only when reproducible. +8. **Bias matchers toward `precise` when the bug shape is exact.** Reserve `noisy` for entry-point coverage and tight globs. +9. **Never echo or commit credentials** (`vck_…`, `sk-ant-…`, `sk-…`, OIDC tokens). Treat `.env.local` as secret. Treat `data/` as gitignored by default. +10. **Treat deepsec like an agent with shell access.** Recommend `sandbox` for prompt-injection-prone repos (vendored code, untrusted deps). +11. **Findings need verdicts.** For any HIGH+ surfaced to the user, prefer `revalidate`-tagged verdicts (`true-positive` / `false-positive` / `fixed` / `uncertain`) over raw `process` output. +12. **Do not invent CLI flags.** Anything beyond `resources/scanning.md`'s flag list must be checked against `--help` first. +13. **Ask agent choice before the first paid call.** If the user has not named an agent (`claude` vs `codex`) and `deepsec.config.ts` does not pin `defaultAgent`, ask once with the trade-off clearly stated. Do not also bargain over budget or severity; those are handled via the upstream calibration recommendation (`--limit 50 --concurrency 5` per deepsec docs) and the user-stated `severity_floor`. + +## References +- Workspace install + `INFO.md` bootstrap: `resources/setup.md` +- Full scan/process/triage/revalidate/export workflow + cost guide: `resources/scanning.md` +- PR / CI gate via `process --diff` (two-job pattern, exit-code semantics): `resources/pr-review.md` +- Authoring custom matchers (slugs, noise tiers, file globs, plugin wiring): `resources/matchers.md` +- Reading findings, severities, triage / revalidation verdicts, FP cuts: `resources/triage.md` +- `deepsec.config.ts` reference, env vars, plugin order, AI Gateway / Vercel Sandbox auth: `resources/config.md` +- Upstream docs (load only when a resource file points at one): + - Repo + README: https://github.com/vercel-labs/deepsec + - Per-topic docs at https://github.com/vercel-labs/deepsec/tree/main/docs (`getting-started`, `reviewing-changes`, `writing-matchers`, `configuration`, `models`, `plugins`, `architecture`, `data-layout`, `vercel-setup`, `supported-tech`, `faq`) +- Shared context loading: `../_shared/core/context-loading.md` +- Shared quality principles: `../_shared/core/quality-principles.md` diff --git a/.agents/skills/oma-deepsec/resources/config.md b/.agents/skills/oma-deepsec/resources/config.md new file mode 100644 index 0000000..8c51756 --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/config.md @@ -0,0 +1,202 @@ +# Configuration: `deepsec.config.ts`, env vars, plugins, models + +deepsec reads `deepsec.config.{ts,mjs,js,cjs}` from the current working directory, walking up. The CLI inherits whatever the file declares. + +```ts +import { defineConfig } from "deepsec/config"; +import myPlugin from "@my-org/deepsec-plugin-foo"; + +export default defineConfig({ + projects: [ + { id: "my-app", root: "../my-app" }, + { id: "service", root: "../service", + githubUrl: "https://github.com/me/service/blob/main" }, + ], + plugins: [myPlugin()], +}); +``` + +For a fully-worked example exercising every common field (`infoMarkdown`, `promptAppend`, `priorityPaths`, an inline plugin), see `samples/webapp/deepsec.config.ts` in the deepsec repo. + +## Top-level fields + +| Field | Type | Purpose | +|---|---|---| +| `projects` | `ProjectDeclaration[]` | Codebases deepsec knows about. | +| `plugins` | `DeepsecPlugin[]` | Loaded in order; later plugins override single-slot capabilities. | +| `matchers` | `{ only?: string[]; exclude?: string[] }` | Filter the matcher set used by `scan`. | +| `defaultAgent` | `"claude" | "codex"` | Default `--agent` value. | +| `dataDir` | `string` | Override the `data/` directory. Defaults to `./data`. | + +## `ProjectDeclaration` + +| Field | Type | Required | Purpose | +|---|---|---|---| +| `id` | `string` | yes | Used as `--project-id` and the data directory name. | +| `root` | `string` | yes | Absolute or relative path to the codebase. | +| `githubUrl` | `string` | no | `https://github.com/owner/repo/blob/branch` for clickable links in exports. Auto-detected from `git remote` if omitted. | +| `infoMarkdown` | `string` | no | Repo context injected into AI prompts. Overrides `data//INFO.md` if both are set. | +| `promptAppend` | `string` | no | Free-form text appended to the system prompt for this project. | +| `priorityPaths` | `string[]` | no | Path prefixes to process first. | + +## Per-project `data//config.json` + +Optional, read by `scan` and the AI agents. Overrides the same fields on the project declaration if both are present. + +```json +{ + "priorityPaths": ["app/api/", "lib/"], + "promptAppend": "Pay extra attention to the booking flow.", + "ignorePaths": ["**/legacy/**"] +} +``` + +## Matcher filtering + +```ts +matchers: { + only: ["sql-injection", "auth-bypass"], // run *only* these + exclude: ["framework-internal-header"], // skip these +} +``` + +If `only` is set, `exclude` is ignored. CLI flag `--matchers ` overrides the config when both are present. + +## Plugin order + +Plugins are evaluated in array order: + +```ts +plugins: [genericPlugin(), orgPlugin()] +``` + +| Slot | Behavior | +|---|---| +| `matchers`, `notifiers`, `agents` | **Additive.** Both plugins' contributions stack. | +| `ownership`, `people`, `executor` | **Last-write-wins.** `orgPlugin()`'s provider replaces `genericPlugin()`'s. | + +A monorepo gating example: + +```ts +const projectId = process.argv[process.argv.indexOf("--project-id") + 1]; +const isInternal = projectId?.startsWith("internal-") ?? false; + +export default defineConfig({ + projects: [ + { id: "internal-api", root: "../api" }, + { id: "open-source-app", root: "../app" }, + ], + plugins: isInternal ? [orgPlugin()] : [], +}); +``` + +The config file is real TypeScript. Any logic at module-load time works. + +## Plugin slots + +| Slot | Purpose | +|---|---| +| `matchers` | Additional regex matchers, registered alongside the built-ins. | +| `notifiers` | Where findings get reported (Slack, GitHub Issues, webhooks, …). | +| `ownership` | Map files to owning teams/people (e.g. an internal directory). | +| `people` | Look up a person by email/name (managers, on-call, contact info). | +| `executor` | Run a deepsec command on remote infrastructure. | + +```ts +export interface DeepsecPlugin { + name: string; + matchers?: MatcherPlugin[]; + notifiers?: NotifierPlugin[]; + ownership?: OwnershipProvider; + people?: PeopleProvider; + executor?: ExecutorProvider; + agents?: AgentPluginRef[]; + commands?: (program: unknown) => void; // commander program +} +``` + +A single plugin can fill any subset. For details see https://github.com/vercel-labs/deepsec/blob/main/docs/plugins.md. + +## Models + +| Backend | Default | Used by | +|---|---|---| +| `claude` (default) | `claude-opus-4-7` | `process`, `revalidate` | +| `claude` (triage) | `claude-sonnet-4-6` | `triage` | +| `codex` | `gpt-5.5` | `process`, `revalidate` | + +CLI selection: + +```bash +bunx deepsec process --agent claude --model claude-sonnet-4-6 # cheaper Claude +bunx deepsec process --agent codex --model gpt-5.4 # cheaper Codex +bunx deepsec triage --model claude-haiku-4-5 # cheaper triage +``` + +`--agent` and `--model` are accepted on `process`, `revalidate`, and `triage`. Set the workspace-wide default via `defaultAgent` in `deepsec.config.ts`. + +## Environment variables + +deepsec reads `.env.local` (auto-loaded by the CLI) or the process environment. + +### Required (one of) + +| Var | Used by | Purpose | +|---|---|---| +| `AI_GATEWAY_API_KEY` | all AI commands | Shortcut. Expands at startup into `ANTHROPIC_AUTH_TOKEN` / `OPENAI_API_KEY` / `ANTHROPIC_BASE_URL` / `OPENAI_BASE_URL` (one key covers Claude **and** Codex through Vercel AI Gateway). Any of those four set explicitly always wins. Falls back to `VERCEL_OIDC_TOKEN` when unset. | +| `ANTHROPIC_AUTH_TOKEN` + `ANTHROPIC_BASE_URL` | `process`, `revalidate`, `triage` (Claude) | Direct Anthropic, or BYOK gateway-issued token. | +| `OPENAI_API_KEY` (+ optional `OPENAI_BASE_URL`) | `--agent codex` | Codex SDK token. | +| `claude login` / `codex login` session | local non-sandbox runs only | Subscription fallback. Generally lacks headroom for full scans. | + +### Optional + +| Var | Purpose | +|---|---| +| `DEEPSEC_AGENT_DEBUG` | Set to `1` for verbose agent logging. | +| `DEEPSEC_DATA_ROOT` | Override the data directory (= `dataDir` in config). | +| Plugin-specific | Each plugin documents its own env vars in its README. | + +### Vercel Sandbox (optional) + +For `bunx deepsec sandbox …`. Pick OIDC for local dev, access token for unattended CI: + +```bash +# OIDC (12 h expiry, re-pull when expired) +npx vercel link +npx vercel env pull # writes VERCEL_OIDC_TOKEN + +# Access token (long-lived, headless) +VERCEL_TOKEN=… +VERCEL_TEAM_ID=team_… +VERCEL_PROJECT_ID=prj_… +``` + +The Sandbox SDK reads these directly from `process.env` at `Sandbox.create()` time. The SDK prefers `VERCEL_OIDC_TOKEN` and falls back to access-token mode otherwise. + +## Troubleshooting + +| Symptom | Cause | Fix | +|---|---|---| +| `Missing AI credentials for --agent claude|codex` | No credential present. | Set `AI_GATEWAY_API_KEY=vck_…` in `.env.local`, or `claude login` / `codex login`. | +| `401 Unauthorized` on `process` / `revalidate` | Credential present but rejected. | OIDC: `vercel env pull` (12 h expiry). API key: regenerate in dashboard. Confirm `.env.local` is in cwd. | +| `Stopped: Vercel AI Gateway credits exhausted` | Gateway balance is $0. | Top up at the printed URL, then re-run the same command; it resumes. | +| `Stopped: Anthropic API credits exhausted` | Direct Anthropic out of credits. | Top up at console.anthropic.com, or switch to the gateway. | +| `Stopped: OpenAI API quota exhausted` | Direct OpenAI out of quota. | Top up in the OpenAI dashboard, or switch to the gateway. | +| `Stopped: Claude Pro/Max subscription exhausted` | Hit weekly / 5-hour cap. | Switch to AI Gateway. | +| `Stopped: ChatGPT subscription exhausted` | Hit ChatGPT Plus / Pro quota. | Switch to AI Gateway. | +| Sandbox spawn fails with auth error | OIDC expired or access-token vars wrong. | `vercel env pull`, or verify the three access-token vars. | +| Findings missing cost in the log | Pricing entry missing for a non-default Codex model. | Add a line to `MODEL_PRICING_USD_PER_M_TOKENS` in `packages/processor/src/agents/codex-sdk.ts` (only matters if you are extending deepsec itself). | +| Persistent refusal on a single file (>5 % of batches) | Hard-to-disambiguate exploit pattern. | Add to `data//config.json:ignorePaths`, or run with `--batch-size 1`. | + +After **any** quota / credit fix, `process` and `revalidate` resume on re-run. No recovery flag, no state to reset. Files already analyzed stay analyzed; only unfinished ones get picked up. Use `--reinvestigate` (process) or `--force` (revalidate) only when you specifically want to redo finished work. + +## Security model of deepsec itself + +Treat deepsec like a coding agent with full shell access on the machine it runs on. It is designed to run on trusted inputs (your source code), but you may still be concerned about prompt injection from external dependencies or vendored code. + +`deepsec sandbox …` substantially limits exposure: + +- API keys are injected outside the sandbox and cannot be exfiltrated. +- Worker-sandbox network egress is locked to the configured AI host. (Egress is allowed during bootstrap, before the coding agent starts.) + +Use sandbox mode for unfamiliar / vendored / contractor codebases. Local mode is fine for your own first-party code. diff --git a/.agents/skills/oma-deepsec/resources/matchers.md b/.agents/skills/oma-deepsec/resources/matchers.md new file mode 100644 index 0000000..f22868e --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/matchers.md @@ -0,0 +1,150 @@ +# Matchers: author project-specific entry-point coverage + +The default matcher set covers common CWE shapes (SQL injection, SSRF, path traversal, …) and a handful of popular framework shapes (Next.js, Prisma, Express, Hono, FastAPI, Django, Laravel, Rails, Gin/Echo/Fiber/Chi, …). It will miss patterns specific to your codebase: an internal RPC framework, a less common language, a custom auth helper, a non-default route layout. Custom matchers fill those gaps. + +The intended loop: + +``` +scan (fast, wide) → process (AI, slow + expensive) → revalidate → write better matchers +``` + +## When to write one + +- A revalidated true-positive needs a matcher to catch siblings on future scans. +- A cluster of `other-*` slugs in `bunx deepsec metrics` points at a real category deepsec has no name for. +- The target repo has **entry points the default matchers do not see**. Check `https://github.com/vercel-labs/deepsec/blob/main/docs/supported-tech.md` first; the framework may already be covered. +- You have an **organization-specific** pattern (internal auth helper, internal SDK call, custom middleware). + +## Where matchers live + +``` +.deepsec/ +├── deepsec.config.ts # inline plugin lists the matchers +└── matchers/ + ├── my-route-no-auth.ts + └── my-internal-rpc.ts +``` + +`deepsec.config.ts`: + +```ts +import { defineConfig, type DeepsecPlugin } from "deepsec/config"; +import { myRouteNoAuth } from "./matchers/my-route-no-auth.js"; +import { myInternalRpc } from "./matchers/my-internal-rpc.js"; + +const myPlugin: DeepsecPlugin = { + name: "my-app", + matchers: [myRouteNoAuth, myInternalRpc], +}; + +export default defineConfig({ + projects: [{ id: "my-app", root: ".." }], + plugins: [myPlugin], +}); +``` + +Slugs are unique. **If your slug collides with a built-in, your matcher wins.** This is useful for swapping in a tighter org-specific version. + +If a matcher is genuinely reusable across orgs (a CWE shape or a public-framework shape), consider upstreaming to https://github.com/vercel-labs/deepsec instead. + +## Workflow + +### 1. Run `scan` + `process` first + +You want real `data/` to point the agent at. + +```bash +bunx deepsec scan +bunx deepsec process --limit 50 # upstream-recommended calibration pass (deepsec docs) +bunx deepsec revalidate --min-severity HIGH +``` + +### 2. Hand the workspace to the agent + +Open the **parent repo** (the codebase being scanned) in your coding agent so it can read both source and `.deepsec/data/`. Then prompt: + +> I want to add custom matchers to deepsec for this repo. deepsec is already installed at `.deepsec/node_modules/deepsec/` and `.deepsec/data//` has at least one scan + process pass. +> +> **Read these first to understand the contract:** +> - `.deepsec/node_modules/deepsec/dist/config.d.ts` defines the `MatcherPlugin` interface and the `regexMatcher` helper signature. +> - `.deepsec/node_modules/deepsec/dist/samples/webapp/matchers/webapp-debug-flag.ts` is a small `normal`-tier matcher. +> - `.deepsec/node_modules/deepsec/dist/samples/webapp/matchers/webapp-route-no-rate-limit.ts` is a slightly larger matcher with a negative pre-check. +> - `.deepsec/node_modules/deepsec/dist/samples/webapp/deepsec.config.ts` shows how the inline plugin wires matchers into the config. +> +> **Then do the analysis:** +> 1. Walk `.deepsec/data//files/` and look at what the default matchers already cover. Note which `vulnSlug`s show up in `candidates[]` and where the AI's `findings[]` ended up landing after revalidation. +> 2. Compare against the **target repository** (root above `.deepsec/`). Identify the **major entry points**: public HTTP handlers, RPC entry points, queue consumers, cron jobs, CLI commands, anything that takes untrusted input from the outside. Walk route/handler/api directories and framework config files (`next.config.*`, `wrangler.toml`, `serverless.yml`, `Procfile`, `main.go`, `app.py`, …) to figure out the entry-point shape. +> 3. Decide which entry points the default matchers **do not reach**. Common gaps: +> - Frameworks deepsec does not ship a glob for (Hono, Elysia, Cloudflare Workers, Bun, Deno, FastAPI, Rails controllers, Go `chi`/`gin`, internal RPC). +> - Languages with thin built-in coverage (Go, Python, Ruby, Lua, shell, Terraform, SQL). +> - Custom org-specific wrappers (auth middleware, rate-limit wrappers, request-validation helpers) where deepsec's generic regexes do not know the convention. +> 4. **Then write matchers that cover those gaps.** Prefer one matcher per concern. For each: +> - **Slug** (kebab-case, names what it flags, e.g. `hono-route-no-auth`, `worker-fetch-handler`). +> - **Noise tier**: `precise` | `normal` | `noisy` (see below). +> - **`filePatterns`** as tight as you can make them (language- or directory-anchored). +> - **Regex(es)** that match the shape. Skip test files (`.test.`, `.spec.`, `__tests__`, `_test.go`, …). +> - Save to `.deepsec/matchers/.ts`. Import types from `"deepsec/config"`. +> 5. Wire the new matchers into the inline plugin in `.deepsec/deepsec.config.ts` (create the plugin if it does not exist yet). +> 6. Run `bunx deepsec scan --matchers ,,…` from `.deepsec/` and report how many candidates each matcher fired. Open 3 candidates per matcher to spot-check the regex is not producing obvious false positives. +> +> Bias toward `precise` when you can describe the bug exactly. Use `noisy` deliberately when the goal is **entry-point coverage**: you would rather the AI look at every `**/api/**/route.ts` than rely on a regex to predict which ones are vulnerable. +> +> Generalize the *shape* of the pattern, not specific identifiers. If the repo's auth helper is `requireSession()`, the matcher should catch any handler that does not call any session/auth helper, not the literal string `requireSession`. + +### 3. Tune and ship + +```bash +bunx deepsec scan --matchers +``` + +Watch the candidate count: + +| Tier | Sweet spot | +|---|---| +| `precise` | 1–20 hits per 1k files | +| `normal` | 5–100 hits per 1k files | +| `noisy` | ≈ entry-point count of the targeted framework (10s, not 1000s) | + +0 hits → too strict (loosen). >100 hits in a small repo → too loose (tighten). + +When happy, commit `.deepsec/deepsec.config.ts` and `.deepsec/matchers/`. The next full scan picks them up automatically. + +## Noise tiers + +| Tier | When | Example | +|---|---|---| +| `precise` | Pattern is unambiguous. | `prisma-raw-sql`: `\$queryRawUnsafe\s*\(` matches only the unsafe API. | +| `normal` | Pattern is broader; AI disambiguates. | `auth-bypass`: flags admin checks and skip-auth strings; AI judges. | +| `noisy` | Every file matching a glob should be reviewed by the AI. | `service-entry-point`: every `**/api/**/route.ts` becomes a candidate. | + +Tier also influences ordering. `precise` candidates are processed first because they have the highest signal per token. + +## File globs + +Set `filePatterns` tightly. A noisy matcher with `**/*.{ts,tsx}` wedges the scanner on a 100k-file repo. Prefer: + +- Language-specific: `**/*.go`, `**/*.lua`, `**/*.tf` +- Directory-anchored: `**/api/**/*.ts`, `**/services/**/handlers/*.ts` +- Combined: `**/services/**/*.{ts,go}` + +## Worked example: covering missing entry points (FastAPI) + +A team scans a FastAPI service. After a `process` pass, `data//files/` shows the default matchers fired plenty on `requirements.txt` and a few `*.sql` files but barely touched `app/routers/*.py`, where the actual HTTP handlers live. The default glob set is tilted toward TypeScript/Next.js. + +1. **Inspect coverage.** Walk `data//files/app/routers/`. Most `FileRecord`s have empty `candidates[]`; the AI never picks them up. +2. **Identify entry points.** Each router decorates handlers with `@router.get("/…")`, `@router.post("/…")`, etc. The team's convention: authenticated handlers depend on a `current_user: User = Depends(get_current_user)` parameter. +3. **Add a noisy entry-point matcher.** Slug `fastapi-route`, `noiseTier: "noisy"`, `filePatterns: ["app/routers/**/*.py", "app/api/**/*.py"]`, regex `/@\w+\.(get|post|put|delete|patch)\s*\(/`. Every router file becomes a candidate; the AI reads them on the next `process` pass. +4. **Add a precise auth-shape matcher.** Slug `fastapi-route-no-auth`, `noiseTier: "precise"`, same globs, regex sweep for `@\w+\.(get|post|...)` whose subsequent `def`/`async def` signature lacks `Depends(get_current_user)` or `Depends(require_*)`. + +Result on the next scan: the AI investigates every router file, and the precise matcher flags handlers that skip the auth dependency. + +## Generic vs plugin vs upstream contribution + +| Catches… | Where | +|---|---| +| An org-specific helper, package, or route layout | Your inline plugin (`.deepsec/matchers/`) | +| A reference to a concrete internal service name | Your inline plugin | +| A CWE shape (path traversal, SSRF, prototype pollution) the public set misses | Consider upstreaming to https://github.com/vercel-labs/deepsec | +| A shape for a popular OSS framework (Hono, FastAPI, Drizzle) | Upstreaming benefits everyone | + +For copy-paste starting points, see `.deepsec/node_modules/deepsec/dist/samples/webapp/matchers/`. diff --git a/.agents/skills/oma-deepsec/resources/pr-review.md b/.agents/skills/oma-deepsec/resources/pr-review.md new file mode 100644 index 0000000..10a04c1 --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/pr-review.md @@ -0,0 +1,173 @@ +# PR review: `process --diff` for CI gating + +Use direct mode when you want a fast, scoped read of the files changed in a PR rather than a whole-repo audit. + +```bash +bunx deepsec process --diff origin/main +``` + +## How direct mode differs from a full scan + +| Step | What it looks at | What it produces | +|---|---|---| +| Resolve files | `--diff` / `--diff-staged` / `--diff-working` / `--files` / `--files-from` | POSIX-relative file list under `rootPath` | +| Scoped scan | Only the listed files | Candidates as **prompt signals** (best-effort) | +| Always-process | The same listed files | AI findings, including files no matcher hit | + +Files with no regex hits still get a record and still get investigated as a holistic review. + +## Diff sources (mutually exclusive) + +| Flag | Meaning | +|---|---| +| `--diff ` | `git diff --name-only ` (e.g. `origin/main`, `HEAD~1..HEAD`) | +| `--diff-staged` | Index vs HEAD | +| `--diff-working` | Uncommitted + untracked | +| `--files ` | Explicit comma-separated list | +| `--files-from ` | Newline-delimited list (or `-` for stdin) | + +Other knobs: + +| Flag | Effect | +|---|---| +| `--no-ignore` | Bypass the default ignore filter (test files, `dist/`, `node_modules/`, …) | +| `--comment-out ` | Write a PR-comment-shaped markdown summary to `` (only when findings exist) | +| `--project-id ` | Override project id (auto-derived from `rootPath` basename otherwise) | +| `--root ` | Override project root | + +The usual `--agent`, `--model`, `--concurrency`, `--batch-size`, `--max-turns` flags work the same as in standard mode. + +## Auto-created projects + +You do not need to run `deepsec init` first. With a direct-mode flag, `process` will: + +1. Use `--project-id` if you pass one (if declared in `deepsec.config.ts`, the declared root is used; otherwise `--root` or cwd). +2. Otherwise derive the id from the resolved root's basename. +3. Write `data//project.json` if absent. + +Auto-creation is one-line and non-destructive. It never modifies your `deepsec.config.ts`. + +## Exit codes (gating contract) + +| Code | Meaning | +|---|---| +| `0` | No findings produced in this run | +| `1` | At least one **net-new** finding produced | +| other | Runtime error (bad input, missing credentials, …) | + +**Net-new findings only** count toward the exit code. Re-running on a file with existing findings does not fail the build unless something new is surfaced. Pre-existing findings on touched files are intentionally excluded. + +## PR-comment markdown + +`--comment-out ` writes a markdown body summarizing the **net-new** findings only (same scope as the exit-code gate). Descriptions and recommendations are truncated (600 / 400 chars) to stay under GitHub's 65 KiB comment limit; full text remains in `data//files/`. + +The file is only written when there are findings, so a green run leaves nothing on disk and your "post comment" step can short-circuit on `if: hashFiles('comment.md') != ''`. + +## Two-job CI pattern (recommended) + +Keep PR-controlled code in a no-write job; let a second, code-free job post the comment. + +```yaml +name: deepsec + +on: pull_request + +permissions: + contents: read + +jobs: + analyze: + if: github.event.pull_request.head.repo.full_name == github.repository + runs-on: ubuntu-latest + timeout-minutes: 30 + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 # need history for `git diff origin/` + + - uses: pnpm/action-setup@v4 + - uses: actions/setup-node@v4 + with: { node-version: 24, cache: pnpm } + + - run: pnpm install --frozen-lockfile + - run: npm install -g @anthropic-ai/claude-code + + - id: deepsec + env: + AI_GATEWAY_API_KEY: ${{ secrets.AI_GATEWAY_API_KEY }} + CLAUDE_CODE_EXECUTABLE: claude + run: | + pnpm deepsec process \ + --diff origin/${{ github.event.pull_request.base.ref }} \ + --comment-out comment.md + + - if: always() && hashFiles('comment.md') != '' + uses: actions/upload-artifact@v4 + with: + name: deepsec-comment + path: comment.md + retention-days: 1 + + comment: + needs: analyze + if: always() && needs.analyze.result == 'failure' + runs-on: ubuntu-latest + timeout-minutes: 5 + permissions: + contents: read + pull-requests: write + steps: + - id: dl + continue-on-error: true + uses: actions/download-artifact@v4 + with: + name: deepsec-comment + + - if: steps.dl.outcome == 'success' + uses: actions/github-script@v7 + with: + script: | + const fs = require('fs'); + github.rest.issues.createComment({ + issue_number: context.issue.number, + owner: context.repo.owner, + repo: context.repo.repo, + body: fs.readFileSync('comment.md', 'utf8'), + }); +``` + +> Swap `pnpm deepsec` for `bunx deepsec` / `npx -y deepsec` / `yarn deepsec` to match the project's package manager. If using `bun`, replace the `pnpm/action-setup` + `setup-node(cache: pnpm)` block with `oven-sh/setup-bun` and `bun install --frozen-lockfile`. + +### Why the split + +- **`analyze`** runs PR-controlled code (the user's `pnpm install`, their config, their source) with the AI gateway secret in scope but **no write permissions on the repo**. +- **`comment`** has `pull-requests: write` but never runs any PR code; it consumes only the sanitized `comment.md` artifact. +- A malicious PR cannot combine "execute arbitrary code" with "write to the repository" in a single privileged step. + +## Threat-model notes + +- **Do not grant `pull-requests: write` to a job that runs PR code.** A PR can add arbitrary code to its own `package.json` postinstall scripts or to a project config the CLI loads. Both run before any of your steps. +- **Pin actions to full SHAs in production.** The example uses major-version tags for readability. Swap each tag for the action's full commit SHA so a compromised tag cannot pivot into your secret-bearing job. (See GitHub's hardening guide.) +- **Same-repo-only gate** (`if: github.event.pull_request.head.repo.full_name == github.repository`) skips fork PRs, which already do not receive secrets under `pull_request`. Pure UX cleanup. +- **The AI gateway secret still flows through PR code** in `analyze`. The `author_association` / same-repo gate is what prevents that from being a vulnerability. For defense-in-depth, run `analyze` only after a label is applied: + ```yaml + if: contains(github.event.pull_request.labels.*.name, 'review-ok') + ``` + +## Cost notes + +Wide diffs are expensive: every file pays for an AI investigation. + +- For PRs against `main`, scope to the merge base (`origin/main`), **not** the entire branch ancestry. +- Drop generated / fixture files via `--files-from`: + ```bash + git diff --name-only origin/main \ + | grep -v '^generated/' \ + | bunx deepsec process --files-from - + ``` +- Add stable noise paths to ignore patterns in `data//config.json:ignorePaths` so they never enter the diff. + +## When NOT to use direct mode + +- **Initial sweep of a large repo.** Full `scan` + `process` orders by noise tier, parallelizes better, and benefits from whole-repo signal in matcher gating. Direct mode is for incremental review. +- **Revalidating existing findings.** Use `revalidate` with its own filters. diff --git a/.agents/skills/oma-deepsec/resources/scanning.md b/.agents/skills/oma-deepsec/resources/scanning.md new file mode 100644 index 0000000..6689818 --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/scanning.md @@ -0,0 +1,152 @@ +# Scanning: `scan` → `process` → `triage` → `revalidate` → `export` + +All commands run from inside `.deepsec/`. `bunx deepsec …` is interchangeable with `pnpm deepsec …`, `npm exec deepsec …`, `yarn deepsec …`. + +## Pipeline + +``` +scan process revalidate enrich export / report / metrics + │ │ │ │ │ + ▼ ▼ ▼ ▼ ▼ +candidates → findings TP/FP/Fixed verdict → +committers JSON / md-dir / aggregate + +ownership +``` + +Stages are idempotent and additive. Re-running merges new info instead of overwriting. State lives under `data//`. + +## Calibration first (mandatory on > 500-file repos) + +The deepsec docs (`getting-started.md`, `vercel-setup.md`, `faq.md`) recommend `--limit 50 --concurrency 5` as the calibration starting point. Defer to a user-named value if given. + +```bash +bunx deepsec scan +bunx deepsec status # show pending / scanned counts +bunx deepsec process --limit 50 --concurrency 5 # upstream-recommended calibration +``` + +`scan` runs ~110 regex matchers across the codebase. **No AI calls.** ~15s on 2k files. Output goes to `data//files/` as one `FileRecord` JSON per scanned source file. + +The calibration `process` is a budget-capped AI pass. Read the per-batch cost the CLI prints, multiply by `(total_files / 50)` to extrapolate. **Get the user's explicit go-ahead before launching the unbounded `process`.** + +## Cost guide (Claude Opus, default settings) + +| Files | Approx cost | Approx wall time | +|---|---|---| +| 100 | $25–60 | 5–15 min | +| 500 | $130–300 | 25–60 min | +| 2,000 | $500–1,200 | 1.5–4 hr | + +Costs swing 2–3× based on file complexity. Codex is cheaper per call; Opus is the precision benchmark. + +## Full investigation + +```bash +bunx deepsec process --concurrency 5 +``` + +Defaults: `--agent claude` (`claude-opus-4-7`), `--batch-size 5`, `--concurrency 5` ⇒ 25 files in flight at peak. Files are claimed atomically via `lockedByRunId`; multiple workers can run in parallel without stepping on each other. + +For a cheaper backend: + +```bash +bunx deepsec process --agent codex --model gpt-5.5 +``` + +Codex runs in a strict read-only sandbox and is fast at grep-heavy investigations. Backends mix freely within a project: re-process unconvincing findings with the other agent, and findings dedupe across agents. + +### Resume after interruption + +`process` and `revalidate` are safe to re-run. Network blip, transient model error, quota stop, Ctrl-C → re-run the **same** command. Files already finished are skipped. **Nothing to clean up.** Never `rm -rf data//` to "start clean" without explicit user instruction. + +### Reinvestigate finished work + +Use `--reinvestigate` (entire repo) or `--reinvestigate ` (wave marker) when a stronger model lands or you want a second opinion. Findings dedupe across agents; the new analysis appends to `analysisHistory` rather than overwriting. + +## Triage and revalidate + +```bash +bunx deepsec triage --severity HIGH +bunx deepsec revalidate --min-severity HIGH +``` + +| Stage | What | Cost | +|---|---|---| +| `triage` | Classifies findings P0/P1/P2/skip from finding text only (no code re-read). Claude Sonnet by default. | ~$0.01 / finding | +| `revalidate` | Re-reads code + git history, emits `true-positive` / `false-positive` / `fixed` / `uncertain` verdicts and may adjust severity. | Comparable to `process` | + +`revalidate` empirically cuts FP rate by 50%+ on most repos. Run it on `HIGH+` before surfacing anything to the user. + +## Export + +```bash +bunx deepsec export --format md-dir --out ./findings # one .md per finding under {CRITICAL,HIGH,…}/ +bunx deepsec export --format json --out findings.json # single JSON array, pipe-friendly +bunx deepsec metrics # aggregate counts, severities, TP rates +bunx deepsec report # per-project markdown + JSON summary +``` + +Each command takes `--project-id ` if your config has multiple projects. + +## Useful flags + +| Flag | Purpose | +|---|---| +| `--limit ` | Cap files processed in this run. | +| `--concurrency ` | Parallel batches in flight. Lower for laptop-friendliness or quota-friendliness. | +| `--batch-size ` | Files per batch (default 5). | +| `--max-turns ` | Cap agent conversation turns per batch. | +| `--agent claude|codex` | Backend selection. | +| `--model ` | Override per-backend model (`claude-sonnet-4-6`, `gpt-5.5-pro`, `claude-haiku-4-5`, …). | +| `--matchers ` | CSV of slugs; restricts the matcher set on `scan`. Overrides `matchers.only` in config when both are set. | +| `--reinvestigate` / `--reinvestigate ` | Force re-analysis on `process`. | +| `--force` | Force re-analysis on `revalidate`. | +| `--project-id ` | Pick a project when more than one is registered. | +| `--root ` | Override project root for one-off scans. | + +## Reading `data/` directly + +`data//files/**/*.json` are `FileRecord`s. Useful jq one-liners: + +```bash +# All TP HIGH+ findings +jq -r '. as $r | $r.findings[] | select(.revalidation.verdict=="true-positive") | select(.severity=="HIGH" or .severity=="CRITICAL") | [$r.filePath, .severity, .title] | @tsv' data//files/**/*.json + +# Total spend on this project +jq -s 'map(.analysisHistory[].costUsd // 0) | add' data//files/**/*.json + +# Files still pending after the latest run +jq -r 'select(.status=="pending") | .filePath' data//files/**/*.json +``` + +For richer queries, prefer `bunx deepsec export --format json`. Its filters match the rest of the CLI. + +## Cron / scheduled CI + +```bash +# Sunday cron: full scan +bunx deepsec scan +bunx deepsec process --concurrency 5 +bunx deepsec revalidate --min-severity HIGH +bunx deepsec export --format json --out findings.json +``` + +Persist `.deepsec/data/` between runs (cache it as a build artifact) or re-scan from scratch each time. The append-only model means cached `data/` strictly improves cost on the next run. + +## Distributed (`sandbox`) + +Large monorepos can fan work across Vercel Sandbox microVMs: + +```bash +bunx deepsec sandbox process --project-id my-app --sandboxes 10 --concurrency 4 +``` + +Local working tree is tarballed (`.git` excluded) and uploaded. Sandbox-level network egress is locked to the configured AI host(s); the gateway key is injected outside the sandbox so it cannot be exfiltrated. Use this when the repo is large enough that local concurrency saturates your machine, or when running unattended in CI/CD. + +See `config.md` for Sandbox auth (OIDC vs access token). + +## What `process` does not do + +- Does not modify source code. Findings are advisory. +- Does not commit / push / open PRs. Hand off to `oma-scm` if the user wants commits. +- Does not call out to non-AI external services unless a notifier plugin is configured. +- Does not phone home or report telemetry; `data//` stays on your machine unless explicitly exported. diff --git a/.agents/skills/oma-deepsec/resources/setup.md b/.agents/skills/oma-deepsec/resources/setup.md new file mode 100644 index 0000000..4cdcb85 --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/setup.md @@ -0,0 +1,113 @@ +# Setup: install `.deepsec/` and bootstrap `INFO.md` + +## 1. Install the workspace + +Requires **Node.js 22+**. Run from the **root of the codebase you want to scan**: + +```bash +bunx deepsec init # creates .deepsec/ and registers this repo +cd .deepsec +bun install # installs deepsec from npm + +# pnpm / npm / yarn equivalents work the same way: +# npx deepsec init && cd .deepsec && pnpm install +# npx deepsec init && cd .deepsec && npm install +# npx deepsec init && cd .deepsec && yarn install +``` + +`init` lays down a minimal scaffold inside `.deepsec/`: + +- `package.json` +- `deepsec.config.ts` (one `projects[]` entry pointing at `..`, id derived from the parent dir's basename) +- `data//INFO.md` (template with section placeholders) +- `data//SETUP.md` (per-project agent prompt) +- workspace-level `AGENTS.md` +- `.env.local` +- `.gitignore` (keeps `INFO.md`, `SETUP.md`, `deepsec.config.ts` tracked; ignores `data/*/files/`, `data/*/runs/`, etc.) + +No custom matchers in the scaffold. Add those only when a real finding shapes one for you. + +> To scan another codebase from the same `.deepsec/`: `bunx deepsec init-project ` (relative paths resolve against `.deepsec/`'s parent). + +## 2. Pick a credential + +Open `.deepsec/.env.local` and pick **one**: + +| Mode | When | Set | +|---|---|---| +| AI Gateway API key | Anywhere, simplest | `AI_GATEWAY_API_KEY=vck_…` from the Vercel AI Gateway API Keys page | +| Vercel OIDC token | Already linked to a Vercel project (or using Sandbox) | `npx vercel link && npx vercel env pull` writes `VERCEL_OIDC_TOKEN` (12 h expiry; re-pull on auth errors) | +| Direct Anthropic | BYOK / bypass gateway | `ANTHROPIC_AUTH_TOKEN=sk-ant-…` + `ANTHROPIC_BASE_URL=https://api.anthropic.com` | +| Direct OpenAI | Codex backend, BYOK | `OPENAI_API_KEY=sk-…` (+ `OPENAI_BASE_URL` only for proxies) | +| Subscription | Local-only evaluation | `claude login` and/or `codex login` already done; non-sandbox runs reuse the session, no token needed | + +`AI_GATEWAY_API_KEY` expands at CLI startup into `ANTHROPIC_AUTH_TOKEN` / `OPENAI_API_KEY` / `ANTHROPIC_BASE_URL` / `OPENAI_BASE_URL`. Any of those four set explicitly always wins. + +> Subscriptions are useful for evaluating deepsec but generally do not have enough headroom for full repo scans. Switch to the gateway once past evaluation. + +## 3. Verify the credential + +```bash +bunx deepsec scan --limit 20 # cheap, no AI calls +bunx deepsec process --limit 5 # exercises the gateway +``` + +If the second call returns `Missing AI credentials` or `401`, see `config.md` § Troubleshooting. + +## 4. Write `INFO.md` (do not skip) + +`INFO.md` is what makes deepsec project-aware. It is injected into the AI prompt for every batch, so vague content here means vague findings. + +### Recommended: agent-driven write-up + +Open the **parent repo** (the codebase you scanned, **not** `.deepsec/`) in your coding agent and paste the prompt that `deepsec init` printed (also in the project root README): + +> Read `.deepsec/node_modules/deepsec/SKILL.md` to understand the tool. Then read `.deepsec/data//SETUP.md` and follow it: skim this repo's README, any `AGENTS.md` / `CLAUDE.md`, and a handful of representative code files, then replace each section of `.deepsec/data//INFO.md`. +> +> Keep it SHORT: target 50–100 lines total. Pick 3–5 examples per section, not exhaustive enumeration. Name primitives (auth helpers, middleware) but no line numbers. Skip generic CWE categories; built-in matchers cover those. Cover only what is project-specific. `INFO.md` is injected into every scan batch; verbose context dilutes signal. + +### Manual write-up + +The processor auto-loads `data//INFO.md` from the workspace's data dir. Edit it directly; no extra wiring is needed in `deepsec.config.ts`. Even a single tight paragraph noticeably improves the AI's output. + +### What goes in `INFO.md` + +Project-specific only: + +- **What the codebase does** in a few sentences. +- **Auth shape**: names of helpers / middleware / decorators that gate access (`requireSession`, `Depends(get_current_user)`, etc.). Name them, do not quote them. +- **Threat model**: which surfaces matter (public HTTP, internal RPC, queue consumers, cron, CLI) and which are out of scope. +- **Known FP sources**: patterns the AI tends to over-flag in this repo. +- **Project-specific primitives**: internal SDK calls, custom validators, codified secret-loading paths. +- **Out of scope**: directories or file types the AI should ignore. + +### What stays out + +- Generic CWE category descriptions; built-in matchers cover those. +- Exhaustive enumeration. Pick 3–5 representative examples per section. +- Line numbers. They drift; the AI re-reads files anyway. +- Boilerplate intro paragraphs. + +## 5. `.gitignore` hygiene + +The scaffold's `.deepsec/.gitignore` already keeps `INFO.md`, `SETUP.md`, and `deepsec.config.ts` tracked (so teammates inherit project context) and ignores generated state. Do **not** unignore `data/*/files/` or `data/*/runs/` unless you have a deliberate reason (e.g. CI cache). + +`.env.local` must stay gitignored. Never commit `vck_…`, `sk-ant-…`, `sk-…`, or OIDC tokens. + +## 6. Multi-project workspaces + +To scan a *different* codebase from the same `.deepsec/`: + +```bash +bunx deepsec init-project +``` + +Each project gets its own `data//` subdirectory. Pass `--project-id ` to disambiguate any subsequent command (auto-resolution only kicks in with exactly one project). + +## 7. Sanity check before the first real run + +- [ ] `.deepsec/.env.local` has a working credential. +- [ ] `bunx deepsec scan --limit 20` succeeds. +- [ ] `bunx deepsec process --limit 5` succeeds and prints a per-batch cost number. +- [ ] `data//INFO.md` is filled in (50-100 lines, project-specific). +- [ ] You and the user agree on a calibration scope for the first `process` run (deepsec docs default: `--limit 50 --concurrency 5`). diff --git a/.agents/skills/oma-deepsec/resources/triage.md b/.agents/skills/oma-deepsec/resources/triage.md new file mode 100644 index 0000000..12ead17 --- /dev/null +++ b/.agents/skills/oma-deepsec/resources/triage.md @@ -0,0 +1,107 @@ +# Triage: read findings, cut false positives, prioritize work + +## Severity vocabulary + +| Severity | Meaning | +|---|---| +| `CRITICAL` | Pre-auth or trivially exploitable issue with broad blast radius. | +| `HIGH` | Real vulnerability, likely exploitable in this codebase's context. | +| `MEDIUM` | Conditional vulnerability or one with significant attacker prerequisites. | +| `LOW` | Defense-in-depth gap, or correctness issue with weak security framing. | +| `HIGH_BUG` / `BUG` | Real bug the agent declined to call a vulnerability; typically correctness with security-adjacent risk. | + +`triage` adds a `priority` field on top of severity: + +| Priority | Trigger | +|---|---| +| `P0` | Drop everything; the exploit is trivial and the impact is critical. | +| `P1` | This sprint. | +| `P2` | Backlog. | +| `skip` | Not worth fixing (nuance, intended behavior, false alarm). | + +## Read order + +Do not show the user raw `process` output for HIGH+ findings. The right pipeline is: + +1. `bunx deepsec process` (or `process --diff` for PR mode). +2. `bunx deepsec triage --severity HIGH` to bucket findings into P0/P1/P2 (~$0.01 / finding). +3. `bunx deepsec revalidate --min-severity HIGH` re-reads the code and git history, then emits a verdict. The cost is comparable to `process`, and FP rate drops by 50%+. +4. `bunx deepsec export --format md-dir --out ./findings` to surface results to the user. + +`revalidate` verdicts: + +| Verdict | Action | +|---|---| +| `true-positive` | Fix it. Hand off to `oma-debug` or the matching domain skill. | +| `false-positive` | Note in `INFO.md` if the FP shape is recurring. Adjust matchers if it is a regex-level over-match. | +| `fixed` | The finding refers to code that was already patched in git history; no action. | +| `uncertain` | Re-run with the other agent, or escalate to a human reviewer. | + +## Cutting FP rate + +Two things help most: + +1. **Always `revalidate` before acting on `HIGH+`.** Worth the cost. +2. **Tighten `INFO.md`.** Even one paragraph about the auth shape, threat model, and known FP sources improves precision a lot. See `setup.md` § 4. + +After revalidation, FP rate on `HIGH+` typically lands in the 10–29 % range. + +## Refusals + +Models occasionally refuse to investigate a candidate (exploit-shaped source, content filter). After every batch deepsec asks the agent whether anything was skipped; `refused: true` appears in `RunMeta` and on the `FileRecord.refusal` field. The per-batch log shows a `refusal` marker. + +Handling: + +- A refused batch produces no false negatives. Affected files stay `pending`, so re-run `--reinvestigate` against the **other** backend (Claude ↔ Codex) to pick up the dropped sites. Findings dedupe across agents. +- If a single file consistently triggers refusals (>5 % of batches), add it to `data//config.json:ignorePaths`, **or** run that file alone with `--batch-size 1` so a refusal does not take an otherwise-fine batch down with it. +- Never silently drop a refusal. Document it in the user-facing summary. + +## Reading severity counts + +```bash +bunx deepsec metrics +``` + +Shows cross-project counts: severities, vulns by type, TPs after revalidation. Use it to decide where matcher investment pays off (clusters of `other-*` slugs are the strongest signal). + +## Per-finding markdown export shape + +`bunx deepsec export --format md-dir --out ./findings` produces: + +``` +findings/ +├── CRITICAL/ +├── HIGH/ +├── MEDIUM/ +├── LOW/ +└── BUG/ +``` + +Each file contains: severity, title, `vulnSlug`, file path with line numbers, description, recommendation, confidence, triage verdict (if run), revalidation verdict (if run), and an `analysisHistory` summary. Use these as inputs to issue tracker tickets; the structure is friendly to GitHub Issues / Linear / Jira import scripts. + +## When to *not* surface a finding + +- `revalidation.verdict === "false-positive"`. +- `revalidation.verdict === "fixed"` and the fix matches the current `HEAD`. +- `triage.priority === "skip"` with reasoning the user agrees with. +- Severity below the user-stated `severity_floor`. + +For everything else: surface it, with verdict, recommendation, and the file path. + +## Hand-off + +Route by **the layer of the vulnerable file**, judged from each finding's `filePath` + `vulnSlug` + `revalidation.verdict` against the project's own signals: `data//tech.json`, `INFO.md`, `priorityPaths`, and the actual directory structure. Do not bake a slug or path enumeration into this skill. Deepsec evolves its matcher set and project layouts vary, so trust the artifact at runtime. + +| Layer of the vulnerable file | Specialist | +|---|---| +| Backend / server / API | `oma-backend` | +| Frontend / web client | `oma-frontend` | +| Mobile / native client | `oma-mobile` | +| IaC / cloud / network | `oma-tf-infra` | +| Database / data model | `oma-db` | +| CI / workflow / supply chain | `oma-dev-workflow` | +| Documentation drift surfaced by the run | `oma-docs` | + +**Ambiguity → `oma-debug` first.** Route to `oma-debug` whenever the layer is not obvious from the artifact: shared / isomorphic / utility code, an `other-*` slug, a fix that would touch multiple layers, `revalidation.verdict === "uncertain"`, or `BUG` / `HIGH_BUG` non-security correctness without an obvious owner. The hop is **triage, not fix**: pin the exact file:line and re-route to the right specialist with a layer-tagged finding. Fix inline only when the change is a single isolated line and the diagnosis is confident. Record the second-hop owner in the run summary. + +Attach to every routed item: file path, severity, `vulnSlug`, revalidation verdict, recommendation, and the export markdown path. diff --git a/.agents/skills/oma-design/SKILL.md b/.agents/skills/oma-design/SKILL.md index a31ed16..39a27b8 100644 --- a/.agents/skills/oma-design/SKILL.md +++ b/.agents/skills/oma-design/SKILL.md @@ -13,7 +13,7 @@ description: > ### Goal Design specialist that defines, creates, and validates project design systems. -DESIGN.md is the central artifact — all design work revolves around it. +DESIGN.md is the central artifact; all design work revolves around it. ### Intent signature - User asks for design system, `DESIGN.md`, visual direction, typography, color, motion, accessibility, anti-pattern review, or component guidance. @@ -132,12 +132,12 @@ bunx getdesign@latest list 1. Check `.design-context.md` before any design work. If missing, run Phase 1 (Setup) to create it. 2. System font stack as default (`system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif`). Add custom fonts only with project justification. 3. If the service supports CJK languages (ko/ja/zh): prioritize CJK-ready fonts (Pretendard Variable > Noto Sans CJK > system-ui fallback). If latin-only: choose fonts appropriate for the target audience. -4. Enforce anti-patterns strictly — reject AI slop. See `resources/anti-patterns.md`. +4. Enforce anti-patterns strictly; reject AI slop. See `resources/anti-patterns.md`. 5. Name colors semantically with hex values: "Deep Ocean Navy (#1a2332)" not "dark blue". 6. Recommend components with install commands (shadcn CLI). 7. ALL output must be responsive-first (mobile layout as default, enhance upward). 8. WCAG AA minimum for all designs. Respect `prefers-reduced-motion`. -9. Stitch MCP is optional — all phases work without it. +9. Stitch MCP is optional; all phases work without it. 10. Present 2-3 design directions and get user confirmation before generating. ### Anti-Pattern Quick Reference @@ -155,7 +155,7 @@ bunx getdesign@latest list - DON'T: Gradient orbs/blobs as hero decoration ("AI SaaS look") - DON'T: Gradient + glassmorphism + blur combo (triple slop) - DON'T: Mesh gradient backgrounds as primary visual -- DON'T: Pure white (#fff) on pure black (#000) — too harsh +- DON'T: Pure white (#fff) on pure black (#000); too harsh - DO: Solid colors or subtle single-hue gradients - DO: Texture (noise, grain, dither) over plain gradients - DO: Derive gradients from brand colors with clear purpose @@ -176,7 +176,7 @@ bunx getdesign@latest list - DO: 150ms micro-interactions, 200-500ms transitions ### Components -- DON'T: Glassmorphism everywhere — use sparingly +- DON'T: Glassmorphism everywhere; use sparingly - DON'T: Hover-only interactions without touch/keyboard alternatives - DO: shadcn/ui for base, Aceternity UI / React Bits for accent effects - DO: All interactive elements must have visible focus states @@ -195,8 +195,8 @@ MIT). Trigger it by listing a supported vendor domain in the ```markdown ## Reference Sites -- [linear.app](https://linear.app) — clean dark UI, minimal, professional -- [stripe.com](https://stripe.com) — strong hierarchy, purposeful animation +- [linear.app](https://linear.app): clean dark UI, minimal, professional +- [stripe.com](https://stripe.com): strong hierarchy, purposeful animation ``` Any domain that matches a brand in the getdesign manifest triggers an @@ -221,27 +221,27 @@ injection defenses, and multi-vendor merge policy live in `resources/getdesign-fetcher.md`. ### Resources -- `resources/execution-protocol.md` — 7-phase workflow -- `resources/anti-patterns.md` — Full DO/DON'T catalog -- `resources/checklist.md` — Audit checklist (Responsive + WCAG + Nielsen + Slop) -- `resources/design-md-spec.md` — DESIGN.md generation guide (9 sections) -- `resources/design-tokens.md` — CSS/Tailwind/shadcn export templates -- `resources/prompt-enhancement.md` — Vague request → detailed spec -- `resources/stitch-integration.md` — Stitch MCP tool mapping (optional) -- `resources/getdesign-fetcher.md` — Vendor seed fetch, hash verify, seed rules -- `resources/error-playbook.md` — Design error recovery +- `resources/execution-protocol.md`: 7-phase workflow +- `resources/anti-patterns.md`: Full DO/DON'T catalog +- `resources/checklist.md`: Audit checklist (Responsive + WCAG + Nielsen + Slop) +- `resources/design-md-spec.md`: DESIGN.md generation guide (9 sections) +- `resources/design-tokens.md`: CSS/Tailwind/shadcn export templates +- `resources/prompt-enhancement.md`: Vague request to detailed spec +- `resources/stitch-integration.md`: Stitch MCP tool mapping (optional) +- `resources/getdesign-fetcher.md`: Vendor seed fetch, hash verify, seed rules +- `resources/error-playbook.md`: Design error recovery ## References -- `reference/visual-hierarchy.md` — 7 hierarchy principles (Alignment, Color, Contrast, Proximity, Size, Texture, Time) -- `reference/typography.md` — Font selection, type scale, CJK -- `reference/color-and-contrast.md` — Color psychology, WCAG contrast -- `reference/spatial-design.md` — 8px grid, breakpoints, spacing -- `reference/motion-design.md` — motion/react, GSAP, Three.js, ogl, Temporal UX -- `reference/responsive-design.md` — Mobile-first, theme system -- `reference/component-patterns.md` — shadcn/Aceternity/React Bits catalog -- `reference/accessibility.md` — WCAG 2.2, ARIA, focus, reduced-motion -- `reference/shader-and-3d.md` — WebGL, R3F, ogl, performance +- `reference/visual-hierarchy.md`: 7 hierarchy principles (Alignment, Color, Contrast, Proximity, Size, Texture, Time) +- `reference/typography.md`: Font selection, type scale, CJK +- `reference/color-and-contrast.md`: Color psychology, WCAG contrast +- `reference/spatial-design.md`: 8px grid, breakpoints, spacing +- `reference/motion-design.md`: motion/react, GSAP, Three.js, ogl, Temporal UX +- `reference/responsive-design.md`: Mobile-first, theme system +- `reference/component-patterns.md`: shadcn/Aceternity/React Bits catalog +- `reference/accessibility.md`: WCAG 2.2, ARIA, focus, reduced-motion +- `reference/shader-and-3d.md`: WebGL, R3F, ogl, performance ### Examples -- `examples/design-context-example.md` — .design-context.md example -- `examples/landing-page-prompt.md` — Detailed landing page prompt +- `examples/design-context-example.md`: .design-context.md example +- `examples/landing-page-prompt.md`: Detailed landing page prompt diff --git a/.agents/skills/oma-design/examples/design-context-example.md b/.agents/skills/oma-design/examples/design-context-example.md index 1aa5726..892e8c3 100644 --- a/.agents/skills/oma-design/examples/design-context-example.md +++ b/.agents/skills/oma-design/examples/design-context-example.md @@ -1,4 +1,4 @@ -# .design-context.md — Example +# .design-context.md: Example This is an example of what `.design-context.md` looks like after Phase 1 (Setup). The file lives in the project root and captures project-specific design decisions. @@ -12,7 +12,7 @@ The file lives in the project root and captures project-specific design decision ## Target Audience - **Role**: Sales leaders, revenue ops managers, growth teams -- **Tech level**: Moderate — comfortable with dashboards, not developers +- **Tech level**: Moderate; comfortable with dashboards, not developers - **Age range**: 28-45 - **Context**: Evaluating tools during work hours, often on laptop @@ -37,8 +37,8 @@ The file lives in the project root and captures project-specific design decision ## Color Direction - **Background**: Deep near-black (#0a0a0a) -- **Text**: Warm off-white (#f5f0eb) — not pure white -- **Primary accent**: Signal Green (#22c55e) — CTAs, success states +- **Text**: Warm off-white (#f5f0eb), not pure white +- **Primary accent**: Signal Green (#22c55e) for CTAs, success states - **Avoid**: Purple gradients, rainbow effects, mesh gradients - **Borders**: White at 10% opacity (rgba(255,255,255,0.1)) @@ -49,9 +49,9 @@ The file lives in the project root and captures project-specific design decision - **Touch targets**: 44x44pt minimum on mobile ## Reference Sites -- [linear.app](https://linear.app) — clean dark UI, minimal, professional -- [vercel.com](https://vercel.com) — developer-premium aesthetic, great typography -- [stripe.com](https://stripe.com) — strong hierarchy, purposeful animation +- [linear.app](https://linear.app): clean dark UI, minimal, professional +- [vercel.com](https://vercel.com): developer-premium aesthetic, great typography +- [stripe.com](https://stripe.com): strong hierarchy, purposeful animation > **Note**: every domain in this section is automatically matched > against the `getdesign` vendor catalog during Phase 1. All three @@ -60,7 +60,7 @@ The file lives in the project root and captures project-specific design decision > vendor matching, use a domain that is not in the catalog (e.g., an > internal design reference or a custom portfolio URL). See > `resources/getdesign-fetcher.md` for the matching algorithm and the -> Seed Application Rules — notably, Typography is never adopted from +> Seed Application Rules. Notably, Typography is never adopted from > the vendor seed, so the Pretendard Variable choice in this file will > still win on the Korean-localized project above. diff --git a/.agents/skills/oma-design/examples/landing-page-prompt.md b/.agents/skills/oma-design/examples/landing-page-prompt.md index 23b12d4..44c4c92 100644 --- a/.agents/skills/oma-design/examples/landing-page-prompt.md +++ b/.agents/skills/oma-design/examples/landing-page-prompt.md @@ -1,4 +1,4 @@ -# Landing Page Design Prompt — Example +# Landing Page Design Prompt: Example This is an example of the level of detail Phase 3 (Enhance) should produce. Based on motionsites.ai-level specifications. @@ -14,7 +14,7 @@ Based on motionsites.ai-level specifications. ## Design System ### Fonts -- Heading: Instrument Serif (italic) — display headings only +- Heading: Instrument Serif (italic) for display headings only - Body: system-ui stack (or Pretendard for CJK) ### CSS Variables @@ -48,7 +48,7 @@ Based on motionsites.ai-level specifications. ### HERO (full viewport) - **Layout**: centered, min-h-screen, flex column - **Background**: video (mp4, autoplay loop muted) with gradient overlay to black at bottom -- **Badge**: liquid-glass rounded-full pill — "New" tag + announcement text +- **Badge**: liquid-glass rounded-full pill with "New" tag + announcement text - **Heading**: BlurText component (motion/react), word-by-word blur-to-clear animation - text-6xl md:text-7xl lg:text-[5.5rem] font-heading italic - leading-[0.8] tracking-[-4px] @@ -61,7 +61,7 @@ Based on motionsites.ai-level specifications. ### PARTNERS BAR - **Layout**: centered column, below hero -- **Badge**: liquid-glass rounded-full — "Trusted by the teams behind" +- **Badge**: liquid-glass rounded-full labeled "Trusted by the teams behind" - **Names**: horizontal row, text-2xl md:text-3xl font-heading italic text-white, gap-12 - **Companies**: Stripe, Vercel, Linear, Notion, Figma - **Responsive**: reduce gap, text-xl on mobile, wrap if needed @@ -71,7 +71,7 @@ Based on motionsites.ai-level specifications. - **Background**: HLS video (hls.js), absolute cover, z-0 - Top + bottom fade gradients (200px each, black ↔ transparent) - **Content** (z-10, centered): - - Badge: liquid-glass rounded-full — "How It Works" + - Badge: liquid-glass rounded-full labeled "How It Works" - Heading: "You dream it. We ship it." - Subtext: description paragraph - Button: liquid-glass-strong rounded-full + ArrowUpRight @@ -82,7 +82,7 @@ Based on motionsites.ai-level specifications. - **Row 1** (text left, image right): - H3 + paragraph + CTA button - Image in liquid-glass rounded-2xl container -- **Row 2** (image left, text right — lg:flex-row-reverse): +- **Row 2** (image left, text right; lg:flex-row-reverse): - Same structure, reversed layout - **Responsive**: stack vertically, image above text on mobile @@ -125,7 +125,7 @@ Based on motionsites.ai-level specifications. ## Dependencies - hls.js (HLS video streaming) -- motion (animation — import from "motion/react") +- motion (animation; import from "motion/react") - lucide-react (icons) - tailwindcss-animate diff --git a/.agents/skills/oma-design/reference/accessibility.md b/.agents/skills/oma-design/reference/accessibility.md index 509c35d..2b57b6e 100644 --- a/.agents/skills/oma-design/reference/accessibility.md +++ b/.agents/skills/oma-design/reference/accessibility.md @@ -49,12 +49,12 @@ const prefersReduced = useReducedMotion() ### Landmarks Every page must use these semantic elements: -- `
` — site header with navigation -- `