A repository-native specification layer for AI-assisted coding.
Intent Specification Layer helps AI coding agents stop guessing, and stop calling work done before behavior is verified.
The core rule is simple:
docs/explains.spec/governs.
AI coding fails when unstated intent becomes implementation guesswork. PRDs, chat prompts, and plan-mode outputs often describe the happy path while leaving authority, vocabulary, unwanted behavior, rollback, and drift controls implicit. The Intent Specification Layer turns those implicit decisions into a reviewable model of user experience and system behavior.
It is bidirectional:
- before implementation, it reduces AI and human guesswork;
- after implementation, it lets reviewers find missing UX, domain, and failure behavior without reading all the code first.
AI agents can produce plausible code from vague natural language. The dangerous failure is not that the code never runs. The dangerous failure is that the code runs while unstated intent, edge cases, rollback rules, and verification remain implicit.
This layer makes the agent work against explicit obligations:
intent -> EARS behavior -> REQ/S statement ID -> verification obligation
-> generated stub -> non-generated trace -> executed evidence
A generated test stub is not evidence. A code comment is not evidence. The behavior is only complete when a real test, guardrail, smoke check, or named manual UX/runtime review has run or been recorded.
npm install
npm run checkThe check regenerates and validates requirement artifacts, verifies local links, runs agent protocol guardrails, runs generated requirement slots, and executes the included method-comparison simulations.
This repository contains:
- an authoritative
spec/folder that dogfoods the method for this project; - a practical guide to the L0-L3 spec model;
- reusable templates for project constitutions and feature specs;
- a spec review loop for finding product and code gaps from the spec itself;
- a review ledger pattern that keeps spec-only findings separate from code evidence;
- a spec-as-product-standard rule that prevents agents from weakening accepted requirements just because implementation is missing;
- a method-update propagation protocol that prevents agents from stopping after governance files change while feature specs remain unaudited;
- an agent mode router that makes broad user requests choose a concrete workflow and completion rule before the agent acts;
- an agent operating protocol that keeps AI agents from confusing intent, evidence, review findings, and generated stubs;
- feature archetype packs that force predictable edge cases into review before implementation;
- a REQ-ID and statement-level verification bridge that turns each intent statement into a proof obligation, while keeping generated placeholders, planned evidence, non-generated traces, and executed verification separate;
- a worked coupon-order example;
- two reproducible experiments that compare PRD, BDD, EARS, Domain+EARS, and the full spec layer;
- repository health files for citation, licensing, contributions, and releases.
| Layer | Purpose | Required when |
|---|---|---|
| L0 Constitution | Product-wide values, authorities, forbidden shortcuts | Always |
| L1 Domain Truth | Entities, states, vocabulary, invariants, ownership | Shared concepts or 2+ modules |
| L2 Behavior Spec | EARS requirements for system response | Every behavior change |
| L3 Interface Contract | Ordering, payloads, idempotency, rollback, partial failure | Multi-step or cross-resource mutation |
Minimum rule set:
- L0 always exists.
- Behavior changes require L2.
- Shared terms require L1.
- Partial failure, rollback, retry, deletion, payment, entitlement, or idempotency requires L3.
- A reviewer should be able to understand the intended user journey from the spec before reading implementation code.
- Accepted future behavior belongs in
spec/; implementation readiness belongs in tests, evidence records, or review ledgers. - Do not downgrade accepted specs to match incomplete code. If code is missing, record an implementation or evidence gap and keep the standard visible.
- Every REQ-ID or statement ID should generate a test stub and map to a verification path, while generated stubs remain clearly marked as pending evidence.
- A statement is not complete because a stub exists or an
@Spec(...)trace is present. It is complete only when a real test, guardrail, smoke check, or named manual UX review has executed or recorded evidence for that statement. - Tools consume the source layer; they do not define it.
- Use feature archetype packs to force common-sense edge-case discovery before implementation. Async work, source ingestion, external AI, approval, payment, auth, deletion, and external integration each have predictable failure surfaces.
- If a user provides valid input and automation fails, the system must preserve that input and provide a recoverable draft, still-processing state, retry path, or actionable error. Do not collapse valid input into an empty manual-only fallback.
- Customer-visible work that can outlive the generic API timeout needs an explicit latency contract: synchronous, endpoint-specific long request, polling, background job, or streaming.
- Applying a new ILS version or upstream rule to an existing project requires a propagation audit across in-scope authoritative feature specs. Governance updates alone are not completion.
- Broad completion language (
done,complete,ready) is allowed only when the selected task mode's completion rule is satisfied.
ILS specs are product and system standards, not implementation inventories. When accepted spec and current code disagree, keep the accepted spec and record an implementation or evidence gap. Do not downgrade accepted specs to match incomplete code.
Use Spec as product standard when an AI agent is unsure whether to update the spec, code, tests, or review ledger.
- Agent operating rules
- Project specs
- Agent mode router
- Agent operating protocol
- Guide
- Spec as product standard
- Method update propagation
- Spec authoring quality
- Adoption playbook
- Spec review loop
- Spec-to-test bridge
- Naming and structure decision
- Research notes
- References
- GitHub discovery setup
- Korean community launch draft
- Feature spec template
- Experience review template
- Change proposal template
- Spec review finding template
- Review ledger template
- Method update propagation template
- Agent task brief template
- Coupon-order example
- Heading-style EARS example
- Frontmatter schema
spec/
README.md
00_constitution.md
features/<feature>/spec.md
changes/
reviews/
schemas/
docs/
narrative documentation, design notes, research, history
This repository now uses that shape for itself. Use a global structure when the product has one unified domain. Use a feature structure when different domain slices have different authorities, change cadences, or failure modes.
Run the local checks:
npm run checkThis checks syntax, local Markdown links, generated requirement artifacts, agent operating protocol guardrails, generated statement stubs, the generated verification report, the benchmark, and the simulation.
Current diagnostic results:
| Method | Information sufficiency | LLM-style runtime simulation |
|---|---|---|
| PRD | 0.000 | 2/15 |
| BDD/Gherkin | 0.100 | 5/15 |
| EARS only | 0.513 | 11/15 |
| Domain + EARS | 0.750 | 12/15 |
| Full Spec Layer | 1.000 | 15/15 |
These are not universal model benchmarks. They are diagnostic checks showing which failure classes remain implicit when a layer is missing.
This project is not a replacement for Spec Kit, OpenSpec, Kiro, BMAD, Augment Intent, or plan-mode workflows. It is the source layer those tools can consume.
The distinction:
- SDD tools help execute a workflow.
- This layer defines and reviews what must remain true.
If an AI agent implements accepted behavior from this layer, the default action
is to create or update verification evidence in the same change. The agent
should not finish with only generated stubs, planned evidence, or @Spec(...)
breadcrumbs. Those are intermediate states. The implementation is complete only
when each touched statement is verified, manually recorded, or explicitly left
blocked with a reason.
If implementation evidence shows a behavior is missing, the agent should not
weaken the accepted spec. It should classify the mismatch as
missing_implementation, partial_implementation, missing_test, or
wrong_code, then update code and evidence or record the blocker.
If the task is to apply a newer ILS rule to a project, the agent should not
claim completion after updating AGENTS, README, templates, or generated files.
It must inventory the authoritative specs, audit in-scope feature specs under
the new rule, update accepted L1/L2/L3 text, regenerate artifacts, and list any
residual pending_spec_review or gap entries.
For broad requests, the agent should route the work through Agent mode router before editing. This makes the completion claim depend on a mode-specific rule instead of the agent's memory or confidence.
This is an initial public version. The method is usable today, but the empirical checks are deliberately small and should be expanded across more domains, models, and real projects.
Use the GitHub "Cite this repository" control or see CITATION.cff.
- Guide and research notes: CC BY 4.0. See LICENSE-DOCS.md.
- Templates, examples, scripts, and executable experiment code: MIT. See LICENSE-CODE.md.