Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions openspec/CHANGE_ORDER.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,12 @@ These changes are the modules-side runtime companions to split core governance a
| governance | 02 | governance-02-exception-management | [#167](https://github.com/nold-ai/specfact-cli-modules/issues/167) | Parent Feature: [#163](https://github.com/nold-ai/specfact-cli-modules/issues/163); core counterpart `specfact-cli#248`; policy runtime `#158` |
| validation | 02 | validation-02-full-chain-engine | [#171](https://github.com/nold-ai/specfact-cli-modules/issues/171) | Parent Feature: [#163](https://github.com/nold-ai/specfact-cli-modules/issues/163); core counterpart `specfact-cli#241`; runtime inputs from `#164` and `#165`; policy semantics from `#158` |

### Code review and sidecar validation improvements

| Module | Order | Change folder | GitHub # | Blocked by |
|--------|-------|---------------|----------|------------|
| code-review + codebase | 01 | code-review-bug-finding-and-sidecar-venv-fix | [#174](https://github.com/nold-ai/specfact-cli-modules/issues/174) | Parent Feature: [#175](https://github.com/nold-ai/specfact-cli-modules/issues/175); Epic: [#162](https://github.com/nold-ai/specfact-cli-modules/issues/162) |

### Documentation restructure

| Module | Order | Change folder | GitHub # | Blocked by |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-04-08
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
## Context

The `specfact-code-review` bundle runs seven analysis tools in sequence (ruff, radon, semgrep, AST, basedpyright, pylint, contract_runner). External-repo validation across 10 Python repos revealed two gaps:

1. **No bug-finding signal on repos without icontract**: CrossHair already runs (`contract_runner.py:112`) but with `--per_path_timeout 2` and a 30s hard cap — too tight to find counterexamples in type-annotated code. Semgrep runs only `clean_code.yaml` rules (style/architecture); no security or bug-pattern rules exist. `MISSING_ICONTRACT` fires on every public function in any repo that hasn't opted into icontract, producing hundreds of low-value warnings that drown real signal.

2. **Sidecar venv self-scan**: All three framework extractors (FastAPI, Flask, Django/DRF) use `search_path.rglob("*.py")` with no path exclusions. The sidecar dependency installer writes into `.specfact/venv/` before extraction runs. Result: the FastAPI extractor on gpt-researcher picked up FastAPI's own source from the venv and reported 25,947 routes (actual: 19). The fix is a single shared exclusion applied at the `rglob` call site.

## Goals / Non-Goals

**Goals:**
- Add a `--bug-hunt` flag to `specfact code review run` that gives CrossHair meaningful budget (10s/path, 120s total)
- Add a `bugs.yaml` semgrep config with Python security/bug rules, wired as a second semgrep pass
- Auto-suppress `MISSING_ICONTRACT` when no `@require`/`@ensure` imports are found in the reviewed files
- Exclude `.specfact/` from all sidecar framework extractor scan paths

**Non-Goals:**
- Replacing CrossHair with a different symbolic execution engine
- Adding semgrep cloud/registry rules (offline-first; rules must be bundled locally)
- Changing the score model or CI exit codes for bug-hunt findings
- Fixing CrossHair's analysis depth beyond timeout tuning

## Decisions

**D1: `bugs.yaml` as a separate semgrep config, not merged into `clean_code.yaml`**

Keeps clean-code rules and bug-finding rules independently evolvable. The semgrep runner already has `find_semgrep_config` that walks parents for `clean_code.yaml` by name — add a parallel `find_semgrep_bugs_config` that looks for `.semgrep/bugs.yaml` and returns `None` (not an error) when absent. Runner calls both; missing `bugs.yaml` is a no-op.

Alternative considered: a single merged config. Rejected — mixing style warnings and bug errors in one file makes the rule set harder to reason about and harder to disable selectively.

**D2: `--bug-hunt` flag rather than a separate command**

`specfact code review run --bug-hunt` passes `bug_hunt=True` through `ReviewRunRequest` → `run_review` → `run_contract_check`. No new command surface, no CLI help restructuring. The flag increases CrossHair timeouts only; all other tools run at normal speed.

Alternative considered: `specfact code review bug-hunt` as a new subcommand. Rejected — unnecessary API surface; `--bug-hunt` is composable with `--scope full`, `--json`, `--out`, etc.

**D3: MISSING_ICONTRACT auto-suppression via import scan, not a flag**

Scan the reviewed file list for any `from icontract import` or `import icontract` statement before running `contract_runner`. If none found, skip `MISSING_ICONTRACT` findings entirely. This is automatic — no user flag needed, works correctly on both internal bundles (which do use icontract) and external repos (which don't).

Alternative considered: `--suppress-missing-contracts` flag. Rejected — requires users to know to pass it; auto-detection is always correct.

**D4: Sidecar exclusion via a shared `_is_excluded_path` helper on `BaseFrameworkExtractor`**

Add `_EXCLUDED_DIR_NAMES = frozenset({".specfact", ".git", "__pycache__", "node_modules"})` and a `_iter_python_files(root: Path)` generator on `BaseFrameworkExtractor` that yields only files whose parts contain no excluded dir name. All three framework extractors replace `search_path.rglob("*.py")` with `self._iter_python_files(search_path)`.

Alternative considered: filtering at the orchestrator level before passing `repo_path` to extractors. Rejected — extractors own the scan, fixing at the source is cleaner and prevents future extractors from re-introducing the bug.

## Risks / Trade-offs

- **CrossHair false positives in bug-hunt mode**: Longer timeouts mean CrossHair explores more paths and may surface `SideEffectDetected` warnings on I/O-heavy functions. Mitigated: `_IGNORED_CROSSHAIR_PREFIXES` already filters `SideEffectDetected`; no change needed.
- **`bugs.yaml` rule maintenance**: Bundled semgrep rules can go stale. Mitigated: keep the initial set small (5–10 high-confidence rules), document in the file header that additions require a PR with test evidence.
- **MISSING_ICONTRACT auto-suppression masking real gaps**: If a bundle file imports icontract but only for one function, all other uncovered functions are still flagged. The auto-suppression only fires when zero icontract imports exist — so internal bundles are unaffected.
- **Sidecar exclusion breaking legitimate `.specfact/` source scanning**: No known repo puts application source under `.specfact/`. Risk is negligible.

## Migration Plan

No migrations required. Both changes are additive (`--bug-hunt` flag, `bugs.yaml` config) or bug fixes (venv exclusion, MISSING_ICONTRACT suppression). Existing behaviour is unchanged when `--bug-hunt` is not passed and when `bugs.yaml` is absent.

Patch version bump on `specfact-codebase` (sidecar bug fix). No version bump required on `specfact-code-review` for the `--bug-hunt` flag or MISSING_ICONTRACT change — both are backwards-compatible additions. If the semgrep `bugs.yaml` is shipped as a bundled resource, a minor bump on `specfact-code-review` is appropriate.

## Open Questions

- Which specific semgrep rules to include in `bugs.yaml` v1? Candidates: `python.lang.security.audit.dangerous-system-call`, `python.lang.correctness.useless-eqeq`, `python.lang.security.audit.hardcoded-password`, `python.lang.correctness.none-not-null`. Needs sign-off before implementation to avoid shipping noisy rules.
- Should `--bug-hunt` findings appear in the score model? Currently CrossHair counterexamples are `severity=warning`; they could be promoted to `severity=error` in bug-hunt mode. Defer to implementation task.
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
## Why

External repo validation (crewAI, gpt-researcher, and 8 OSS baseline repos) revealed two concrete tool gaps: the code review module produces no bug-finding signal on repos that don't use icontract, and the sidecar route extractor inflates route counts by scanning its own installed venv. Both blockers limit the tool's usefulness on arbitrary Python codebases.

## What Changes

- **Semgrep bug-finding rules**: Add a `bugs.yaml` semgrep config alongside the existing `clean_code.yaml`, covering Python security patterns, unsafe access, known bug-prone patterns. Wire it as a second semgrep pass in the runner.
- **CrossHair timeout increase + bug-hunt mode**: Expose a `--bug-hunt` flag on `specfact code review run` that raises `--per_path_timeout` from 2s to 10s and total timeout from 30s to 120s, giving CrossHair enough budget to find counterexamples in type-annotated code without icontract.
- **MISSING_ICONTRACT suppression on external repos**: When a reviewed codebase has zero icontract usage, suppress `MISSING_ICONTRACT` warnings rather than emitting one per public function. Auto-detect by scanning the reviewed files for any `@require`/`@ensure` import; if none found, omit the rule entirely for that run.
- **Sidecar venv self-scan fix**: Exclude `.specfact/` from the route extraction scan path in all framework extractors (`FlaskExtractor`, `FastAPIExtractor`, `DjangoExtractor`). Currently the sidecar installs deps into `.specfact/venv` then scans the full repo tree, picking up the installed framework's own source (gpt-researcher: 25,947 reported routes vs 19 real).

## Capabilities

### New Capabilities

- `code-review-bug-finding`: Semgrep security/bug rules pass and CrossHair bug-hunt mode — a second analysis layer focused on detecting actual bugs rather than clean-code style issues.

### Modified Capabilities

- `contract-runner`: CrossHair timeout parameters increase for bug-hunt mode; MISSING_ICONTRACT auto-suppressed when no icontract usage is detected in the reviewed files.
- `clean-code-policy-pack`: Second semgrep config (`bugs.yaml`) added alongside `clean_code.yaml`; semgrep runner gains a second config load pass.
- `review-run-command`: New `--bug-hunt` flag wired through `ReviewRunRequest` and `run_command`.

## Impact

- `packages/specfact-code-review/src/specfact_code_review/tools/contract_runner.py` — CrossHair timeout params, MISSING_ICONTRACT suppression logic
- `packages/specfact-code-review/src/specfact_code_review/tools/semgrep_runner.py` — second config pass for `bugs.yaml`
- `packages/specfact-code-review/src/specfact_code_review/run/commands.py` — `--bug-hunt` flag, `ReviewRunRequest` field
- `packages/specfact-code-review/src/specfact_code_review/run/runner.py` — pass `bug_hunt` flag through to tool steps
- `packages/specfact-code-review/.semgrep/bugs.yaml` — new file
- `packages/specfact-codebase/` — sidecar framework extractors: exclude `.specfact/` from scan paths
- No registry, manifest, or semver impact for specfact-code-review (behaviour change only; patch bump on specfact-codebase for the bug fix)
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
## ADDED Requirements

### Requirement: Semgrep bug-finding rules pass

The system SHALL run a second semgrep pass using a `bugs.yaml` config alongside
the existing `clean_code.yaml` pass. The `bugs.yaml` config SHALL cover Python
security and correctness patterns. When `bugs.yaml` is absent the pass SHALL be
silently skipped without error.

#### Scenario: bugs.yaml present — security findings emitted

- **WHEN** `.semgrep/bugs.yaml` exists in the bundle
- **AND** `run_semgrep` is called on Python files matching a bug rule
- **THEN** `ReviewFinding` records are returned with `category="security"` or `category="correctness"`
- **AND** findings reference the matched rule id from `bugs.yaml`

#### Scenario: bugs.yaml absent — pass is a no-op

- **WHEN** no `.semgrep/bugs.yaml` file is discoverable
- **AND** `run_semgrep` is called
- **THEN** no finding is returned for the missing bugs pass
- **AND** no exception propagates to the caller

#### Scenario: bugs.yaml findings are included in the JSON report

- **WHEN** `specfact code review run --json` is executed on a file matching a bug rule
- **THEN** the output JSON contains findings from both the clean-code and bug-finding passes

### Requirement: CrossHair bug-hunt mode

The system SHALL support a `--bug-hunt` flag on `specfact code review run` that
increases the CrossHair per-path timeout to 10 seconds and the total CrossHair
timeout to 120 seconds. Without `--bug-hunt` the existing timeouts SHALL remain
unchanged.

#### Scenario: --bug-hunt increases CrossHair timeouts

- **WHEN** `specfact code review run --bug-hunt --json <file>` is executed
- **THEN** CrossHair runs with `--per_path_timeout 10`
- **AND** the total CrossHair subprocess timeout is 120 seconds

#### Scenario: Default run uses original CrossHair timeouts

- **WHEN** `specfact code review run --json <file>` is executed without `--bug-hunt`
- **THEN** CrossHair runs with `--per_path_timeout 2`
- **AND** the total CrossHair subprocess timeout is 30 seconds

#### Scenario: --bug-hunt is composable with --scope and --out

- **WHEN** `specfact code review run --bug-hunt --scope full --json --out report.json` is executed
- **THEN** the command completes without error
- **AND** the output JSON is written to `report.json`
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## MODIFIED Requirements

### Requirement: icontract Decorator AST Scan and CrossHair Fast Pass

The system SHALL AST-scan changed Python files for public functions missing
`@require` / `@ensure` decorators, and run CrossHair with a configurable
per-path timeout for counterexample discovery. When no icontract usage is
detected in the reviewed files, `MISSING_ICONTRACT` findings SHALL be
suppressed entirely for that run.

#### Scenario: Public function without icontract decorators produces a contracts finding when icontract is in use

- **GIVEN** a Python file with a public function lacking icontract decorators
- **AND** at least one other reviewed file imports from `icontract`
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** a `ReviewFinding` is returned with `category="contracts"` and
`severity="warning"`

#### Scenario: MISSING_ICONTRACT suppressed when no icontract usage detected

- **GIVEN** a set of reviewed Python files containing no `from icontract import` or
`import icontract` statements
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** no `MISSING_ICONTRACT` findings are returned
- **AND** CrossHair still runs on the files

#### Scenario: Decorated public function produces no contracts finding

- **GIVEN** a Python file with a public function decorated with both `@require` and
`@ensure`
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** no contract-related finding is returned for that function

#### Scenario: Private functions are excluded from the scan

- **GIVEN** a Python file with a function named `_private_helper` and no icontract
decorators
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** no finding is produced for `_private_helper`

#### Scenario: CrossHair counterexample maps to a contracts warning

- **GIVEN** CrossHair finds a counterexample for a reviewed function
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** a `ReviewFinding` is returned with `category="contracts"`,
`severity="warning"`, and `tool="crosshair"`

#### Scenario: CrossHair timeout or unavailability degrades gracefully

- **GIVEN** CrossHair times out or is not installed
- **WHEN** `run_contract_check(files=[...])` is called
- **THEN** the AST scan still runs
- **AND** no exception propagates to the caller
- **AND** CrossHair unavailability does not produce a blocking finding

#### Scenario: Bug-hunt mode uses extended CrossHair timeouts

- **GIVEN** `run_contract_check(files=[...], bug_hunt=True)` is called
- **WHEN** CrossHair executes
- **THEN** CrossHair runs with `--per_path_timeout 10`
- **AND** the subprocess timeout is 120 seconds
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
## ADDED Requirements

### Requirement: --bug-hunt flag on review run command

The `specfact code review run` command SHALL accept a `--bug-hunt` flag that
enables extended CrossHair timeouts and is composable with all existing flags.

#### Scenario: --bug-hunt flag accepted without error

- **GIVEN** `specfact code review run --bug-hunt --json <file>` is executed
- **WHEN** the command parses its arguments
- **THEN** the command proceeds without a CLI argument error
- **AND** `ReviewRunRequest.bug_hunt` is `True`

#### Scenario: --bug-hunt flag absent defaults to False

- **GIVEN** `specfact code review run --json <file>` is executed without `--bug-hunt`
- **WHEN** the command parses its arguments
- **THEN** `ReviewRunRequest.bug_hunt` is `False`
- **AND** CrossHair uses the standard 2-second per-path timeout
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## ADDED Requirements

### Requirement: Framework extractors exclude .specfact from scan paths

All sidecar framework extractors (FastAPI, Flask, Django, DRF) SHALL exclude
`.specfact/` directories from Python file discovery. This prevents the sidecar's
own installed venv and workspace artefacts from being scanned as application
source.

#### Scenario: FastAPI extractor ignores .specfact/venv

- **GIVEN** a repo with a `.specfact/venv/` directory containing FastAPI source
- **WHEN** the FastAPI extractor runs route extraction on that repo
- **THEN** no routes are extracted from files under `.specfact/`
- **AND** only routes from application source files are returned

#### Scenario: Flask extractor ignores .specfact/venv

- **GIVEN** a repo with a `.specfact/venv/` directory containing Flask source
- **WHEN** the Flask extractor runs route extraction on that repo
- **THEN** no routes are extracted from files under `.specfact/`

#### Scenario: Django extractor ignores .specfact/venv

- **GIVEN** a repo with a `.specfact/venv/` directory containing Django source
- **WHEN** the Django extractor runs route extraction on that repo
- **THEN** no routes are extracted from files under `.specfact/`

#### Scenario: Other standard exclusions also apply

- **WHEN** any framework extractor scans a repo
- **THEN** files under `.git/`, `__pycache__/`, and `node_modules/` are also excluded
- **AND** legitimate application source outside these directories is not affected

#### Scenario: Route count reflects real application routes only

- **GIVEN** gpt-researcher repo with 19 real FastAPI routes and `.specfact/venv` installed
- **WHEN** sidecar validation runs route extraction
- **THEN** routes extracted is approximately 19, not 25,947
Loading
Loading