From f7b2387d9d1e77b6ff1785ee10e17ed92c1b6d28 Mon Sep 17 00:00:00 2001
From: Thomas Pluck <thomaspluck95@proton.me>
Date: Sat, 18 Apr 2026 09:07:33 +0100
Subject: [PATCH 1/4] ci: gate fixture runner on analyzer binary presence
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two fixtures were failing CI on master:
- rust_analyzer/ — rust-analyzer is not installed in CI (the workflow
  only installs pyright + semgrep). The runner had no notion of
  optional analyzer binaries, so the fixture failed with 'expected
  hallucinated-import diagnostic'.
- semgrep/ — the equivalent cargo integration test
  (semgrep_fixture_flags_multiple_rules_when_installed) passes, so
  semgrep itself works. The shell runner was redirecting all daemon
  stderr to /dev/null, which made the failure undebuggable.

Fix:
- Map each fixture name to its required external binary; skip when
  missing. rust-analyzer / semgrep / pyright / tsc are now treated as
  optional capabilities the way daemon/tests/fixtures.rs already does.
- Capture daemon stderr to a temp file and dump it on failure so the
  next regression doesn't require source-code spelunking.
- Update the docstring + the misleading 'expected hallucinated-import'
  message — the script checks a generic non-green invariant; detailed
  per-analyzer expectations live in daemon/tests/fixtures.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 test/run_fixtures.sh | 58 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 46 insertions(+), 12 deletions(-)

diff --git a/test/run_fixtures.sh b/test/run_fixtures.sh
index ca83b4d..2afdbd0 100755
--- a/test/run_fixtures.sh
+++ b/test/run_fixtures.sh
@@ -1,12 +1,16 @@
 #!/usr/bin/env bash
 # Integration harness for test/fixtures/ai-slop/.
 #
-# For each subdirectory under fixtures/ai-slop, run `ive-daemon scan` and
-# check the returned JSON matches the expectations in the YAML sidecar.
-# This runner is deliberately shell+jq-only (no extra deps) and only checks
-# the invariants that exist in v1.
+# For each subdirectory, run `ive-daemon scan` and check the returned JSON
+# satisfies a generic non-green invariant. Detailed per-analyzer expectations
+# live in daemon/tests/fixtures.rs — this script is the end-to-end smoke
+# check that the released binary's wiring still works.
 #
-# Exit code is non-zero on the first mismatch.
+# Fixtures that depend on an external analyzer binary (rust-analyzer,
+# semgrep, pyright, tsc) are skipped when that binary isn't on PATH so a
+# minimal install doesn't fail CI for missing optional capabilities.
+#
+# Exit code is non-zero on the first hard mismatch.
 
 set -euo pipefail
 
@@ -18,35 +22,65 @@ if [[ ! -x "$DAEMON" ]]; then
   exit 2
 fi
 
+# Map fixture name → required external binary. Empty string means the
+# fixture relies only on the daemon's built-in analyzers (hallucination,
+# crossfile, binding) and must always produce diagnostics.
+required_binary() {
+  case "$1" in
+    rust_analyzer) echo rust-analyzer ;;
+    semgrep)       echo semgrep       ;;
+    pyright)       echo pyright       ;;
+    tsc)           echo tsc           ;;
+    *)             echo ""            ;;
+  esac
+}
+
 FAIL=0
 
 for fixture_dir in "$ROOT"/test/fixtures/ai-slop/*/; do
   name="$(basename "$fixture_dir")"
   echo "── fixture: $name"
-  summary="$("$DAEMON" scan --workspace "$fixture_dir" 2>/dev/null)"
+
+  required="$(required_binary "$name")"
+  if [[ -n "$required" ]] && ! command -v "$required" >/dev/null 2>&1; then
+    echo "  ⤳ skipped: required binary '$required' not on PATH (covered by cargo tests when installed)"
+    continue
+  fi
+
+  # Capture stderr so a daemon panic or analyzer error is visible on failure.
+  stderr_log="$(mktemp)"
+  summary="$("$DAEMON" scan --workspace "$fixture_dir" 2>"$stderr_log")"
   files_total="$(echo "$summary" | grep -o '"files":[[:space:]]*[0-9]*' | head -1 | awk '{print $2}')"
   diagnostics="$(echo "$summary" | grep -o '"diagnostics":[[:space:]]*[0-9]*' | head -1 | awk '{print $2}')"
   red="$(echo "$summary" | grep -o '"redFiles":[[:space:]]*[0-9]*' | head -1 | awk '{print $2}')"
   yellow="$(echo "$summary" | grep -o '"yellowFiles":[[:space:]]*[0-9]*' | head -1 | awk '{print $2}')"
 
-  if [[ "${files_total:-0}" == "0" ]]; then
-    echo "  ✗ expected at least one file in $name"
+  fail_with() {
+    echo "  ✗ $1 in $name"
+    if [[ -s "$stderr_log" ]]; then
+      echo "  ── daemon stderr ──"
+      sed 's/^/    /' "$stderr_log"
+    fi
+    rm -f "$stderr_log"
     FAIL=1
+  }
+
+  if [[ "${files_total:-0}" == "0" ]]; then
+    fail_with "expected at least one file"
     continue
   fi
 
   if [[ "${diagnostics:-0}" == "0" ]]; then
-    echo "  ✗ expected hallucinated-import diagnostic in $name"
-    FAIL=1
+    fail_with "expected at least one diagnostic"
     continue
   fi
 
   if [[ "${red:-0}" == "0" && "${yellow:-0}" == "0" ]]; then
-    echo "  ✗ expected at least one non-green file in $name"
-    FAIL=1
+    fail_with "expected at least one non-green file"
     continue
   fi
 
+  rm -f "$stderr_log"
   echo "  ✓ $name ($files_total files, $diagnostics diagnostics, $red red, $yellow yellow)"
 done
 

From 025d788b8c88f8f4783c065943e4c25c4bf374ec Mon Sep 17 00:00:00 2001
From: Thomas Pluck <thomaspluck95@proton.me>
Date: Sat, 18 Apr 2026 09:12:48 +0100
Subject: [PATCH 2/4] ci: gate on `<bin> --version`, log semgrep errors when 0
 results
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

CI on the previous commit revealed two refinements needed:

1. `command -v rust-analyzer` returns true on a stock rustup install
   because rustup ships a shim at ~/.cargo/bin/rust-analyzer that fails
   unless `rustup component add rust-analyzer` has been run. The
   daemon's own `binary_present()` check (`<bin> --version`) catches
   this; the shell runner now mirrors that.

2. The semgrep fixture is producing 0 findings via the daemon binary
   even though semgrep itself is installed and the equivalent cargo
   integration test passes. Captured stderr just shows
   `semgrep pass complete n=0` with no clue why. Add diagnostic
   logging in semgrep.rs that, on a 0-result run, surfaces the
   parsed `errors` field plus a tail of semgrep's stderr — that
   should pinpoint the misconfig in the next CI run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 daemon/src/analyzers/semgrep.rs | 40 +++++++++++++++++++++++++++++++--
 test/run_fixtures.sh            |  7 ++++--
 2 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/daemon/src/analyzers/semgrep.rs b/daemon/src/analyzers/semgrep.rs
index 028b669..235a9a6 100644
--- a/daemon/src/analyzers/semgrep.rs
+++ b/daemon/src/analyzers/semgrep.rs
@@ -13,6 +13,7 @@ use crate::contracts::{Diagnostic, DiagnosticSource, Location, Range, Severity};
 use std::path::{Path, PathBuf};
 use std::process::Command;
 use std::time::Duration;
+use tracing::warn;
 
 pub fn binary_present() -> bool {
     if std::env::var("IVE_SKIP_SEMGREP").is_ok() {
@@ -67,8 +68,43 @@ pub fn scan_path(target: &Path, rules: &Path) -> Option<Vec<Diagnostic>> {
         .arg(target)
         .output()
         .ok()?;
-    let parsed: serde_json::Value = serde_json::from_slice(&output.stdout).ok()?;
-    let results = parsed.get("results")?.as_array()?;
+    let parsed: serde_json::Value = match serde_json::from_slice(&output.stdout) {
+        Ok(v) => v,
+        Err(e) => {
+            warn!(
+                error = %e,
+                stderr = %String::from_utf8_lossy(&output.stderr),
+                "semgrep stdout was not valid JSON"
+            );
+            return None;
+        }
+    };
+    let results = match parsed.get("results").and_then(|r| r.as_array()) {
+        Some(r) => r,
+        None => {
+            warn!(
+                stderr = %String::from_utf8_lossy(&output.stderr),
+                "semgrep JSON had no `results` array"
+            );
+            return None;
+        }
+    };
+    if results.is_empty() {
+        // 0 findings is a legitimate outcome, but on the AI-slop fixtures
+        // it's almost always a misconfig. Surface semgrep's own errors so
+        // the failure is debuggable.
+        let errors = parsed
+            .get("errors")
+            .map(|e| e.to_string())
+            .unwrap_or_else(|| "[]".into());
+        let stderr_tail = String::from_utf8_lossy(&output.stderr);
+        warn!(
+            errors = %errors,
+            stderr_len = stderr_tail.len(),
+            stderr_tail = %stderr_tail.lines().rev().take(5).collect::<Vec<_>>().join(" | "),
+            "semgrep returned 0 results"
+        );
+    }
     let mut diagnostics = Vec::with_capacity(results.len());
     for r in results {
         if let Some(d) = result_to_diagnostic(r, target) {
diff --git a/test/run_fixtures.sh b/test/run_fixtures.sh
index 2afdbd0..d565736 100755
--- a/test/run_fixtures.sh
+++ b/test/run_fixtures.sh
@@ -42,8 +42,11 @@ for fixture_dir in "$ROOT"/test/fixtures/ai-slop/*/; do
   echo "── fixture: $name"
 
   required="$(required_binary "$name")"
-  if [[ -n "$required" ]] && ! command -v "$required" >/dev/null 2>&1; then
-    echo "  ⤳ skipped: required binary '$required' not on PATH (covered by cargo tests when installed)"
+  # `command -v` is not enough: rustup ships a rust-analyzer shim that
+  # is on PATH but errors out unless `rustup component add rust-analyzer`
+  # has been run. Mirror the daemon's own check (`<bin> --version`).
+  if [[ -n "$required" ]] && ! "$required" --version >/dev/null 2>&1; then
+    echo "  ⤳ skipped: '$required' not usable (binary missing or rustup shim without component)"
     continue
   fi
 

From 288d95e3a83ddd170cac2cac6a6dc2b7f30b77c4 Mon Sep 17 00:00:00 2001
From: Thomas Pluck <thomaspluck95@proton.me>
Date: Sat, 18 Apr 2026 09:17:35 +0100
Subject: [PATCH 3/4] fix: pass --no-git-ignore to semgrep so subdirectory
 targets work
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Root cause for the semgrep fixture failing in CI but passing in cargo
tests:

When the target directory lives inside a git repo, semgrep auto-applies
a "tracked by git only" filter. From semgrep's own vantage point in CI
(target = /repo/test/fixtures/ai-slop/semgrep, daemon CWD = /repo) the
filter sees 0 files and the scan returns 0 findings, even though the
files are tracked at the repo root. Captured semgrep stderr from the
prior debugging commit:

    Ran 15 rules on 0 files: 0 findings.
    Scan was limited to files tracked by git

The cargo integration test passes because `isolate()` copies the
fixture into a tempdir outside any git repo, so the filter never
engages.

`--no-git-ignore` disables both the .gitignore respect AND the
git-tracked-only filter — IVE owns workspace traversal end-to-end, so
we never want semgrep second-guessing which files to scan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 daemon/src/analyzers/semgrep.rs | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/daemon/src/analyzers/semgrep.rs b/daemon/src/analyzers/semgrep.rs
index 235a9a6..ec51fa4 100644
--- a/daemon/src/analyzers/semgrep.rs
+++ b/daemon/src/analyzers/semgrep.rs
@@ -59,12 +59,19 @@ pub fn scan_path(target: &Path, rules: &Path) -> Option<Vec<Diagnostic>> {
     // Semgrep ≥1.x exits non-zero when it finds issues — we consume
     // stdout either way and don't pass the flag that older versions used
     // for this (it was renamed/removed across versions).
+    // `--no-git-ignore` is load-bearing: without it, semgrep auto-detects
+    // when the target lives inside a git repo and silently restricts the
+    // scan to files tracked by git from semgrep's own vantage point. That
+    // produces 0 findings on subdirectory targets even when the files are
+    // tracked at the repo root (the daemon's case). We never want that
+    // behaviour — IVE owns workspace traversal — so always opt out.
     let output = Command::new("semgrep")
         .arg("--config")
         .arg(rules)
         .arg("--json")
         .arg("--timeout")
         .arg("10")
+        .arg("--no-git-ignore")
         .arg(target)
         .output()
         .ok()?;

From 75d27c009b3434c44d95415d1991ea30c9fd80cb Mon Sep 17 00:00:00 2001
From: Thomas Pluck <thomaspluck95@proton.me>
Date: Sat, 18 Apr 2026 09:24:34 +0100
Subject: [PATCH 4/4] ci: add permissive .semgrepignore at repo root
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Second piece of the semgrep CI fix. With --no-git-ignore alone, semgrep
still consulted its built-in default .semgrepignore, whose patterns
include test/ — and since semgrep walks up from the fixture target to
the IVE git root, both app.py and requirements.txt got filtered as
'test/fixtures/ai-slop/semgrep/...'. Captured stderr from the previous
CI run:

    Ran 15 rules on 0 files: 0 findings.
    Files matching .semgrepignore patterns: 2

The only override mechanism semgrep CLI accepts is a .semgrepignore
file at the project (git) root, which it then uses *instead of* the
defaults. Empty contents = scan everything semgrep would otherwise
ignore.

This is the right behaviour for IVE specifically: scanner::walk_workspace
already filters node_modules / target / .git / .ive, and our
test/fixtures/ai-slop/ tree exists precisely to be scanned by the
semgrep analyzer in fixture tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .semgrepignore | 10 ++++++++++
 1 file changed, 10 insertions(+)
 create mode 100644 .semgrepignore

diff --git a/.semgrepignore b/.semgrepignore
new file mode 100644
index 0000000..ec979a6
--- /dev/null
+++ b/.semgrepignore
@@ -0,0 +1,10 @@
+# Intentionally minimal — overrides semgrep's built-in default
+# .semgrepignore (which excludes test/, tests/, vendor/, etc.).
+#
+# IVE owns workspace traversal: the daemon already filters node_modules,
+# target, .git, and .ive in scanner::walk_workspace. We don't want
+# semgrep also applying its own opinion about which files are worth
+# scanning, especially since our test/fixtures/ai-slop/ tree literally
+# exists to be scanned by the semgrep analyzer.
+#
+# If you add real exclusions here, document why.