feat: add static analysis pipeline with Semgrep and Qlty integration by amar-zhuri · Pull Request #18 · techdebtgpt/codewave

amar-zhuri · 2026-02-27T13:33:50Z

##Description:
## What this does

Runs Semgrep and Qlty on commit files automatically during evaluation. Semgrep catches security issues, Qlty catches code quality problems and code smells. Findings get fed into agent prompts so agents can reference real static analysis data in their evaluations.

How it works

Git diff gives us the changed files
Files get filtered — we skip anything irrelevant (165+ patterns: lock files, binaries, generated code, vendor dirs, docs, etc.)
Semgrep and Qlty run in parallel on the remaining files
Findings are filtered to only the lines that actually changed in the commit
Results are deduplicated, sorted by severity, and injected into agent prompts

Key decisions

Tool installation happens at config --init time, not during evaluation — no surprise downloads mid-run
User exclusion patterns are additive — if someone adds "excludedPaths": ["legacy/**"] in their config, it adds to the defaults instead of wiping them
Graceful degradation — if Semgrep or Qlty aren't installed, evaluation continues without them and logs a warning
Config file stays clean — config --init no longer writes 165 exclusion patterns into .codewave.config.json, defaults live in code

Tests

137 tests across 13 test files covering runners, parsers, scope resolution, config merging, tool installation, and service orchestration.

Includes: - MCP server core implementation - Documentation updates - Configuration changes

…, and warning handling

…vior changes

…tests

- pass commitDiff into static-analysis flow and filter findings by changed line ranges - keep semgrep and qlty execution hybrid (semgrep parallel with serialized qlty pipeline) - add broad default excludes for docs and multi-ecosystem lockfiles - improve raw artifact readability by storing parsed stdout/stderr structures

…non-null assertions - Extract requiresQltyInit() and qltyInitCompleted() into qlty-init-helper.ts, eliminating duplicated code from qlty-runner and qlty-smells-runner - Remove ! non-null assertions in tool-runner-registry by capturing executable into a local const before the runnable boolean check

Semgrep was timing out at 60s when running in parallel with qlty tools. Doubling the timeout to 120s gives semgrep enough headroom under concurrent load.

- extractFilesFromDiff now captures the b/ (new/renamed) path instead of a/, so renamed files are no longer silently dropped as missing on disk - Add a quoted-path branch for git C-quoted filenames (spaces, non-ASCII) - Add unquoteGitPath to decode octal byte sequences and simple escapes

…rruption - Parse quoted diff --git headers: diff --git "a/..." "b/..." - Move unquoting before backslash normalization in normalizeDiffPath so octal escape sequences (\303\251) are not corrupted by the \ -> / pass - Add unquoteGitPath helper for octal byte sequences and simple escapes - Add test covering non-ASCII filename (é) round-trip through diff parsing

…n transient failures - Split cache into per-tool maps (cachedSemgrepByMode, cachedQltyByMode) - Only cache a tool when it is successfully available, so a transient failure (network down, install timeout) does not lock it out for the process lifetime while the other tool remains available - Update cache test to toStrictEqual since the cache now reconstructs the ToolAvailability object rather than returning the same reference

Add finding-formatter module that routes static analysis results to agents based on category expertise. Each agent receives a filtered, formatted view of findings relevant to their role: - Category routing: security→architect, quality/style/bug→reviewer, etc. - Primary agents see all severities; secondary agents see error+warning only - Round 1 gets full findings; Round 2+ gets condensed error-only reference - Safety cap (200 findings) prevents prompt bloat in pathological cases Wire summary through LangGraph state → AgentContext → agent prompts.

…ll docs Add production documentation for the static analysis feature across 7 files: - ARCHITECTURE.md: full pipeline section with ASCII diagram, runners table, unified finding type, changed-lines scoping, category routing, agent injection - ADVANCED_FEATURES.md: user-facing section covering routing, risk levels, round behavior, graceful degradation, and prompt format - AGENTS.md: per-agent static analysis categories received blocks - CONFIGURATION.md: tool installation subsection with caching and re-install - CHANGELOG.md: v0.0.6 entry with features and fixes - INDEX.md: navigation entries, feature coverage row, updated stats - README.md: feature bullet and expanded quick start

…edPaths additive - Add ~80 new default exclusion patterns covering vendor dirs, IDE configs, generated code, minified/bundled files, binary assets, build outputs, language caches, and CI/CD configs - Fix config loader to merge user excludedPaths on top of defaults instead of silently replacing them - Remove excludedPaths from config --init output so the config file stays clean — defaults live in code, users only add their own patterns - Add tests for merge, deduplication, and preservation of defaults

amar-zhuri added 24 commits January 30, 2026 13:53

feat: add MCP server implementation

f05b57f

Includes: - MCP server core implementation - Documentation updates - Configuration changes

feat: add static-analysis foundations and fix filesChanged propagation

5c9b9dc

feat: add file-scope resolver and tool installer for static analysis

2c957ee

feat(static-analysis): implement semgrep integration, scope filtering…

93e82f5

…, and warning handling

feat(config): fail fast on invalid staticAnalysis config

2a9ea3b

fix(static-analysis): cache tool availability per autoInstall mode

e0762ad

refactor(static-analysis): split ToolInstaller internals without beha…

2cb2cc3

…vior changes

refactor(static-analysis): introduce runner registry for orchestration

b581c4f

feat(static-analysis): add qlty runner with sarif parsing and init retry

c28fd9f

feat(static-analysis): wire qlty runner through registry and service …

36b4a7e

…tests

fix(static-analysis): harden qlty init/bootstrap recovery

ad54ac1

feat(static-analysis): finalize qlty check/smells integration

1a3ae13

fix(static-analysis): improve qlty auto-install reliability

ce21c79

fix(static-analysis): increase semgrep default timeout to 120s

5ae7ac0

Semgrep was timing out at 60s when running in parallel with qlty tools. Doubling the timeout to 120s gives semgrep enough headroom under concurrent load.

fix(static-analysis): handle escaped quoted git diff paths

7ac3428

feat(static-analysis): move analyzer install to config init

7d442e0

amar-zhuri requested a review from rqirici February 27, 2026 13:33

amar-zhuri self-assigned this Feb 27, 2026

amar-zhuri added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add static analysis pipeline with Semgrep and Qlty integration#18

feat: add static analysis pipeline with Semgrep and Qlty integration#18
amar-zhuri wants to merge 24 commits intomainfrom
feat/static-analysis

amar-zhuri commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amar-zhuri commented Feb 27, 2026

How it works

Key decisions

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant