feat: add static analysis pipeline with Semgrep and Qlty integration#18
Open
amar-zhuri wants to merge 24 commits intomainfrom
Open
feat: add static analysis pipeline with Semgrep and Qlty integration#18amar-zhuri wants to merge 24 commits intomainfrom
amar-zhuri wants to merge 24 commits intomainfrom
Conversation
Includes: - MCP server core implementation - Documentation updates - Configuration changes
…, and warning handling
- pass commitDiff into static-analysis flow and filter findings by changed line ranges - keep semgrep and qlty execution hybrid (semgrep parallel with serialized qlty pipeline) - add broad default excludes for docs and multi-ecosystem lockfiles - improve raw artifact readability by storing parsed stdout/stderr structures
…non-null assertions - Extract requiresQltyInit() and qltyInitCompleted() into qlty-init-helper.ts, eliminating duplicated code from qlty-runner and qlty-smells-runner - Remove ! non-null assertions in tool-runner-registry by capturing executable into a local const before the runnable boolean check
Semgrep was timing out at 60s when running in parallel with qlty tools. Doubling the timeout to 120s gives semgrep enough headroom under concurrent load.
- extractFilesFromDiff now captures the b/ (new/renamed) path instead of a/, so renamed files are no longer silently dropped as missing on disk - Add a quoted-path branch for git C-quoted filenames (spaces, non-ASCII) - Add unquoteGitPath to decode octal byte sequences and simple escapes
…rruption - Parse quoted diff --git headers: diff --git "a/..." "b/..." - Move unquoting before backslash normalization in normalizeDiffPath so octal escape sequences (\303\251) are not corrupted by the \ -> / pass - Add unquoteGitPath helper for octal byte sequences and simple escapes - Add test covering non-ASCII filename (é) round-trip through diff parsing
…n transient failures - Split cache into per-tool maps (cachedSemgrepByMode, cachedQltyByMode) - Only cache a tool when it is successfully available, so a transient failure (network down, install timeout) does not lock it out for the process lifetime while the other tool remains available - Update cache test to toStrictEqual since the cache now reconstructs the ToolAvailability object rather than returning the same reference
Add finding-formatter module that routes static analysis results to agents based on category expertise. Each agent receives a filtered, formatted view of findings relevant to their role: - Category routing: security→architect, quality/style/bug→reviewer, etc. - Primary agents see all severities; secondary agents see error+warning only - Round 1 gets full findings; Round 2+ gets condensed error-only reference - Safety cap (200 findings) prevents prompt bloat in pathological cases Wire summary through LangGraph state → AgentContext → agent prompts.
…ll docs Add production documentation for the static analysis feature across 7 files: - ARCHITECTURE.md: full pipeline section with ASCII diagram, runners table, unified finding type, changed-lines scoping, category routing, agent injection - ADVANCED_FEATURES.md: user-facing section covering routing, risk levels, round behavior, graceful degradation, and prompt format - AGENTS.md: per-agent static analysis categories received blocks - CONFIGURATION.md: tool installation subsection with caching and re-install - CHANGELOG.md: v0.0.6 entry with features and fixes - INDEX.md: navigation entries, feature coverage row, updated stats - README.md: feature bullet and expanded quick start
…edPaths additive - Add ~80 new default exclusion patterns covering vendor dirs, IDE configs, generated code, minified/bundled files, binary assets, build outputs, language caches, and CI/CD configs - Fix config loader to merge user excludedPaths on top of defaults instead of silently replacing them - Remove excludedPaths from config --init output so the config file stays clean — defaults live in code, users only add their own patterns - Add tests for merge, deduplication, and preservation of defaults
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
##Description:
## What this does
Runs Semgrep and Qlty on commit files automatically during evaluation. Semgrep catches security issues, Qlty catches code quality problems and code smells. Findings get fed into agent prompts so agents can reference real static analysis data in their evaluations.
How it works
Key decisions
config --inittime, not during evaluation — no surprise downloads mid-run"excludedPaths": ["legacy/**"]in their config, it adds to the defaults instead of wiping themconfig --initno longer writes 165 exclusion patterns into.codewave.config.json, defaults live in codeTests
137 tests across 13 test files covering runners, parsers, scope resolution, config merging, tool installation, and service orchestration.