fix(scanner): end-to-end scan pipeline — persistence, parsers, profile#16
Merged
fix(scanner): end-to-end scan pipeline — persistence, parsers, profile#16
Conversation
Live test against pentest-ground.com exposed a cascade of pre-existing issues that together made web scans produce 0 findings. Each scan completed instantly with "pending" status in the DB despite the CLI claiming success. ## Persistence fixes (scan_cli.py, engine.py, store.py) - scan_cli.py no longer executed with `store=` param, so pipeline could not persist findings → pass store through. - save_scan / save_task used INSERT which crashed on re-save with UNIQUE violation → switched to INSERT OR REPLACE (idempotent upsert). - Engine._finalize never populated tools_completed / tools_failed / completed_at / started_at → compute from internal task map and stamp. - Task records stayed at status=pending forever (engine mutated in-memory only) → save all tasks post-execute in scan_cli. - Scan.finding_count stayed 0 because engine has no visibility into pipeline output → query store.get_scan_findings after execute and set before terminal save. ## Profile fixes (profiles/web_quick.yaml) - Commands ran bare binaries (whatweb, nuclei, …) assuming PATH install, but this deployment only has the Docker containers from mcp-security-hub. → Route through `docker exec <tool>-mcp …`. - `nuclei -json` is a deprecated flag → use `-jsonl -silent`. - `whatweb --log-json=-` crashes on closed stream under docker exec → write to /tmp and cat the file. - `nikto -Format json` needs `-output FILE`; the redirect only suppressed stderr leaving human-readable stdout → use `>/dev/null 2>&1` to drop Nikto's chatty stdout and cat the JSON file. ## New parsers (scanner/parsing/parsers/) Profile tasks specified `parser: nuclei|nikto|whatweb|waybackurls` but none were registered — the pipeline logged "Parser X not found" and returned 0 findings even when tools succeeded. Added: - NucleiParser — JSONL, severity mapping, CWE extraction, endpoint precision. - NiktoParser — JSON (dict or array), heuristic severity from message patterns. - WhatWebParser — handles concatenated JSON arrays (one per target) via bracket-counting fallback; emits per-plugin technology findings. - WaybackurlsParser — plain-text URL list → info-level endpoint findings. Registered all four in ScanPipeline._register_builtin_parsers. ## Live test results (pentest-ground.com) After the fixes, web-quick profile against the 4 HTTP targets: https://pentest-ground.com:4280 (DVWA) 134 findings https://pentest-ground.com:5013 (GraphQL) 30 findings https://pentest-ground.com:9000 (RestFlaw) 126 findings https://pentest-ground.com:81 (GuardianLeaks) 41 findings All tools (whatweb, waybackurls, nuclei, nikto) complete without failures. Severity distribution now surfaces info/low/medium findings correctly. ## Tests 33 existing CLI tests (plugin_cli, plugin, containers) still pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Live test against pentest-ground.com exposed a cascade of pre-existing issues that together produced 0 findings on every web scan despite the CLI claiming success. This PR fixes the full pipeline end-to-end.
Bugs found & fixed
Persistence
scan runnever passedstoretoapi.execute()→ pipeline ran but couldn't persist findingssave_scan/save_taskusedINSERT→ second save crashed with UNIQUE violation_finalize()never populatedtools_completed/tools_failed/started_at/completed_atstatus=pendingforever (engine mutated in-memory only)finding_countstayed 0 even when pipeline emitted findingsProfile (
web_quick.yaml)nuclei -jsonis deprecated →-jsonl -silentwhatweb --log-json=-crashes under docker exec → file-based outputniktoredirect only suppressed stderr → human text leaked into JSON streamMissing parsers
Profile specified
parser: nuclei|nikto|whatweb|waybackurlsbut none of those parsers existed inscanner/parsing/parsers/. Added all four:NucleiParser— JSONL, severity + CWE mappingNiktoParser— dict or array of target-results, heuristic severityWhatWebParser— concatenated JSON arrays handled via bracket-countingWaybackurlsParser— plain-text URL listLive test results
pentest-ground.com:4280pentest-ground.com:5013pentest-ground.com:9000pentest-ground.com:81All 4 tools (whatweb, waybackurls, nuclei, nikto) complete without failures.
Test plan
🤖 Generated with Claude Code