Skip to content

fix(scanner): end-to-end scan pipeline — persistence, parsers, profile#16

Merged
Emperiusm merged 1 commit intomainfrom
feature/scan-fixes
Apr 17, 2026
Merged

fix(scanner): end-to-end scan pipeline — persistence, parsers, profile#16
Emperiusm merged 1 commit intomainfrom
feature/scan-fixes

Conversation

@Emperiusm
Copy link
Copy Markdown
Owner

Summary

Live test against pentest-ground.com exposed a cascade of pre-existing issues that together produced 0 findings on every web scan despite the CLI claiming success. This PR fixes the full pipeline end-to-end.

Bugs found & fixed

Persistence

  • scan run never passed store to api.execute() → pipeline ran but couldn't persist findings
  • save_scan / save_task used INSERT → second save crashed with UNIQUE violation
  • Engine _finalize() never populated tools_completed / tools_failed / started_at / completed_at
  • Task records stayed at status=pending forever (engine mutated in-memory only)
  • Scan finding_count stayed 0 even when pipeline emitted findings

Profile (web_quick.yaml)

  • Assumed tools were on PATH — this deployment has them only in Docker
  • nuclei -json is deprecated → -jsonl -silent
  • whatweb --log-json=- crashes under docker exec → file-based output
  • nikto redirect only suppressed stderr → human text leaked into JSON stream

Missing parsers
Profile specified parser: nuclei|nikto|whatweb|waybackurls but none of those parsers existed in scanner/parsing/parsers/. Added all four:

  • NucleiParser — JSONL, severity + CWE mapping
  • NiktoParser — dict or array of target-results, heuristic severity
  • WhatWebParser — concatenated JSON arrays handled via bracket-counting
  • WaybackurlsParser — plain-text URL list

Live test results

Target URL Findings
DVWA pentest-ground.com:4280 134
GraphQL pentest-ground.com:5013 30
RestFlaw pentest-ground.com:9000 126
GuardianLeaks pentest-ground.com:81 41

All 4 tools (whatweb, waybackurls, nuclei, nikto) complete without failures.

Test plan

  • 33 existing CLI tests still pass (plugin_cli, plugin, containers)
  • Live scans produce findings across all 4 HTTP targets
  • Manual: view findings in dashboard TUI
  • Follow-up: parsers should have unit tests (not included in this PR to keep scope tight)

🤖 Generated with Claude Code

Live test against pentest-ground.com exposed a cascade of pre-existing
issues that together made web scans produce 0 findings. Each scan completed
instantly with "pending" status in the DB despite the CLI claiming success.

## Persistence fixes (scan_cli.py, engine.py, store.py)

- scan_cli.py no longer executed with `store=` param, so pipeline could not
  persist findings → pass store through.
- save_scan / save_task used INSERT which crashed on re-save with UNIQUE
  violation → switched to INSERT OR REPLACE (idempotent upsert).
- Engine._finalize never populated tools_completed / tools_failed /
  completed_at / started_at → compute from internal task map and stamp.
- Task records stayed at status=pending forever (engine mutated in-memory
  only) → save all tasks post-execute in scan_cli.
- Scan.finding_count stayed 0 because engine has no visibility into pipeline
  output → query store.get_scan_findings after execute and set before
  terminal save.

## Profile fixes (profiles/web_quick.yaml)

- Commands ran bare binaries (whatweb, nuclei, …) assuming PATH install,
  but this deployment only has the Docker containers from mcp-security-hub.
  → Route through `docker exec <tool>-mcp …`.
- `nuclei -json` is a deprecated flag → use `-jsonl -silent`.
- `whatweb --log-json=-` crashes on closed stream under docker exec → write
  to /tmp and cat the file.
- `nikto -Format json` needs `-output FILE`; the redirect only suppressed
  stderr leaving human-readable stdout → use `>/dev/null 2>&1` to drop
  Nikto's chatty stdout and cat the JSON file.

## New parsers (scanner/parsing/parsers/)

Profile tasks specified `parser: nuclei|nikto|whatweb|waybackurls` but none
were registered — the pipeline logged "Parser X not found" and returned 0
findings even when tools succeeded. Added:

- NucleiParser — JSONL, severity mapping, CWE extraction, endpoint precision.
- NiktoParser — JSON (dict or array), heuristic severity from message patterns.
- WhatWebParser — handles concatenated JSON arrays (one per target) via
  bracket-counting fallback; emits per-plugin technology findings.
- WaybackurlsParser — plain-text URL list → info-level endpoint findings.

Registered all four in ScanPipeline._register_builtin_parsers.

## Live test results (pentest-ground.com)

After the fixes, web-quick profile against the 4 HTTP targets:

  https://pentest-ground.com:4280 (DVWA)        134 findings
  https://pentest-ground.com:5013 (GraphQL)      30 findings
  https://pentest-ground.com:9000 (RestFlaw)    126 findings
  https://pentest-ground.com:81   (GuardianLeaks) 41 findings

All tools (whatweb, waybackurls, nuclei, nikto) complete without failures.
Severity distribution now surfaces info/low/medium findings correctly.

## Tests

33 existing CLI tests (plugin_cli, plugin, containers) still pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@Emperiusm Emperiusm merged commit b03ed2c into main Apr 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant