Skip to content

Commit c4681ae

Browse files
hyperpolymathclaude
andcommitted
docs: add TEST-NEEDS.md and/or PROOF-NEEDS.md from audit
Documents testing and proof gaps identified during batch audit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 7920b1b commit c4681ae

File tree

2 files changed

+94
-0
lines changed

2 files changed

+94
-0
lines changed

PROOF-NEEDS.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# PROOF-NEEDS.md
2+
<!-- SPDX-License-Identifier: PMPL-1.0-or-later -->
3+
4+
## Current State
5+
6+
- **LOC**: ~9,400
7+
- **Languages**: Haskell, Idris2, Zig
8+
- **Existing ABI proofs**: `src/abi/*.idr` (template-level)
9+
- **Dangerous patterns**: None detected in Haskell source
10+
11+
## What Needs Proving
12+
13+
### Taint Analysis (src/Sanctify/Analysis/Taint.hs)
14+
- Tracks tainted data flow through PHP code
15+
- Prove: taint propagation is sound (no tainted value reaches a sink without sanitization)
16+
- This is the security-critical core of the tool
17+
18+
### Security Analysis (src/Sanctify/Analysis/Security.hs)
19+
- Detects security vulnerabilities in PHP
20+
- Prove: analysis does not have false negatives for the declared vulnerability classes
21+
22+
### Dead Code Analysis (src/Sanctify/Analysis/DeadCode.hs)
23+
- Prove: reported dead code is genuinely unreachable
24+
25+
### Type Checker (src/Sanctify/Analysis/Types.hs)
26+
- PHP type inference
27+
- Prove: type inference is sound with respect to PHP runtime semantics
28+
29+
### Parser Correctness (src/Sanctify/Parser/)
30+
- `Parser.hs`, `Lexer.hs`, `Token.hs`
31+
- Prove: parser accepts valid PHP and rejects invalid PHP (or at minimum, is conservative)
32+
33+
### Transform Soundness (src/Sanctify/Transform/)
34+
- `Sanitize.hs`, `Strict.hs`, `StrictTypes.hs`, `TypeHints.hs`
35+
- Prove: code transformations preserve program semantics
36+
- Prove: sanitization transforms eliminate the security vulnerabilities they claim to fix
37+
38+
### WordPress-Specific Rules (src/Sanctify/WordPress/)
39+
- `Constraints.hs`, `Hooks.hs`, `Security.hs`
40+
- Prove: WordPress hook analysis correctly models WordPress execution order
41+
42+
## Recommended Prover
43+
44+
- **Agda** or **Lean4** — Haskell analysis tools have a strong tradition of formal verification
45+
- **Idris2** for ABI contracts
46+
47+
## Priority
48+
49+
**HIGH** — Security analysis tool. If the taint analysis is unsound, users trust code that is actually vulnerable. False negatives in a security tool are worse than no tool at all.

TEST-NEEDS.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# TEST-NEEDS: sanctify-php
2+
3+
## Current State
4+
5+
| Category | Count | Details |
6+
|----------|-------|---------|
7+
| **Source modules** | 20 | Haskell: AST, Parser (Lexer, Token), Analysis (Advanced, DeadCode, Security, Taint), Transform (Sanitize, Strict, StrictTypes, TypeHints), WordPress (Constraints, Hooks, Security), Config, Emit, Report, Ruleset |
8+
| **Unit tests** | ~67 | SecuritySpec.hs (~30), TransformSpec.hs (~37) |
9+
| **Integration tests** | ~69 | Main.hs test harness |
10+
| **E2E tests** | 0 | No end-to-end with actual PHP files through full pipeline |
11+
| **Test fixtures** | 9 | PHP fixture files for SQL injection, XSS, WordPress, dead code, etc. |
12+
| **Benchmarks** | 0 | None |
13+
14+
## What's Missing
15+
16+
### E2E Tests
17+
- [ ] No test that runs sanctify-php as a binary on a PHP codebase
18+
- [ ] No test that validates transformed PHP output is syntactically valid
19+
20+
### Aspect Tests
21+
- [ ] **Security**: SecuritySpec exists but only ~30 tests for a SECURITY ANALYSIS TOOL. Needs 200+
22+
- [ ] **Performance**: No tests for large PHP codebases (1000+ files)
23+
- [ ] **Concurrency**: No parallel analysis tests
24+
- [ ] **Error handling**: No tests for malformed PHP, encoding issues, huge files
25+
26+
### Benchmarks Needed
27+
- [ ] Parsing throughput (lines/second on real WordPress codebases)
28+
- [ ] Taint analysis scaling with codebase size
29+
- [ ] Memory usage on large projects
30+
31+
### Self-Tests
32+
- [ ] No self-diagnostic mode
33+
34+
## FLAGGED ISSUES
35+
- **A security analysis tool with ~30 security tests** is embarrassing. This needs an order of magnitude more.
36+
- **Taint analysis module has 0 dedicated tests** -- the most critical analysis capability is untested
37+
- **Dead code detection has 0 dedicated tests** (only fixture files exist)
38+
39+
## Priority: P1 (HIGH)
40+
41+
## FAKE-FUZZ ALERT
42+
43+
- `tests/fuzz/placeholder.txt` is a scorecard placeholder inherited from rsr-template-repo — it does NOT provide real fuzz testing
44+
- Replace with an actual fuzz harness (see rsr-template-repo/tests/fuzz/README.adoc) or remove the file
45+
- Priority: P2 — creates false impression of fuzz coverage

0 commit comments

Comments
 (0)