Skip to content

feat: add PostToolUse hook for PII redaction in Claude Code#22

Closed
byapparov wants to merge 2 commits into
masterfrom
feat/hooks-pii-redaction
Closed

feat: add PostToolUse hook for PII redaction in Claude Code#22
byapparov wants to merge 2 commits into
masterfrom
feat/hooks-pii-redaction

Conversation

@byapparov
Copy link
Copy Markdown
Contributor

Summary

  • Adds hush redact-hook — a Claude Code PostToolUse hook that redacts PII from tool outputs (Bash, Read, Grep, WebFetch) before Claude ever sees them
  • Adds hush init --hooks — generates/merges hook config into .claude/settings.json (idempotent, supports --local)
  • CLI subcommand routing with dynamic import() so hook commands never load Express/pino/blessed
  • Works standalone (no proxy needed) or alongside the proxy for defense-in-depth
Local files/commands → [Hook: redact before Claude sees] → Claude's context
                                                               ↓
                                                          API request
                                                               ↓
                                                    [Proxy: redact before cloud]
                                                               ↓
                                                          LLM Provider

Changes

File What
src/commands/redact-hook.ts Core hook handler — stdin JSON → Redactor → stdout override
src/commands/init.ts Setup command — creates/merges hook config
src/cli.ts Subcommand routing with dynamic imports
tests/redact-hook.test.ts 9 integration tests (spawns CLI as child process)
tests/init.test.ts 5 tests (create, merge, idempotent, --local, usage)
README.md Hooks Mode section with setup, diagram, comparison table
examples/team-config/.claude/settings.json Defense-in-depth example

Test plan

  • npm run build compiles cleanly
  • npm test — all 44 tests pass (6 files, including 14 new)
  • echo '{"tool_output":{"stdout":"email: test@foo.com"}}' | node dist/cli.js redact-hook → redacted JSON
  • echo '{"tool_output":{"stdout":"hello world"}}' | node dist/cli.js redact-hook → no output
  • node dist/cli.js init --hooks in temp dir → creates .claude/settings.json
  • node dist/cli.js (no args) → proxy still starts normally

🤖 Generated with Claude Code

Adds `hush redact-hook` command that runs as a Claude Code PostToolUse
hook, redacting PII from tool outputs before Claude ever sees them.
Works standalone or alongside the proxy for defense-in-depth.

- `hush redact-hook`: stdin/stdout hook handler using existing Redactor
- `hush init --hooks`: generates/merges hook config into settings.json
- CLI subcommand routing with dynamic imports (no heavy deps for hooks)
- 14 new tests (redact-hook + init integration tests)
- README: Hooks Mode section with setup, diagram, comparison table
- Team config example updated with defense-in-depth setup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 78.36% 250 / 319
🔵 Statements 76.81% 265 / 345
🔵 Functions 70.21% 33 / 47
🔵 Branches 68.3% 125 / 183
File CoverageNo changed files found.
Generated in workflow #83 for commit 9a5dc21 by the Vitest Coverage Report Action

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Code Review: PostToolUse Hook for PII Redaction

Summary

This PR adds a Claude Code PostToolUse hook (hush redact-hook) that redacts PII from tool outputs before Claude sees them. The implementation is solid overall, with good test coverage and clean architecture using dynamic imports to avoid loading heavy deps for hooks.


1. Redaction Logic ✅

Strengths:

  • Reuses existing Redactor class ensuring consistency between hook and proxy modes
  • extractText() correctly handles multiple tool output formats (stdout/stderr/content/output)
  • Deterministic token hashing ensures reproducible redaction

Concerns:

  • JSON in tool outputs: The hook concatenates stdout/stderr with \n, which could mangle structured JSON output. Consider whether CLI table outputs with embedded emails (e.g., | alice@example.com | admin |) are fully covered.
  • SECRET pattern length threshold: The regex requires 16+ char secrets ({16,}). This could miss valid shorter API keys/tokens. Consider documenting this threshold or making it configurable.
  • IPv4 false positives: The pattern \b(?:\d{1,3}\.){3}\d{1,3}\b could match version numbers like 1.2.3.4 in package.json or CLI output. Test coverage shows some handling, but edge cases may slip through.

2. Streaming Integrity ✅

Proxy mode (not in this PR's scope but related):

  • createStreamingRehydrator() in TokenVault correctly handles token splitting across SSE chunks
  • Backpressure handling with res.once('drain') is properly implemented in src/index.ts:143-145

Hook mode:

  • No streaming concerns here—hooks receive complete tool output. The implementation is appropriately stateless.

3. Security ✅

Strengths:

  • Vault uses structuredClone() for safe deep copies (prevents prototype pollution)
  • Original PII values are never logged—only token types and hashes
  • Hook is stateless; no persistent storage of sensitive data
  • HUSH_AUTH_TOKEN provides proxy authentication

No concerns identified.


4. Reliability ⚠️

Strengths:

  • Graceful handling of malformed JSON (exit 2 per hooks spec, blocks tool call)
  • Empty stdin and missing tool_output handled without errors
  • 30s upstream timeout with proper error logging in proxy

Concerns:

  • Hook timeout: The timeout: 10 seconds in the hook config may be insufficient for large tool outputs (e.g., cat on a 10MB log file). Consider increasing or documenting this limitation.
  • Tool coverage: Only Bash|Read|Grep|WebFetch are covered. Tools like Write, Edit, or Glob could expose PII through their outputs. Consider whether these need coverage.

Minor Observations

File Observation
src/commands/redact-hook.ts:55-65 extractText silently ignores unknown fields in tool_output. Consider logging when unexpected fields are encountered for debugging.
src/cli.ts Dynamic imports are a good pattern—keeps hook execution lightweight.
tests/redact-hook.test.ts Good coverage of edge cases (empty stdin, invalid JSON, combined stdout+stderr).

Verdict

Approve with minor suggestions. The core logic is sound. Consider addressing the 10s timeout and secret length threshold before merging, or document as known limitations.

Reviewed SHA: 1221c38

- Use `tool_response` field (not `tool_output`) matching actual payload
- Use `decision: "block"` + `reason` output format (PostToolUse has no
  outputOverride — confirmed via spec and closed GitHub issues #4635, #18594)
- Handle Read tool's nested `file.content` response shape
- Add Grep content field test case (10 tests total)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Technical Review: PostToolUse Hook for PII Redaction

✅ Strengths

Hook Spec Compliance

  • Exit codes correct: 0 for success, 2 for malformed input (blocks per spec)
  • Output format matches Claude Code hooks API: {"decision":"block","reason":"..."}
  • Timeout (10s) appropriate for redaction operations

PII Pattern Coverage

  • Handles JSON output from CLI tools (e.g., {"user": "alice@example.com"})
  • Preserves structure in CLI tables while redacting values
  • Multi-line support works correctly (emails, IPs, phones in same output)
  • Extracts text from multiple tool response fields: stdout, stderr, file.content, content, output

Security

  • No token leaks in hook implementation (src/commands/redact-hook.ts:1-124)
  • Vault stores tokens in-memory only with TTL (1h default)
  • Proxy binds to localhost only (127.0.0.1)
  • Auth token handling follows best practices (supports both X-Hush-Token and Authorization: Bearer)

Reliability

  • Idempotent hush init --hooks (won't duplicate hooks on re-run)
  • Graceful error handling with proper exit codes
  • Empty stdin handled correctly (exits 0)

Test Coverage

  • All 45 tests pass (including 14 new for hooks)
  • 100% coverage on redactor middleware
  • Integration tests spawn CLI as child process (matches real usage)

⚠️ Issues Found

Critical: IPv4 Regex False Positives (src/middleware/redactor.ts:38)

IPV4: /\b(?:\d{1,3}\.){3}\d{1,3}\b/g

Problem: Matches semantic version numbers and invalid IPs.

Evidence:

$ echo '{"tool_response":{"stdout":"version: 1.2.3.4"}}' | node dist/cli.js redact-hook
{"decision":"block","reason":"version: [NETWORK_IP_6694f8]"}  # ❌ Wrong

$ echo '{"tool_response":{"stdout":"My IP is 999.999.999.999"}}' | node dist/cli.js redact-hook  
{"decision":"block","reason":"My IP is [NETWORK_IP_a464c3]"}  # ❌ Invalid IP

Impact: Breaks semantic versioning in tool outputs (e.g., package versions, API versions).

Recommended Fix:

IPV4: /\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/g

Medium: SECRET Pattern Over-Matching (src/middleware/redactor.ts:42)

SECRET: /(?:api[-_]?key|secret|password|token|bearer|auth)["']?\s*[:=]\s*["']?([a-zA-Z0-9\-_!@#$%^&*()=+]{16,})["']?/gi

Problem: Long non-secret strings with keywords get redacted.

Evidence:

$ echo '{"tool_response":{"stdout":"api_key=this_is_a_very_long_string_that_is_not_secret"}}' | node dist/cli.js redact-hook
{"decision":"block","reason":"api_key=[SENSITIVE_SECRET_5d9068]"}  # ❌ Over-matched

Impact: False positives in documentation/examples with placeholder text.

Mitigation: The 16-char minimum helps, but consider:

  1. Require special chars in value (e.g., at least one non-alphanumeric)
  2. Add entropy check for high-randomness strings
  3. Document expected false positive rate

🔍 Streaming Integrity (No Changes in PR)

This PR focuses on hooks (pre-Claude redaction), not streaming proxy logic. The existing streaming implementation in src/index.ts:118-153 and src/vault/token-vault.ts:87-211 was not modified.

Current streaming behavior (unchanged):

  • Stateful rehydrator handles tokens split across SSE chunks
  • Backpressure handling with res.once('drain')
  • 30s upstream timeout

📊 Test Results

✓ tests/redact-hook.test.ts (9 tests) - Hook integration
✓ tests/init.test.ts (5 tests) - Config generation
✓ tests/redaction.test.ts (7 tests) - PII patterns
✓ tests/vault.test.ts (5 tests) - TTL + streaming
✓ tests/proxy.test.ts (10 tests) - Proxy behavior
✓ tests/universal-proxy.test.ts (8 tests) - Multi-provider

Test Files: 6 passed (6)
Tests: 45 passed (45)
Coverage: redactor.ts 100%, token-vault.ts 87%

🎯 Recommendations

  1. Fix IPv4 regex (blocking issue for semantic versions)
  2. Add tests for version numbers (e.g., 1.2.3.4, v2.0.0-beta.1)
  3. Consider entropy-based secret detection to reduce false positives
  4. Document known false positive patterns in README

✅ Overall Assessment

Approve with critical fix required for IPv4 regex. The hook architecture is solid, follows Claude Code spec correctly, and provides defense-in-depth when combined with proxy mode. The main blocker is the version number false positive, which will impact real-world usage (e.g., npm list output, version checks).

Reviewed SHA: 9a5dc21

@byapparov
Copy link
Copy Markdown
Contributor Author

Superseded by #23 which includes all changes from this PR plus the OpenCode plugin, cloud provider key detection, and GitLab E2E pipeline.

@byapparov byapparov closed this Mar 2, 2026
@byapparov byapparov deleted the feat/hooks-pii-redaction branch March 3, 2026 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant