Redact client secrets and shell-stdin literal transcripts by peyton-alt · Pull Request #1203 · entireio/cli

peyton-alt · 2026-05-13T20:56:44Z

https://entire.io/gh/entireio/cli/trails/372

Summary

Closes structural redaction gaps in transcript content:

*_CLIENT_SECRET, *_API_KEY, *_TOKEN env-shape redaction in both raw text and JSONL structured fields (e.g. GITHUB_CLIENT_SECRET=… was previously leaking).
Shell-stdin literal pattern: printf 'value' | examplectl secret put EXAMPLE_API_KEY is now redacted. These secrets are often human-ish (correct-horse-client) and below the entropy threshold, so entropy + Betterleaks couldn't catch them — the structural shape is the signal.
Identifier env-shape guard: *_id / *_account keys with non-high-entropy values are preserved (so GOOGLE_ADSENSE_ACCOUNT=pub-1234567890123456 no longer collapses to REDACTED by the entropy detector). Values that independently look secret-like still redact.
\$VAR references in secret-put commands continue to be treated as placeholders, not values.

The regex tier stays the deterministic, structural layer — important for layered composition with future ML-based redaction (e.g. OpenAI Privacy Filter), which is strong on free-text PII but lists "missed secrets for novel credential formats" and "over-redaction of benign high-entropy strings" as known failure modes.

Test plan

go test ./redact -count=1 — full package, all green
mise run fmt && mise run lint — 0 issues
mise run test:ci — unit + integration + Vogon canary (58/58) + roger-roger canary (4/4), exit 0
New tests: github_client_secret_env_var, google_adsense_account_is_preserved_as_identifier, TestJSONLContent_StructuredCredentialFieldsRedacted (extended with GITHUB_CLIENT_SECRET), TestJSONLContent_ShellStdinSecretCommandRedactsPrintfLiteral, TestJSONLContent_ShellStdinSecretCommandOverRedactionGuards

🤖 Generated with Claude Code

Note

Medium Risk
Expands and reorders redaction heuristics with new regexes and guard logic, which could introduce false positives/negatives in transcript redaction. Changes are localized to redaction behavior but affect all logged/serialized content.

Overview
Extends deterministic redaction beyond DB-password keys to also redact secret-shaped keys (e.g. *_CLIENT_SECRET, *_API_KEY, *_TOKEN) in both raw text key=value/key:value forms and structured JSON fields.

Adds a new detector for shell transcripts that pipe a literal into secret-management commands (e.g. printf '...' | ... secrets put ...), redacting only the stdin literal while preserving command context and treating $VAR references as placeholders.

Tightens entropy-based redaction by skipping low-entropy identifier assignments like *_id/*_account unless the value itself looks like a high-entropy secret. Updates/extends tests to cover the new secret shapes, shell-stdin redaction, and the new over-redaction guard.

^{Reviewed by Cursor Bugbot for commit 4018d6a. Configure here.}

Adds three structural redaction rules to close transcript leak gaps: - CLIENT_SECRET / API_KEY / TOKEN env-shape values, in both raw text and JSONL structured fields (e.g. GITHUB_CLIENT_SECRET=...). - Shell-stdin literals piped into secret-management commands such as `printf 'value' | examplectl secret put EXAMPLE_API_KEY`, which are often human-ish and below the entropy threshold. - Identifier env-var shapes (`*_id`, `*_account`) are preserved when the value is not independently high-entropy, preventing benign IDs like `GOOGLE_ADSENSE_ACCOUNT=pub-...` from being collapsed to `REDACTED` by the entropy detector. Shell variable references like `\$EXAMPLE_API_KEY` continue to be treated as placeholders rather than secret values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 7daa370e8203

Copilot

Pull request overview

This PR strengthens transcript redaction in the redact package to close three structural gaps: env-shape secret keys (*_CLIENT_SECRET, *_API_KEY, *_TOKEN) in raw text and JSONL fields, shell-stdin literals piped into secret-management CLIs (e.g. printf 'value' | examplectl secret put ...), and an over-redaction guard so *_id/*_account identifiers with non-high-entropy values aren't collapsed by the entropy detector. $VAR references continue to be treated as placeholders.

Changes:

Add secretValueKeyShape/secretValuePattern/secretJSONKeyRegex and a shellStdinSecretPattern regex; wire them into String/detectCredentialValues/detectShellStdinSecrets and into the JSON key sensitivity check.
Add isNonSecretIdentifierAssignment + isHighEntropySecretCandidate to preserve *_id/*_account env assignments with non-entropic values, and isShellVariableReference to treat $VAR as a placeholder.
Extend unit tests covering GITHUB_CLIENT_SECRET, GOOGLE_ADSENSE_ACCOUNT, structured JSONL credential fields, and shell-stdin printf | secret put redaction plus over-redaction guards.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
redact/redact.go	New regexes and helpers to detect env-shape secrets and shell-stdin literals, plus an identifier-assignment guard against entropy over-redaction.
redact/redact_test.go	New tests for client-secret env vars, identifier preservation, structured JSONL credential fields, and shell-stdin `printf

peyton-alt · 2026-05-13T21:40:45Z

@BugBot review

Per the existing pattern documented on dbPasswordKeyShape — the same key list should compose both the env-var regex and the JSON-key regex so the vendor list stays in one place. The hand-written duplicate in secretJSONKeyRegex would have silently diverged if a new key type were added to one side but not the other. Behavior is unchanged: the regex runs on normalized keys, where [_-]+ and _+ match identically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 20953e233268

peyton-alt · 2026-05-13T22:09:54Z

@BugBot review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit 4018d6a. Configure here.}

Following review feedback on this PR, four scoped changes narrow the new patterns introduced earlier so they stop flat-redacting prose, debug identifiers, and account fields: - secretValuePattern now uses `=` only (not `[:=]`). The JSON-aware path handles `key: value` shapes via secretJSONKeyRegex, so dropping `:` from the raw-text regex prevents prose like "Got the token: foo" from over-redacting the next word. - secretValueKeyShape splits into two alternation branches so bare `secret`/`token` keywords require at least one prefix segment (e.g. csrf_token, auth_secret). Standalone `{"token":"x"}` / `{"secret":"x"}` JSON fields are no longer flat-redacted; entropy and Betterleaks still catch real high-entropy values. - New nonSecretTokenPrefixes allowlist (cancel, continuation, cursor, idempotency, next, page, pagination, prev, previous, sync) is checked in isCredentialJSONSecretKey so debug/pagination tokens like `{"cancel_token":"abc-debug"}` and `{"pagination_token":"..."}` are preserved in transcripts. Real credential-shaped values still get caught by the per-value String() call. - shouldSkipJSONLField now also skips keys ending in `account`, so identifier fields like `{"google_adsense_account":"pub-..."}` are preserved by design rather than by entropy coincidence. Regression tests cover the three previously-leaking prose cases, the seven JSON token-key cases, and a high-entropy account value to prove the protection is structural rather than entropy-dependent. The PR's original coverage of `printf 'value' | examplectl secret put KEY` and structured credential field redaction is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 02c170aecfae

Three tests in manifest_test.go hardcoded `started := time.Date(2026, 5, 8, ...)` as the session.State.StartedAt anchor. session.StateStore.Load auto-deletes sessions whose StartedAt is older than StaleSessionThreshold (7 days) and returns nil, so the hardcoded date silently rots: tests pass while the calendar is inside the 7-day window, then fail forever once it crosses. CI on PR #964 caught this — same SHA passed yesterday (6 days after the hardcoded date) and failed today (7 days after). Unrelated to the streaming/diagnostic work in this PR; the manifest_test.go file isn't touched by any other commit on this branch. Switch all three tests to `time.Now().UTC().Add(-time.Hour)` so the session is always one hour old at test time. Still exercises the 5-second jitter check inside matchReviewSessionState; stays well inside the staleness window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

peyton-alt requested a review from a team as a code owner May 13, 2026 20:56

Copilot AI review requested due to automatic review settings May 13, 2026 20:56

Copilot started reviewing on behalf of peyton-alt May 13, 2026 20:57 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

cursor Bot reviewed May 13, 2026

View reviewed changes

Comment thread redact/redact.go Outdated

cursor Bot reviewed May 13, 2026

View reviewed changes

peyton-alt and others added 2 commits May 15, 2026 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redact client secrets and shell-stdin literal transcripts#1203

Redact client secrets and shell-stdin literal transcripts#1203
peyton-alt wants to merge 4 commits into
mainfrom
redact-2

peyton-alt commented May 13, 2026 •

edited by cursor Bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

peyton-alt commented May 13, 2026

Uh oh!

Uh oh!

peyton-alt commented May 13, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

peyton-alt commented May 13, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

peyton-alt commented May 13, 2026

Uh oh!

Uh oh!

peyton-alt commented May 13, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

peyton-alt commented May 13, 2026 •

edited by cursor Bot

Loading