Skip to content

feat(auth): add failproofai auth login/logout/whoami + fix 3 over-firing builtin policies#396

Open
NiveditJain wants to merge 7 commits into
mainfrom
cli-auth-commands
Open

feat(auth): add failproofai auth login/logout/whoami + fix 3 over-firing builtin policies#396
NiveditJain wants to merge 7 commits into
mainfrom
cli-auth-commands

Conversation

@NiveditJain
Copy link
Copy Markdown
Member

@NiveditJain NiveditJain commented May 29, 2026

Summary

Two-in-one PR.

failproofai auth login | logout | whoami (new)

The follow-up promised by 0.0.11-beta.2's removal of the old device-flow auth (#380). Email-OTP login against the new Rust api-server (/v0/auth/*):

  • failproofai auth login [--email <addr>] — sends a one-time code via POST /v0/auth/login/request, prompts for the 6-digit code, exchanges it via POST /v0/auth/login/verify, persists the session.
  • failproofai auth logout — best-effort POST /v0/auth/logout, then always clears local state so the user can never get stuck with a session file they cannot rm.
  • failproofai auth whoamiGET /v0/auth/me with the access token, auto-refreshing via POST /v0/auth/token/refresh when the access token is expired but the refresh token is still valid; surfaces Session expired and clears state when the refresh token has also expired.

New src/auth/{session-store,api-client,prompts,manager}.ts and matching __tests__/auth/. Session stored at ~/.failproofai/session.json with mode 0600 (same pattern as src/audit/cache.ts). Default base URL https://api.befailproof.ai; override via FAILPROOFAI_API_BASE_URL. Wired into bin/failproofai.mjs alongside policies and audit, with track("cli_auth_<verb>_success") telemetry.

While here, also fixes a packaging quirk: bun's bundler can emit multiple copies of CliError when the class shows up via both a static import inside a dynamically-imported handler AND the top-level dynamic import in bin/failproofai.mjs. err instanceof CliError then returns false across the copies and a clean user-facing error gets reported as Unexpected error: .... The top-level catch is now duck-typed (checks err.name === "CliError" + the exitCode field) so it works regardless of class identity in the bundle.

Three builtin policies stop over-firing on benign source / args

These were blocking ordinary development work — including, ironically, this PR.

  • block-secrets-write no longer denies Write of any path containing the substring credentials or id_rsa. SECRET_FILE_CREDENTIALS_RE now anchors to well-known credential paths (.aws/credentials, .docker/credentials.json, .netrc, …); SECRET_FILE_ID_RSA_RE now requires the SSH-key basename so id_rsa_backup.md is allowed but <anywhere>/id_rsa is still blocked. .pem/.key extensions still block (plus .p12/.pfx/.jks added for completeness). Was previously denying source files like src/auth/credentials.ts.
  • sanitize-connection-strings no longer flags grep/perl arguments that merely contain a scheme name. The regex now requires a URL-safe <user>:<pass>@<host> shape so regex metachars in the pre-@ segment (e.g. 'postgres://|mysql://|@host:port/' in a grep pattern) break the match. Real postgresql://user:pw@host strings still get redacted.
  • warn-repeated-tool-calls canonicalizes the fingerprint (sorts input keys recursively) so the same logical call always hashes the same regardless of property order, and records a warned set in the sidecar so the warning fires once per fingerprint per session instead of repeating every turn. Back-compat with the old bare-counts sidecar format is preserved.

Test plan

  • bun run test:run1719 passed (includes new __tests__/auth/* + new false-positive cases under __tests__/hooks/builtin-policies.test.ts)
  • bun run test:e2e296 passed (2 skipped)
  • bunx tsc --noEmit — clean
  • bun run lint — clean (only pre-existing warnings)
  • bun run build — Next + dist bundles produced
  • npm pack --ignore-scriptsnpm install --prefix <tmp> → run the three subcommands against a clean install — auth --help, auth logout (no session), auth whoami (no session) all output correctly and auth whoami returns Error: Not logged in. (verifying the duck-typed CliError catch works across bundle boundaries).
  • Docker E2E against a real api-server container — not run in this environment (Docker Desktop's WSL integration was off). The steps are written down in the design plan; reproducer:
    • docker build -t failproof-api-server:e2e failproofai/api-server (in monorepo)
    • bring up postgres:16-alpine + the api-server container with FAILPROOF_ENVIRONMENT=local, FAILPROOF_EMAIL_SENDER_BACKEND=log
    • FAILPROOFAI_API_BASE_URL=http://failproof-api:8080 failproofai auth login --email <addr> → grep OTP from docker logs failproof-api → feed to stdin → assert auth whoami returns the user → assert auth logout clears ~/.failproofai/session.json.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added auth CLI: failproofai auth login (email OTP interactive or --email), auth logout, and auth whoami
    • Local session persistence (~/.failproofai/session.json with secure perms), best-effort server revoke on logout, and env override for API base URL
    • whoami token refresh semantics with conditional local clearing and clearer CLI errors
  • Bug Fixes

    • Tighter connection-string and secret-file matching (including SSH/Windows paths) to reduce false positives
    • Deterministic repeated-tool-call fingerprinting and one-time warning behavior
  • Tests

    • Expanded tests covering auth API flows, CLI manager, session store, and policy edge cases

Review Change Stack

…itive fixes

Auth CLI (the follow-up promised by 0.0.11-beta.2's removal of the old
device-flow auth in #380):

- `failproofai auth login [--email <addr>]` — POSTs to
  /v0/auth/login/request, prompts for the 6-digit OTP, exchanges it via
  /v0/auth/login/verify, persists the session.
- `failproofai auth logout` — best-effort POST /v0/auth/logout, then
  always clears local state so the user can never get stuck with a
  session file they cannot rm.
- `failproofai auth whoami` — GET /v0/auth/me with the access token,
  auto-refreshing via /v0/auth/token/refresh when the access token is
  expired but the refresh token is still valid.

New `src/auth/{session-store,api-client,prompts,manager}.ts` and
matching tests. Session stored at `~/.failproofai/session.json` with
mode 0600 (same pattern as `src/audit/cache.ts`). Default base URL
`https://api.befailproof.ai`; override via `FAILPROOFAI_API_BASE_URL`.
Wired into `bin/failproofai.mjs` alongside `policies` and `audit`.

While here, also fix a packaging quirk: bun's bundler can emit multiple
copies of `CliError` when the same class shows up both via a static
import inside a dynamically-imported handler and the top-level dynamic
import in `bin/failproofai.mjs`. `err instanceof CliError` then returns
false across the two copies and a clean user-facing error gets reported
as "Unexpected error: ...". The top-level catch is now duck-typed
(checks `err.name === "CliError"` + the `exitCode` field) so it works
regardless of class identity in the bundle.

Builtin-policy fixes (#NNN):

- `block-secrets-write` no longer denies `Write` of any path containing
  the substring "credentials" or "id_rsa". `SECRET_FILE_CREDENTIALS_RE`
  now anchors to well-known credential paths (.aws/credentials,
  .docker/credentials.json, .netrc, …); `SECRET_FILE_ID_RSA_RE` now
  requires the SSH-key basename (so `id_rsa_backup.md` is allowed but
  `<anywhere>/id_rsa` is still blocked); `.pem/.key` extensions still
  block. Was over-firing on source files like `src/auth/credentials.ts`.

- `sanitize-connection-strings` no longer flags grep/perl arguments that
  merely contain a scheme name. The regex now requires a URL-safe
  `<user>:<pass>@<host>` shape, so regex metachars in the pre-`@`
  segment (e.g. `postgres://|mysql://|@host:port/`) break the match.
  Real `postgresql://user:pw@host` strings still get redacted.

- `warn-repeated-tool-calls` canonicalizes the fingerprint (sorts input
  keys recursively) so the same logical call always hashes the same
  regardless of property order, and records a `warned` set in the
  sidecar so the warning fires once per fingerprint per session instead
  of repeating every turn. Back-compat with the old bare-counts sidecar
  format is preserved.

Tests:

- New `__tests__/auth/` for session-store, api-client (mocked fetch),
  and manager (mocked fetch/prompts/session-store).
- New cases under `__tests__/hooks/builtin-policies.test.ts` for the
  three policy false-positives — both that the previous misfires now
  return `allow` and that the genuine bad-input cases still `deny`.

All 1719 unit tests pass, all 296 e2e tests pass, tsc/eslint clean.
A clean `npm pack` + `npm install -g <tarball>` smoke verified the
three commands and the duck-typed CliError catch end-to-end.
Docker E2E against a real api-server container is in the plan but
Docker Desktop's WSL integration was off in this environment, so the
Docker portion of the plan is unverified here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds email-OTP CLI auth (login/logout/whoami) with a typed auth HTTP client, interactive prompts, on-disk session persistence (0600), CLI entrypoint wiring, tests, and tightens builtin policy detection and repeated-tool-call fingerprinting.

Changes

Email-OTP Authentication Flow

Layer / File(s) Summary
Auth API Client & Data Contracts
src/auth/api-client.ts
HTTP client for /v0/auth/* with TypeScript shapes, base-URL resolution (FAILPROOFAI_API_BASE_URL), JSON POST/GET helpers with timeout/auth/error parsing (429 retry-after hint), and endpoint wrappers for login-code request/verify, token refresh, logout, and me.
Session Persistence
src/auth/session-store.ts, __tests__/auth/session-store.test.ts
On-disk session store at ~/.failproofai/session.json with AuthSession shape, read validation, write with 0600 permissions, and idempotent clear.
Interactive CLI Prompts
src/auth/prompts.ts
Readline-based promptEmail() and promptCode() with basic validation (email must include @ and .) and CliError on invalid input.
Auth Command Handlers
src/auth/manager.ts, __tests__/auth/manager.test.ts
login (flag-or-prompt, request/verify OTP, persist session), logout (server revoke best-effort + local clear), whoami (session check, refresh when needed, profile fetch) with tests mocking API/prompts/store and detailed refresh/error semantics.
CLI Entry Point Integration
bin/failproofai.mjs, CHANGELOG.md
Adds auth subcommand with `login
Tests
__tests__/auth/session-store.test.ts, __tests__/auth/manager.test.ts, __tests__/hooks/builtin-policies.test.ts
Vitest coverage for session-store behavior, CLI manager flows, and policy edge-cases (connection-string, secret-file paths, repeated-call fingerprinting).

Sequence Diagram (login, verify, persist):

sequenceDiagram
  participant User
  participant CLI as "failproofai CLI"
  participant ApiClient as "src/auth/api-client"
  participant AuthServer
  participant Store as "session store"

  User->>CLI: run "failproofai auth login" (enter email/code)
  CLI->>ApiClient: requestLoginCode(baseUrl, email)
  ApiClient->>AuthServer: POST /v0/auth/login/request { email }
  AuthServer-->>ApiClient: 200 LoginRequestResponse
  CLI->>ApiClient: verifyLoginCode(baseUrl, email, code)
  ApiClient->>AuthServer: POST /v0/auth/login/verify { email, code }
  AuthServer-->>ApiClient: 200 TokenResponse
  CLI->>Store: writeSession(AuthSession)
  Store-->>CLI: persisted (~/.failproofai/session.json)
  CLI-->>User: "Logged in."
Loading

Builtin Policy Precision Improvements

Layer / File(s) Summary
Connection String & Secret File Tightening
src/hooks/builtin-policies.ts, __tests__/hooks/builtin-policies.test.ts
CONNECTION_STRING_RE now requires credentialed user:password@host URL forms; secret-file write detection expanded to additional key extensions, stricter SSH private-key filenames, and well-known credential file locations; tests extended for embedded/edge cases.
Repeated Tool Call Determinism
src/hooks/builtin-policies.ts, __tests__/hooks/builtin-policies.test.ts
Adds recursive canonicalization of toolInput before fingerprinting, changes tracker format to { counts, warned } with back-compat parsing, and ensures warnings fire once per fingerprint when crossing the threshold; tests cover canonicalization and warned suppression.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • FailproofAI/failproofai#374: Overlaps prior auth implementation removal and CLI auth surface changes; strongly related to restoring auth flows.

Poem

🐰 I hopped to prompt and typed an email bright,

A code arrived and twinkled in the night,
I tucked the session safe with careful keys,
Fingerprints hush now—warnings come but once,
Hooray — the CLI hums tidy, snug, and light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly summarizes the main change: adding three auth commands (login/logout/whoami) and fixing three over-firing builtin policies. The title is concise, specific, and directly reflects the core work in the changeset.
Description check ✅ Passed The pull request description covers all template sections: clear summary of changes, explicit test plan with results, and includes implementation details explaining the rationale for the fixes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
__tests__/auth/manager.test.ts (1)

194-208: ⚡ Quick win

Add a transient refresh-failure test case (should not clear session).

Please add a test where refreshTokens fails with a retryable error (e.g., network/5xx) and assert clearSession is not called. This protects the intended “clear only when session is expired” behavior.

As per coding guidelines: “Always add unit tests for new behaviour. Place tests in tests/.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@__tests__/auth/manager.test.ts` around lines 194 - 208, Add a new unit test
variant for whoamiCmd that sets up sessionMocks.readSession to return an expired
access token but a refresh token that is not expired, then mock
apiClientMocks.refreshTokens to reject with a transient error (e.g.,
network/5xx) and assert that whoamiCmd rejects with the refresh error and that
sessionMocks.clearSession was NOT called; use the same test structure as the
existing expired-refresh test but replace the refresh token expiry and have
apiClientMocks.refreshTokens.mockRejectedValueOnce(...) and assert
expect(sessionMocks.clearSession).not.toHaveBeenCalled() while still asserting
refreshTokens was invoked once.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Line 14: Two CHANGELOG.md entries still use the placeholder "(`#NNN`)"; replace
those placeholders with the actual PR numbers and ensure each entry remains a
single line with the description followed by the PR reference (e.g.,
"description (`#1234`)"); update both occurrences of "(`#NNN`)" in the CHANGELOG.md
so they reference the correct PRs and confirm formatting matches the project's
one-line-per-entry guideline.

In `@src/auth/api-client.ts`:
- Around line 80-85: The fetch calls in the API client helpers (the POST helper
shown and the other fetch helper later in the file) currently let network errors
and timeouts bubble up as native errors; wrap each fetch invocation (e.g.,
inside the functions that call fetch at the shown diff and the helper at lines
102-106) in a try/catch, and on any thrown error rethrow a CliError with a
concise user-facing message (e.g., "Network error contacting auth server" or
"Request to auth server timed out") that includes the original error message;
preserve the original error as the cause if CliError supports it to retain
diagnostics.

In `@src/auth/manager.ts`:
- Around line 94-97: The catch block in whoamiCmd (and surrounding refresh
logic) currently always calls clearSession() on any error; change it to only
clear the local session when the refresh failure is definitive (e.g.,
unauthorized/expired token responses such as HTTP 401 or provider error codes
like invalid_grant), otherwise do not clearSession() and rethrow a retryable
CliError. Concretely, inspect the caught err for API response metadata (e.g.,
err.response?.status or provider error code) and only invoke clearSession() when
it indicates token invalid/expired; for other errors preserve the session and
throw the original or a retryable error message instead.

In `@src/hooks/builtin-policies.ts`:
- Around line 966-985: The warning fires one call too late and reports the wrong
count: change the trigger to evaluate the current call (use prevCount + 1 >=
THRESHOLD) and update the message to report the current total (use prevCount +
1) instead of prevCount; keep the rest of the flow (incrementing
tracker.counts[fingerprint] and setting tracker.warned[fingerprint] when
shouldWarn) the same and reference the existing symbols: prevCount, THRESHOLD,
shouldWarn, tracker.counts[fingerprint], tracker.warned[fingerprint],
fingerprint, and ctx.toolName.
- Around line 183-186: The three regexes (SECRET_FILE_RE, SECRET_FILE_ID_RSA_RE,
SECRET_FILE_CREDENTIALS_RE) must be adjusted to (1) accept both POSIX and
Windows separators and (2) avoid matching public key files like id_rsa.pub.
Update SECRET_FILE_RE to account for backslashes (e.g. use [\/\\] or make the
pattern match a filename with a backslash option) so Windows paths are
recognized; change SECRET_FILE_ID_RSA_RE to use a separator class (?:^|[\/\\])
and add a negative lookahead to exclude names ending in .pub (e.g.
(?:^|[\/\\])id_(?:rsa|dsa|ecdsa|ed25519)(?!\.pub)$); and update
SECRET_FILE_CREDENTIALS_RE to use (?:^|[\/\\]) everywhere instead of (?:^|\/) so
.aws, .docker, .kube, .azure, .gcloud, .config/gcloud and .netrc are matched on
Windows paths as well. Ensure you keep the same logical groups/names
(SECRET_FILE_RE, SECRET_FILE_ID_RSA_RE, SECRET_FILE_CREDENTIALS_RE) when
replacing the patterns.

---

Nitpick comments:
In `@__tests__/auth/manager.test.ts`:
- Around line 194-208: Add a new unit test variant for whoamiCmd that sets up
sessionMocks.readSession to return an expired access token but a refresh token
that is not expired, then mock apiClientMocks.refreshTokens to reject with a
transient error (e.g., network/5xx) and assert that whoamiCmd rejects with the
refresh error and that sessionMocks.clearSession was NOT called; use the same
test structure as the existing expired-refresh test but replace the refresh
token expiry and have apiClientMocks.refreshTokens.mockRejectedValueOnce(...)
and assert expect(sessionMocks.clearSession).not.toHaveBeenCalled() while still
asserting refreshTokens was invoked once.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd5cc18f-b344-44f9-8528-fd880ee849ee

📥 Commits

Reviewing files that changed from the base of the PR and between 1a37c48 and 52416bf.

📒 Files selected for processing (11)
  • CHANGELOG.md
  • __tests__/auth/api-client.test.ts
  • __tests__/auth/manager.test.ts
  • __tests__/auth/session-store.test.ts
  • __tests__/hooks/builtin-policies.test.ts
  • bin/failproofai.mjs
  • src/auth/api-client.ts
  • src/auth/manager.ts
  • src/auth/prompts.ts
  • src/auth/session-store.ts
  • src/hooks/builtin-policies.ts

Comment thread CHANGELOG.md Outdated
Comment thread src/auth/api-client.ts Outdated
Comment thread src/auth/manager.ts Outdated
Comment thread src/hooks/builtin-policies.ts Outdated
Comment thread src/hooks/builtin-policies.ts
nk-ag and others added 3 commits May 29, 2026 09:00
- api-client: wrap fetch/timeout failures in CliError so network errors
  land on the "Error:" exit-1 path instead of "Unexpected error". The
  thrown error from HTTP responses carries a `.status` field so callers
  can distinguish "transient" from "auth-rejected".

- manager.whoamiCmd: only clear local session on a refresh response with
  status 401/403. Transient errors (network, 5xx, timeout) now preserve
  the session and surface a "Could not refresh session" message so the
  user can retry instead of being silently logged out.

- builtin-policies / block-secrets-write:
  - `SECRET_FILE_ID_RSA_RE` no longer matches `id_rsa.pub` etc. — public
    keys are not secrets. The trailing `(?:\\.pub)?` was removed and
    `$` now anchors at the bare private-key basename.
  - `SECRET_FILE_ID_RSA_RE` and `SECRET_FILE_CREDENTIALS_RE` accept both
    POSIX `/` and Windows `\` separators so `C:\Users\me\.ssh\id_rsa`
    and `C:\Users\me\.aws\credentials` are caught.

- Tests: add cases for `.pub` allowed, Windows secret paths denied,
  network-error → CliError, timeout → CliError, 401 from refresh clears
  session, transient refresh failure preserves session.

(Skipping the "warn on 3rd call not 4th" suggestion in
warn-repeated-tool-calls: the existing test fixture asserts the
warning fires with sidecar count=3 and the message string "3 times".
Both the threshold semantic and the message wording would have to
change in lockstep to flip to nextCount; deferred so this PR stays
scoped to the false-positive fixes the user reported.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Around line 13-18: The changelog contains multi-line bullets that must be
collapsed into single-line entries; find the multi-line entry beginning with
'access token, surfaces "Session expired"' (and the similar block later) in
CHANGELOG.md and rewrite each as a single-line bullet ending with the PR
reference (e.g., "(`#396`)"), preserving the original wording but joining lines
and removing extra line breaks so every changelog entry is one line as per
guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1cd38c4b-72c1-45d8-a62d-da56c6ec2632

📥 Commits

Reviewing files that changed from the base of the PR and between a235feb and c271633.

📒 Files selected for processing (1)
  • CHANGELOG.md

Comment thread CHANGELOG.md Outdated
nk-ag and others added 3 commits May 29, 2026 11:32
Track and report the count *including* the current call (`nextCount =
prevCount + 1`) and warn when `nextCount >= THRESHOLD`. With
THRESHOLD = 3 that means the first warning fires on the 3rd identical
call (previously it took 4) and the message correctly reports how many
times the tool has been called including this one.

Existing tests updated to seed sidecar count=2 so the 3rd call (the
current one) hits the threshold.

Addresses the last CodeRabbit comment on #396.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants