Skip to content

refactor: extract fix command#163

Closed
ndycode wants to merge 3 commits intorefactor/pr1-doctor-commandfrom
refactor/pr1-fix-command-2
Closed

refactor: extract fix command#163
ndycode wants to merge 3 commits intorefactor/pr1-doctor-commandfrom
refactor/pr1-fix-command-2

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 21, 2026

Summary

  • extract the codex auth fix implementation into a dedicated command module
  • add direct unit coverage for help, invalid-option handling, and empty-storage behavior while preserving the existing fix CLI behavior

What Changed

  • added lib/codex-manager/commands/fix.ts for auto-fix scanning, refresh/live-probe handling, recommendation output, and JSON/human output paths
  • updated lib/codex-manager.ts so runFix(...) delegates to the extracted fix command module through injected dependencies
  • added test/codex-manager-fix-command.test.ts for command-level coverage

Validation

  • npm run test -- test/codex-manager-fix-command.test.ts test/codex-manager-cli.test.ts
  • npm run lint
  • npm run typecheck
  • npm run build

Risk and Rollback

  • Risk level: medium-low
  • Rollback plan: revert 7b98866 to restore the inline fix implementation

Additional Notes

  • this completes the main manager command extraction train before moving on to larger settings/runtime/storage refactors

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

clean extraction of codex auth fix into lib/codex-manager/commands/fix.ts using the established dep-injection pattern. codex-manager.ts now delegates to runFixCommand and the new test file adds five targeted unit cases. the refactor also silently fixes a bug where saveQuotaCache was called in dry-run mode — this behavioral change is correct and tested but should be called out explicitly in the PR description.

  • lib/codex-manager/commands/fix.ts: 561-line extracted command; exports FixCliOptions, FixAccountReport, FixCommandDeps, summarizeFixReports, and runFixCommand
  • lib/codex-manager.ts: runFix reduced to a single return runFixCommand(args, { ... }) call; TokenFailure import correctly dropped
  • test/codex-manager-fix-command.test.ts: covers help, invalid-option, empty-storage (json), live-probe-failure, dry-run no-persist, and human-mode preview
  • behavior change: both json and human output paths now guard saveQuotaCache with !options.dryRun — the old code would persist quota cache even during a dry-run, which was a silent data mutation bug
  • missing vitest coverage: lockout-prevention path (re-enables one account when all are hard-disabled), hard/soft refresh failure outcomes, and the saveAccounts affirmative write are not directly exercised in the new test file; confirm codex-manager-cli.test.ts covers them end-to-end

Confidence Score: 4/5

  • safe to merge after noting the undisclosed dry-run fix in the PR description; no regressions introduced.
  • refactoring is a faithful port, dep-injection wiring matches the original call sites, the added !dryRun guard is a correct fix, and the five new tests validate the primary scenarios. score stays at 4 rather than 5 because the lockout-prevention path (a user-safety mechanism) has no direct unit coverage in the new file and the behavioral change in dry-run handling is undocumented in the PR description.
  • lib/codex-manager/commands/fix.ts lines 410–436 (lockout-prevention) — no direct unit test in the new file.

Important Files Changed

Filename Overview
lib/codex-manager/commands/fix.ts new command module extracting the fix logic with full dep-injection; silently adds !dryRun guard on saveQuotaCache (undisclosed bug fix), otherwise a faithful port of the original inline implementation.
lib/codex-manager.ts runFix now delegates entirely to runFixCommand via injected deps; TokenFailure import correctly removed; delegation wiring looks complete and correct.
test/codex-manager-fix-command.test.ts five vitest cases cover the main happy paths and edge cases; lockout-prevention, hard/soft refresh failure, and the dryRun=false write path are not yet directly covered in this new unit file.

Sequence Diagram

sequenceDiagram
    participant CLI as codex-manager.ts<br/>runFix(args)
    participant CMD as fix.ts<br/>runFixCommand(args, deps)
    participant Store as deps.loadAccounts
    participant QC as deps.loadQuotaCache
    participant Refresh as deps.queuedRefresh
    participant Probe as deps.fetchCodexQuotaSnapshot
    participant Disk as deps.saveAccounts<br/>deps.saveQuotaCache

    CLI->>CMD: delegate(args, injected deps)
    CMD->>Store: loadAccounts()
    Store-->>CMD: storage | null
    alt empty storage
        CMD-->>CLI: 0 (json or plain msg)
    end
    CMD->>QC: loadQuotaCache() [if --live]
    loop each account
        alt token usable + --live
            CMD->>Probe: fetchCodexQuotaSnapshot()
            Probe-->>CMD: snapshot | throws
        else token expired
            CMD->>Refresh: queuedRefresh(refreshToken)
            Refresh-->>CMD: success | failure
            alt --live after refresh
                CMD->>Probe: fetchCodexQuotaSnapshot()
                Probe-->>CMD: snapshot | throws
            end
        end
    end
    alt !dryRun && changed
        CMD->>Disk: saveAccounts(storage)
    end
    alt !dryRun && quotaCacheChanged
        CMD->>Disk: saveQuotaCache(workingCache)
    end
    CMD-->>CLI: exit code 0
Loading

Comments Outside Diff (4)

  1. lib/codex-manager/commands/fix.ts, line 691-692 (link)

    P1 concurrent runFix callers share the same storage file without a lock

    setStoragePath(null)loadAccounts() → mutate accounts in the loop → saveAccounts() is not atomic. on a machine with multiple terminals (common on windows where users run the CLI side-by-side), two concurrent codex auth fix invocations can race: both read the same file, both apply their mutations independently, and the second write silently clobbers the first. the hard-disable lockout recovery at line ~904 is particularly dangerous here — account A's enabled = true anti-lockout write could be lost.

    this is pre-existing behavior from the inline implementation, but the extraction is a good moment to call it out. consider at minimum adding a comment noting that callers must not be run concurrently, or passing a write-queue/mutex dep (consistent with the unified-settings.ts EBUSY/EPERM queued-write pattern already in the codebase).

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: lib/codex-manager/commands/fix.ts
    Line: 691-692
    
    Comment:
    **concurrent `runFix` callers share the same storage file without a lock**
    
    `setStoragePath(null)``loadAccounts()` → mutate accounts in the loop → `saveAccounts()` is not atomic. on a machine with multiple terminals (common on windows where users run the CLI side-by-side), two concurrent `codex auth fix` invocations can race: both read the same file, both apply their mutations independently, and the second write silently clobbers the first. the hard-disable lockout recovery at line ~904 is particularly dangerous here — account A's `enabled = true` anti-lockout write could be lost.
    
    this is pre-existing behavior from the inline implementation, but the extraction is a good moment to call it out. consider at minimum adding a comment noting that callers must not be run concurrently, or passing a write-queue/mutex dep (consistent with the `unified-settings.ts` EBUSY/EPERM queued-write pattern already in the codebase).
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Codex

  2. lib/codex-manager/commands/fix.ts, line 779-782 (link)

    silent catch changes the live-probe fallback flow and drops warning report

    the original code, when a live probe fails for an account with a usable access token, did two things: (1) pushed a "warning-soft-failure" report with the message "live probe failed (${message}), trying refresh fallback", then (2) fell through to push a second "healthy" report ("access token still valid") and continue-d — it did not call queuedRefresh.

    the new code sets needsRefresh = true and falls through to queuedRefresh. this means:

    • the intermediate "warning-soft-failure" probe-failure entry is silently dropped from reports and from json output, reducing observability
    • an extra network call (queuedRefresh) is made for every account whose probe fails but whose access token is still valid — this is new behaviour not present in the original

    the pr description says it's "preserving the existing fix CLI behaviour", but this is a divergence. the new test titled "falls back to refresh when live probe fails" is explicitly testing the new behaviour (the old code never called queuedRefresh in this branch).

    if the intent is to fix the double-report bug in the original, that's worth documenting. if the intent is strict parity, restore the warning push and skip the refresh:

    } catch (error) {
        const message = deps.normalizeFailureDetail(
            error instanceof Error ? error.message : String(error),
            undefined,
        );
        reports.push({
            index: i,
            label,
            outcome: "warning-soft-failure",
            message: `live probe failed (${message}), trying refresh fallback`,
        });
        needsRefresh = true;
    }
  3. lib/codex-manager/commands/fix.ts, line 961-964 (link)

    new dryRun guard on saveQuotaCache in json path is an undocumented behaviour change

    the original runFix did not guard saveQuotaCache with !options.dryRun in the json branch — --dry-run --json would still persist the quota cache. the new code adds && !options.dryRun, which is arguably the correct behaviour (dry-run should not write anything), but it silently changes observable CLI semantics.

    this should be called out in the pr description and should have an explicit test. the existing "does not persist quota cache during dry-run" test only exercises the human (non-json) code path (no json guard), so it doesn't directly cover this branch.

    add a companion test:

    it("does not persist quota cache during dry-run (json mode)", async () => {
        // parseFixArgs returns { dryRun: true, json: true, live: true, ... }
        // loadQuotaCache returns a real cache
        // updateQuotaCacheForAccount returns true
        // → saveQuotaCache must NOT be called
        expect(deps.saveQuotaCache).not.toHaveBeenCalled();
    });
  4. lib/codex-manager/commands/fix.ts, line 450-453 (link)

    silent behavioral fix not mentioned in PR description

    the old inline runFix in codex-manager.ts called await saveQuotaCache(workingQuotaCache) in the --json path without a !options.dryRun guard — meaning codex auth fix --dry-run --json --live would mutate the quota cache on disk despite being a preview run. the new code correctly adds && !options.dryRun here and at line 551, and the test "does not persist accounts or quota cache during dry-run json mode" validates the fix. this is a real behavior change that should be called out in the PR description, since it's not a pure refactor.

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/codex-manager/commands/fix.ts
Line: 450-453

Comment:
**silent behavioral fix not mentioned in PR description**

the old inline `runFix` in `codex-manager.ts` called `await saveQuotaCache(workingQuotaCache)` in the `--json` path **without** a `!options.dryRun` guard — meaning `codex auth fix --dry-run --json --live` would mutate the quota cache on disk despite being a preview run. the new code correctly adds `&& !options.dryRun` here and at line 551, and the test "does not persist accounts or quota cache during dry-run json mode" validates the fix. this is a real behavior change that should be called out in the PR description, since it's not a pure refactor.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/codex-manager-fix-command.test.ts
Line: 72-104

Comment:
**missing coverage for lockout-prevention and refresh-failure paths**

the five tests added cover help, invalid-option, empty-storage, live-probe-failure, dry-run, and human-mode preview — good foundations. however, these paths in `fix.ts` are not yet exercised by the new unit tests:

- **lockout-prevention** (lines 410–436): when every account gets hard-disabled, one is silently re-enabled with `outcome` mutated to `"warning-soft-failure"`. this is the safety net that prevents a user from being locked out entirely — it's worth having a direct vitest case.
- **hard refresh failure** (`disabled-hard-failure` outcome + `account.enabled = false`): lines 390–407.
- **soft refresh failure** (`warning-soft-failure` from a non-hard refresh failure): lines 401–407.
- **`saveAccounts` called** when `!dryRun && changed`: the dry-run negative is tested but the affirmative write path isn't.

if `codex-manager-cli.test.ts` already covers these end-to-end, a brief comment linking to those test names would help future reviewers know coverage exists.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "fix: preserve live probe fallback report..." | Re-trigger Greptile

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 21, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • skip-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e10d613b-dd7a-4df1-90cc-21885e6771da

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

refactored runFix implementation by extracting it to a dedicated fix command module. moved 435 lines of fix logic, account scanning, quota cache mutation, and report summarization from the manager entry point to lib/codex-manager/commands/fix.ts, which now orchestrates account state classification, token refresh, and forecasting via dependency injection.

Changes

Cohort / File(s) Summary
Fix command extraction
lib/codex-manager.ts
Removed 435 lines of in-file runFix implementation. now delegates to runFixCommand from the new commands module, removing local TokenFailure type reference since failure tracking was relocated.
Fix command module
lib/codex-manager/commands/fix.ts
Added 538 lines implementing runFixCommand, FixCliOptions, FixAccountReport, FixCommandDeps, and summarizeFixReports(). handles --help parsing, account state scanning, hard/soft disables, lockout avoidance, optional live quota probing, token refresh orchestration, and JSON/formatted output with recommendations.
Fix command tests
test/codex-manager-fix-command.test.ts
Added 92 lines covering three scenarios: --help exit, unknown option error, and empty storage case. uses mocked FixCommandDeps dependency graph.

Sequence Diagram

sequenceDiagram
    participant CLI
    participant Manager as runFix<br/>(manager.ts)
    participant FixCmd as runFixCommand<br/>(fix.ts)
    participant Storage
    participant Cache as Quota Cache
    participant Token as Token/QoS
    participant Forecast
    participant Output as Formatter

    CLI->>Manager: runFix(args)
    Manager->>FixCmd: runFixCommand(args, deps)
    
    FixCmd->>FixCmd: parseArgs()
    alt --help flag
        FixCmd->>Output: printFixUsage()
        Output-->>FixCmd: exit 0
    else --bad flag
        FixCmd->>FixCmd: log error
        FixCmd-->>Manager: exit 1
    else normal flow
        FixCmd->>Storage: loadStorage()
        Storage-->>FixCmd: accounts[]
        
        alt --live mode
            FixCmd->>Cache: loadAndClone()
            Cache-->>FixCmd: quotaCache
        end
        
        loop per account
            FixCmd->>FixCmd: classify state
            alt needs token refresh
                FixCmd->>Token: refresh tokens
                Token-->>FixCmd: success/failure
                alt live mode
                    FixCmd->>Forecast: probeQuota()
                    Forecast-->>FixCmd: quota state
                end
            else live quota check
                FixCmd->>Forecast: probeQuota()
                Forecast-->>FixCmd: quota state
            end
            FixCmd->>FixCmd: apply lockout avoidance
        end
        
        FixCmd->>FixCmd: summarizeFixReports()
        
        alt not dryRun
            FixCmd->>Storage: persist accounts
            alt live mode
                FixCmd->>Cache: persist quotaCache
            end
        end
        
        alt json output
            FixCmd->>Output: JSON + recommendation
        else formatted output
            FixCmd->>Output: human-readable + rows
        end
        
        FixCmd-->>Manager: exit 0
    end
    
    Manager-->>CLI: exit code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

detailed concerns

regression testing gap: test coverage is thin relative to logic density. lib/codex-manager/commands/fix.ts:1-538 handles account state classification, quota cache mutation, token refresh orchestration, and lockout-avoidance rules with only three happy-path test cases in test/codex-manager-fix-command.test.ts. missing coverage for:

  • hard/soft disable branching logic in per-account scanning
  • lockout avoidance heuristic (re-enabling one disabled account)
  • quota cache clone + persist flow for --live mode
  • token refresh failure handling and email fallback caching
  • dry-run vs mutation behavior divergence
  • forecast+recommendation flow

concurrency risk: account iteration in the scan loop (FixCmd → per-account state classification → token refresh → optional live probe) modifies accounts in-memory and conditionally persists to storage. no visible locking or transaction safety. if multiple fix invocations run concurrently, quota cache and account persistence could race. recommend explicit mutual exclusion or append-only audit logging.

windows edge cases: quota cache path handling and file I/O in cache load/clone/persist steps (FixCmd dependency graph) should verify cross-platform path handling. no visible path normalization for windows absolute paths in file references.

dependency injection opacity: FixCommandDeps surface is large (10+ injected functions). callers must satisfy all dependencies correctly. test factory in test/codex-manager-fix-command.test.ts:createDeps() helps, but integration tests against real storage/refresh/forecast paths are absent. recommend smoke test with actual account state.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive the pr description covers summary, what changed, and validation checkboxes. however, it omits critical behavior changes: (1) the new !dryRun guard on saveQuotaCache in both json and human paths silently fixes a dry-run mutation bug, and (2) the live-probe fallback now calls queuedRefresh instead of just warning, changing observability and making extra network calls. add an explicit 'behavior changes' subsection documenting the dry-run quota cache fix and the live-probe fallback refactoring. also note the missing coverage for lockout-prevention, hard/soft refresh failures, and concurrent-access races on windows.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional commits format with correct type and lowercase imperative summary under 72 characters, accurately describing the main refactoring change.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/pr1-fix-command-2
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch refactor/pr1-fix-command-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 21, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/codex-manager/commands/fix.ts`:
- Around line 427-430: The quota-cache is currently saved even during --dry-run;
update both locations that call deps.saveQuotaCache so they only run when
options.json is true AND options.dryRun is false (i.e., change the condition to
check !options.dryRun && options.json && workingQuotaCache && quotaCacheChanged
around the existing saveQuotaCache calls), mirror the same gating for the second
occurrence near the later save, and add/adjust the test
test/codex-manager-fix-command.test.ts to assert that a dry-run does not invoke
deps.saveQuotaCache; additionally, to address Windows IO and concurrency, wrap
the saveQuotaCache invocation (or implement in deps.saveQuotaCache) with a small
retry/backoff/queue that handles EBUSY and HTTP 429 transient errors.
- Around line 176-179: The early return when storage is empty bypasses the
--json output path; change the empty-storage branch in the handler that checks
storage (the block using storage and logInfo) to emit machine-readable JSON when
the json flag is set (e.g., an empty accounts array or a structured result)
instead of calling logInfo and returning immediately—use the same JSON output
helper/format used elsewhere in this command (or check the options.json flag) so
`codex auth fix --json` returns valid JSON on the empty-storage path and then
return the correct exit code.
- Around line 207-264: The live-probe catch currently pushes a
"warning-soft-failure" but then the code immediately appends a second "healthy"
report and continues, preventing the refresh fallback (queuedRefresh) from
running; change the control flow so that when deps.fetchCodexQuotaSnapshot(...)
inside the if (options.live) path throws, you do NOT add the "access token still
valid" healthy report or continue, but instead let execution fall through into
the refresh logic (or explicitly call the queuedRefresh path) so that
queuedRefresh/refresh handling can run; update logic around
deps.fetchCodexQuotaSnapshot, reports.push, deps.updateQuotaCacheForAccount and
deps.hasLikelyInvalidRefreshToken to ensure only one report is emitted per
account, and add/adjust tests (test/codex-manager-fix-command.test.ts:71-92) to
assert the revoked-but-not-expired session goes through queuedRefresh and that
EBUSY/429 retry behavior for refresh queues is covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 21cd04c9-9fa2-4c56-aaa3-67dac635bf7f

📥 Commits

Reviewing files that changed from the base of the PR and between 81fd616 and 7b98866.

📒 Files selected for processing (3)
  • lib/codex-manager.ts
  • lib/codex-manager/commands/fix.ts
  • test/codex-manager-fix-command.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/codex-manager-fix-command.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/codex-manager.ts
  • lib/codex-manager/commands/fix.ts
🔇 Additional comments (1)
lib/codex-manager.ts (1)

2671-2702: delegation boundary looks clean.

lib/codex-manager.ts:2671-2702 passes the storage, quota, auth-refresh, and formatting helpers the extracted command needs, so this stays a thin adapter around the new module. the command-level entry paths are at least smoke-covered in test/codex-manager-fix-command.test.ts:71-92.

@ndycode ndycode added the passed label Mar 22, 2026
@ndycode ndycode dismissed coderabbitai[bot]’s stale review March 22, 2026 15:43

All review threads are resolved and follow-up commits addressed this stale automated change request.

@ndycode
Copy link
Owner Author

ndycode commented Mar 23, 2026

Closing because this work is now included in main via #318 and #319.

@ndycode ndycode closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant