test: cover runtime benchmark script by ndycode · Pull Request #308 · ndycode/codex-multi-auth

ndycode · 2026-03-22T06:02:11Z

Summary

add a focused smoke test for scripts/benchmark-runtime-path.mjs
verify the script writes a runtime benchmark payload with the expected result entries
improve coverage for a previously untested dist-backed benchmark entrypoint

Validation

npm run typecheck
npm run lint -- test/benchmark-runtime-path-script.test.ts
npm run test -- test/benchmark-runtime-path-script.test.ts
npm run build
manual QA: node scripts/benchmark-runtime-path.mjs --iterations=1 --output=<tmp>/runtime-benchmark.json

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

adds a focused smoke test for scripts/benchmark-runtime-path.mjs that copies the script into a temp fixture, stubs the three dist/ modules it depends on, spawns the script via spawnSync, and asserts the output json contains the expected benchmark result names and iteration count.

the prior rmSync concern is fully resolved — cleanup now uses removeWithRetry with EBUSY/EPERM backoff, matching the windows filesystem safety policy in AGENTS.md and test/AGENTS.md.
fixture construction is well-isolated: each test gets its own mkdtempSync root pushed onto tempRoots, cleaned atomically in afterEach.
the stub AccountManager and function exports correctly match the signatures called by the benchmark script, so the subprocess reliably exits 0.
one non-blocking P2: result.error is not asserted before result.status; if the subprocess times out or fails to spawn, the failure message is "received null, expected 0" rather than the actual error.

Confidence Score: 5/5

safe to merge — the only prior blocking concern (bare rmSync) is resolved, and the remaining P2 is a non-blocking diagnostic improvement.
the rmSync → removeWithRetry fix addresses the one concrete windows filesystem safety issue from the previous review round. the test is well-structured, properly isolated, and correctly exercises the full dist-stub → subprocess → json output path. the sole remaining comment (asserting result.error before result.status) is a diagnostic clarity improvement with no correctness impact.
no files require special attention

Important Files Changed

Filename	Overview
test/benchmark-runtime-path-script.test.ts	new smoke test for benchmark-runtime-path.mjs — correctly uses removeWithRetry for cleanup, properly stubs dist modules in a temp fixture dir, and verifies the full benchmark payload shape. one minor P2: result.error is not asserted before result.status, making timeout/spawn failures harder to diagnose.

Sequence Diagram

sequenceDiagram
    participant T as vitest test
    participant F as createFixture()
    participant TMP as tmp/runtime-bench-*
    participant S as spawnSync (node benchmark-runtime-path.mjs)
    participant DIST as stub dist modules

    T->>F: createFixture()
    F->>TMP: mkdtempSync → create scripts/, dist/lib/...
    F->>TMP: copyFileSync(real script → scripts/)
    F->>TMP: writeFileSync stubs (request-transformer.js, tool-utils.js, accounts.js)
    F-->>T: { root, scriptCopy }

    T->>S: spawnSync(node, [scriptCopy, --iterations=1, --output=...], cwd=root)
    S->>DIST: import ../dist/lib/request/request-transformer.js
    S->>DIST: import ../dist/lib/request/helpers/tool-utils.js
    S->>DIST: import ../dist/lib/accounts.js
    S->>TMP: writeFile(runtime-benchmark.json)
    S-->>T: stdout "Runtime benchmark written:", status 0

    T->>TMP: readFileSync(runtime-benchmark.json)
    T->>T: assert payload.iterations, result names
    T->>TMP: afterEach removeWithRetry(root, recursive+force)

Prompt To Fix All With AI

This is a comment left during a code review.
Path: test/benchmark-runtime-path-script.test.ts
Line: 73-76

Comment:
**assert `result.error` before `result.status`**

if `spawnSync` times out (the 15 s budget) or fails to spawn, `result.error` is set and `result.status` is `null`. the current `expect(result.status).toBe(0)` assertion then surfaces as "received null" — not obvious it was a timeout or spawn failure. asserting `result.error` first gives a much clearer failure message on slow ci.

```suggestion
		const result = spawnSync(
			process.execPath,
			[scriptCopy, "--iterations=1", `--output=${outputPath}`],
			{ cwd: root, encoding: "utf8", timeout: 15_000 },
		);

		expect(result.error).toBeUndefined();
```

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "Fail fast on benchma..."}

chatgpt-codex-connector · 2026-03-22T06:02:16Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

coderabbitai · 2026-03-22T06:02:19Z

📝 Walkthrough

Walkthrough

adds a new test file that spawns the benchmark runtime path script in an isolated temporary directory, verifying successful execution, expected stdout output, and the structure and values of the generated JSON results payload.

Changes

Cohort / File(s)	Summary
Benchmark Runtime Test `test/benchmark-runtime-path-script.test.ts`	new test file (+95 lines) that sets up per-test fixture directories, copies the benchmark script and stubbed runtime files, spawns the script with process.execPath, asserts exit code and stdout, reads/parses generated json, and validates iteration count and result names against expected values.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

notes

watch out for windows edge cases here—process.execPath and the fixture directory setup need to handle path separators correctly. test/benchmark-runtime-path-script.test.ts uses a global fixtureDirectories array with removeWithRetry cleanup, which could race if tests run in parallel; consider using vitest's temp dir utilities or sequential test isolation instead. no regression test visible for the actual benchmark output correctness (malformed json, missing fields, wrong result ordering)—consider adding assertions on more result properties. the removeWithRetry logic needs clear documentation on retry count and timing to avoid flaky test cleanup failures on windows.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows conventional commits format with `test:` prefix, lowercase imperative summary, and is well under 72 characters (36 chars).
Description check	✅ Passed	The PR description includes summary, what changed, and validation steps, but is missing risk/rollback and docs checklist sections from the template.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/pr10-benchmark-runtime-path-smoke-real

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch test/pr10-benchmark-runtime-path-smoke-real

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

test/benchmark-runtime-path-script.test.ts

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/benchmark-runtime-path-script.test.ts`:
- Around line 73-77: The spawnSync call that runs the benchmark script can hang
and block the vitest worker; update the spawnSync invocation in
test/benchmark-runtime-path-script.test.ts (the call creating result) to pass a
timeout option (e.g., timeout: 15000) in its options object (alongside cwd and
encoding) so the child process is killed after 15s and the test fails fast if
the script hangs; ensure the options merge preserves existing keys (cwd,
encoding) and consider specifying a kill signal if needed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fd81a066-20ec-4566-b831-e72c19265622

📥 Commits

Reviewing files that changed from the base of the PR and between 1be5e95 and 7d44c75.

📒 Files selected for processing (1)

test/benchmark-runtime-path-script.test.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (1)

test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

test/benchmark-runtime-path-script.test.ts

🔇 Additional comments (3)

test/benchmark-runtime-path-script.test.ts (3)

16-23: cleanup looks good with removeWithRetry.

this addresses the windows filesystem safety concern from the automated review. using removeWithRetry from test/helpers/remove-with-retry.ts instead of bare fs.rmSync prevents flaky ci on windows where file handles may linger.

25-66: fixture setup is well-structured.

the stubs at test/benchmark-runtime-path-script.test.ts:44-63 correctly export the three symbols required by scripts/benchmark-runtime-path.mjs:3-9. the minimal implementations (returning arrays, exposing the expected method) are sufficient for the smoke test without adding unnecessary complexity.

79-94: assertions are solid for the --output path.

the expected result names at test/benchmark-runtime-path-script.test.ts:87-93 match the benchmark cases defined in scripts/benchmark-runtime-path.mjs:117-140 exactly. the test is deterministic with --iterations=1 and an isolated fixture.

note: as flagged in the pr description, the console-only path (when --output is omitted) remains uncovered. consider adding a follow-up test that asserts json is written to stdout when no --output is provided.

test/benchmark-runtime-path-script.test.ts

test: cover runtime benchmark script

7065009

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

test/benchmark-runtime-path-script.test.ts Show resolved Hide resolved

Use retry-safe benchmark cleanup

7d44c75

coderabbitai bot requested changes Mar 22, 2026

View reviewed changes

test/benchmark-runtime-path-script.test.ts Show resolved Hide resolved

Fail fast on benchmark test hangs

6de0144

coderabbitai bot approved these changes Mar 22, 2026

View reviewed changes

ndycode mentioned this pull request Mar 22, 2026

release: merge current main-target PR set #318

Merged

ndycode merged commit 8c1d09e into main Mar 23, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: cover runtime benchmark script#308

test: cover runtime benchmark script#308
ndycode merged 3 commits intomainfrom
test/pr10-benchmark-runtime-path-smoke-real

ndycode commented Mar 22, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

notes

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ndycode commented Mar 22, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

chatgpt-codex-connector bot commented Mar 22, 2026

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

notes

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ndycode commented Mar 22, 2026 •

edited by greptile-apps bot

Loading

note: greptile review for oc-chatgpt-multi-auth. cite files like `lib/foo.ts:123`. confirm regression tests + windows concurrency/token redaction coverage.

coderabbitai bot commented Mar 22, 2026 •

edited

Loading