Skip to content

feat: Gemini CLI E2E tests (v0.9.2.0)#252

Merged
garrytan merged 3 commits intomainfrom
garrytan/gemini-cli-e2e
Mar 20, 2026
Merged

feat: Gemini CLI E2E tests (v0.9.2.0)#252
garrytan merged 3 commits intomainfrom
garrytan/gemini-cli-e2e

Conversation

@garrytan
Copy link
Owner

Summary

  • Gemini CLI is now tested end-to-end — two E2E tests verify skill discovery and code review via gemini -p
  • Gemini JSONL parser (parseGeminiJSONL) handles all event types with 10 unit tests
  • bun run test:gemini and test:gemini:all scripts added alongside aggregate test scripts

Test Coverage

CODE PATH COVERAGE
===========================
[+] test/helpers/gemini-session-runner.ts
    ├── parseGeminiJSONL()
    │   ├── [★★★] init → sessionId
    │   ├── [★★★] message (assistant) → output
    │   ├── [★★★] message (user) → ignored
    │   ├── [★★★] tool_use → toolCalls
    │   ├── [★★★] result → tokens
    │   ├── [★★★] malformed lines
    │   ├── [★★★] empty input
    │   └── [★★★] missing fields
    └── runGeminiSkill()
        ├── [★★] binary not found → SKIP
        └── [★★] normal success

COVERAGE: 12/13 paths tested (92%)
QUALITY:  ★★★: 8  ★★: 4

Pre-Landing Review

No issues found. Pure test infrastructure — no SQL, no LLM trust boundaries, no security concerns.

Eval Results

Gemini E2E (2/2 passed):

  • gemini-discover-skill: 15,687 tokens, 0 tool calls, 13s — PASS
  • gemini-review-findings: 397,282 tokens, 14 tool calls, 121s — PASS

No prompt-related files changed — LLM evals skipped.

Test plan

  • Parser unit tests pass (10/10)
  • Gemini E2E tests pass (2/2)
  • Graceful skip when gemini not installed
  • Graceful skip when EVALS=1 not set
  • Pre-existing touchfiles.test.ts failure confirmed on main (not caused by this PR)

🤖 Generated with Claude Code

garrytan and others added 3 commits March 20, 2026 08:19
Subprocess wrapper for `gemini -p --output-format stream-json --yolo`
that spawns the Gemini CLI and parses NDJSON events (init, message,
tool_use, tool_result, result) into a structured GeminiResult.

Includes 10 unit tests for parseGeminiJSONL covering happy path,
malformed input, empty input, missing fields, and multi-tool scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two E2E tests (gemini-discover-skill, gemini-review-findings) that
verify gstack skills work when invoked by the Gemini CLI. Follows
the same pattern as codex-e2e.test.ts — gated by EVALS=1 + binary
availability, diff-based selection via touchfiles, eval persistence.

- Add test/gemini-e2e.test.ts
- Add Gemini entries to E2E_TOUCHFILES and GLOBAL_TOUCHFILES
- Add test:gemini and test:gemini:all scripts to package.json
- Add gemini-e2e.test.ts to test:evals, test:e2e, and ignore list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@garrytan garrytan merged commit 6a6b2b0 into main Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant