Context Arbitrage for AI Engineering: Combine Claude Code (precise reasoning) with Gemini CLI (massive context, free tier) and OpenRouter (400+ models) to cut costs by ~90% without sacrificing quality.
Reading is cheap, thinking is expensive.
- Gemini (The Reader): Reads 100k+ tokens of logs, docs, and code. Summarizes, extracts, and researches. Cost: FREE (with Google account).
- Claude (The Thinker): Reviews summaries, makes architectural decisions, writes critical code, and supervises Gemini. Cost: Paid (but used sparingly).
- OpenRouter (The Council): Provides second opinions, specialized models (DeepSeek, Llama 3), and debates.
Result: You can "read" your entire codebase and "fix" complex bugs for pennies instead of dollars.
git clone https://github.com/SneezeGUI/Hybrid-CLI-Agent.git
cd hybrid-cli-agent
npm install
npm link # Optional: makes 'hybrid' command available globallyLaunch the configuration wizard to set up API keys and register with Claude Code automatically.
npm run setupFollow the prompts to authenticate Gemini (OAuth recommended for free tier) and optionally configure OpenRouter.
Check which agents are ready and see your usage tier.
node bin/hybrid.js statusCopy the template to use as Claude Code's system instructions:
cp CLAUDE.md.template ~/.claude/CLAUDE.md
# Or copy to your project root as CLAUDE.mdThe template teaches Claude to delegate tasks to Gemini for massive cost savings.
You can use the hybrid CLI directly or let Claude Code drive via MCP.
The hybrid command automatically routes tasks to the most cost-effective capable model.
# Ask questions (Routes to Gemini for free context, Claude for complex reasoning)
node bin/hybrid.js ask "How does the authentication middleware work?"
# Research (Uses Gemini to read multiple files - FREE)
node bin/hybrid.js research "Find all places where we use deprecated APIs" -f "src/**/*.ts"
# Draft Code (Gemini drafts, Claude reviews - Supervisor Pattern)
node bin/hybrid.js draft src/utils/rate-limit.ts "Implement a sliding window rate limiter using Redis"
# Code Review (Gemini analyzes, Claude validates)
node bin/hybrid.js review src/services/ --focus "security"Once installed (via npm run setup), you can talk to Claude Code naturally:
"Analyze the logs in @logs/error.log and find the root cause of the crash."
Claude will see the gemini-worker tools and delegate the heavy reading to Gemini, receiving back a structured summary.
| Provider | Best For | Setup | Cost |
|---|---|---|---|
| Gemini (OAuth) | Heavy Reading | gemini auth login |
FREE (within limits)* |
| Claude | Complex Reasoning | claude login |
~$3-15 / 1M tokens |
| OpenRouter | Variety / Debates | API Key | Varies by model |
The free tier is available via OAuth (gemini auth login) - no credit card required.
| Model | RPM | RPD | Best For |
|---|---|---|---|
| gemini-2.5-flash | 15 | 1,500 | General tasks, file reading |
| gemini-2.5-flash-lite | 15 | 1,500 | High-frequency, low-latency |
| gemini-2.5-pro | 2 | 50 | Complex reasoning (limited) |
| gemini-3-pro-preview | 2 | 50 | Frontier model (limited) |
Google One AI Premium ($19.99/mo): ~5x higher limits + priority access to Gemini 3 Pro.
The system automatically falls back when you exceed free tier limits:
- Primary:
gemini-2.5-flash(15 RPM / 1,500 RPD) - handles most tasks - Fallback:
gemini-2.5-flash-liteif Flash quota exhausted - Paid API: If all free quotas hit, falls back to API key billing:
| Model | Input ($/1M) | Output ($/1M) |
|---|---|---|
| gemini-2.5-flash-lite | $0.075 | $0.30 |
| gemini-2.5-flash | $0.10 | $0.40 |
| gemini-2.5-pro | $1.25-2.50 | $5.00-10.00 |
| gemini-3-pro | $2.50 | $10.00 |
π‘ Tip: For most dev work, the free tier (1,500 requests/day) is plenty. You'll only hit paid API if doing heavy batch processing.
- Task Routing: Automatically sends simple tasks to faster/cheaper models.
- Supervisor Loop: Claude reviews code generated by Gemini before you see it.
- AI Collaboration: Run debates or consensus checks between GPT-4, Claude, and Llama 3.
The system exposes a streamlined toolkit to Claude Code, focusing on the powerful autonomous agent:
View All Available Tools
| Category | Tools | Description |
|---|---|---|
| Agent | gemini_agent_task |
PRIMARY: Autonomous agent with file system, shell commands, Google Search, and test iteration. |
| Agent | gemini_agent_approve |
NEW: Approve/reject file changes from agent task (auto-review workflow). |
| Agent | gemini_agent_list |
List active agent sessions. |
| Agent | gemini_agent_clear |
Clear completed/failed sessions. |
| System | gemini_auth_status |
Check authentication and free tier status. |
| System | gemini_health_check |
Check system connectivity and model availability. |
| System | gemini_config_show |
Show current configuration. |
| System | hybrid_metrics |
View usage stats and costs. |
{
"task_description": "Create feature with tests, run npm test until ALL pass",
"context_files": ["src/**/*.js"],
"stall_timeout_seconds": 120,
"verbose": false,
"max_retries": 0
}stall_timeout_seconds(default: 300) - Time before stall detection triggers (5 min)verbose(default: false) - Include larger output (up to 100k chars)max_retries(default: 0) - Auto-retry on transient failures
When gemini_agent_task modifies files, it returns PENDING_REVIEW status. You must call gemini_agent_approve to finalize changes:
{ "session_id": "abc-123", "approved": true }- β File system: Create, read, write, delete files within workspace
- β Shell commands: npm, node, git, pytest, build tools
- β Google Search: Live web search for docs, APIs, solutions
- β Security sandbox: Path traversal blocked, stays within workspace
- Response Caching: Caches Gemini responses (TTL 30m) to save time on repeated queries.
- Token Tracking: Real-time cost estimation per session.
graph TD
User[User / Claude Code] --> Orchestrator
Orchestrator -->|Complexity Check| Router
Router -->|Reasoning & Sign-off| Claude[Claude Code - Supervisor]
Router -->|Reading & Execution| Gemini[Gemini CLI - Worker]
Claude -->|Delegate Task| MCPServer[Gemini MCP Server]
MCPServer -->|Execute| Gemini
The npm run setup script creates a .env file for you. You can also configure manually:
# .env file
GEMINI_AGENT_MODE=true # Enable autonomous agent capabilities
OPENROUTER_API_KEY=sk-... # Optional: For 400+ extra models
GEMINI_API_KEY=... # Optional: If not using OAuthSystem-Wide Install:
If you want to use the Gemini tools in any Claude Code project, the setup script registers the server globally in your ~/.claude.json.
For the Context Arbitrage pattern to work, Claude needs instructions on when and how to delegate to Gemini. This repo includes a CLAUDE.md file with these rules.
Copy the contents of CLAUDE.md from this repo to your global Claude instructions file:
Location: ~/.claude/CLAUDE.md
- Windows:
C:\Users\<YourUsername>\.claude\CLAUDE.md - macOS/Linux:
~/.claude/CLAUDE.md
# Create the directory if it doesn't exist
mkdir -p ~/.claude
# Copy the file (adjust source path as needed)
cp CLAUDE.md ~/.claude/CLAUDE.md
β οΈ Important: If you already have a~/.claude/CLAUDE.md, merge the contents rather than overwriting. The key sections are the "Decision Matrix" and "Workflow" rules that tell Claude to delegate togemini_agent_task.
The CLAUDE.md file instructs Claude to:
- Delegate heavy reading to Gemini (FREE) instead of using its own Read/Search tools ($$)
- Use
gemini_agent_taskfor coding, testing, R&D, and documentation - Only act directly for final approvals, git commits, and single tiny fixes
- Review Gemini's output before accepting it
This is what enables the ~90% cost savings. Without these instructions, Claude will do everything itself at full price.
- Fork the repo
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
Disclaimer: This tool is designed for local development use. Enabling GEMINI_AGENT_MODE allows the AI to execute shell commands and modify files. Always review actions in critical environments.