Skip to content

SneezeGUI/Hybrid-CLI-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Hybrid CLI Agent

Context Arbitrage for AI Engineering: Combine Claude Code (precise reasoning) with Gemini CLI (massive context, free tier) and OpenRouter (400+ models) to cut costs by ~90% without sacrificing quality.

License Node Version Status

πŸš€ The Core Concept

Reading is cheap, thinking is expensive.

  1. Gemini (The Reader): Reads 100k+ tokens of logs, docs, and code. Summarizes, extracts, and researches. Cost: FREE (with Google account).
  2. Claude (The Thinker): Reviews summaries, makes architectural decisions, writes critical code, and supervises Gemini. Cost: Paid (but used sparingly).
  3. OpenRouter (The Council): Provides second opinions, specialized models (DeepSeek, Llama 3), and debates.

Result: You can "read" your entire codebase and "fix" complex bugs for pennies instead of dollars.


⚑ Quick Start

1. Installation

git clone https://github.com/SneezeGUI/Hybrid-CLI-Agent.git
cd hybrid-cli-agent
npm install
npm link  # Optional: makes 'hybrid' command available globally

2. Interactive Setup (Recommended)

Launch the configuration wizard to set up API keys and register with Claude Code automatically.

npm run setup

Follow the prompts to authenticate Gemini (OAuth recommended for free tier) and optionally configure OpenRouter.

3. Verify Status

Check which agents are ready and see your usage tier.

node bin/hybrid.js status

4. Claude System Instructions (Recommended)

Copy the template to use as Claude Code's system instructions:

cp CLAUDE.md.template ~/.claude/CLAUDE.md
# Or copy to your project root as CLAUDE.md

The template teaches Claude to delegate tasks to Gemini for massive cost savings.


πŸ› οΈ Usage

You can use the hybrid CLI directly or let Claude Code drive via MCP.

CLI Mode (Orchestrator)

The hybrid command automatically routes tasks to the most cost-effective capable model.

# Ask questions (Routes to Gemini for free context, Claude for complex reasoning)
node bin/hybrid.js ask "How does the authentication middleware work?"

# Research (Uses Gemini to read multiple files - FREE)
node bin/hybrid.js research "Find all places where we use deprecated APIs" -f "src/**/*.ts"

# Draft Code (Gemini drafts, Claude reviews - Supervisor Pattern)
node bin/hybrid.js draft src/utils/rate-limit.ts "Implement a sliding window rate limiter using Redis"

# Code Review (Gemini analyzes, Claude validates)
node bin/hybrid.js review src/services/ --focus "security"

MCP Mode (Claude Code Integration)

Once installed (via npm run setup), you can talk to Claude Code naturally:

"Analyze the logs in @logs/error.log and find the root cause of the crash."

Claude will see the gemini-worker tools and delegate the heavy reading to Gemini, receiving back a structured summary.


πŸ”‘ Authentication & Costs

Provider Best For Setup Cost
Gemini (OAuth) Heavy Reading gemini auth login FREE (within limits)*
Claude Complex Reasoning claude login ~$3-15 / 1M tokens
OpenRouter Variety / Debates API Key Varies by model

Gemini Free Tier Limits (December 2025)

The free tier is available via OAuth (gemini auth login) - no credit card required.

Model RPM RPD Best For
gemini-2.5-flash 15 1,500 General tasks, file reading
gemini-2.5-flash-lite 15 1,500 High-frequency, low-latency
gemini-2.5-pro 2 50 Complex reasoning (limited)
gemini-3-pro-preview 2 50 Frontier model (limited)

Google One AI Premium ($19.99/mo): ~5x higher limits + priority access to Gemini 3 Pro.

When Free Limits Are Hit

The system automatically falls back when you exceed free tier limits:

  1. Primary: gemini-2.5-flash (15 RPM / 1,500 RPD) - handles most tasks
  2. Fallback: gemini-2.5-flash-lite if Flash quota exhausted
  3. Paid API: If all free quotas hit, falls back to API key billing:
Model Input ($/1M) Output ($/1M)
gemini-2.5-flash-lite $0.075 $0.30
gemini-2.5-flash $0.10 $0.40
gemini-2.5-pro $1.25-2.50 $5.00-10.00
gemini-3-pro $2.50 $10.00

πŸ’‘ Tip: For most dev work, the free tier (1,500 requests/day) is plenty. You'll only hit paid API if doing heavy batch processing.


🧩 Features

🧠 Multi-Agent Orchestration

  • Task Routing: Automatically sends simple tasks to faster/cheaper models.
  • Supervisor Loop: Claude reviews code generated by Gemini before you see it.
  • AI Collaboration: Run debates or consensus checks between GPT-4, Claude, and Llama 3.

πŸ› οΈ 8 MCP Tools (v0.3.7)

The system exposes a streamlined toolkit to Claude Code, focusing on the powerful autonomous agent:

View All Available Tools
Category Tools Description
Agent gemini_agent_task PRIMARY: Autonomous agent with file system, shell commands, Google Search, and test iteration.
Agent gemini_agent_approve NEW: Approve/reject file changes from agent task (auto-review workflow).
Agent gemini_agent_list List active agent sessions.
Agent gemini_agent_clear Clear completed/failed sessions.
System gemini_auth_status Check authentication and free tier status.
System gemini_health_check Check system connectivity and model availability.
System gemini_config_show Show current configuration.
System hybrid_metrics View usage stats and costs.

Agent Task Parameters (v0.3.7)

{
  "task_description": "Create feature with tests, run npm test until ALL pass",
  "context_files": ["src/**/*.js"],
  "stall_timeout_seconds": 120,
  "verbose": false,
  "max_retries": 0
}
  • stall_timeout_seconds (default: 300) - Time before stall detection triggers (5 min)
  • verbose (default: false) - Include larger output (up to 100k chars)
  • max_retries (default: 0) - Auto-retry on transient failures

Auto-Review Workflow

When gemini_agent_task modifies files, it returns PENDING_REVIEW status. You must call gemini_agent_approve to finalize changes:

{ "session_id": "abc-123", "approved": true }

Verified Capabilities

  • βœ… File system: Create, read, write, delete files within workspace
  • βœ… Shell commands: npm, node, git, pytest, build tools
  • βœ… Google Search: Live web search for docs, APIs, solutions
  • βœ… Security sandbox: Path traversal blocked, stays within workspace

⚑ Performance Features

  • Response Caching: Caches Gemini responses (TTL 30m) to save time on repeated queries.
  • Token Tracking: Real-time cost estimation per session.

πŸ—οΈ Architecture

graph TD
    User[User / Claude Code] --> Orchestrator
    Orchestrator -->|Complexity Check| Router
    Router -->|Reasoning & Sign-off| Claude[Claude Code - Supervisor]
    Router -->|Reading & Execution| Gemini[Gemini CLI - Worker]
    Claude -->|Delegate Task| MCPServer[Gemini MCP Server]
    MCPServer -->|Execute| Gemini
Loading

βš™οΈ Configuration

The npm run setup script creates a .env file for you. You can also configure manually:

# .env file
GEMINI_AGENT_MODE=true        # Enable autonomous agent capabilities
OPENROUTER_API_KEY=sk-...     # Optional: For 400+ extra models
GEMINI_API_KEY=...            # Optional: If not using OAuth

System-Wide Install: If you want to use the Gemini tools in any Claude Code project, the setup script registers the server globally in your ~/.claude.json.


πŸ“‹ Claude System Instructions (Required)

For the Context Arbitrage pattern to work, Claude needs instructions on when and how to delegate to Gemini. This repo includes a CLAUDE.md file with these rules.

Setup: Copy to System-Wide Location

Copy the contents of CLAUDE.md from this repo to your global Claude instructions file:

Location: ~/.claude/CLAUDE.md

  • Windows: C:\Users\<YourUsername>\.claude\CLAUDE.md
  • macOS/Linux: ~/.claude/CLAUDE.md
# Create the directory if it doesn't exist
mkdir -p ~/.claude

# Copy the file (adjust source path as needed)
cp CLAUDE.md ~/.claude/CLAUDE.md

⚠️ Important: If you already have a ~/.claude/CLAUDE.md, merge the contents rather than overwriting. The key sections are the "Decision Matrix" and "Workflow" rules that tell Claude to delegate to gemini_agent_task.

What It Does

The CLAUDE.md file instructs Claude to:

  • Delegate heavy reading to Gemini (FREE) instead of using its own Read/Search tools ($$)
  • Use gemini_agent_task for coding, testing, R&D, and documentation
  • Only act directly for final approvals, git commits, and single tiny fixes
  • Review Gemini's output before accepting it

This is what enables the ~90% cost savings. Without these instructions, Claude will do everything itself at full price.


🀝 Contributing

  1. Fork the repo
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

Distributed under the MIT License. See LICENSE for more information.


Disclaimer: This tool is designed for local development use. Enabling GEMINI_AGENT_MODE allows the AI to execute shell commands and modify files. Always review actions in critical environments.

About

Multi-agent CLI orchestrator combining Claude Code, Gemini CLI, and OpenRouter.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors