feat: Add Google Cloud Vertex AI provider support by itdove · Pull Request #22 · LobsterTrap/OpenShell

itdove · 2026-04-07T00:23:47Z

Summary

Adds complete support for Google Cloud Vertex AI as an inference provider, enabling OpenShell sandboxes to use Claude models via GCP Vertex AI with OAuth authentication.

This implementation includes full end-to-end testing and supports both direct Claude CLI usage and inference routing via inference.local.

Features

Vertex AI Provider

Provider discovery: Auto-discovers Vertex AI credentials from environment
OAuth token generation: Generates tokens from GCP Application Default Credentials
Credential injection: Injects actual values (not placeholders) for CLI tool compatibility
Region support: Configurable region via ANTHROPIC_VERTEX_REGION (defaults to us-central1)
Auto-configuration: Sets CLAUDE_CODE_USE_VERTEX=1 automatically

Inference Routing

URL construction: Builds Vertex-specific URLs with project ID, region, and :streamRawPredict endpoint
Model field handling: Removes model from request body (Vertex expects it in URL path)
Bearer auth: Uses OAuth tokens as Bearer tokens
API version: Uses vertex-2023-10-16 Anthropic API version
Model ID format: Supports @ separator (e.g., claude-sonnet-4-5@20250929)

Direct Credential Injection

Selective injection: Credentials needed by CLI tools are injected as actual environment variables
Vertex credentials: ANTHROPIC_VERTEX_PROJECT_ID, VERTEX_OAUTH_TOKEN, CLAUDE_CODE_USE_VERTEX, ANTHROPIC_VERTEX_REGION
Security: Only credentials essential for CLI tool compatibility are directly injected
HTTP proxy resolution: Other credentials continue using openshell:resolve:env:* placeholders

Network Policy Support

Custom policies: Sandboxes require network policy allowing Google Cloud endpoints
OAuth endpoints: oauth2.googleapis.com, accounts.google.com
Vertex AI endpoints: Regional Vertex AI endpoints (*-aiplatform.googleapis.com)
Inference routing: inference.local endpoint for privacy-aware routing

Changes

Core Implementation

crates/openshell-providers/src/providers/vertex.rs - Vertex AI provider plugin with OAuth generation
crates/openshell-core/src/inference.rs - VERTEX_PROFILE with Bearer auth and vertex API version
crates/openshell-server/src/inference.rs - Vertex URL construction with project ID and region
crates/openshell-router/src/backend.rs - Critical fix: Removes model field from request body for Vertex AI
crates/openshell-sandbox/src/secrets.rs - Direct credential injection for CLI compatibility
crates/openshell-providers/Cargo.toml - Add gcp_auth dependency
crates/openshell-providers/src/lib.rs - Register vertex provider
crates/openshell-cli/src/main.rs - Add Vertex to provider type enum

Examples

examples/vertex-ai/sandbox-policy.yaml - New: Network policy for Vertex AI endpoints
examples/vertex-ai/README.md - New: Quick start guide with documentation references

Development Improvements

tasks/scripts/cluster-deploy-fast.sh - Bash 3 compatibility fix (replaces mapfile)
scripts/rebuild-cluster.sh - New: Quick rebuild script for development workflow
scripts/setup-podman-macos.sh - Increase default memory from 8 GB to 12 GB for better build performance
cleanup-openshell-podman-macos.sh - Improved cleanup with sandbox deletion

Documentation

docs/sandboxes/manage-providers.md - Updated Vertex provider documentation, removed OAuth limitation note
docs/inference/configure.md - Updated Vertex AI setup guide with OAuth token generation
docs/get-started/install-podman-macos.md - Added rebuild/cleanup workflow documentation
CONTRIBUTING.md - Added development rebuild workflow

Usage

Prerequisites

# Install Google Cloud SDK
brew install google-cloud-sdk

# Configure Application Default Credentials
gcloud auth application-default login

# Set project ID and region
export ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
export ANTHROPIC_VERTEX_REGION=us-east5  # Optional, defaults to us-central1

Quick Start

# Create provider
openshell provider create --name vertex --type vertex --from-existing

# Create sandbox with Vertex AI
openshell sandbox create --name vertex-test --provider vertex \
  --upload ~/.config/gcloud/:.config/gcloud/ \
  --policy examples/vertex-ai/sandbox-policy.yaml

# Use Claude CLI (automatically uses Vertex AI)
claude

Inference Routing (Optional)

# Configure inference routing
openshell inference set --provider vertex --model claude-sonnet-4-5@20250929 --no-verify

# Test inside sandbox
curl -X POST https://inference.local/v1/messages \
  -H "content-type: application/json" \
  -d '{
    "anthropic_version": "vertex-2023-10-16",
    "max_tokens": 32,
    "messages": [{"role": "user", "content": "Say hello"}]
  }'

Testing

✅ Fully tested end-to-end on macOS with:

Podman Machine (12 GB RAM)
GCP project with Vertex AI Claude models enabled
Application Default Credentials configured
Provider creation and OAuth token generation
Direct Claude CLI usage in sandboxes
Inference routing via inference.local
Network policy enforcement
All regional Vertex AI endpoints

Key Test Results:

✅ OAuth token generation from ADC
✅ Credential injection into sandboxes
✅ Claude CLI auto-detects Vertex AI
✅ Inference routing removes model field correctly
✅ Vertex API responds with successful completions

Technical Details

Router Fix (Critical)

The router was incorrectly inserting the model field into request bodies for all providers. Vertex AI's :streamRawPredict endpoint expects the model in the URL path, not the request body, causing "Extra inputs are not permitted" errors.

Fix: Router now detects Vertex AI endpoints (aiplatform.googleapis.com) and removes the model field from the request body while keeping it in the URL path.

Credential Flow

User configures GCP Application Default Credentials
Provider plugin generates OAuth token from ADC at creation time
Credentials are stored in gateway database
When creating sandboxes, credentials are injected as actual environment variables
CLI tools (claude) automatically detect Vertex AI via CLAUDE_CODE_USE_VERTEX=1
OAuth tokens are refreshed from the uploaded ~/.config/gcloud/ directory

URL Structure

Vertex AI requests go to:

https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/publishers/anthropic/models/{model}:streamRawPredict

The router constructs this URL and removes the model field from the JSON body.

Development Workflow

Rebuilding After Changes

# Quick rebuild for testing code changes
bash scripts/rebuild-cluster.sh

# Recreate provider and sandbox after rebuild
openshell provider create --name vertex --type vertex --from-existing
openshell sandbox create --name test --provider vertex \
  --upload ~/.config/gcloud/:.config/gcloud/ \
  --policy examples/vertex-ai/sandbox-policy.yaml

Breaking Changes

None. All changes are additive.

Related Issues

Addresses the need for Vertex AI provider support for users who:

Need to use Claude via GCP Vertex AI for billing/compliance
Want to use existing GCP credentials and infrastructure
Require OAuth-based authentication instead of API keys
Work in organizations with GCP-only AI policies

Checklist

- Add vertex provider plugin with ANTHROPIC_VERTEX_PROJECT_ID credential - Add vertex inference profile with Anthropic-compatible protocols - Register vertex in provider registry and CLI - Add vertex to supported inference provider types - Fix scripts/podman.env to use correct env var names for local registry - Update docs for simplified CLI install workflow Known limitation: GCP OAuth authentication not yet implemented. Vertex provider can be created and configured but API calls will fail until OAuth token generation is added.

- Note that mise run cluster:build:full builds AND starts the gateway - Add verification step after build completes - Clarify that gateway is already running before sandbox creation

- Add vertex to supported provider types table in manage-providers.md - Add Vertex AI provider tab in inference configuration docs - Clarify two usage modes: direct API calls vs inference.local routing - Document prerequisites (GCP project, Application Default Credentials) - Note OAuth limitation only affects inference routing, not direct calls - Keep Vertex docs in provider/inference pages, not installation guides

- Add gcp_auth dependency for OAuth token generation - Generate OAuth tokens from Application Default Credentials in vertex provider - Store tokens as VERTEX_OAUTH_TOKEN credential for router authentication - Update inference profile to use Bearer auth with OAuth tokens - Construct Vertex-specific URLs with :streamRawPredict endpoint - Support project ID from credentials for URL construction - Add model parameter to build_backend_url for Vertex routing

Avoid tokio runtime nesting panic by spawning OAuth token generation in a separate OS thread with its own runtime. This allows provider discovery to work when called from within an existing tokio context.

…r ordering - Delete all sandboxes before destroying gateway - Explicitly stop and remove cluster and registry containers by name - Remove images by specific tags (localhost/openshell/*) - Run cargo clean for build artifacts - Add reinstall instructions to completion message - Better error handling with 2>/dev/null redirects

…iables Add selective direct injection for provider credentials that need to be accessible as real environment variables (not placeholders). This allows tools like `claude` CLI to read Vertex AI credentials directly. Changes: - Add direct_inject_credentials() list for credentials requiring direct access - Modify from_provider_env() to support selective direct injection - Inject ANTHROPIC_VERTEX_PROJECT_ID, VERTEX_OAUTH_TOKEN, and ANTHROPIC_VERTEX_REGION as actual values instead of placeholders - Other credentials continue using openshell:resolve:env:* placeholders for HTTP proxy resolution Security note: Directly injected credentials are visible via /proc/*/environ, unlike placeholder-based credentials which are only resolved within HTTP requests. Only credentials essential for CLI tool compatibility are included.

- Add CLAUDE_CODE_USE_VERTEX to direct injection list - Automatically set CLAUDE_CODE_USE_VERTEX=1 in Vertex provider credentials - Enables claude CLI to auto-detect Vertex AI without manual config Now sandboxes with Vertex provider will automatically have: - ANTHROPIC_VERTEX_PROJECT_ID (from env) - VERTEX_OAUTH_TOKEN (generated from GCP ADC) - CLAUDE_CODE_USE_VERTEX=1 (auto-set) The claude CLI can now use Vertex AI with zero manual configuration.

…rmance - Change Podman machine default memory from 8 GB to 12 GB - Update documentation to reflect 12 GB default - Update troubleshooting to suggest 16 GB for build issues 12 GB provides better performance for Rust compilation and reduces out-of-memory issues during parallel builds.

Replace manual 'cargo build + cp' with 'cargo install --path' Add verification step with 'openshell gateway info' Keep correct 'mise run cluster:build:full' command

Vertex AI's :streamRawPredict endpoint expects the model in the URL path, not in the request body. The router was incorrectly inserting the model field, causing "Extra inputs are not permitted" errors. Changes: - Router now detects Vertex AI endpoints and removes model field - Added bash 3 compatibility fix for cluster-deploy-fast.sh - Added scripts/rebuild-cluster.sh for development workflow - Updated documentation for Vertex AI setup and rebuild process Fixes inference routing to Vertex AI via inference.local endpoint.

coderabbitai · 2026-04-07T03:15:42Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 267788ed-44a5-4208-9b9a-d84db885ddc7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Added examples/vertex-ai/ directory with: - sandbox-policy.yaml: Network policy for Vertex AI endpoints - README.md: Quick start guide with links to full documentation Provides ready-to-use policy file for Vertex AI integration.

Podman does not support --push flag in build command like Docker buildx. This commit fixes two issues: 1. docker-build-image.sh: Filter out --push flag and execute push as separate command after build completes 2. docker-publish-multiarch.sh: Use safe array expansion syntax to avoid unbound variable errors with set -u when EXTRA_TAGS is empty Note: Multi-arch builds with Podman still require manual workflow due to cross-compilation toolchain issues. Use /tmp/build-multiarch-local.sh for local multi-arch builds with QEMU emulation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…h.sh Add Podman-specific multi-architecture build logic to complement existing Docker buildx support. Podman builds each platform sequentially using manifest lists, while Docker buildx builds in parallel. Changes: - Detect Podman and use manifest-based approach for multi-arch builds - Build each platform (arm64, amd64) separately with explicit TARGETARCH - Create and push manifest list combining all architectures - Preserve existing Docker buildx workflow unchanged - Add informative logging about sequential vs parallel builds Build times: - Podman: Sequential builds (~30-40 min on Linux, ~45-60 min on macOS) - Docker buildx: Parallel builds (~20-30 min) This enables multi-arch image publishing on systems using Podman as the container runtime, supporting both Apple Silicon and Intel architectures.

Fix CI formatting check failures: - Split long .insert() calls across multiple lines - Reformat MockDiscoveryContext initialization No functional changes, formatting only.

Remove short-lived OAuth token generation and storage in gateway database. Tokens are now generated on-demand inside sandboxes from uploaded ADC files. Changes: - Remove generate_oauth_token() function and gcp_auth dependency - Remove VERTEX_OAUTH_TOKEN from direct credential injection - Remove OAuth token insertion in discover_existing() - Add unset IMAGE_TAG/TAG_LATEST in podman.env to prevent build conflicts - Update Cargo.lock to remove gcp_auth dependency tree Benefits: - No stale token pollution in database - Tokens generated fresh on-demand (auto-refresh via ADC) - Simpler provider creation (synchronous, no async OAuth) - Reduced dependency footprint (removes 32 packages) - Better security (tokens not persisted in database) Token lifecycle: - Provider stores only ANTHROPIC_VERTEX_PROJECT_ID and region - Sandboxes require --upload ~/.config/gcloud/ for token generation - Claude CLI uses gcp_auth to generate/refresh tokens from ADC - Tokens valid for 1 hour, automatically refreshed via refresh token

- Check for ADC in both GOOGLE_APPLICATION_CREDENTIALS and default location - Add critical warning about --upload ~/.config/gcloud/ requirement - Document security model for credential injection strategy - Add comprehensive troubleshooting section with solutions for: - Authentication failures (missing ADC) - Project not found errors - Region not supported errors

itdove added 10 commits April 6, 2026 17:20

docs: clarify that cluster:build:full also starts the gateway

dc36903

- Note that mise run cluster:build:full builds AND starts the gateway - Add verification step after build completes - Clarify that gateway is already running before sandbox creation

fix(vertex): use separate thread for OAuth token generation

5ac42ba

Avoid tokio runtime nesting panic by spawning OAuth token generation in a separate OS thread with its own runtime. This allows provider discovery to work when called from within an existing tokio context.

fix(scripts): update CLI installation command in setup script

b08de19

Replace manual 'cargo build + cp' with 'cargo install --path' Add verification step with 'openshell gateway info' Keep correct 'mise run cluster:build:full' command

itdove requested review from cgwalters, fatherlinux, maxamillion, openchow and sallyom as code owners April 7, 2026 00:23

itdove and others added 7 commits April 6, 2026 23:22

docs: add Vertex AI example with network policy

308dc5c

Added examples/vertex-ai/ directory with: - sandbox-policy.yaml: Network policy for Vertex AI endpoints - README.md: Quick start guide with links to full documentation Provides ready-to-use policy file for Vertex AI integration.

fix: apply cargo fmt formatting to vertex provider

8a27b2f

Fix CI formatting check failures: - Split long .insert() calls across multiple lines - Reformat MockDiscoveryContext initialization No functional changes, formatting only.

style(vertex): apply cargo fmt formatting

c58f3c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Google Cloud Vertex AI provider support#22

feat: Add Google Cloud Vertex AI provider support#22
itdove wants to merge 18 commits intoLobsterTrap:midstreamfrom
itdove:vertex-claude

itdove commented Apr 7, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

itdove commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

Vertex AI Provider

Inference Routing

Direct Credential Injection

Network Policy Support

Changes

Core Implementation

Examples

Development Improvements

Documentation

Usage

Prerequisites

Quick Start

Inference Routing (Optional)

Testing

Technical Details

Router Fix (Critical)

Credential Flow

URL Structure

Development Workflow

Rebuilding After Changes

Breaking Changes

Related Issues

Checklist

Uh oh!

coderabbitai bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

itdove commented Apr 7, 2026 •

edited

Loading

coderabbitai bot commented Apr 7, 2026 •

edited

Loading