Skip to content

Add a dedicated Chutes sidecar route without changing existing /v1/chat/completions behavior #8

@mondaylord

Description

@mondaylord

Summary

We want to integrate Chutes models into vllm-proxy while preserving the current production-stable behavior for existing users.

Instead of modifying the current /v1/chat/completions path, add a dedicated Chutes route:

  • POST /v1/chutes/chat/completions
  • GET /v1/chutes/models

This lets upstream gateways (e.g., Redpill) continue receiving standard OpenAI-style /v1/chat/completions requests and selectively forward Chutes-bound traffic to the new sidecar route.

Goals

  • Keep existing /v1/chat/completions behavior unchanged
  • Minimize coupling/risk to current production path
  • Add Chutes integration behind explicit route + config flag

Proposed behavior

Existing route (unchanged)

POST /v1/chat/completions

  • continues to forward to local vLLM backend as today

New Chutes route

POST /v1/chutes/chat/completions

  • reuses existing E2EE parse/decrypt logic on ingress
  • forwards plaintext request to Chutes OpenAI-compatible endpoint over TLS:
    • ${CHUTES_BASE_URL}/v1/chat/completions
    • with Authorization: Bearer ${CHUTES_API_KEY}
  • reuses existing E2EE encrypt logic on egress

New Chutes models route

GET /v1/chutes/models

  • proxies ${CHUTES_BASE_URL}/v1/models with Chutes auth header

Config

  • CHUTES_ENABLED (default false)
  • CHUTES_BASE_URL (default https://llm.chutes.ai)
  • CHUTES_API_KEY (required if enabled)

Security note

This is not pure client-to-model cryptographic E2EE.
It is a practical TEE-mediated segmented model:

  • Client ↔ Proxy: existing E2EE
  • Proxy ↔ Chutes: TLS
  • Plaintext is only visible inside trusted runtime boundaries

Acceptance criteria

  1. Existing route behavior remains unchanged
  2. Chutes route supports stream + non-stream
  3. Existing E2EE nonce/replay checks remain in effect
  4. Misconfig returns explicit 503 errors
  5. No API-key leakage in logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions