Skip to content

Add Per-Model Provider Routing Overrides #146

@coniferconcepts

Description

@coniferconcepts

Add Per-Model Provider Routing Overrides

Problem

Mirrowel currently has:

  • provider model inventories
  • global routing weights
  • model_aliases.*.default_provider

That is not enough to express model-specific routing constraints when provider availability differs by model.

Current examples:

  • nemotron-3-super should route only to Ollama Cloud
  • qwen3.5 should route only among Ollama Cloud and Chutes
  • GO should not be considered for qwen3.5

Today, current config hints at defaults but does not clearly enforce:

  • provider exclusion per model
  • per-model custom weights
  • strict single-provider routes

Proposed Design

Add routing.model_overrides and resolve weighted-router/<model> before provider lock-in in RotatingClient.

Precedence:

  1. routing.model_overrides[model]
  2. model_aliases[model].default_provider
  3. global routing.weights
  4. fallback tier rules

Proposed Config

routing:
  model_overrides:
    nemotron-3-super:
      strategy: "single"
      primary: ollama
      allowed_providers: [ollama]
      fallback_providers: []
      strict: true
      log_decision: true
      allow_global_fallback: false
      reason: "Only available on Ollama Cloud"

    qwen3.5:
      strategy: "weighted"
      allowed_providers: [ollama, chutes]
      weights:
        ollama: 0.85
        chutes: 0.15
      fallback_providers: [zen]
      strict: true
      log_decision: true
      allow_global_fallback: false
      reason: "Shared between Ollama Cloud and Chutes; exclude GO"

Implementation Touch Points

  • add mirrowel-src/src/rotator_library/routing_policy.py
  • update mirrowel-src/src/rotator_library/client.py
    • rewrite before provider lock-in at client.py:1186
    • integrate near existing model-id resolution at client.py:1236

v1 Scope

  • parse and validate routing.model_overrides
  • implement single
  • pin nemotron-3-super to Ollama only
  • add route-decision logging
  • add tests

v2 Scope

  • implement weighted
  • add qwen3.5 override
  • optionally add sequential

Acceptance Criteria

  • weighted-router/nemotron-3-super always rewrites to ollama/nemotron-3-super
  • invalid override config fails at load time
  • weighted-router/qwen3.5 only selects ollama or chutes
  • route decisions are visible in logs
  • existing provider-specific retry/credential behavior remains unchanged

Suggested Tests

  • single-provider rewrite for nemotron-3-super
  • invalid provider/model validation failure
  • no-fallback single route fails fast
  • weighted qwen3.5 excludes GO
  • route log contains selection_source and override_applied

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions