feat(api): AIN-116 Piece #1 · provider_model_name column + routing fallback by hizrianraz · Pull Request #44 · ainfera-ai/api

hizrianraz · 2026-05-18T15:36:08Z

Summary

Schema prerequisite for AIN-116 v1.0.1 catalog mapping. Adds a nullable provider_model_name TEXT column on models and updates the routing layer to prefer it over slug when set.

Why

Ainfera slug is the canonical-for-us identifier (e.g., claude-opus-4-7). The provider's API expects its own canonical name (e.g., Anthropic's claude-opus-4-20251110). Currently routing.dispatch_inference passes slug directly to upstream, which works for AAMC 5 but blocks 43 deactivated non-AAMC slugs whose names diverge.

Audit chain integrity

Per the "no stealth substitution" rule, audit events still record model_slug (Ainfera canonical). This change only affects what we send upstream.

Deferred (separate sub-tickets needed)

Piece chore(deps): Bump actions/checkout from 4 to 6 #2 — Fix 4 provider base_urls (groq/fireworks/deepinfra/novita). Requires prod DB UPDATE on providers.
Piece chore: D5 fix queue — license, copy reconciliation, descriptions #3 — Backfill provider_model_name for the 43 deactivated rows. Requires per-provider /v1/models research + sacrificial test key + T9 re-fanout.

Both pieces are Discipline #6 corollary (prod DB writes). File as sub-tickets for explicit founder auth.

Test plan

ruff + ruff-format + mypy --strict + pytest (pre-commit): all green
After deploy + migration apply: existing AAMC 5 calls still work (slug fallback)
After Piece chore: D5 fix queue — license, copy reconciliation, descriptions #3 backfill: previously-deactivated rows return 200 when re-enabled

🤖 Generated with Claude Code

…llback Ainfera `slug` is the canonical-for-us identifier (e.g., `claude-opus-4-7`). The provider's API expects its own canonical name (e.g., Anthropic's `claude-opus-4-20251110`). Currently routing.dispatch_inference passes `slug` directly to upstream, which works for AAMC 5 (slug == upstream name) but blocks the 43 deactivated non-AAMC slugs whose names diverge. ## Schema (migration 0016) - `models.provider_model_name TEXT NULL` - Forward-compatible add-column (no backfill) ## Routing fallback `upstream_model = model.provider_model_name or model_slug` - NULL = adapter receives Ainfera slug as-is (current AAMC 5 behavior preserved) - Populated = adapter receives the provider's canonical name ## Audit chain integrity Per "no stealth substitution" rule, audit events still record `model_slug` (Ainfera canonical). This change only affects what we send upstream. ## Deferred to follow-up - Piece #2: fix 4 provider base_urls (groq/fireworks/deepinfra/novita) — requires prod DB UPDATE against `providers` table - Piece #3: backfill provider_model_name for the 43 deactivated rows — requires per-provider /v1/models research + sacrificial test key access + T9 re-fanout Both pieces are Discipline #6 corollary territory (prod DB writes); file as separate sub-tickets for explicit founder auth. Co-Authored-By: Claude <noreply@anthropic.com>

linear-code · 2026-05-18T15:36:12Z

AIN-116 ainfera-auto routing v1.1 + v1.0.1 catalog mapping — provider_model_name, latency, geo, prefs

Update 2026-05-16 19:25 WIB · v1.0.1 prerequisite added

T9 fanout hit the cap-rule (11/54 pass) and 43 new model rows were
deactivated. Two upstream blockers must land in v1.0.1 BEFORE we can
ship the v1.1 routing improvements below:

v1.0.1 (blocker · ~3h)

provider_model_name TEXT NULL column on models · routing
layer uses it when set, falls back to slug when NULL. Ainfera
slug is the canonical-for-us identifier; provider_model_name
is what we pass to the upstream provider's API. Migration:
forward-compatible add-column.
Fix 4 provider base_urls · current rows have URLs that produce
/v1/v1/chat/completions when concatenated with the OpenAI-compat
adapter path. Fix:
- groq base_url → https://api.groq.com/openai (drop trailing /v1)
- fireworks base_url → https://api.fireworks.ai/inference (drop trailing /v1)
- deepinfra + novita need custom chat_completions_path (add column
  to providers OR subclass OpenAICompatAdapter)
Slug-to-provider-name mapping · for each of the 43 deactivated
rows, research each provider's /v1/models endpoint and populate
provider_model_name. Re-run fanout. Re-activate passing rows.
Target ≥ 35 pass before re-enabling.

See .launch-snapshots/T9-FANOUT-INCIDENT.md for the full incident
write-up + the 11 currently-active slugs.

v1.1 (after v1.0.1 lands)

latency_budget_ms axis — when caller passes
routing_hint.latency_budget_ms, prefer Groq tier (sub-50ms p50)
and other fast-capable models. Hard cap on AA Index loss vs the
unconstrained pick.
Geographic stack hints — routing_hint.stack accepts
china, eu, sea, us, avoid-china. Maps the same way
/v1/models?stack= does today.
Smarter cost estimation — current default assumes
expected_output_ratio = 0.5. Learn the per-model ratio from the
last N receipts and use that to estimate. Fall back to 0.5 when
fewer than N samples.
Per-tenant model preferences — new fields on
tenants.spend_policy:
- banned_providers: list[str]
- preferred_providers: list[str]
- banned_slugs: list[str]
Per-call routing telemetry — emit inference.routed audit
events with decision_log (top-3 candidates that survived each
filter step).

Out of scope (v1.2+)

Streaming-aware routing (drop non-streaming-capable models when
caller requests stream)
Multi-region routing (preference for low-RTT provider POPs)
Cost-vs-quality Pareto frontier exposed at /v1/models/explain

Acceptance after v1.0.1 + v1.1

/v1/models | jq 'length' returns ≥ 40 (post-mapping re-activation)
routing_hint.latency_budget_ms=500 routes through Tier 4 Groq if any model qualifies
routing_hint.stack=china routes Novita-only
Banned providers in tenant policy are never selected (test with synthetic tenant)
Decision-log audit event present on every ainfera-auto call
Backward compatible: existing model=ainfera-auto calls with no hint still work the same as v1.0

Review in Linear

cursor · 2026-05-18T15:36:12Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

hizrianraz merged commit 14999e2 into main May 18, 2026
3 checks passed

hizrianraz deleted the feat/ain-116-piece-1-provider-model-name-column branch May 18, 2026 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): AIN-116 Piece #1 · provider_model_name column + routing fallback#44

feat(api): AIN-116 Piece #1 · provider_model_name column + routing fallback#44
hizrianraz merged 1 commit into
mainfrom
feat/ain-116-piece-1-provider-model-name-column

hizrianraz commented May 18, 2026

Uh oh!

linear-code Bot commented May 18, 2026 •

edited

Loading

Update 2026-05-16 19:25 WIB · v1.0.1 prerequisite added

v1.0.1 (blocker · ~3h)

v1.1 (after v1.0.1 lands)

Out of scope (v1.2+)

Acceptance after v1.0.1 + v1.1

Uh oh!

cursor Bot commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 18, 2026

Summary

Why

Audit chain integrity

Deferred (separate sub-tickets needed)

Test plan

Uh oh!

linear-code Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update 2026-05-16 19:25 WIB · v1.0.1 prerequisite added

v1.0.1 (blocker · ~3h)

v1.1 (after v1.0.1 lands)

Out of scope (v1.2+)

Acceptance after v1.0.1 + v1.1

Uh oh!

cursor Bot commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linear-code Bot commented May 18, 2026 •

edited

Loading