chore(api): SP-4 PR-C · dark-host activation scaffold (AIN-248, founder-gated) by hizrianraz · Pull Request #81 · ainfera-ai/api

hizrianraz · 2026-05-24T11:20:56Z

SP-Ω-RUN resurrection of #75 (closed by base-deletion). Rebased onto main post-#80 squash.

Original PR: #75
Source: chore/dark-host-prep

Note

Low Risk
Adds new documentation and a standalone founder-run smoke-test script; no production routing, schema, or data-path changes unless the script/runbook is manually executed.

Overview
Adds a founder-gated dark-host activation runbook describing a phased process (smoke test, ontology decision, parametrized alembic migration template, and post-deploy verification) without applying any migrations.

Adds a Model×Host ontology proposal doc outlining alternative schema approaches (multi-row slugs vs model_hosts junction) to support multi-venue hosting.

Introduces scripts/dark_host_smoke.py, a read-only async CLI that uses existing provider adapters to make two test chat() calls against a specified venue (keys via env) and emits a JSON latency/shape report.

^{Reviewed by Cursor Bugbot for commit b93ec86. Bugbot is set up for automated code reviews on this repo. Configure here.}

…er-gated) Adds three pieces of scaffolding for the dark-host activation pass. **Activates nothing. Zero schema change. Zero catalog change.** Per the SP-4 §1 moat guardrails, this PR ships ONLY founder-gated artifacts. Stacks on SP-2 api#72 (\`feat/ain271-streaming-tooluse\`); independent of PR-A (#73) and PR-B (#74). ## What's new ### 1. \`scripts/dark_host_smoke.py\` — adapter smoke harness A CLI that exercises the existing ProviderAdapter against a (provider, upstream_model, base_url) target and prints a JSON latency/cost/shape report. Two consecutive \`.chat()\` calls give a coarse cold-vs-warm variance read. - Reads keys from env (Doppler-injectable) — never argv. - Covers the 5 open-weight venues (Groq, DeepInfra, Together, Fireworks, Novita) + Anthropic for parity check. - Returns JSON-serializable error dicts on every failure mode (no bare exceptions to stderr) so the founder can pipe the output straight into the activation runbook as evidence. - **Aulë does NOT run this** — the harness needs live provider credits (~\$45 total: DeepInfra \$15 + Together \$15 + Fireworks \$10 + Groq \$0 + Novita \$5) + Doppler keys. Founder runs it after topping up. ### 2. \`docs/dark-host-activation-runbook.md\` — the 4-phase tap The exact, ordered steps to light one (logical-model, venue) row: Phase 1 — smoke (founder, no DB): run the harness per venue, save the JSON reports for §16 audit. Phase 2 — Model x Host ontology decision (Disc#12): see proposal below; founder picks Path A / B / C. Phase 3 — activation migration TEMPLATE (not yet a real alembic file — lives as a snippet in the doc to keep the \`alembic/versions/\` directory clean until authorized). Parametrized on slug, upstream_model, costs, q_prior, brand. Phase 4 — verify (post-deploy): catalog row active, brain enrols it, audit chain intact. Rollback = \`alembic downgrade -1\`. The runbook is explicit that activation is **founder-gated** on three signals: credits + Doppler keys + ontology authorization. ### 3. \`docs/dark-host-ontology-proposal.md\` — Disc#12 schema decision Lays out 3 schema paths for representing the same logical model on multiple hosts (verified live: 0 cross-host slugs today; the schema is operationally one-model-one-host): Path A — flat \`models\` table, venue-suffixed slugs (\`llama-3.3-70b-groq\`). Lightest migration; zero engine change. Path B — \`model_hosts\` M:N junction. Cleanest semantics; biggest migration; touches \`routing_outcomes\` (§16 schema — violates SP-4 §1 immutability unless additive). Path C — Path A + nullable \`models.logical_slug\` for cross-venue aggregates. Aulë's recommendation: **Path A** for the SP-4 activation pass. Migrate to Path B in a follow-up sprint when the multi-host catalog density justifies the §16-additive migration. Four Disc#12 questions for the founder are listed at the bottom of the proposal. Activation runbook stays parked until they're answered. ## §0/P5 finding (documented for the audit chain) Live read against Supabase \`dftfpwzqxoebwzepygzl\`: - 47 inactive models distributed across 10 providers (novita 9 + deepinfra 6 + together 6 + gemini 5 + groq 5 + openai 4 + anthropic 3 + fireworks 3 + mistral 3 + xai 3). - **0 model slugs appear across multiple providers** — confirms one-model-one-host today. The Model x Host ontology change IS a real schema migration; PR-C ships ONLY the proposal doc. ## Pre-commit ruff + ruff-format + mypy --strict + pytest unit+smoke = 505 green. Zero new tests in this PR — the smoke harness is exercised against live providers (founder-run); the runbook + ontology are docs. ## Out of scope (per SP-4 §1 moat guardrails) - \`routing_outcomes\` schema — immutable, untouched. - The routing engine in \`routing/ainfera_routing/decide.py\` — untouched. - \`models\` schema — untouched. - Catalog activation — no model becomes \`active=true\` from this PR. - Online learning (AIN-246) — Backlog/deferred. - M_allowed / q_prior / q_empirical semantics. ## Founder action to unblock 1. \$45 credits across the 5 open-weight venues (DeepInfra \$15 + Together \$15 + Fireworks \$10 + Groq \$0 + Novita \$5). 2. Doppler keys mirroring those into the api Doppler env. 3. Disc#12 authorization of the Model x Host ontology path (the 4 questions at the bottom of the proposal doc). Once all three are in place, run the smoke harness per venue, then materialize the activation migration template into an actual alembic file and apply. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

linear-code · 2026-05-24T11:21:00Z

AIN-248 [Catalog] Full-catalog routing enrolment — q_prior backfill + 6 gated brands (price + q_prior + M_allowed) — feeds AIN-245

Feeds AIN-245 Routing v0. The engine is N-agnostic and routes the full catalog (11+ brands, growing) via emergent gating — a model enrols as a routing candidate iff it clears 3 gates: price + q_prior + M_allowed verdict. AIN-245 ships the engine + the 5 §C frontier anchors enrolled; this ticket grows enrolment to the full catalog.

Rulings — LOCKED 2026-05-22 (Discipline #12, founder "Go")

q_prior = new numeric(3,2) column. Do NOT inherit aa_intelligence_index — it is sourced from the retired AAMC engine; carrying it forward re-imports a dead methodology.
q_prior seeding rule: the 5 §C frontier anchors = their locked §C values (opus-4-7 0.95 · gpt-5-5 0.93 · gemini-3-1-pro 0.90 · grok-4 0.86 · mistral-large-3 0.80); every other model = Artificial Analysis Intelligence Index v4.0 ÷ 100 (traceable; same source the public leaderboard already cites). No AA entry and not a §C anchor → not enrolled (no fabricated priors — §D3).
Emergent gating: no hardcoded active flag decides routing — clearing the 3 gates does. The 6 Chinese-origin brands cannot enter the candidate set until an M_allowed verdict exists. The architecture enforces the legal gate pre-launch.

Scope

Work	Detail	Owner
q_prior backfill — active non-anchor	claude-haiku-4-5, claude-sonnet-4-6, gpt-5, gpt-5-mini ← AA v4.0 ÷ 100	Aule (data)
6 gated brands — price	Alibaba (Qwen), DeepSeek, Meta, MiniMax, Moonshot AI, Z.ai (GLM)	Aule (data)
6 gated brands — q_prior	AA v4.0 ÷ 100 where published; else hold out of v0 set	Aule (data)
6 gated brands — M_allowed verdict	data-residency / legal per brand	Ulmo + founder

Gate / dependency

M_allowed verdicts for the 6 gated brands = founder/Ulmo compliance call. Until issued, those brands stay out of the candidate set by design.
All writes via Alembic; real values only (§D3 — no fabricated priors).

NOT in this ticket

The q_prior column DDL + the 5 §C anchor seeds — those land in AIN-245's first migration (so the engine routes the frontier set immediately).
Latency data (none exists in catalog today; tracked separately — see AIN-245 caveat).

Review in Linear

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is ON. A cloud agent has been kicked off to fix the reported issues.}

^{Reviewed by Cursor Bugbot for commit b93ec86. Configure here.}

cursor · 2026-05-24T11:34:56Z

+        "GROQ_API_KEY",
+        "https://api.groq.com/openai",
+        OpenAICompatAdapter,
+    ),


Smoke base URLs mismatch production

High Severity

Default base_url values for deepinfra, groq, and fireworks differ from providers.base_url seeded in 20260516_0007_t9_catalog_providers.py. OpenAICompatAdapter appends /v1/chat/completions, so smoke and adapter_for_provider() hit different hosts. Phase 1 runbook commands omit --base-url, so a passing smoke run may not reflect production dispatch.

Additional Locations (1)

docs/dark-host-activation-runbook.md#L20-L55

^{Reviewed by Cursor Bugbot for commit b93ec86. Configure here.}

cursor · 2026-05-24T11:34:56Z

+            active = TRUE
+        WHERE slug = '{_MODEL_SLUG}'
+          AND provider_id = (SELECT id FROM providers WHERE slug = '{_VENUE}');
+    """)


Activation template only updates rows

Medium Severity

The Phase 3 migration template only runs UPDATE models for a new Path A slug such as llama-3.3-70b-deepinfra. If no matching row exists yet, upgrade() succeeds with zero rows changed and the model stays inactive, which is easy to miss.

^{Reviewed by Cursor Bugbot for commit b93ec86. Configure here.}

cursor · 2026-05-24T11:34:56Z

+
+Before any DB change, the founder authorizes the schema shape from [dark-host-ontology-proposal.md](./dark-host-ontology-proposal.md). Two paths the proposal lays out:
+
+- **Path A (minimal):** keep the existing `models` table; add multiple rows for the same logical model (e.g. three rows for `llama-3.3-70b` differentiated by `provider_id`). Slug becomes non-unique → schema change.


Path A slug uniqueness contradiction

Medium Severity

Phase 2 says Path A makes slug non-unique across providers, but ModelORM enforces global UniqueConstraint("slug", name="uq_models_slug"). Path A needs distinct suffixed slugs per venue, not duplicate slugs on different provider_id values.

Additional Locations (1)

docs/dark-host-ontology-proposal.md#L28-L44

^{Reviewed by Cursor Bugbot for commit b93ec86. Configure here.}

cursor Bot reviewed May 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(api): SP-4 PR-C · dark-host activation scaffold (AIN-248, founder-gated)#81

chore(api): SP-4 PR-C · dark-host activation scaffold (AIN-248, founder-gated)#81
hizrianraz wants to merge 1 commit into
mainfrom
chore/dark-host-prep

hizrianraz commented May 24, 2026 •

edited by cursor Bot

Loading

Uh oh!

linear-code Bot commented May 24, 2026 •

edited

Loading

Rulings — LOCKED 2026-05-22 (Discipline #12, founder "Go")

Scope

Gate / dependency

NOT in this ticket

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 24, 2026

Uh oh!

cursor Bot May 24, 2026

Uh oh!

cursor Bot May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		Before any DB change, the founder authorizes the schema shape from [dark-host-ontology-proposal.md](./dark-host-ontology-proposal.md). Two paths the proposal lays out:

		- Path A (minimal): keep the existing `models` table; add multiple rows for the same logical model (e.g. three rows for `llama-3.3-70b` differentiated by `provider_id`). Slug becomes non-unique → schema change.

Conversation

hizrianraz commented May 24, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear-code Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rulings — LOCKED 2026-05-22 (Discipline #12, founder "Go")

Scope

Gate / dependency

NOT in this ticket

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 24, 2026

Choose a reason for hiding this comment

Smoke base URLs mismatch production

Uh oh!

cursor Bot May 24, 2026

Choose a reason for hiding this comment

Activation template only updates rows

Uh oh!

cursor Bot May 24, 2026

Choose a reason for hiding this comment

Path A slug uniqueness contradiction

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hizrianraz commented May 24, 2026 •

edited by cursor Bot

Loading

linear-code Bot commented May 24, 2026 •

edited

Loading