chore(api): SP-4 PR-C · dark-host activation scaffold (AIN-248, founder-gated)#81
chore(api): SP-4 PR-C · dark-host activation scaffold (AIN-248, founder-gated)#81hizrianraz wants to merge 1 commit into
Conversation
…er-gated) Adds three pieces of scaffolding for the dark-host activation pass. **Activates nothing. Zero schema change. Zero catalog change.** Per the SP-4 §1 moat guardrails, this PR ships ONLY founder-gated artifacts. Stacks on SP-2 api#72 (\`feat/ain271-streaming-tooluse\`); independent of PR-A (#73) and PR-B (#74). ## What's new ### 1. \`scripts/dark_host_smoke.py\` — adapter smoke harness A CLI that exercises the existing ProviderAdapter against a (provider, upstream_model, base_url) target and prints a JSON latency/cost/shape report. Two consecutive \`.chat()\` calls give a coarse cold-vs-warm variance read. - Reads keys from env (Doppler-injectable) — never argv. - Covers the 5 open-weight venues (Groq, DeepInfra, Together, Fireworks, Novita) + Anthropic for parity check. - Returns JSON-serializable error dicts on every failure mode (no bare exceptions to stderr) so the founder can pipe the output straight into the activation runbook as evidence. - **Aulë does NOT run this** — the harness needs live provider credits (~\$45 total: DeepInfra \$15 + Together \$15 + Fireworks \$10 + Groq \$0 + Novita \$5) + Doppler keys. Founder runs it after topping up. ### 2. \`docs/dark-host-activation-runbook.md\` — the 4-phase tap The exact, ordered steps to light one (logical-model, venue) row: Phase 1 — smoke (founder, no DB): run the harness per venue, save the JSON reports for §16 audit. Phase 2 — Model x Host ontology decision (Disc#12): see proposal below; founder picks Path A / B / C. Phase 3 — activation migration TEMPLATE (not yet a real alembic file — lives as a snippet in the doc to keep the \`alembic/versions/\` directory clean until authorized). Parametrized on slug, upstream_model, costs, q_prior, brand. Phase 4 — verify (post-deploy): catalog row active, brain enrols it, audit chain intact. Rollback = \`alembic downgrade -1\`. The runbook is explicit that activation is **founder-gated** on three signals: credits + Doppler keys + ontology authorization. ### 3. \`docs/dark-host-ontology-proposal.md\` — Disc#12 schema decision Lays out 3 schema paths for representing the same logical model on multiple hosts (verified live: 0 cross-host slugs today; the schema is operationally one-model-one-host): Path A — flat \`models\` table, venue-suffixed slugs (\`llama-3.3-70b-groq\`). Lightest migration; zero engine change. Path B — \`model_hosts\` M:N junction. Cleanest semantics; biggest migration; touches \`routing_outcomes\` (§16 schema — violates SP-4 §1 immutability unless additive). Path C — Path A + nullable \`models.logical_slug\` for cross-venue aggregates. Aulë's recommendation: **Path A** for the SP-4 activation pass. Migrate to Path B in a follow-up sprint when the multi-host catalog density justifies the §16-additive migration. Four Disc#12 questions for the founder are listed at the bottom of the proposal. Activation runbook stays parked until they're answered. ## §0/P5 finding (documented for the audit chain) Live read against Supabase \`dftfpwzqxoebwzepygzl\`: - 47 inactive models distributed across 10 providers (novita 9 + deepinfra 6 + together 6 + gemini 5 + groq 5 + openai 4 + anthropic 3 + fireworks 3 + mistral 3 + xai 3). - **0 model slugs appear across multiple providers** — confirms one-model-one-host today. The Model x Host ontology change IS a real schema migration; PR-C ships ONLY the proposal doc. ## Pre-commit ruff + ruff-format + mypy --strict + pytest unit+smoke = 505 green. Zero new tests in this PR — the smoke harness is exercised against live providers (founder-run); the runbook + ontology are docs. ## Out of scope (per SP-4 §1 moat guardrails) - \`routing_outcomes\` schema — immutable, untouched. - The routing engine in \`routing/ainfera_routing/decide.py\` — untouched. - \`models\` schema — untouched. - Catalog activation — no model becomes \`active=true\` from this PR. - Online learning (AIN-246) — Backlog/deferred. - M_allowed / q_prior / q_empirical semantics. ## Founder action to unblock 1. \$45 credits across the 5 open-weight venues (DeepInfra \$15 + Together \$15 + Fireworks \$10 + Groq \$0 + Novita \$5). 2. Doppler keys mirroring those into the api Doppler env. 3. Disc#12 authorization of the Model x Host ontology path (the 4 questions at the bottom of the proposal doc). Once all three are in place, run the smoke harness per venue, then materialize the activation migration template into an actual alembic file and apply. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AIN-248 [Catalog] Full-catalog routing enrolment — q_prior backfill + 6 gated brands (price + q_prior + M_allowed) — feeds AIN-245
Feeds AIN-245 Routing v0. The engine is N-agnostic and routes the full catalog (11+ brands, growing) via emergent gating — a model enrols as a routing candidate iff it clears 3 gates: price + q_prior + M_allowed verdict. AIN-245 ships the engine + the 5 §C frontier anchors enrolled; this ticket grows enrolment to the full catalog. Rulings — LOCKED 2026-05-22 (Discipline #12, founder "Go")
Scope
Gate / dependency
NOT in this ticket |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is ON. A cloud agent has been kicked off to fix the reported issues.
Reviewed by Cursor Bugbot for commit b93ec86. Configure here.
| "GROQ_API_KEY", | ||
| "https://api.groq.com/openai", | ||
| OpenAICompatAdapter, | ||
| ), |
There was a problem hiding this comment.
Smoke base URLs mismatch production
High Severity
Default base_url values for deepinfra, groq, and fireworks differ from providers.base_url seeded in 20260516_0007_t9_catalog_providers.py. OpenAICompatAdapter appends /v1/chat/completions, so smoke and adapter_for_provider() hit different hosts. Phase 1 runbook commands omit --base-url, so a passing smoke run may not reflect production dispatch.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit b93ec86. Configure here.
| active = TRUE | ||
| WHERE slug = '{_MODEL_SLUG}' | ||
| AND provider_id = (SELECT id FROM providers WHERE slug = '{_VENUE}'); | ||
| """) |
There was a problem hiding this comment.
Activation template only updates rows
Medium Severity
The Phase 3 migration template only runs UPDATE models for a new Path A slug such as llama-3.3-70b-deepinfra. If no matching row exists yet, upgrade() succeeds with zero rows changed and the model stays inactive, which is easy to miss.
Reviewed by Cursor Bugbot for commit b93ec86. Configure here.
|
|
||
| Before any DB change, the founder authorizes the schema shape from [dark-host-ontology-proposal.md](./dark-host-ontology-proposal.md). Two paths the proposal lays out: | ||
|
|
||
| - **Path A (minimal):** keep the existing `models` table; add multiple rows for the same logical model (e.g. three rows for `llama-3.3-70b` differentiated by `provider_id`). Slug becomes non-unique → schema change. |
There was a problem hiding this comment.
Path A slug uniqueness contradiction
Medium Severity
Phase 2 says Path A makes slug non-unique across providers, but ModelORM enforces global UniqueConstraint("slug", name="uq_models_slug"). Path A needs distinct suffixed slugs per venue, not duplicate slugs on different provider_id values.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit b93ec86. Configure here.


SP-Ω-RUN resurrection of #75 (closed by base-deletion). Rebased onto main post-#80 squash.
Original PR: #75
Source: chore/dark-host-prep
Note
Low Risk
Adds new documentation and a standalone founder-run smoke-test script; no production routing, schema, or data-path changes unless the script/runbook is manually executed.
Overview
Adds a founder-gated dark-host activation runbook describing a phased process (smoke test, ontology decision, parametrized alembic migration template, and post-deploy verification) without applying any migrations.
Adds a Model×Host ontology proposal doc outlining alternative schema approaches (multi-row slugs vs
model_hostsjunction) to support multi-venue hosting.Introduces
scripts/dark_host_smoke.py, a read-only async CLI that uses existing provider adapters to make two testchat()calls against a specified venue (keys via env) and emits a JSON latency/shape report.Reviewed by Cursor Bugbot for commit b93ec86. Bugbot is set up for automated code reviews on this repo. Configure here.