From cef3b941ce39b6fd3c2ff083f97a74d2a5db0dc8 Mon Sep 17 00:00:00 2001 From: Brian O'Kelley Date: Mon, 11 May 2026 07:33:19 -0400 Subject: [PATCH] docs(verification): rewrite framing for (Sandbox) verdict (closes docs half of #4380) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three pages updated to reflect the (Sandbox) verdict from #4379: aao-verified.mdx: - Title / description / intro / TL;DR — (Sandbox) replaces (Live) as the higher tier - "What each axis certifies" tables — (Sandbox) tested against registered production URL with account.sandbox: true; same storyboard suite as (Spec), zero real-world side effects - "How agents earn each axis" — (Sandbox) earn instructions describe the seller-side sandbox-account gate - "Reading a badge" — display rows updated - "Lifecycle" — (Sandbox) lifecycle replaces (Live), cross-mode leakage flagged as immediate-revoke trigger - "Coverage gaps" — universal storyboards run as standard suite, no observability carve-out - JWT verification_modes — ["spec"] / ["spec", "sandbox"] - Supporting specs — #3755/#4382 (schema gate), #4028/#4384 (mode-gate storyboard), #4226/#4228 (UNKNOWN grading) - Warning banner above the deprecated canonical-campaign sections (eight checks, webhook ownership, two discovery paths, maintenance windows) — full removal in follow-up sweep conformance.mdx: - Intro / "Two words, not three" — (Sandbox) replaces (Live) - "Storyboard conformance vs. AAO Verified" — both qualifiers run the same storyboards; (Sandbox) is the stronger claim because it attests prod-stack sandbox tolerance comply-test-controller.mdx: - Prominent Note callout at the top: controller is dev/staging-only, AAO grading does NOT require or use it. Connects to (Sandbox) framing via #4379. Production-stack sandbox-flag honoring is what (Sandbox) attests, not controller exposure. Push A item 4 of 4 in the compliance reporting fidelity initiative is now complete. Follow-up: remove deprecated canonical-campaign sections from aao-verified.mdx (tracked in #4380). Co-Authored-By: Claude Opus 4.7 (1M context) --- .changeset/docs-sandbox-framing.md | 4 + .../by-layer/L3/comply-test-controller.mdx | 10 +- docs/building/verification/aao-verified.mdx | 120 ++++++++++-------- docs/building/verification/conformance.mdx | 12 +- 4 files changed, 88 insertions(+), 58 deletions(-) create mode 100644 .changeset/docs-sandbox-framing.md diff --git a/.changeset/docs-sandbox-framing.md b/.changeset/docs-sandbox-framing.md new file mode 100644 index 0000000000..882fb96e25 --- /dev/null +++ b/.changeset/docs-sandbox-framing.md @@ -0,0 +1,4 @@ +--- +--- + +Rewrites the AAO Verified framing in `docs/building/verification/aao-verified.mdx`, `docs/building/verification/conformance.mdx`, and `docs/building/by-layer/L3/comply-test-controller.mdx` to reflect the (Sandbox) verdict from #4379. Closes the docs half of #4380. Front-of-page sections (intro, axis tables, badge meaning, lifecycle, claim instructions, supporting-spec references) now describe the (Sandbox) tier: same storyboards as (Spec), real production endpoint, `account.sandbox: true` flagging, zero real-world side effects. `comply_test_controller` doc gains a prominent dev/staging-only callout. Deprecated canonical-campaign sections in aao-verified.mdx (eight observability checks, two discovery paths, webhook ownership, maintenance windows) carry a Warning banner pointing readers at the new framing; full removal of those sections tracked as a follow-up sweep. diff --git a/docs/building/by-layer/L3/comply-test-controller.mdx b/docs/building/by-layer/L3/comply-test-controller.mdx index 9aa2cbc216..878fc2f9b9 100644 --- a/docs/building/by-layer/L3/comply-test-controller.mdx +++ b/docs/building/by-layer/L3/comply-test-controller.mdx @@ -6,9 +6,17 @@ description: "Optional sandbox tool that lets the storyboard runner walk full li # Compliance test controller + +**The compliance test controller is a dev/staging-only affordance, not a production-time concept.** AAO grading does NOT require or use it. The AAO compliance heartbeat drives storyboards against the seller's registered production URL with `account.sandbox: true` on every request, and the seller's prod stack is responsible for honoring the flag — no controller endpoint needed. + +Sellers MAY implement the controller in their dev or staging environment to support their own integration testing — walking lifecycle state machines deterministically, seeding fixtures, forcing transitions that would otherwise require waiting for real time. That's its purpose. It MUST NOT be exposed on production deployments (see [Sandbox gating](#sandbox-gating) below). + +Confused about how the controller relates to AAO Verified (Sandbox)? See [#4379](https://github.com/adcontextprotocol/adcp/issues/4379) for the framing decision: (Sandbox) attests "real production endpoint correctly handles sandbox-flagged traffic across the full storyboard suite." The controller is the developer-side affordance for *your* testing, not the AAO-side grading mechanism. + + AdCP defines lifecycle state machines for accounts, creatives, media buys, SI sessions, and delivery reporting. Many transitions in these state machines are seller-initiated — creative approval, account suspension, budget depletion, delivery accrual. A storyboard runner can only exercise buyer-initiated flows, leaving seller-initiated transitions untested. -The **compliance test controller** is an optional tool that sellers expose in sandbox mode. It allows a storyboard runner to trigger seller-side state transitions deterministically, enabling end-to-end lifecycle verification. +The **compliance test controller** is an optional tool sellers expose in their dev/staging environment to support deterministic local testing. It allows a runner to trigger seller-side state transitions on demand, enabling end-to-end lifecycle verification during development. ## Motivation diff --git a/docs/building/verification/aao-verified.mdx b/docs/building/verification/aao-verified.mdx index 4c4f293c5f..d6acd9ece2 100644 --- a/docs/building/verification/aao-verified.mdx +++ b/docs/building/verification/aao-verified.mdx @@ -1,28 +1,33 @@ --- title: AAO Verified sidebarTitle: AAO Verified -description: "The public trust mark for AdCP agents. Two axes — Verified (Spec) for storyboard conformance, Verified (Live) for AAO-observed real production traffic. Earn either or both." +description: "The public trust mark for AdCP agents. Two qualifiers — Verified (Spec) for wire-format conformance, Verified (Sandbox) for production-surface sandbox tolerance. Earn either or both." "og:title": "AdCP — AAO Verified" --- **Status**: Request for Comments -**Last Updated**: April 24, 2026 +**Last Updated**: May 11, 2026 -**AAO Verified** is the public trust mark for AdCP agents. It carries one of two qualifiers in parens — **(Spec)** or **(Live)** — and may carry both. The qualifier names *which axis* of verification an agent has earned. +**AAO Verified** is the public trust mark for AdCP agents. It carries one of two qualifiers in parens — **(Spec)** or **(Sandbox)** — and may carry both. The qualifier names *which axis* of verification an agent has earned. It is two axes, not two tiers. The qualifiers answer different questions: -- **Verified (Spec)** — your AdCP protocol implementation matches the spec. Storyboards run against your test-mode endpoint; wire format, task shape, error semantics, and state-machine transitions all check out. Issued by the AAO compliance heartbeat. -- **Verified (Live)** — AAO has observed real production traffic flowing through your agent. The compliance engine continuously watches delivery against your live ad-server integration over a 7–14 day rolling window. *Lights up in 3.1 once the canonical-campaign runner is operational; the observability machinery on this page already ships.* +- **Verified (Spec)** — your AdCP protocol implementation matches the spec. Storyboards pass somewhere — could be a test deployment, could be local dev. Wire format, task shape, error semantics, state-machine transitions all check out. Attests *wire-format conformance*, not production tolerance. +- **Verified (Sandbox)** — your **real production endpoint** correctly honors `account.sandbox: true`. AAO runs the full storyboard suite against your registered `agent_url` with sandbox-flagged traffic; your prod stack processes it with schema-valid responses, correct lifecycle transitions, proper error envelopes, and **no real-world side effects** (no real spend, no real persistence, no real platform calls). Attests *the production code path tolerates test traffic correctly*. -An agent can earn either axis or both. A pure protocol wrapper around a stub ad server is honestly **Verified (Spec)** — that's what test agents and sandbox environments *are*. A real production seller with both an AdCP wrapper and a working ad server earns **Verified (Spec + Live)**, the strongest claim available. +An agent can earn either axis or both. A pure protocol wrapper around a stub ad server is honestly **Verified (Spec)** — that's what test agents and dev environments *are*. A real production seller whose prod URL handles sandbox traffic across the full storyboard suite earns **Verified (Spec + Sandbox)**, the strongest claim available. -The two axes are **orthogonal** — neither is a prerequisite for the other. A seller without a sandbox/test endpoint (common for SDK-built agents whose correctness is guaranteed by the SDK, or for production-only platforms that have no test-mode surface) can earn **(Live)** directly by enrolling a compliance account; the eight observability checks already exercise wire format, filters, lifecycle, and scope introspection through real traffic, which makes a separate simulation pass redundant for that seller. Conversely, a test agent that can never serve real impressions earns **(Spec)** as a complete claim. +The two axes are **orthogonal** — neither is a prerequisite for the other. A seller without a separate test deployment (production-only platforms that have no test-mode surface) can earn **(Sandbox)** directly by exposing their prod URL to AAO's runner with sandbox flagging — no separate test endpoint needed. Conversely, a test agent that can never serve real impressions earns **(Spec)** as a complete claim. -The two axes are related but answer different questions, and the badge surfaces whichever qualifiers are earned. +The badge surfaces whichever qualifiers are earned. -**TL;DR for sellers.** (Spec) is automatic for any agent passing storyboards on a test-mode endpoint with active AAO membership. (Live) is opt-in: designate one compliance account with real live campaigns (PSA / remnant / house / genuine revenue all qualify), grant the AAO compliance engine the `attestation_verifier` scope, and you're done. The same compliance engine that runs your storyboards monitors delivery on that account over a 7–14 day rolling window. Signal healthy → (Live) qualifier holds; signal degrades → it lapses. Today (AdCP 3.x) the webhook-attached path requires a dedicated compliance tenant because `reporting_webhook` is single-slot; AdCP 4.0 relaxes that via [#3009](https://github.com/adcontextprotocol/adcp/issues/3009). +**TL;DR for sellers.** Both qualifiers run the same storyboards through the same AAO compliance heartbeat. The difference is *where* the runner targets and *what* the seller's stack does with sandbox-flagged traffic: + +- **(Spec)** runs storyboards against a test deployment / local dev / a sandbox endpoint. The agent's prod surface is not exercised. +- **(Sandbox)** runs storyboards against the seller's registered production `agent_url` with `account.sandbox: true` on every request. The seller's prod stack MUST honor the flag — return schema-valid responses, transition state correctly, surface errors properly, and have **zero real-world side effects** (no billing, no persistence beyond the sandbox account, no third-party platform calls). + +The seller-side gate is normative: every comply_test_controller request includes `account: { sandbox: true }`, and the seller MUST verify the targeted account is sandbox by looking up the persisted account record — not by trusting the field. See [comply_test_controller](/docs/building/by-layer/L3/comply-test-controller) for the dev-side affordance; AAO grading itself does not require or use the controller. ## What each axis certifies @@ -31,25 +36,29 @@ The two axes are related but answer different questions, and the badge surfaces | | | |---|---| -| **Tested against** | Your test-mode / sandbox endpoint | -| **What it proves** | AdCP wire format, task shape, error semantics, state-machine transitions, declared specialisms map to working tools, schema conformance, filter behavior, idempotency semantics | -| **How** | Storyboards from the [Compliance Catalog](/docs/building/verification/compliance-catalog) run against the agent on AAO's compliance heartbeat | -| **Cadence** | ~12-24h heartbeat | +| **Tested against** | Any endpoint the agent owner registers — test deployment, local dev, sandbox-only stack. The runner does not distinguish. | +| **What it proves** | AdCP wire format, task shape, error semantics, state-machine transitions, declared specialisms map to working tools, schema conformance, filter behavior, idempotency semantics — exercised in isolation from real-world production state. | +| **How** | Storyboards from the [Compliance Catalog](/docs/building/verification/compliance-catalog) run against the registered agent URL on AAO's compliance heartbeat. | +| **Cadence** | ~1h heartbeat | | **Eligibility** | Any agent that passes the storyboards for its declared specialisms + holds an active AAO membership with API-access tier | | **Status** | **Available now** | -### Verified (Live) +### Verified (Sandbox) | | | |---|---| -| **Tested against** | Your real production endpoint | -| **What it proves** | A real ad server / decisioning engine / creative renderer is behind the protocol. Real impressions delivered on real inventory. Reporting flows through AdCP. Lifecycle transitions surface correctly. The eight observability checks below all hold. | -| **How** | Continuous observability on a designated compliance account, run by the AAO compliance engine over a 7–14 day rolling window | -| **Cadence** | Continuous; mark expires automatically when signals degrade | -| **Eligibility** | Has enrolled in observability (designated compliance account + `attestation_verifier` scope granted) + signals are healthy across the rolling window. **Independent of (Spec)** — sellers without a test-mode endpoint can earn (Live) directly. | -| **Status** | **Lights up in 3.1** when the canonical-campaign runner is operational. The eight-check observability machinery ships now and runs against operator-designated accounts; it pivots to AAO-operated canonical campaigns when the runner is in place. | +| **Tested against** | Your registered production `agent_url`, with `account.sandbox: true` on every request. | +| **What it proves** | Your production code path correctly honors sandbox flagging — same storyboards as (Spec), but exercised against the real prod stack buyers actually hit. Schema-valid responses, correct lifecycle transitions, proper error envelopes, **zero real-world side effects**. | +| **How** | Same storyboard suite as (Spec), driven against the seller's registered URL with sandbox-flagged traffic. No separate canonical-campaign infrastructure. | +| **Cadence** | Same ~1h heartbeat as (Spec) | +| **Eligibility** | Same as (Spec), PLUS the seller's prod surface accepts `account.sandbox: true` requests and processes them without persisting real state, calling third-party platforms, or billing | +| **Status** | **Foundation shipping** in [#4382](https://github.com/adcontextprotocol/adcp/pull/4382) (account.sandbox schema gate), [#4384](https://github.com/adcontextprotocol/adcp/pull/4384) (live-mode denial storyboard). Full grading framework following. | -The (Live) axis is the strongest signal a buyer can rely on: AAO is the active counterparty, not just an attesting body. If something breaks in the production code path, the compliance engine sees it within days and the qualifier expires. +The (Sandbox) qualifier replaces the earlier draft's `Verified (Live)` framing. The change: instead of attesting "your real-money production code path delivers impressions correctly" (canonical campaigns running through your stack), (Sandbox) attests "your real production code path correctly handles sandbox-flagged traffic across the full storyboard suite." Both are real-prod-surface claims; the difference is what gets tested. (Sandbox) is universally achievable across specialisms with no new AAO operational infrastructure. See [#4379](https://github.com/adcontextprotocol/adcp/issues/4379) for the reframe verdict. + + +**Re: `comply_test_controller`**: the controller is a **dev/staging-only** affordance for adopters' own integration testing. AAO's (Sandbox) grading does not require or use it. Sellers MAY implement controller endpoints in their dev environment to support deterministic local testing, but the production stack does not need to expose `comply_test_controller` to earn (Sandbox). The seller-side sandbox gate is what (Sandbox) attests — schema and lifecycle correctness under flagged traffic, on real prod. + ## Naming history @@ -67,7 +76,7 @@ The earlier draft's rejection of "Tier 1 / Tier 2" remains correct: tiering the ## Coverage gaps are explicit -Some claims don't have a clean (Live) observability path. The universal `signed_requests` storyboard, for example, is a per-request transport heartbeat with no canonical campaign to observe — agents declaring `request_signing.supported: true` are graded as **Verified (Spec)** for that capability with no (Live) counterpart, and that's an honest claim. The same applies to specialisms whose surface is intrinsically observational (e.g., catalog-only signal agents) rather than delivery-shaped. See [#3046](https://github.com/adcontextprotocol/adcp/issues/3046) for the per-storyboard coverage analysis. +Under the (Sandbox) framing, every applicable storyboard runs against the seller's production endpoint with sandbox-flagged traffic. There's no observability carve-out — universal storyboards (`signed_requests`, `pagination_integrity`, etc.) run as part of the standard suite. The (Sandbox) qualifier attests that the seller's prod stack handles all of them correctly under flagged traffic, not just the ones with a real-money observability path. See [#4379](https://github.com/adcontextprotocol/adcp/issues/4379) for the framing decision that replaced the earlier (Live) observability model. ## Reading a badge @@ -75,9 +84,9 @@ Badges render as a single shields.io-style image with the qualifiers in parens: | Display | Meaning | |---|---| -| `AAO Verified Sales Agent (Spec)` | Storyboards pass for declared media-buy specialisms; live traffic not yet observed, not yet enrolled, or no Live path exists for this agent's specialisms. Common for test agents and pre-production rollouts. | -| `AAO Verified Sales Agent (Spec + Live)` | Both axes earned. The strongest claim. | -| `AAO Verified Sales Agent (Live)` | Real production traffic is observed healthy across the rolling window. Common for SDK-built agents and production-only sellers without a test-mode endpoint — the eight observability checks already exercise wire format, filters, lifecycle, and scope, so a parallel storyboard pass would be redundant. | +| `AAO Verified Sales Agent (Spec)` | Storyboards pass for declared media-buy specialisms against a test deployment / dev / sandbox-only endpoint. Wire format and protocol semantics are correct; production-stack sandbox tolerance is not yet attested. Common for test agents and pre-production rollouts. | +| `AAO Verified Sales Agent (Spec + Sandbox)` | Both axes earned. The strongest claim. The agent's registered production URL handles the full storyboard suite under sandbox-flagged traffic with no real-world side effects. | +| `AAO Verified Sales Agent (Sandbox)` | Storyboards pass against the seller's registered production endpoint under `account.sandbox: true`. The seller's prod stack correctly honors the sandbox gate. Common for production-only sellers without a separate test deployment. | | `AAO Verified — Not Verified` | No badge issued for this agent + role, or the badge has been revoked. | The badge URL is stable per agent + role. As an agent earns or loses an axis, the SVG content updates without changing the URL — embedded badges automatically reflect the current state. @@ -104,16 +113,21 @@ The badge URL is stable per agent + role. As an agent earns or loses an axis, th The compliance heartbeat picks it up automatically — no manual enrollment needed beyond [registering your agent](/docs/building/index). -### To earn Verified (Live): +### To earn Verified (Sandbox): + +A seller earns (Sandbox) by: -A seller enrolls in the (Live) observability program by: +1. Registering their **production `agent_url`** with AAO. This is the same registration that earns (Spec) — no separate "compliance account" or "test deployment" needed. +2. Implementing the **sandbox-account gate** in their production stack: when a request arrives with `account.sandbox: true`, the seller verifies the targeted account is a sandbox account in the persisted record (not trusting the field), and processes the request with full schema/lifecycle correctness while producing **zero real-world side effects** — no real spend, no real ad-server orders, no third-party platform calls, no production persistence beyond the sandbox account's bounded state. +3. Holding an active AAO membership at an API-access tier. -1. Designating a **dedicated compliance account** on the seller's production platform. The account MUST contain real live campaigns — PSA, remnant, house, or genuine revenue campaigns all work. "Real" means: the campaigns serve real impressions on real inventory, not simulated data. The account MUST be one on which the AAO compliance engine can own the `reporting_webhook` slot without displacing another buyer's webhook (see [Webhook ownership](#webhook-ownership) below). Sellers MAY declare an expected flight cadence on the account (see [Maintenance windows](#maintenance-windows)) so seasonal gaps or scheduled quiet periods do not flap the mark. -2. Granting the AAO compliance engine the [`attestation_verifier`](/docs/accounts/overview#standard-named-scope-attestation_verifier) scope on the compliance account. The scope is declared via the `authorization` object on `sync_accounts` / `list_accounts` responses. It is narrowly defined — read + `update_media_buy{reporting_webhook}` only. No `create_media_buy`, no `sync_creatives`, no budget, date, or cancellation mutations. +That's it. The compliance heartbeat runs the same storyboards as (Spec), but targets the registered production URL with `account.sandbox: true` on every request. Pass → (Sandbox) qualifier issues. -That's it. No ground-truth exports, no admin API credentials, no parallel verification campaigns, no seller-side harness. +**Key requirement: sandbox-account isolation.** Sellers MUST persist a clear sandbox/live distinction at the account level. A request asserting `sandbox: true` against a live account MUST be refused with a structured error — see [#4028](https://github.com/adcontextprotocol/adcp/issues/4028) and the `comply-controller-mode-gate` storyboard for the canonical denial check. Cross-mode leakage is the failure mode (Sandbox) attests against. -(Live) today presumes the seller's stack has **tenant-level isolation** — a concept every SSP, ad server, and premium publisher has, but one that social platforms and walled-garden-adjacent surfaces without a dedicated-account model do not. Sellers in that shape can still claim **(Spec)** and can participate in (Live) via the polling-only Path B1 described below; the qualifier is the same either way. + +**Sections below describe the deprecated canonical-campaign / (Live) observability model.** That model is superseded by the (Sandbox) framing decided in [#4379](https://github.com/adcontextprotocol/adcp/issues/4379) — no separate compliance account, no `attestation_verifier` scope, no eight-check observability infrastructure. (Sandbox) attests via the same storyboard heartbeat that drives (Spec), targeting the registered production URL with sandbox-flagged traffic. The deep technical content below remains for historical context only and will be removed in a follow-up sweep. + ### Webhook ownership @@ -230,7 +244,7 @@ The token claims: `adcp_version` is the AdCP release this badge was issued against (`MAJOR.MINOR`). Pairs with the `(agent_url, role, adcp_version)` identity used by the badge URL routes. **Verifiers MUST check `adcp_version` against the AdCP version they care about** — a 3.0 token presented as proof of 3.1 conformance is not authoritative. The signed claim is shape-validated at signing time (`^[1-9][0-9]*\.[0-9]+$`); verifiers SHOULD apply the same regex defensively. -`verification_modes` is the array of axes earned. `["spec"]` until (Live) lights up; `["spec", "live"]` thereafter for agents that enroll. `protocol_version` is the full semver of the spec build the badge was tested against — informational metadata for support and audits. +`verification_modes` is the array of axes earned. `["spec"]` for test-deployment storyboard pass only; `["spec", "sandbox"]` for agents whose production endpoint also passes under sandbox-flagged traffic. `protocol_version` is the full semver of the spec build the badge was tested against — informational metadata for support and audits. The registry API is authoritative for real-time status; the JWT is a 30-day cacheable proof. @@ -242,15 +256,15 @@ Verification is continuously re-evaluated, not a one-time certificate. - **Issued** — first heartbeat with all declared-specialism storyboards passing + active membership. - **Active** — re-checked every heartbeat; JWT auto-renewed. - **Degraded** — first storyboard regression starts a 48-hour grace; the badge continues to render (Spec) while the operator investigates. -- **Revoked** — 48h continuous failure → `(Spec)` qualifier drops from the badge. (Live), if held, is unaffected — the axes are independent. +- **Revoked** — 48h continuous failure → `(Spec)` qualifier drops from the badge. (Sandbox), if held, is unaffected — the axes are independent. - **Recovery** — passing storyboards reissue (Spec) automatically. -### (Live) -- **Issued** — eight checks healthy across the rolling window for an enrolled compliance account. -- **Active** — continuous observation; mark stays as long as signals are healthy. -- **Degraded** — any check fails → 48-hour grace. Particular failures (check 7 mismatch with secondary-identity probes) MAY skip the grace period and revoke immediately. -- **Revoked** — 48h continuous failure → `(Live)` qualifier drops. (Spec), if held, is unaffected. -- **Recovery** — sustained healthy window reissues (Live). +### (Sandbox) +- **Issued** — first heartbeat with all declared-specialism storyboards passing against the registered production URL under `account.sandbox: true` + active membership. The seller's prod stack must additionally pass the `comply-controller-mode-gate` storyboard (refuses controller dispatch against live-mode accounts — the seller-side sandbox isolation contract). +- **Active** — re-checked every heartbeat; JWT auto-renewed. +- **Degraded** — first storyboard regression starts a 48-hour grace; the badge continues to render (Sandbox) while the operator investigates. Cross-mode leakage (a sandbox request producing real-world side effects, or a live-mode account accepting a sandbox-flagged controller call) MAY skip the grace period and revoke immediately — that's the (Sandbox) attestation's whole point. +- **Revoked** — 48h continuous failure → `(Sandbox)` qualifier drops. (Spec), if held, is unaffected. +- **Recovery** — passing storyboards (including the mode-gate check) reissue (Sandbox). Membership lapse revokes the entire badge regardless of test results — public trust marks require active membership. @@ -347,32 +361,36 @@ When AAO serves brand.json data for a registered brand, agent entries get an `aa 3. Pass the storyboards your declarations obligate (universal + protocol baselines + specialism baselines) at a specific AdCP major version. 4. The AAO compliance heartbeat issues **AAO Verified (Spec)** automatically and re-verifies on each heartbeat cycle. -### To claim **(Live)** *(lights up in 3.1)* +### To claim **(Sandbox)** -(Live) is **independent of (Spec)** — sellers without a test-mode endpoint can earn (Live) directly. The eight observability checks below already exercise wire format, filters, lifecycle, and scope through real traffic. +(Sandbox) is **independent of (Spec)** — sellers without a separate test deployment can earn (Sandbox) directly by exposing their production endpoint to AAO's runner with sandbox flagging. -1. Designate a compliance account in your tenant containing real live campaigns. PSA, remnant, house, or genuine revenue all qualify — the only requirement is that real impressions serve on real inventory and reporting flows through AdCP. -2. Grant the [`attestation_verifier`](/docs/accounts/overview#standard-named-scope-attestation_verifier) scope to the AAO compliance engine's identity via the [`authorization`](/docs/accounts/overview#caller-authorization) object on `sync_accounts` / `list_accounts` responses for that account. -3. *(Optional)* Declare maintenance windows on the compliance account if your inventory has expected quiet periods (cap: 14 days per window, 30 days cumulative per rolling 90-day window). -4. The AAO compliance engine watches the eight checks above over a 7–14 day rolling window and issues **(Live)** when signals are healthy. The qualifier expires automatically if signal degrades. +1. Hold an active AAO membership with API-access tier. +2. Declare your `supported_protocols` and `specialisms` in `get_adcp_capabilities` (same as (Spec)). +3. Register your **production `agent_url`** with AAO. The compliance heartbeat will target it with `account.sandbox: true` on every storyboard request. +4. Implement the sandbox-account gate in your production stack: verify the targeted account is sandbox in your persisted records (not by trusting the field), and process the request with full schema/lifecycle correctness while producing **zero real-world side effects** — no real spend, no real ad-server orders, no third-party platform calls, no production persistence beyond the bounded sandbox account state. +5. Pass the [`comply-controller-mode-gate`](https://adcontextprotocol.org/compliance/latest/universal/comply-controller-mode-gate) storyboard, which exercises the seller-side isolation contract (refuse controller dispatch against live-mode accounts). +6. The AAO compliance heartbeat issues **AAO Verified (Sandbox)** when the full applicable storyboard suite passes against the registered URL with sandbox-flagged traffic. -No report uploads, no admin credentials, no ground-truth exports, no parallel verification flow. +Same storyboards as (Spec). Same heartbeat cadence. Different attestation surface: prod, with sandbox flagging, instead of any-registered-endpoint. ## What AAO Verified is not - **Not a regulatory or financial attestation.** SOC 2, ISO 27001, ISAE 3402 and similar frameworks address operational and financial-control posture — distinct questions, with their own audit paths. AAO Verified is wire-and-delivery correctness for AdCP. -- **Not hard ground-truth reconciliation in v1.** The eight (Live) checks observe internal consistency of AdCP responses against real delivery; they do not reconcile AdCP-reported numbers against the seller's internal ad-server dashboard. Hard reconciliation is deferrable to a future opt-in upgrade — buyer attestation, or seller-exported reports — without affecting the v1 mark. +- **Not hard ground-truth reconciliation.** (Sandbox) attests the production code path handles sandbox-flagged traffic correctly across the protocol surface. It does not reconcile real-money AdCP-reported numbers against the seller's internal ad-server dashboard under live traffic. Hard reconciliation is a separate kind of attestation tracked outside the (Sandbox) tier. - **Not certification beyond AAO membership.** The [AgenticAdvertising.org certification program](/docs/learning/overview) composes with AAO Verified — verification is necessary input to certification, but verification is not certification itself. - **Not a SLA.** AAO Verified does not guarantee uptime, latency, or commercial outcomes. It attests that the seller's AdCP surface continuously reflects real delivery; commercial reliability is between buyer and seller. - **Not a substitute for due diligence.** Buyers SHOULD still vet sellers' contractual terms, billing posture, governance practices, and incident-response posture independently. AAO Verified is one input, not the whole picture. ## Relationship to supporting specs -AAO Verified (Live) depends on three normative pieces of the AdCP spec, each tracked on the issue tracker: +AAO Verified (Sandbox) rests on a small set of normative AdCP spec elements: + +- **[`account.sandbox` schema gate (#3755 / #4382)](https://github.com/adcontextprotocol/adcp/issues/3755)** — pins `account.sandbox: true` on every `comply_test_controller` request (when present). Defense-in-depth on top of the seller-side gate; a request asserting `sandbox: false` schema-rejects. +- **[`comply-controller-mode-gate` storyboard (#4028 / #4384)](https://github.com/adcontextprotocol/adcp/issues/4028)** — verifies sellers correctly refuse controller dispatch against live-mode accounts. Keystone of the (Sandbox) isolation contract. +- **[UNKNOWN_SCENARIO grading (#4226 / #4228)](https://github.com/adcontextprotocol/adcp/issues/4226)** — sellers MAY implement controller selectively; the runner grades absent operations as `not_applicable` rather than `failed`. Controller is dev-only per the (Sandbox) framing. -- **[`get_media_buys` account-ownership scope (#2963)](https://github.com/adcontextprotocol/adcp/issues/2963)** — `get_media_buys` MUST return all account-owned media buys regardless of how they were created, so the brownfield Path B can discover non-AdCP-created campaigns. Without this, (Live) cannot observe campaigns trafficked outside AdCP. -- **[Per-account `authorization` envelope + `attestation_verifier` scope + RBAC error codes (#2964)](https://github.com/adcontextprotocol/adcp/issues/2964)** — the narrow-scope grant the AAO compliance engine holds on the seller's compliance account, plus the introspection mechanism that backs check 7 (introspection consistency). -- **[Behavioral filter assertions (#2902)](https://github.com/adcontextprotocol/adcp/issues/2902)** — the (Spec) axis runs filter-behavior assertions against `simulate_delivery`; (Live) re-runs them against real data to close the "filtering silently no-ops" gap. +The earlier (Live) framing's supporting issues (#2963, #2964, #2902 — `attestation_verifier` scope, `get_media_buys` ownership, behavioral filter assertions on real data) are deferred. They remain relevant if AAO ever returns to a canonical-campaign model, but are not load-bearing under (Sandbox). A fourth supporting issue tracks 4.0: diff --git a/docs/building/verification/conformance.mdx b/docs/building/verification/conformance.mdx index cf211b049c..3b0cb0284b 100644 --- a/docs/building/verification/conformance.mdx +++ b/docs/building/verification/conformance.mdx @@ -13,22 +13,22 @@ description: "What 'AdCP-conformant' means, defined by the storyboards that veri AdCP conformance has two load-bearing terms. A third (one you'll hear in the wild) is a trap. - **Conformant** — the agent meets the normative rules. Defined by the storyboards this document indexes. -- **Verified** — AAO has tested the agent recently and issued a signed attestation. Gated on active membership and a live heartbeat. The [AAO Verified badge](/docs/building/verification/aao-verified) carries one of two qualifiers: **(Spec)** for storyboard-conformance against a test-mode endpoint, **(Live)** for AAO-observed real production traffic via canonical campaigns. An agent can earn either or both. +- **Verified** — AAO has tested the agent recently and issued a signed attestation. Gated on active membership and a live heartbeat. The [AAO Verified badge](/docs/building/verification/aao-verified) carries one of two qualifiers: **(Spec)** for storyboard-conformance against a test deployment or dev endpoint, **(Sandbox)** for storyboard-conformance against the seller's real production endpoint under `account.sandbox: true` flagging. An agent can earn either or both. - **"Compliant"** — self-attested, unverified, no external check. Don't claim it; don't design for it. This document uses *conformant* and *verified* exclusively. Put differently: - Conformance is a property of the agent's wire behavior. -- Verification is a time-bounded third-party attestation. **(Spec)** verifies conformance via simulation; **(Live)** verifies the underlying production capability via real-traffic observation. Each independently demonstrates conformance through different evidence. -- The two axes are independent: a seller without a test-mode endpoint can earn **(Live)** directly, and a test agent that can never serve real impressions earns **(Spec)** as a complete claim. +- Verification is a time-bounded third-party attestation. **(Spec)** attests wire-format conformance against any registered endpoint; **(Sandbox)** attests the same storyboard suite passes against the seller's real production endpoint under sandbox-flagged traffic. Same storyboards, different attestation surface. +- The two axes are independent: a seller without a separate test deployment can earn **(Sandbox)** directly on production; a test agent that can never serve real impressions earns **(Spec)** as a complete claim. ## Storyboard conformance vs. AAO Verified -This page indexes **storyboard conformance** — the property an agent's wire behavior has when it matches the spec, verified by storyboards running against seeded test data. Storyboard passing earns the **AAO Verified (Spec)** qualifier on an agent's badge. +This page indexes **storyboard conformance** — the property an agent's wire behavior has when it matches the spec, verified by storyboards running against seeded test data. Storyboard passing earns the **AAO Verified (Spec)** or **AAO Verified (Sandbox)** qualifier (or both) on an agent's badge, depending on where the runner targeted. -A second axis — **AAO Verified (Live)** — verifies the production code path behind the protocol actually delivers real impressions on real inventory. (Live) is opt-in, observed continuously over a 7–14 day rolling window on a designated compliance account, and orthogonal to storyboard conformance: a seller can pass every storyboard with a broken ad-server integration because `simulate_delivery` is a parallel code path from the production reporting path. (Live) closes that gap. +A second axis — **AAO Verified (Sandbox)** — verifies the seller's real production endpoint correctly handles the full storyboard suite under `account.sandbox: true` flagging. (Sandbox) is the stronger claim: a seller can pass (Spec) on a test deployment while their production stack has a broken sandbox gate (real-world side effects under flagged traffic, missing account-mode verification, etc.) — (Sandbox) closes that gap. -The two qualifiers share one brand mark — **AAO Verified** — and an agent can earn either or both. **(Spec) and (Live) are independent**: each independently demonstrates conformance through different evidence. (Spec) verifies it via simulated interactions against a test endpoint; (Live) verifies it via observed real traffic that the eight checks exercise (wire format, filters, lifecycle, scope introspection). A seller without a test-mode endpoint can earn (Live) directly; a test agent that can never serve real impressions earns (Spec) as a complete claim. See [AAO Verified](/docs/building/verification/aao-verified) for the full qualifier model and the (Live) spec; the rest of this page indexes the storyboards that back (Spec). +The two qualifiers share one brand mark — **AAO Verified** — and an agent can earn either or both. **(Spec) and (Sandbox) are independent**: each independently demonstrates conformance through different evidence. (Spec) attests wire-format conformance against any registered endpoint; (Sandbox) attests the production code path correctly tolerates sandbox-flagged traffic. See [AAO Verified](/docs/building/verification/aao-verified) for the qualifier model and the [Sandbox framing verdict](https://github.com/adcontextprotocol/adcp/issues/4379); the rest of this page indexes the storyboards that back both qualifiers. ## Storyboards are the truth