Skip to content

RFC: AAO Verified via canonical test campaigns (supersedes two-tier model in #2965) #3046

@bokelley

Description

@bokelley

AAO Verified (Live) via Canonical Test Campaigns

Status: Revised draft (v3) — naming aligned with PR #2153: single brand mark "AAO Verified" with axis qualifiers (Spec) and (Live) instead of the v2 draft's two distinct mark names ("AdCP Conformant" / "AAO Verified"). The wire format reflects this with verification_modes: string[] in the JWT and registry API — ["spec"] today, ["spec", "live"] once canonical campaigns light up. End-state machinery is unchanged; only the public framing collapses to a single brand word with composable qualifiers. See PR #2153 for the rename in code + docs.
Author: Brian O'Kelley + Claude
Date: April 2026
Milestone: 4.0 (final trigger flip for (Live)). Transitional machinery already shipping in 3.x: merged PR #3001 (eight observability checks + attestation_verifier scope) + PR #2153 (qualifier framing in code, JWT, registry API, SVG).

TL;DR

AAO Verified (Live) is the top-tier trust mark for AdCP agents. Today (3.x), AAO issues it by continuously observing the seller's own campaigns running on a designated compliance account (docs/building/aao-verified.mdx). In 4.0, AAO becomes the operator: it runs a canonical test campaign per declared specialism, weekly, through the seller's real ad-server integration. AAO Verified (Live) issues when the canonical campaign stays healthy and revokes when it doesn't. The same eight checks from 3.x apply per canonical campaign; only the issuance trigger changes — sellers don't re-plumb.

AAO Verified (Spec) (storyboard-issued, 3.x and 4.0) remains the lower-weight publishable mark. Verified (Live) ⇒ Verified (Spec) — a storyboard regression blocks AAO Verified (Live) issuance.

If you've read docs/building/aao-verified.mdx, the end state is: same machinery, different trigger. This RFC defines the trigger flip, scopes the per-specialism work to get there, and sizes AAO's operational commitment honestly.


The problem with "passes the basic tests"

Our current badge path (#2153) issues a mark to agents that pass storyboards against simulate_delivery. That's wire-format conformance. A buyer seeing "Verified" on an agent's website is not thinking "the JSON shapes are right." They're thinking "I can actually buy media through this agent."

Those are different claims. Storyboards prove the first. Only a live campaign proves the second. The two-mark model preserves both claims as distinct publishable signals; canonical campaigns flip what "Verified" means.

Two marks, not one

AdCP 3.x (shipping via merged #3001) defines two distinct public trust marks:

Mark What it means How it's issued (3.x) How it's issued (4.0, this RFC)
AAO Verified (Spec) Wire format is right: the agent implements its declared supported_protocols and specialisms per the storyboard suite. Storyboards pass. Same — storyboards pass.
AAO Verified (Live) The declared capability is actually implemented in the seller's live production stack: real impressions, real inventory, sustained over weeks. Continuous observability of whatever campaigns the seller runs on a designated compliance account. AAO-operated canonical test campaign for the specialism runs weekly and stays healthy.

Containment. Verified (Live) ⇒ Verified (Spec). A seller cannot hold AAO Verified (Live) without AAO Verified (Spec) — a storyboard regression blocks AAO Verified (Live) issuance even when live campaigns look fine. You can be Conformant without being Verified (storyboard-passing but no live traffic yet); you cannot be Verified without being Conformant.

Why two marks, not one. The earlier single-bar framing dropped storyboards from the public-badge path entirely. Buyers got a stronger signal, but pre-production agents, non-revenue pilots, and specialisms without canonical flows lost any public signal. The two-mark model keeps AAO Verified (Spec) as a useful lower-weight mark (searchable registry, "storyboards passing") and reserves AAO Verified (Live) for in-market sellers with canonical-campaign health.

Per-check mapping: 3.x enrollment-based → 4.0 canonical-campaign-based

The same eight checks apply at both versions; the locus of observation moves from "the compliance account's ongoing activity" to "AAO's canonical flight."

# Check 3.x (enrollment) 4.0 (canonical campaign)
1 Liveness At least one active buy in the compliance account ≥ 80% of the 7–14 day rolling window At least one healthy completed canonical flight per scheduled cadence (weekly cadence = ≥ 1 healthy flight per rolling 7-day window; missed flights count as failures, not as quiet periods)
2 Freshness Same get_media_buy_delivery query on day N vs N+1 returns different numbers Same query during a canonical flight's flight window returns changing numbers across the flight
3 Plausibility Monotonic impressions, by_package sums, non-zero where expected, pacing_index consistency Same checks applied to the canonical flight's reported delivery
4 Filter correctness start_date / end_date narrow results against account history Same applied to the canonical flight
5 Reporting-surface cross-consistency All declared reporting surfaces agree across the compliance account window All declared surfaces agree for the canonical flight; skipped if polling is the only declared surface
6 Lifecycle correctness Completed / paused / canceled buys behave correctly across the account The canonical flight completes, pauses, and cancels correctly when AAO induces those transitions
7 Introspection consistency authorization object on sync_accounts / list_accounts matches actual enforcement Same — runner verifies authorization before and during each canonical flight
8 Seller-initiated state-transition propagation Trafficker / finance / lifecycle changes surface within seller's declared status-freshness tolerance Same applied to the canonical flight; AAO induces seller-initiated transitions in some weeks to test propagation

The 4.0 proposal: canonical test campaigns

AAO Verified (Live) in 4.0 = an AAO-operated canonical test campaign for this specialism is currently running through your agent and reporting is flowing.

One trigger. One bar. You earn it by letting AAO run real buys through you; you lose it when those buys stop working.

Specialism as contract

Each specialism that supports canonical campaigns gets exactly one canonical test flow — a real end-to-end interaction that exercises the production code path for that capability. AAO runs it weekly. If every step works, the AAO Verified (Live) mark stays green for that specialism; if any step fails, it degrades.

"Canonical test campaign" is a shorthand; the real idea is "AAO runs an end-to-end canonical flow through your agent that exercises the same code path buyers would hit." The flow shape varies by specialism — a media-buy sales specialism runs a PSA flight; a governance specialism runs a compliant/non-compliant brief pair; a signals specialism runs a discover-then-activate flow.

The common thread: AAO is an operator participating in the ecosystem. Every AAO Verified (Live) agent has AAO as an active counterparty exercising their real code paths weekly. Breakage on any seller surfaces within a week.

Canonical campaign spec format

Each specialism with a canonical flow gets a sibling YAML alongside its storyboard:

/compliance/{version}/specialisms/{id}/index.yaml          (storyboards — AAO Verified (Spec))
/compliance/{version}/specialisms/{id}/canonical-campaign.yaml  (AAO Verified (Live), 4.0)

canonical-campaign.yaml declares: the brief, budget envelope, flight length, creative shape, expected per-check coverage, and attribution rules (which checks contribute to which specialism marks for composed flights). AAO authors and maintains these alongside the storyboards; sellers don't author them.

Composition attribution rules

A single canonical flight that exercises composed specialisms (e.g., sales-guaranteed + creative-template + governance-aware-seller) needs deterministic attribution when checks fail. Rules:

  • Each check in canonical-campaign.yaml declares which specialism(s) it contributes to.
  • A single-specialism check failure degrades only that specialism's mark.
  • A check contributing to multiple specialisms uses most-specific-specialism-first resolution: a failure on a check tagged [sales-guaranteed, creative-template] defaults to the more-specific specialism if the failure mode unambiguously implicates one (e.g., trafficked-creative success + delivery failure → sales-guaranteed); when the failure is ambiguous, all tagged specialisms degrade.
  • Sellers MAY appeal an attribution after-the-fact via the dispute path (see Open questions).

Runner request-signing integration

Sellers that advertise request_signing.required_for covering tasks the canonical runner uses (create_media_buy, sync_creatives, update_media_buy, etc.) MUST be invoked through AAO's signed-request runner. The runner honors the seller's declared signing profile; failure to sign breaks the canonical flow but is attributed to the runner (an AAO-side bug), not the seller. Conversely, a seller that requires signing but accepts unsigned canonical-runner requests fails the canonical campaign for the relevant signing-required specialism.

This is implicit verification of signed-requests (see Per-specialism coverage) — every canonical flight against a signing-required seller exercises the signing pipeline.

Per-specialism coverage

All 20 stable specialisms walked through. Target column indicates when AAO plans to stand up the canonical flow; "Conformant-only" specialisms keep the AAO Verified (Spec) mark but have no path to AAO Verified (Live).

The "Why" column states the access blocker or design constraint when complexity is High; for Low-complexity specialisms, the canonical flow shape is short.

Media-buy specialisms

Specialism Canonical flow Why (Low/Medium/High) Target
sales-non-guaranteed Programmatic PSA, 24-hour pacing check, AAO as buyer Low — auction inventory is universally available; fastest feedback loop Pilot (phase 1)
sales-guaranteed :30 PSA, 7-day guaranteed flight, delivery heartbeat Low–Medium — requires guaranteed-inventory partner Phase 2
sales-proposal-mode AAO issues brief → seller returns proposal → AAO accepts → buy runs 7 days Medium — exercises proposal negotiation loop end-to-end Phase 2
sales-catalog-driven Catalog-driven PSA (e.g., retail-media test SKU) with conversion-tracking ping Medium — requires catalog seed and attribution test surface Phase 3
sales-broadcast-tv :30 PSA in primetime, watch C3 → C7 maturation High — broadcast inventory access at scale is the blocker; PSA slots already mostly committed (Ad Council holds most primetime); C7 settlement is 15–22 days. May be reclassified to Conformant-only if PSA partnership doesn't land. Phase 4 (aspirational)
sales-social Platform-native PSA flow per platform High — each platform (Meta / TikTok / Snap / X / LinkedIn) needs its own integration; tenant-isolation makes "AAO probes targeting" a security review for some platforms; the universe of social sellers is small. May be reclassified to per-platform opt-in only. Phase 4 (aspirational)
governance-aware-seller Canonical campaign with governance hooks set; seller propagates approvals/conditions/denials unchanged Low — composes with whatever sales specialism the seller also holds; same flight, additional checks Phase 2 (alongside sales)
audience-sync Sync synthetic-but-resolvable test list; verify reflection in delivery Medium — privacy surface (synthetic identifiers owned by AAO; documented consent posture) Phase 3

Creative specialisms

Specialism Canonical flow Why Target
creative-template Weekly brief → trafficked creative → served on AAO's canonical sales partner Low — composes with a sales specialism; same flight Phase 2
creative-generative Weekly brief → generated spot → serves on AAO's canonical sales partner Low–Medium — composes with sales; generation cost is the only delta Phase 2
creative-ad-server PSA tagged with known macros; verify click/impression via macro callbacks; ad-server health probe Low — single canonical PSA per channel family covers it Phase 2

Signals specialisms

Specialism Canonical flow Why Target
signal-marketplace Discover canonical test signal → activate → confirm targeting takes effect in a sales flow Medium — requires sales-flow partner Phase 3
signal-owned AAO exposes canonical first-party test segment → seller resolves it on the buy path → verify reflection Medium — synthetic test segment plumbing Phase 3

Governance specialisms

Specialism Canonical flow Why Target
content-standards Submit one brand-safe and one unsafe brief weekly; verify approval/rejection behavior Low — decision-check, not delivery-check Phase 2
property-lists Sync canonical test property list; verify targeting narrows per the list during a canonical flight Low — small synthetic property list, easy to author Phase 2
collection-lists Sync canonical test collection list; verify inclusion/exclusion on content programs Low Phase 3
governance-delivery-monitor Active canonical campaign under monitoring; AAO induces drift; verify alert fires within threshold Medium — composes with a sales flow Phase 3
governance-spend-authority Submit conditional-approval brief weekly; watch human-in-loop approval flow complete within SLA Medium–High — latency-sensitive (requires human response within SLA); requires AAO-side human-in-loop partner Phase 4

Brand specialisms

Specialism Canonical flow Why Target
brand-rights The synthetic license acquisition flow doesn't exercise the real code path (real licenses are negotiated, priced, countersigned). Re-evaluate once brand-rights matures past preview status. Conformant-only (4.0)

Security / transport specialisms

Specialism Canonical flow Why Target
signed-requests Implicitly exercised by every canonical campaign — every runner request to a signing-required seller tests the signing pipeline (see Runner request-signing integration above). A standalone canonical flow would be redundant. Taxonomy note: signed-requests is classified as a specialism in 3.x by historical accident; it is a cross-protocol transport-layer concern, not a media-buy specialism. #3075 tracks reclassification as a universal capability-gated storyboard; the deprecation landed as a patch in #3076. Conformant-only (independent verification implicit in other canonical flows)

Capabilities without a specialism (Conformant-only)

These are declared in get_adcp_capabilities but are not specialisms — listed here so readers don't expect canonical flows for them. Their behavior is verified by storyboards and (where applicable) implicitly exercised by every canonical campaign.

Capability Why Verification
webhook_signing Implicitly exercised — every canonical-flight webhook tests signing Storyboards + implicit during canonical flights
idempotency Storyboard-provable; no ongoing live-data aspect Storyboards
compliance_testing (test controller) Self-describing surface for the compliance runner itself Storyboards

Pilot and phase plan

The RFC commits AAO to running canonical campaigns at ecosystem scale. That's a substantial operational commitment. The pilot exists to test that AAO can staff it before 4.0 locks in the trigger flip — if AAO can't staff it, the trigger flip doesn't happen and AAO Verified (Live) stays at the 3.x enrollment-based mechanism indefinitely.

Phase 1 — Pilot (target: 30–60 days)

  • Scope: one specialism (sales-non-guaranteed), 5–10 participating sellers (start at 5, expand to 8–10 by week 3 if onboarding holds)
  • Cadence: weekly 24-hour flights — ~20–40 flights total
  • Staffing: one on-call engineer (part-time, best-effort response)
  • Budget: ~$500/month for paid inventory if remnant/PSA options are unavailable
  • Learn: observed failure rate, time-to-diagnosis, tooling gaps, seller friction in onboarding
  • Phase-2 transition criteria (all required):
    • Median seller enrollment time < 2 hours; no seller exceeds 8 hours
    • Every incident root-caused and documented within 48 hours (frequency is OK; opacity is not)
    • False-positive revocation rate (canonical-campaign failure attributed to seller when actually AAO-side) < 5% across the pilot window
    • At least one seller publicly committed to keep participating post-pilot

Pilot sellers should be a mix of seller shapes, not the highest count: one SSP with mature remnant/PSA operations, one direct publisher, one ad network, one DSP-side partner if one wants in, one self-hosted reference implementation. Spread of seller archetypes > number of sellers.

Phase 2 — Broaden to clean flows (target: 90 days after Phase 1 exit)

Add: sales-guaranteed, sales-proposal-mode, governance-aware-seller, creative-ad-server, creative-template, content-standards, property-lists.

  • ~7 specialisms × ~10 sellers × weekly = ~70 flights/week
  • Staffing: 1 FTE ops engineer + part-time TPM
  • Budget: $2–3K/month inventory + content sourcing
  • Phase-3 transition criteria: all Phase-2 specialisms have ≥ 5 verified sellers; PSA content partnership (e.g., Ad Council MOU) signed for at least 2 channel families; runner uptime ≥ 99.5% over 60 days

Phase 3 — Design-moderate flows (target: 6 months after Phase 2 exit)

Add: sales-catalog-driven, audience-sync, signal-owned, signal-marketplace, collection-lists, governance-delivery-monitor, creative-generative.

  • ~14 specialisms × ~20 sellers × weekly = ~280 flights/week
  • Staffing: 1 FTE ops + 1 FTE runner engineer + part-time TPM; on-call rotation
  • Budget: $5–8K/month; synthetic-identity and test-segment infrastructure built
  • Phase-4 transition criteria: composition attribution rules (canonical-campaign.yaml) tested in production; seller appeal/dispute path exercised at least twice end-to-end; legal sign-off on synthetic-identity posture for audience-sync

Phase 4 — Design-heavy flows (target: 6 months after Phase 3 exit)

Aspirational. Each of the four Phase-4 specialisms may not land; the spec acknowledges this and accepts Conformant-only as the steady-state for any specialism whose canonical flow can't be staffed/funded.

  • sales-broadcast-tv — gated on Ad Council partnership or equivalent licensed PSA inventory
  • sales-social — gated on per-platform Marketing Partner relationships; expect 2–3 of 5 platforms, not all
  • governance-spend-authority — gated on AAO-side human-in-loop partner and SLA commitment
  • brand-rights — reclassified to Conformant-only unless real-license flow becomes feasible

Honest maximum

This is a real institutional commitment. The pilot exists to test that AAO can staff it before 4.0 locks in the trigger flip. If the answer is no, AAO Verified (Live) stays at the 3.x enrollment-based mechanism — the spec is forward-compatible.

At Phase 4 steady state, AAO is running ~50 canonical flights per day across the ecosystem. The expert-review estimate of the fully-loaded annual cost is in the range of $900K–$1.2M (~3.5–4 FTE + inventory + legal + tooling), not the $400K implied by the per-phase numbers above. Roles undersized in the per-phase view that the operational-plan issue should size honestly: seller-success/onboarding (1 FTE), content/partnerships lead (0.5 FTE), legal review ($40K/yr external), comms / customer-marketing (0.25 FTE). Inventory at steady state likely runs $25–35K/month including CTV CPMs, not $10–15K.

Operational plan (separate issue)

A separate "AAO canonical-campaign runner: operational plan" issue should cover service ownership (in-house vs. contracted), monitoring stack, PSA content partnerships (Ad Council MOU first), 24/7 on-call policies, budget approval, legal review cadence, and seller-success staffing. That issue should close before 4.0 ships; this RFC's spec content can land independently.

How the mark works (4.0 end state)

Issuance

AAO creates the canonical test campaign for each declared specialism via standard AdCP (create_media_buysync_creatives → delivery observation). If every step succeeds and reporting looks healthy after 7 days, the AAO Verified (Live) mark issues for that specialism.

The eight checks from the 3.x transitional spec apply per canonical campaign rather than per compliance account; see the per-check mapping table.

Maintenance

Weekly refresh flight per specialism. Reporting heartbeats continuously. AAO re-runs storyboards against the seller's live agent endpoint daily to confirm AAO Verified (Spec) remains current — the canonical-campaign cadence and the storyboard cadence are independent, so wire-format regressions are caught within 24 hours regardless of the canonical-flight schedule.

Degradation and grace

  • Any check fails → mark enters 7-day grace (extended from the v1 draft's 48 hours, in light of seller-side commercial harm risk; transient ad-server issues need real ops response time, including weekends)
  • 7 days continuous failure → revoke for that specialism (other specialisms unaffected)
  • Storyboard regression → enters the same 7-day grace, applies to all of that seller's AAO Verified (Live) specialisms (containment); a seller has 7 days to fix the wire-format regression before AAO Verified (Live) revokes
  • Membership lapses → revoke immediately (existing AAO-membership behavior)

During grace, the public mark renders as "AAO Verified (Live) — Monitoring" rather than disappearing. This avoids a flap-induced commercial harm scenario where a seller loses a deal because of a transient lapse the issue gets fixed within hours.

Recovery

Next successful canonical flight (or storyboard re-run for storyboard regressions) → mark reissues for that specialism.

Per-specialism independence

A seller declaring multiple specialisms gets multiple independent AAO Verified (Live) specialisms that degrade independently. If one canonical flow breaks, only that specialism lapses; the others stay green.

Buyer-facing presentation. For UX legibility at scale, the registry / brand.json renders specialisms grouped into four buyer-legible categories: Sales (all sales-*), Creative (creative-*), Signals (signal-*), Governance (governance-*, content-standards, property-lists, collection-lists, audience-sync). The category mark renders as Verified only if all of the seller's claimed specialisms in that category are currently Verified; one lapsing specialism flips the category to Verified — Monitoring with a tooltip listing the specific lapsed specialism. This keeps the buyer-facing signal clean while preserving the per-specialism technical reality underneath.

What storyboards become

Storyboards remain essential and publicly visible — they are the AAO Verified (Spec) mark.

  • AAO Verified (Spec) is storyboard-issued. Published, searchable in the registry, useful for pre-production agents, beta sellers, specialisms without canonical flows, and developers checking their implementation.
  • Pre-req for AAO Verified (Live). A seller cannot be AAO Verified (Live) without being AAO Verified (Spec).
  • Daily re-run against the seller's live agent endpoint is what keeps Conformant current. AAO's runner authoritatively decides whether storyboards are passing; sellers can dispute a result via the appeal path.
  • Regression test. Wire-format violations that live data might paper over still get caught.
  • Local dev-CI. Storyboards run fast against @adcp/client's test controller during development.

A seller's storyboards at 100% green still doesn't get AAO Verified (Live) until the canonical campaign runs cleanly for 7 days. A seller whose canonical campaign is healthy but fails a storyboard enters the same 7-day grace as a canonical-campaign failure.

What the seller provides (4.0)

Same as 3.x enrollment — none of the seller-facing obligations change at the trigger flip:

  1. One compliance account on their platform (sandbox or real).
  2. PSA inventory access (or an equivalent zero-cost path — remnant, house, or a small real budget under $100/week).
  3. The attestation_verifier scope (from Capability introspection task + attestation_verifier scope + RBAC error codes #2964 / spec(accounts): per-account authorization on sync/list + RBAC error codes #2994, shipping in 3.x) granted to AAO's compliance identity.

No new seller obligation at 4.0. The trigger flips on AAO's side; sellers don't re-enroll or re-plumb.

Why this is better than the 3.x enrollment-based model

  1. AAO-as-operator. AAO is in-market — placing real buys, accepting real reporting, exercising real signal flows. The mark is a byproduct of AAO's real operator activity, not a separate verification pipeline.
  2. Teach-to-test is infeasible at this surface area. Stubbing a 7-day canonical flight across create → traffic → delivery → maturation while handling an ambiguous mix of canonical and probe traffic over weeks is essentially building a working ad server. The canonical-campaign trigger combined with secondary-identity probing closes the most plausible attack surfaces.
  3. Predictable canonical content. AAO sources one canonical PSA per channel family; sellers plug in to a known flow rather than inventing one.
  4. Continuous reality check with ecosystem CI. Every AAO Verified (Live) agent has a known-good counterparty exercising their real code paths weekly. Breakage on any seller surfaces within a week across the ecosystem.
  5. Per-specialism independence with buyer-legible category roll-up. A seller declaring multiple specialisms can have some lapse while others stay green; the registry's category roll-up keeps the buyer signal legible at scale.

Open questions

  1. PSA content sourcing. Ad Council partnership is the highest-confidence path; AAO-branded "Running on AAO" PSAs are a fallback (own production cost ~$200–300K/year for a full multi-channel rotation; Ad Council MOU is ~6 months of BD). First-year content commitment lives in the operational-plan issue.
  2. Sellers that genuinely can't take PSAs. Retail media (auction integrity, merchandising contracts), some social platforms (tenant-isolation security review). Fallback paths: small real budget, slower cadence (monthly), or stay Conformant-only.
  3. Buyer demand commitment. AAO Verified (Live) as a "filter" is aspirational until a tier-1 holdco (WPP / Publicis / Omnicom / IPG) publicly requires it in an agentic-buying RFP. Without that, AAO Verified (Live) is a nice-to-have. This is a gating question for the 4.0 trigger flip and should be tracked separately, not as a spec question.
  4. Privacy / legal posture for audience-sync. Synthetic identifiers owned by AAO with documented opt-in; or AAO staff MAID/HEM list with consent on file. Whichever AAO's legal review supports. Lives in operational-plan issue.
  5. Seller appeal / dispute path. When a canonical flight fails for AAO-side reasons (PSA inventory outage, runner bug, attribution misfire), how does the seller dispute the lapse? AAO publishes a dispute SLA (e.g., 48h to first response, 5 business days to resolution); appeals don't extend grace, they reverse revocations after the fact.
  6. AAO's own runner audit posture. Who audits AAO's canonical-campaign runner? Self-attested today; longer-term, an MRC partnership where MRC accredits the runner's process is worth exploring (AAO operates, MRC accredits the operation — strategic complement, not competitor).
  7. Strategic positioning vs. MRC / TAG / IAB / BPA. AAO-as-operator is genuinely new in the ad-tech attestation landscape. The strategic case is "agent-native" (existing bodies assume human ad ops; AAO is the only body that understands AdCP wire format). MRC partnership is worth exploring before Phase 2.
  8. Multi-specialism flight design. A canonical flight can exercise composed specialisms (e.g., sales-guaranteed + creative-template + governance-aware-seller) under a single flight. AAO SHOULD reuse canonical flights to minimize flight count; canonical-campaign.yaml declares which specialisms a flight covers. Attribution rules above resolve partial failures.

Failure-mode contingencies

Five realistic failure modes named to make the operational-plan issue concrete:

  1. Seller serves PSAs cleanly to AAO, fakes data to probes. Mitigation: third-party measurement cross-check on canonical flights (DV / IAS impressions vs. AAO's reporting). Adds budget line ~$2–3K/month at Phase 3 and beyond. Not in the per-phase numbers; should be added to the operational-plan issue.
  2. PSA content partnership falls through mid-year. Mitigation: pre-produce one fallback PSA per channel family during Phase 2 (~$60K one-time) as insurance.
  3. Integration breaks; seller loses revenue from "formerly Verified" status. Mitigation: 7-day grace (above) plus public "Monitoring" state during grace. May warrant insurance for commercially-harmed wrongly-lapsed sellers; legal review.
  4. Buyers conflate AAO Verified (Live) with brand-safety / viewability. Mitigation: standardized disclaimer on every surface — "AAO Verified (Live) attests to AdCP capability conformance and live ad-server integration. It is not a brand-safety, viewability, or measurement-quality claim."
  5. 24/7 on-call misses an incident; dozens lapse. Mitigation: staged-degradation rule (no mass-lapse exceeding N sellers within any 6-hour window without TPM review); commercial-harm insurance (legal line item).

Impact on shipped PRs

Phased migration path

Pilot, 3.x ship, and Phase 1 deliberately overlap. The 3.x enrollment-based AAO Verified (Live) ships in late June 2026; the canonical-campaign pilot can run concurrently because they don't conflict — a seller in the pilot is dual-instrumented during the overlap.

  1. 3.x ships (late June 2026). AAO Verified (Live) available via enrollment-based continuous observation. AAO Verified (Spec) available via storyboard pass. Containment relationship enforced. All machinery (attestation_verifier scope, eight checks, Path A/B, webhook-ownership contract) in place.
  2. Pilot (June–August 2026, runs concurrently with 3.x). AAO stands up the canonical-campaign runner for sales-non-guaranteed only, 5–10 sellers. Pilot is a non-committal trial; AAO Verified (Live) still issues via 3.x enrollment-based observation during this period. Learn from failures.
  3. Phased rollout (Q4 2026 onward). Per-specialism, AAO flips AAO Verified (Live) issuance from 3.x enrollment-based → canonical-campaign-based. Phase 2 / 3 / 4 as above. Both issuance paths coexist as OR during the transition window per specialism — either path grants the mark. AAO publishes the per-specialism cutover date at least 30 days in advance; on the cutover date, enrollment-based issuance retires for that specialism and only canonical-campaign-based issuance grants the mark going forward. Storyboards remain mandatory throughout.
  4. 4.0 ships (target: early 2027 or early 2028, depending on pilot outcomes). Enrollment-based AAO Verified (Live) issuance retires for any specialism where a canonical campaign is operational. Specialisms without canonical flows (signed-requests, brand-rights if it stays preview, transport capabilities) stay AdCP-Conformant-only.

The expert-review feedback flagged the late-June 2026 → early 2027 timeline as compressed (Ad Council MOU alone is 6 months; FTE hires 4–5 months). Realistic 4.0 ship target may be early 2028 instead of early 2027, depending on Phase 1 / 2 outcomes. The spec is forward-compatible either way; the trigger flip is per-specialism and per-cutover-date, not a single big-bang event.

3.x machinery is the foundation; 4.0 is the trigger flip per specialism; the spec's normative seller obligations don't change between versions.

Relation to other 4.0 work

The 4.0 Tier-2-equivalent check expansions filed separately are additive to the canonical-campaign machinery:

  • #3017 — creative-approval pipeline liveness (applies to creative-template, creative-generative, social sales specialisms)
  • #3018 — cancellation-propagation timing (applies to all sales specialisms)
  • #3019 — billing reconciliation touch (opt-in, applies to sales specialisms)
  • #3020 — IO / JWS signing workflow liveness (applies to sales-guaranteed)
  • #3009 — multi-subscriber reporting_webhooks[] (relaxes Path B2's dedicated-tenant requirement)

These become additional checks within each canonical campaign once the runner lands. #3020 (JWS signing) composes naturally with sales-guaranteed's canonical flow.


Earlier draft notes (v2 changes)

This v2 incorporates expert-review feedback on the v1 revision:

  • Specialism count corrected. v1 said "19 stable specialisms"; the canonical enum has 20 (added signed-requests to the Security/transport subsection as Conformant-only).
  • Per-check mapping table added. v1 said "the eight checks apply per canonical campaign rather than per compliance account" without showing the mapping; v2 walks each check with 3.x → 4.0 behavior side-by-side.
  • Composition attribution rules promoted from "open question" to main body. v1 left this as a question; v2 specifies most-specific-specialism-first resolution and locates the rules in canonical-campaign.yaml.
  • Canonical-campaign spec format committed. v1 left canonical-campaign.yaml as a question; v2 commits to the path and contents.
  • Runner request-signing integration spelled out. v1 placed signed-requests as Conformant-only without explaining how the runner itself complies with seller signing requirements.
  • AND vs OR resolved. v1's phased migration said both issuance paths "coexist" without saying whether the mark needs both; v2 commits to OR with dated per-specialism cutover ≥ 30 days in advance.
  • Storyboard regression revocation specified. v1 said "automatic" without a cadence or grace; v2 commits to daily storyboard re-runs and same-7-day-grace as canonical-campaign failures.
  • Grace extended from 48h to 7 days. v1's 48h grace was commercially harsh (weekend incidents, ad-ops response times); v2 extends to 7 days with public "Monitoring" state.
  • Per-specialism UX rendering specified. v1 left buyer-facing UX as ambiguous; v2 commits to category-grouped rendering with tooltip drill-down.
  • Liveness check (check 1) reinterpreted for canonical cadence. v1's "80% of rolling window" doesn't translate to weekly flights (~14% liveness); v2 restates as ≥1 healthy flight per scheduled cadence.
  • brand-rights reclassified to Conformant-only. v1 had it in Phase 4 with synthetic license acquisition; v2 acknowledges that flow doesn't exercise the real code path.
  • Phase-transition criteria added per phase. v1 only had Phase-1 exit criteria; v2 has criteria gating Phase 2 → 3 → 4 transitions.
  • Failure-mode contingencies added. Five realistic failure modes named explicitly.
  • Honest-maximum framing moved before the budget numbers. v1 buried "the pilot exists to test that AAO can staff it" after the cost estimates.
  • Tone hedged. "Teach-to-test structurally impossible" → "infeasible at this surface area" (matches merged aao-verified.mdx).
  • Timeline acknowledged as compressed. v1's early-2027 4.0 target was unrealistic per expert review; v2 acknowledges early-2028 as the realistic alternative depending on pilot outcomes.
  • Buyer demand and MRC positioning surfaced as open questions (Q3, Q7), not assumptions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.rfcProtocol change — auto-adds to roadmap board

    Type

    No type

    Projects

    Status

    Shipped

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions