diff --git a/.changeset/identitymatch-fcap-architecture-spec.md b/.changeset/identitymatch-fcap-architecture-spec.md new file mode 100644 index 0000000000..fe04a4b888 --- /dev/null +++ b/.changeset/identitymatch-fcap-architecture-spec.md @@ -0,0 +1,36 @@ +--- +"adcontextprotocol": patch +--- + +IdentityMatch & frequency capping architecture, with the wire-spec change and the data-flow boundary contract landing as authoritative protocol docs. Counting and policy live in the buyer's impression tracker; the IdentityMatch service consumes only cap-fire events at the boundary. + +**Wire spec changes** (`identity-match-response.json`): +- Adds `serve_window_sec` (integer, 1–300, default 60) — per-package single-shot fcap window. After serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. Not a router response cache TTL. +- Removes `ttl_sec`. Originally documented as a router cache TTL but operationally functioned as a per-package serve throttle. TMP is pre-launch (experimental, pre-3.0.0 GA) and not subject to deprecation cycles, so the field is removed outright. + +**Doc updates:** +- `docs/trusted-match/specification.mdx` — adds `serve_window_sec` field, removes `ttl_sec`, adds normative conformance invariants for IdentityMatch eligibility (audience intersection; cap-state presence check; active state; audience freshness). Updates the caching section for the new contract. +- `docs/trusted-match/identity-match-implementation.mdx` (new page) — frequency-cap data flow (boundary contract): the cap-fire event the impression tracker writes into the IdentityMatch cap-state store, and how the IdentityMatch service consumes it at query time. The protocol does not constrain how the impression tracker counts impressions, evaluates windows, or decides when a cap fires — those concerns live entirely in the buyer's impression-tracking pipeline. +- `docs/trusted-match/buyer-guide.mdx` — updates frequency-cap management to reflect the impression-tracker / IdentityMatch split, and the serve-window contract section. +- `docs/trusted-match/migration-from-axe.mdx` — adds OpenRTB 2.6 `User.eids[]` cross-walk for buyers bridging from OpenRTB-shaped pipelines. + +**Three-layer model:** +- Wire spec (normative) — what crosses an agent boundary. +- Conformance invariants (normative) — backend-agnostic eligibility logic, including a presence check against cap-state. +- Boundary contract (normative for the cap-state store API) — what events flow from the impression tracker into the IdentityMatch cap-state store. Storage backend is implementer choice; the reference store ships in `adcp-go/targeting/fcap` (Valkey 9 hashes with HSETEX). + +**Cap-state store surface:** `RecordCap(userIdentity, fields, expireAt)` and `IsCapped(userIdentity, field)`, where `field` is `{seller_agent_url, package_id}`. v1 keys cap-state at `(user_identity, seller_agent_url, package_id)`; broader-dimension caps (advertiser, campaign, creative, line item) are a future extension to the boundary contract. + +**Architecture history** preserved at `specs/identitymatch-fcap-architecture.md` — captures design decisions, deferred security/privacy follow-ups, the rollout plan, and consolidated Slack/PR-review threads. Earlier iterations of the design (counter-based exposure tracking, log-based tracking with `impression_id` dedup, `fcap_keys` label model) were unwound — counting, dedup, and policy evaluation depend on buyer-internal concerns the protocol shouldn't constrain. + +All TMP surfaces remain `x-status: experimental`. Per the experimental-status contract, fields on this surface are not subject to deprecation cycles until 3.0.0 GA. + +**Tracked deferred follow-ups** (not in this PR): +- TMPX harvest → competitor-suppression attack +- Eligibility-as-audience-membership oracle (honeypot package_ids) +- Consent revocation between IdentityMatch and impression +- Side-channel via eligibility deltas +- `hashed_email` in TMPX leak surface +- DoS amplification via large `package_ids[]` +- Cap-state extensions for advertiser/campaign/creative dimensions +- Identity-graph plug-point in the impression tracker diff --git a/CHANGELOG.md b/CHANGELOG.md index 03049fb5b3..dd99f52fec 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,11 @@ # Changelog +## Upcoming + +### Notices — experimental surfaces + +- **TMP `identity-match-response.ttl_sec` is removed; replaced by `serve_window_sec`.** The `ttl_sec` field was documented as a router response cache TTL but operationally functioned as a per-package single-shot fcap, conflating two distinct concerns and silently breaking either when tuned. Replacement field `serve_window_sec` (integer, 1–300, default 60) carries the corrected semantic — *after serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again.* This is **not** a router response cache. Multi-impression frequency capping is a separate concern handled by the buyer's impression tracker, which writes cap-fire events to the IdentityMatch cap-state store at the boundary regardless of this window. TMP is pre-launch (experimental, pre-3.0.0 GA) and not subject to deprecation cycles, so `ttl_sec` is removed outright rather than going through a deprecation window. Tracked in `specs/identitymatch-fcap-architecture.md` and [Frequency-Cap Data Flow](docs/trusted-match/identity-match-implementation.mdx). + ## 3.0.6 ### Patch Changes diff --git a/docs/trusted-match/buyer-guide.mdx b/docs/trusted-match/buyer-guide.mdx index e0534f059c..1dc0b7fa6b 100644 --- a/docs/trusted-match/buyer-guide.mdx +++ b/docs/trusted-match/buyer-guide.mdx @@ -16,7 +16,7 @@ A buyer agent exposes two HTTP/2 endpoints under a single base URL — `POST /co | Message type | Receives | Returns | |---|---|---| | `context_match_request` | Page/content signals, placement, geo | Offers with creative manifests | -| `identity_match_request` | Seller agent URL, identity tokens, optional package ID list | Eligible package IDs + TTL | +| `identity_match_request` | Seller agent URL, identity tokens, optional package ID list | Eligible package IDs + `serve_window_sec` | Each endpoint handles one message type. Both must respond in under 50ms. The router enforces this budget and will skip slow providers. @@ -121,11 +121,11 @@ The router sends you the seller's `seller_agent_url` and one or more identity to "type": "identity_match_response", "request_id": "id-9c4e", "eligible_package_ids": ["acme-outdoor-q2", "acme-loyalty-retarget"], - "ttl_sec": 60 + "serve_window_sec": 60 } ``` -Return only the package IDs that pass your eligibility checks. Packages not in the list are treated as ineligible. The `ttl_sec` tells the router how long to cache this response — during that window, the router returns cached eligibility without re-querying you. The publisher uses cached eligibility to allocate across whatever placements exist. Set the TTL based on how quickly your eligibility state changes (frequency caps, audience updates, etc.). +Return only the package IDs that pass your eligibility checks. Packages not in the list are treated as ineligible. The `serve_window_sec` is a **per-package single-shot fcap**: after the publisher serves the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. Default 60s, max 300s. This is not a router response cache TTL — see [The serve-window contract](#the-serve-window-contract). **What you never receive** in Identity Match: page URLs, content topics, keywords, article text, or any content signal. You cannot determine what the user is looking at. @@ -144,21 +144,25 @@ You have no role in this step. The publisher controls activation. ## Frequency Cap Management -Cross-publisher frequency capping is the primary use case for Identity Match. Your agent maintains frequency state per user token: +Cross-publisher frequency capping is the primary use case for Identity Match. Cap policy and counting live in your **impression tracker**; the Identity Match service consumes only cap-fire signals at query time. The split: -- **Count impressions** by user token + package ID -- **Track recency** — when was the last impression for this token? -- **Apply caps** from the media buy: `max_impressions` per `window`, minimum `recency` between exposures -- **Exclude the package** from `eligible_package_ids` when a cap is hit -- **Set `ttl_sec`** to reflect how long this eligibility is valid — a shorter TTL means the router re-checks sooner, which is useful when a cap is close to being reached +- **Impression tracker** receives pixel fires, decodes the TMPX token, and applies whatever fcap policies you maintain — counting impressions across whatever dimensions you cap on (package, campaign, advertiser, creative, line item) for each resolved user identity, with whatever windowing and dedup logic your policy engine uses. +- **On the impression that exhausts a cap**, the impression tracker writes a cap-fire entry — `(user_identity, package) capped until ` — into the Identity Match cap-state store. +- **Identity Match service** at query time excludes any package with a cap-fire entry against any of the request's identities from `eligible_package_ids`. + +The protocol does not constrain how you count impressions, where policies live, or how you dedup across identities. It only defines the boundary: cap-fire events flow into the cap-state store; the IdentityMatch service checks presence at query time. See [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) for the boundary contract and the reference cap-state store. + +When an fcap rule changes — a window shortens or lengthens, a `max_count` rises or falls, a policy is paused or removed, a package is reassigned — you MUST re-evaluate the affected `(user_identity, package)` cap-state entries against the new policy and push the appropriate updates: **delete** entries for users no longer over-cap, **extend** (overwrite with a new `expire_at`) entries that are still over-cap but whose window changed. The cap-state store doesn't store counts and can't re-evaluate on its own; the buyer's policy owner is the source of truth. See [Policy updates and cap-state re-evaluation](/docs/trusted-match/identity-match-implementation#policy-updates-and-cap-state-re-evaluation) for the event shapes. Because Identity Match runs across all publishers using TMP, a user who saw your ad on Publisher A will correctly show as over-frequency on Publisher B — even though you can't see which publisher sent the request. ### How Buyers Learn About Exposures -The `tmpx` field on the Identity Match response carries a TMPX token — an HPKE-encrypted blob containing the user's resolved identity tokens. The publisher substitutes `{TMPX}` into creative tracking URLs. When the ad serves, your impression pixel receives the encrypted token. Your cluster master decrypts it, logs the exposure against the user, and replicates updated frequency state to read replicas. This gives you real-time per-user exposure signals without the publisher seeing user identity. +The `tmpx` field on the Identity Match response carries a TMPX token — an HPKE-encrypted blob containing the user's resolved identity tokens. The publisher substitutes `{TMPX}` into creative tracking URLs. When the ad serves, your impression pixel receives the encrypted token. Your impression tracker decrypts it, applies your fcap policy logic against the resolved identities, and (when a cap fires) writes a cap-fire entry to the Identity Match cap-state store. Most production deployments separate decode (synchronous, at intake) from policy evaluation and cap-state writes (asynchronous, behind a queue) for buffering. + +This gives you real-time per-user exposure signals without the publisher seeing user identity. -See [TMPX Exposure Tokens](/docs/trusted-match/specification#tmpx-exposure-tokens) for the encryption format and binary token structure. +See [TMPX Exposure Tokens](/docs/trusted-match/specification#tmpx-exposure-tokens) for the encryption format and binary token structure, and [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) for the cap-state store boundary contract. ## Provider Registration @@ -201,16 +205,18 @@ Common scenarios: - **Internal failure**: Return an error response. The router skips your provider and proceeds with other providers. - **Timeout**: If you can't respond within the latency budget, the router skips you. No error response needed — the router handles this. -## The TTL Caching Contract +## The serve-window contract + +The `serve_window_sec` field on Identity Match responses is a **per-package single-shot fcap** between the buyer and the publisher: + +- For each package in `eligible_package_ids`, the publisher MAY serve the user **at most one impression** on that package within `serve_window_sec` seconds. +- After the publisher has served one impression on each eligible package, the publisher MUST re-query Identity Match before serving any of those packages to the same user again. +- Multi-impression frequency capping (5/day, 100/month, etc.) is separate. It lives in your buyer-side state and is updated out-of-band via TMPX impression callbacks regardless of `serve_window_sec`. The serve window is the protocol-level throttle; multi-impression caps are buyer-internal policy. -The `ttl_sec` field on Identity Match responses is a caching contract between the buyer and the router: +The router MAY apply an internal deduplication cache keyed by `{identities_hash, provider_id, package_ids_hash, consent_hash}` (see spec for canonical bytes), but the publisher's binding contract is the serve-window throttle, not the router's cache window. -- The router caches the response for `ttl_sec` seconds, keyed by `{identities_hash, provider_id, package_ids_hash, consent_hash}` (see spec for canonical bytes). `identities_hash` is computed over the per-provider filtered subset you received — your cache partition is scoped to the identity types you resolve. -- During that window, the router returns cached eligibility without re-querying the buyer -- The publisher uses cached eligibility to allocate across whatever placements exist — a single pre-roll, a CTV ad pod, or a web page with multiple ad units -- The buyer doesn't need to know how many placements exist or how the publisher allocates +**Choosing a serve_window_sec value**: Default 60 seconds. Range 1–300. Anything longer than 300 makes per-package fcap too coarse for typical campaigns. Anything shorter than your IdentityMatch round-trip just adds load. 60 is a good default; tune downward if eligibility state shifts faster (close to a cap, audience just changed) or upward (max 300) if your IdentityMatch service is at load and the campaigns are tolerant of coarser fcap. -**Choosing a TTL**: Set the TTL based on how quickly your eligibility state changes. If frequency caps reset hourly, a 300-second TTL is reasonable. If a user is close to a cap limit, return a shorter TTL (e.g., 30 seconds) so the router re-checks sooner. ## Performance Requirements @@ -234,7 +240,7 @@ Buyers receive real-time per-user exposure signals via the `{TMPX}` macro. The I | | OpenRTB | TMP | |---|---|---| | **You receive** | Full bid request (user + content + device) | Either content OR identity, never both | -| **You return** | Bid price | Offer (creative manifest) or eligible package IDs + TTL | +| **You return** | Bid price | Offer (creative manifest) or eligible package IDs + serve window | | **Auction** | Exchange runs auction | No auction — publisher joins locally | | **Frequency** | Per-DSP only | Cross-publisher via Identity Match | | **Integration** | Per-exchange SSP adapter | Two endpoints (context + identity), any surface | diff --git a/docs/trusted-match/identity-match-implementation.mdx b/docs/trusted-match/identity-match-implementation.mdx new file mode 100644 index 0000000000..d333d80162 --- /dev/null +++ b/docs/trusted-match/identity-match-implementation.mdx @@ -0,0 +1,116 @@ +--- +title: Identity Match Frequency-Cap Data Flow +sidebarTitle: Frequency-Cap Data Flow +description: "Boundary contract between the impression tracker and the Identity Match service for frequency capping — the data flow only. Internal counting, policy evaluation, and storage layout are buyer-internal concerns." +"og:title": "AdCP TMP Identity Match Frequency-Cap Data Flow" +--- + +# Identity Match Frequency-Cap Data Flow + +This page describes how frequency-cap state reaches the Identity Match service and how Identity Match consumes it at eligibility time. It defines **the data flow only** — what crosses the boundary between the impression tracker and the Identity Match service. Internal mechanics (how the impression tracker counts impressions, where policies live, what storage layout the Identity Match service uses, how identities are deduplicated upstream) are buyer-internal concerns and are out of scope here. + +The wire spec lives in the [TMP specification](/docs/trusted-match/specification); the conformance invariants the Identity Match service must satisfy are also normative there. The reference implementation of the Identity Match cap-state store ships in [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap). + +## Roles + +| Component | Responsibility | +|---|---| +| **Identity Match service** | At query time, returns `eligible_package_ids` — the subset of requested packages the user is not currently capped on (and that pass other eligibility checks). It does not count impressions and does not own fcap policies. | +| **Impression tracker** | Receives pixel fires, decodes TMPX, applies the buyer's fcap policies (counting, windowing, multi-identity dedup, whatever the buyer's policy logic does), and signals "cap fired" to the Identity Match cap-state store on the impression that exhausts a cap. | +| **Identity Match cap-state store** | Records `(user_identity, package) → cap-until` entries with TTL. Queried by the Identity Match service at eligibility time. Written by the impression tracker (or a downstream service in its pipeline). | + +The split is deliberate: counting impressions, evaluating windows, and deciding when a cap fires are buyer-internal policy concerns that vary across buyers and across campaigns. The Identity Match service stays narrow — it answers "is this user currently capped on this package?" and nothing more. New cap dimensions (advertiser, campaign, creative — see [extensions](#future-extensions)) plug into the same boundary contract without changing the service. + +## End-to-end flow + +``` +1. Identity Match query + publisher → router → Identity Match service + Identity Match looks up cap state for each (identity, package) pair + returns eligible_package_ids + tmpx (HPKE-encrypted resolved identities) + +2. Ad serves; creative tracking URL fires pixel with {TMPX} + publisher's player/page → impression tracker + +3. Impression tracker decodes TMPX + → resolved identities + signed package context (seller_agent_url, package_id) + +4. Impression tracker applies the buyer's fcap policies + → counts this exposure against whatever dimensions the buyer caps on + (package, campaign, advertiser, creative, line item, …) for each + resolved identity, using whatever policy logic and storage the buyer + runs internally + +5. If this impression exhausts a cap (i.e., it is the last allowed exposure + under one of the buyer's policies), the impression tracker (or a + downstream service in its pipeline) writes a cap-fire entry to the + Identity Match cap-state store: + (user_identity, package) capped until + +6. Subsequent Identity Match queries for that user see the cap-state entry + and exclude the package from eligible_package_ids until the entry expires +``` + +Steps 1, 2, and 6 cross the wire and are normatively defined in the [TMP specification](/docs/trusted-match/specification). Steps 3 and 5 cross the impression-tracker → cap-state-store boundary and are defined on this page. Step 4 is buyer-internal — the protocol does not constrain it. + +## The cap-fire event + +When a buyer's policy evaluation determines that an impression has exhausted a cap, the impression tracker writes a cap-fire entry to the Identity Match cap-state store. Each entry consists of: + +| Field | Description | +|---|---| +| `user_identity` | The resolved identity token (e.g., `rampid:abc`, `id5:def`, `maid:ghi`) the cap fired on. If a single impression resolved to multiple identities and the policy fired on all of them, the impression tracker writes one entry per identity. | +| `seller_agent_url` | The seller agent the package belongs to. Disambiguates identical `package_id` strings across sellers. | +| `package_id` | The package the cap fired on. | +| `expire_at` | Wall-clock time at which the cap expires. The cap-state store enforces this as a TTL — entries are absent after `expire_at`. | + +A single cap-fire event typically corresponds to one entry; a cap that fires on multiple resolved identities or multiple packages produces one entry per `(identity, package)` pair, all sharing the same `expire_at` if the buyer's policy is the same. + +The cap-state store does not record per-impression counts, policy definitions, or window configurations. Its only job is to answer "is this `(user_identity, package)` currently capped?" The buyer's policy logic — counting, windowing, choosing dimensions to cap on, deciding when to fire — lives entirely in the impression tracker. + +## The eligibility query + +At query time, the Identity Match service receives a list of identities and a list of candidate packages. For each candidate package, it checks the cap-state store for any matching `(identity, package)` entry across the user's identities. If any entry exists, the package is excluded from `eligible_package_ids`. This is a presence check, not a count. + +Cap state is one input to eligibility. The Identity Match service also evaluates audience membership, package active state, audience freshness, and any other inputs the buyer cares about — see the [conformance invariants](/docs/trusted-match/specification#conformance-invariants-for-identitymatch-eligibility). The cap-state portion of that evaluation is the part this page defines. + +## Policy updates and cap-state re-evaluation + +Cap-state entries are written under whatever fcap policy was in force at cap-fire time. When the buyer's fcap policies change — a window shortens or lengthens, a `max_count` rises or falls, a policy is paused or removed, a package is reassigned to a different policy — the existing cap-state entries written under the old policy can become stale. Stale entries either suppress users who should now be eligible (over-suppression) or fail to suppress users who should now be capped (under-suppression). + +When a fcap rule changes, the buyer's policy owner (typically the impression tracker or a service in its pipeline) MUST re-evaluate every cap-state entry the rule applied to and push the appropriate update to the IdentityMatch cap-state store. Two event shapes cover the cases: + +| Event | When to push | Effect on cap-state | +|---|---|---| +| **Delete cap-state** | A user's exposure count under the new policy is below the new `max_count`, or the policy was removed/disabled, or the package was reassigned away from the policy. | Remove the `(user_identity, package)` entry — the user is no longer suppressed on that package. | +| **Extend cap-state** | A user is still over-cap under the new policy, but the new `expire_at` differs from the existing entry — for example, the window was lengthened (push a later `expire_at`) or shortened (push an earlier `expire_at`). | Overwrite the entry with the new `expire_at`. | + +Re-evaluation runs over the buyer's own counting state (where impression history lives), not over the cap-state store — the cap-state store doesn't carry counts. The output is the set of delete-or-extend events to apply. + +The reference store in [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap) implements extend natively (a second `RecordCap` for the same `(user_identity, field)` overwrites the prior `expire_at` via `HSETEX`). Delete is a future extension — today, the simplest workaround is to extend with an `expire_at` already in the past, which causes the entry to be treated as absent at the next query and to be reaped by the backend's TTL machinery. + +Re-evaluation can be expensive when a policy applies to many users. Buyers typically run it asynchronously: enqueue the policy-change event, sweep the affected user population in batches, push delete/extend events incrementally. The protocol does not constrain the cadence — only the eventual consistency requirement that cap-state must converge to what the current policies imply. + +## Reference implementation + +The cap-state store API in [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap) is the reference shape. It exposes two operations: + +```go +RecordCap(ctx, userIdentity string, fields []Field, expireAt time.Time) error +IsCapped(ctx, userIdentity string, field Field) (bool, error) +``` + +— plus batch variants for both. `Field` is `{SellerAgentURL, PackageID}`. The reference store is backed by Valkey 9 hashes, hashed by user identity, with one hash field per `(seller_agent_url, package_id)` tuple and a TTL set to `expire_at`. Other backends (Aerospike, DynamoDB, in-memory, anything) are conformant if they satisfy the boundary contract above. + +## Future extensions + +Today the cap-state store is keyed at `(user_identity, seller_agent_url, package_id)`. Future protocol versions may extend the field to additional dimensions — advertiser, campaign, creative, line item — so a buyer can express caps that span multiple packages without writing N entries on every cap-fire. The boundary contract on this page is unchanged by such extensions: the impression tracker writes cap-fire entries; the Identity Match service checks presence at query time. + +## See also + +- [TMP Specification](/docs/trusted-match/specification) — wire spec, TMPX format, conformance invariants +- [Buyer Guide](/docs/trusted-match/buyer-guide) — buyer agent integration, Context Match + Identity Match flows +- [Migration from AXE](/docs/trusted-match/migration-from-axe) — for buyers transitioning from AXE-shaped pipelines, including the OpenRTB User.eids cross-walk +- [Privacy architecture](/docs/trusted-match/privacy-architecture) — what each party learns +- [Router architecture](/docs/trusted-match/router-architecture) — provider registration, fan-out, latency +- [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap) — reference cap-state store in Go diff --git a/docs/trusted-match/migration-from-axe.mdx b/docs/trusted-match/migration-from-axe.mdx index 673cdbd3f6..829ad4b04a 100644 --- a/docs/trusted-match/migration-from-axe.mdx +++ b/docs/trusted-match/migration-from-axe.mdx @@ -85,3 +85,21 @@ New media buys should omit AXE fields entirely. The buyer agent's Context Match - **`sync_creatives`** — Same creative sync - **GAM as the ad server** — TMP still sets key-values that GAM evaluates - **Geographic and other targeting overlays** — These are media buy fields, not execution-layer concerns + +## OpenRTB User.eids cross-walk + +For buyers bridging from OpenRTB-shaped pipelines, the TMP Identity Match `identities[]` shape maps to OpenRTB 2.6 `User.eids[]` as follows: + +| AdCP TMP `identities[].uid_type` | OpenRTB 2.6 `User.eids[].source` | Notes | +|---|---|---| +| `rampid` / `rampid_derived` | `liveramp.com` | `atype: 3` (person-based, per [IAB AdCOM Agent Types](https://github.com/InteractiveAdvertisingBureau/AdCOM/blob/main/AdCOM%20v1.0%20FINAL.md#list_agenttypes)) | +| `id5` | `id5-sync.com` | | +| `uid2` | `uidapi.com` | `atype: 3` | +| `euid` | `euid.eu` | | +| `pairid` | `iabtechlab.com/pair` | | +| `maid` | `adid` (Android) / `idfa` (iOS) | Atypically carried on `Device.ifa` rather than `User.eids` in OpenRTB | +| `hashed_email` | `liveintent.com` or buyer-specific | `atype: 3` | +| `publisher_first_party` | publisher-defined `source` URL | | +| `other` | buyer-defined `source` URL | | + +The TMP `user_token` field corresponds to `User.eids[].uids[].id`. AdCP carries up to 3 identities per Identity Match request (HPKE size budget — see [TMPX size budget](/docs/trusted-match/specification#size-budget)); OpenRTB has no such limit, so a buyer bridging from OpenRTB into TMP must apply a buyer-configured priority order to truncate (typically: deterministic graphs first — UID2, RampID — then probabilistic or publisher-scoped IDs). diff --git a/docs/trusted-match/specification.mdx b/docs/trusted-match/specification.mdx index 472e633ab3..138b0c1837 100644 --- a/docs/trusted-match/specification.mdx +++ b/docs/trusted-match/specification.mdx @@ -7,7 +7,7 @@ description: Authoritative message type definitions, field tables, privacy requi # Trusted Match Protocol Specification -**Experimental.** The Trusted Match Protocol is part of AdCP 3.0 as an experimental surface — it may change between 3.x releases with at least 6 weeks' notice. Sellers implementing TMP MUST declare `trusted_match.core` in `experimental_features`. See [experimental status](/docs/reference/experimental-status) for the full contract. +**Experimental.** The Trusted Match Protocol is part of AdCP 3.0 as an experimental surface — it may change between 3.x releases with at least 6 weeks' notice. Sellers implementing TMP MUST declare `trusted_match.core` in `experimental_features`. See [experimental status](/docs/reference/experimental-status) for the full contract. Fields on this surface are not subject to deprecation cycles until 3.0.0 GA. This is the authoritative reference for the Trusted Match Protocol (TMP). For conceptual introductions, see the [overview](/docs/trusted-match/) and [core concepts](/docs/trusted-match/context-and-identity). @@ -24,7 +24,7 @@ Specific areas expected to evolve include TMPX exposure tokens, country-partitio | **Offer** | A buyer's response to a context match request. Ranges from simple activation (package_id only) to rich proposals with brand, price, summary, and creative manifest. | | **Available package** | A package from an active media buy that is eligible for evaluation on a given placement. Package metadata — including the originating seller agent — is synced at media buy time. See [Package Sync](#package-sync). | | **Seller agent** | The buyer-side agent that sold the package into a publisher. Identified by the agent URL declared in the publisher's `adagents.json` `authorized_agents[].url`. Every `AvailablePackage` is bound to exactly one seller agent at sync time. | -| **Eligibility** | List of eligible package IDs returned by Identity Match, plus a TTL caching contract. The buyer computes eligibility from frequency caps, audience membership, and other signals; the reasons are opaque to the publisher. | +| **Eligibility** | List of eligible package IDs returned by Identity Match, plus a serve-window throttle. The buyer computes eligibility from frequency caps, audience membership, and other signals; the reasons are opaque to the publisher. | | **Artifact** | A typed content reference associated with a publisher property (article URL, episode EIDR, show Gracenote ID, music ISRC, product GTIN, conversation turn). Each artifact has a `type` and `value`. Referenced in context match requests. | | **Temporal decorrelation** | Random delay and random ordering between Context Match and Identity Match requests, preventing timing- and order-based correlation. | @@ -196,19 +196,34 @@ Each entry in `identities` is an `{user_token, uid_type}` pair: ### IdentityMatchResponse -Returned by the buyer agent. A list of eligible package IDs with a caching TTL. +Returned by the buyer agent. A list of eligible package IDs with a serve-window throttle. | Field | Type | Required | Description | |---|---|---|---| | `type` | string | Yes | `"identity_match_response"`. Message type discriminator for deserialization. | | `request_id` | string | Yes | Echo of the request's `request_id`. | | `eligible_package_ids` | List\ | Yes | Package IDs the user is eligible for. Packages not listed are ineligible. | -| `ttl_sec` | integer | Yes | How long the router should cache this response, in seconds. A value of `0` means do not cache — re-query on every request. | +| `serve_window_sec` | integer | Yes | Per-package single-shot fcap window, in seconds. Range: 1–300. Default: 60. After serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. This is **not** a router response cache TTL — it is a buyer-asserted serve throttle. Multi-impression frequency caps are handled separately by the buyer's impression tracker, which writes cap-fire events to the IdentityMatch cap-state store at the boundary regardless of this window — see [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation). | | `tmpx` | string | No | HPKE-encrypted exposure token containing resolved user identity tokens. The publisher substitutes this into creative tracking URLs as `{TMPX}`. The buyer's impression pixel receives the token, enabling real-time per-user frequency state updates. Wire format: `kid.base64url_nopad(ciphertext)` (unpadded, no `=` characters). Publishers MUST treat this value as opaque pass-through data. | -The response includes eligible package IDs, a TTL, and an optional `tmpx` field. The TMPX token is an HPKE-encrypted exposure token that flows through creative tracking URLs to the buyer's impression pixel, enabling real-time per-user frequency state updates without exposing user identity to the publisher. The buyer computes eligibility from whatever identity signals they have (frequency caps, audience membership, purchase history) and returns only the packages that pass. The publisher does not need to know why a package was excluded — just which packages are eligible. +The response includes eligible package IDs, a serve-window throttle, and an optional `tmpx` field. The TMPX token is an HPKE-encrypted exposure token that flows through creative tracking URLs to the buyer's impression pixel, enabling real-time per-user frequency state updates without exposing user identity to the publisher. The buyer computes eligibility from whatever identity signals they have (frequency caps, audience membership, purchase history) and returns only the packages that pass. The publisher does not need to know why a package was excluded — just which packages are eligible. -The `ttl_sec` field is a caching contract. The buyer is saying: "Cache this for N seconds." The router caches the `eligible_package_ids` list and returns it for subsequent requests during the window — it does not track which packages have been served. The publisher enforces allocation rules (at most one ad per package, competitive separation, pod composition) using the cached eligibility as input. This eliminates the need for pod-specific or batch-specific protocol semantics — the router has cached eligibility and the publisher allocates across whatever placements exist during the TTL window (a CTV ad pod, a web page with 20 slots, a single pre-roll). The buyer doesn't need to know the allocation details. +The `serve_window_sec` field is a **per-package single-shot fcap**, not a router cache TTL. The buyer is saying: "After you serve the user one impression on each eligible package, re-query me before serving from those packages again." The router MAY still cache the response for an internal deduplication/cost-saving window, but the binding contract on the publisher side is "one impression per eligible package per window." Multi-impression frequency caps (5 per day per campaign, 100 per month per advertiser, etc.) live in the buyer's impression tracker and surface to the IdentityMatch service as cap-fire events at the boundary regardless of `serve_window_sec`. + +The publisher enforces allocation rules (competitive separation, pod composition) using the eligibility list as input. This eliminates the need for pod-specific or batch-specific protocol semantics — the publisher allocates across whatever placements exist during the serve window (a CTV ad pod, a web page with 20 slots, a single pre-roll), honoring the one-impression-per-package contract. + +#### Conformance invariants for IdentityMatch eligibility + +A conformant IdentityMatch service MUST compute `eligible_package_ids` such that, for each `package_id ∈ request.package_ids`, the package is included in `eligible_package_ids` if and only if **all** of the following hold: + +1. **Audience eligibility.** Either the package has no audience requirement, OR there exists at least one audience identifier `a` such that `a` is in the package's required audience set AND `a` is in the audience-membership of at least one identity `i ∈ request.identities` (the union across the user's resolved identities intersects the package's required audiences). +2. **Frequency cap eligibility.** No `(identity, package)` cap-state entry exists for any identity `i ∈ request.identities` against the package. Cap-state entries are written by the buyer's impression tracker when it determines an impression has exhausted a cap and carry an expiration timestamp; an entry is "present" until that timestamp. The protocol does not constrain how the impression tracker counts impressions, evaluates windows, or decides when a cap fires — only the boundary contract (cap-fire entries flow into the cap-state store; the IdentityMatch service checks presence at query time). See [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) for the boundary contract. +3. **Active state.** Packages or policies marked inactive MUST be treated as if absent. +4. **Audience freshness.** If the buyer's audience pipeline publishes a freshness deadline and the current time is past it, that audience-membership entry MUST NOT contribute to (1). + +The TMPX returned with the response MUST encode the resolved identities so the out-of-band impression tracker can update fcap policy state and signal cap-fire events to the IdentityMatch cap-state store — see § TMPX tokens and [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation). + +Storage backend (valkey, Aerospike, DynamoDB, in-memory, anything) is implementation. Two services with different storage backends that satisfy these invariants for the same inputs MUST return the same eligibility output. #### Consent @@ -599,9 +614,9 @@ The 8-byte random nonce enables deduplication at the master. The master stores n ### Caching behavior -The TMPX token is generated once per Identity Match evaluation and cached alongside the eligibility response for `ttl_sec` seconds. All impressions within the TTL window share the same TMPX value (same nonce, same tokens). +The TMPX token is generated once per Identity Match evaluation and accompanies the eligibility response for the `serve_window_sec` window. All impressions on eligible packages within that window share the same TMPX value (same nonce, same tokens). -The buyer's master MUST NOT deduplicate by TMPX value or nonce within a TTL window — each pixel fire is one impression. Multiple ads served to the same user in a CTV pod or a web page with multiple ad units all produce distinct pixel fires with the same TMPX token. The nonce deduplication only prevents replay of the same TMPX token *after* the TTL window expires — if the same nonce appears outside its original TTL window, it is a replay and MUST be rejected. +The buyer's master MUST NOT deduplicate by TMPX value or nonce within a serve window — each pixel fire is one impression. Multiple ads served to the same user in a CTV pod or a web page with multiple ad units all produce distinct pixel fires with the same TMPX token. The nonce deduplication only prevents replay of the same TMPX token *after* the serve window expires — if the same nonce appears outside its original window, it is a replay and MUST be rejected. ### Publisher obligations @@ -648,9 +663,9 @@ Context Match responses are cacheable because the same packages are evaluated fo - Routers SHOULD cache Context Match responses with a TTL of **5 minutes**. - Providers MAY include a `cache_ttl` field (integer, seconds) in Context Match responses to override the default. Routers MUST respect this value when present. -- Identity Match responses are cached per the `ttl_sec` value in the response. Cache key: `{identities_hash, provider_id, package_ids_hash, consent_hash}`, where `identities_hash` is the SHA-256 of the canonical `identities` bytes defined in [Identity Match signed fields](#identity-match-signed-fields) (computed over the per-provider filtered subset); `package_ids_hash` is SHA-256 over the JCS serialization of the sorted `package_ids` array; `consent_hash` is SHA-256 over the JCS serialization of the request's `consent` object (or JCS `null` when the field is absent — this distinguishes "consent unknown" from an explicit-empty consent object). JCS framing prevents delimiter-injection: raw consent strings or package IDs containing `|`, `,`, or `\n` cannot collide two distinct inputs. Including the identity set ensures that adding or removing tokens produces a distinct cache entry. Including the package list hash ensures cached responses are invalidated when the active package set changes (e.g., a new media buy activates). Including the consent hash prevents eligibility decisions taken under one consent state from being served under another. -- When a provider's targeting configuration changes (new packages, updated targeting rules), the provider SHOULD return `"cache_ttl": 0` until the change has propagated, then resume normal caching. -- Both `ttl_sec` and `cache_ttl` have a schema-enforced maximum of 86400 seconds (24 hours). Routers SHOULD clamp buyer-provided values to a configured maximum (recommended: 3600 seconds) to limit the blast radius of stale caches. +- Identity Match responses are bound by `serve_window_sec` (per-package single-shot fcap, max 300s, default 60s). Routers MAY apply an internal deduplication cache keyed on `{identities_hash, provider_id, package_ids_hash, consent_hash}`, where `identities_hash` is the SHA-256 of the canonical `identities` bytes defined in [Identity Match signed fields](#identity-match-signed-fields) (computed over the per-provider filtered subset); `package_ids_hash` is SHA-256 over the JCS serialization of the sorted `package_ids` array; `consent_hash` is SHA-256 over the JCS serialization of the request's `consent` object (or JCS `null` when the field is absent — this distinguishes "consent unknown" from an explicit-empty consent object). JCS framing prevents delimiter-injection: raw consent strings or package IDs containing `|`, `,`, or `\n` cannot collide two distinct inputs. Including the identity set ensures that adding or removing tokens produces a distinct cache entry. Including the package list hash ensures cached responses are invalidated when the active package set changes (e.g., a new media buy activates). Including the consent hash prevents eligibility decisions taken under one consent state from being served under another. The publisher's binding contract is the serve-window throttle, not the router's internal cache window. +- When a provider's targeting configuration changes (new packages, updated targeting rules), the provider SHOULD return `"cache_ttl": 0` (Context Match) or `"serve_window_sec": 1` (Identity Match) until the change has propagated, then resume normal values. +- `cache_ttl` (Context Match) has a schema-enforced maximum of 86400 seconds. `serve_window_sec` is bounded at 300 seconds — longer windows make per-package fcap too coarse for typical campaigns, shorter than the IdentityMatch round-trip wastes the throttle. ## Conformance Levels diff --git a/specs/identitymatch-fcap-architecture.md b/specs/identitymatch-fcap-architecture.md new file mode 100644 index 0000000000..b73c84a93b --- /dev/null +++ b/specs/identitymatch-fcap-architecture.md @@ -0,0 +1,143 @@ +# IdentityMatch & Frequency Capping — Architecture Spec + +**Status**: landed (architecture decisions). +**Target release**: 3.0.1 (additive wire change). + +This spec captures the architecture decisions behind the buyer-side IdentityMatch surface in TMP. It is a **design-history document**, not an implementation reference — the authoritative spec lives in: + +- [`docs/trusted-match/specification.mdx`](../docs/trusted-match/specification.mdx) — wire spec (normative): `serve_window_sec` field, conformance invariants for IdentityMatch eligibility, TMPX binary format. +- [`docs/trusted-match/identity-match-implementation.mdx`](../docs/trusted-match/identity-match-implementation.mdx) — frequency-cap data flow (boundary contract): the cap-fire event the impression tracker writes into the IdentityMatch cap-state store, and how the IdentityMatch service consumes it at query time. Internal counting / policy / storage layout are buyer-internal and out of scope. +- [`docs/trusted-match/buyer-guide.mdx`](../docs/trusted-match/buyer-guide.mdx) — buyer-agent integration walkthrough; updated for `serve_window_sec` semantic. +- [`docs/trusted-match/migration-from-axe.mdx`](../docs/trusted-match/migration-from-axe.mdx) — adds OpenRTB 2.6 `User.eids` cross-walk for buyers bridging from OpenRTB-shaped pipelines. + +Read this doc when you want to understand **why** the design landed where it did. Read the docs above when you want to **implement** against it. + +## Problem + +The TMP IdentityMatch wire spec defines what flows on the wire: identity tokens in, eligible package IDs and an HPKE-encrypted exposure token (`tmpx`) out. It did not previously define: + +1. **Where fcap policy and counting live.** Originally implied to be inside the IdentityMatch service. Settled here as buyer-internal in the impression tracker; the IdentityMatch service consumes only cap-fire events at the boundary. +2. **Boundary contract between impression tracker and IdentityMatch service** — what events flow from the impression-tracking pipeline into the IdentityMatch cap-state store. +3. **Audience freshness vs. response throttle** — `ttl_sec` was documented as a router cache TTL but operationally functioned as a per-package serve throttle, conflating two distinct concerns. +4. **Conformance** — how a third party validates that an IdentityMatch implementation is correct. + +Without these decisions, the open-source IdentityMatch reference impl risked shipping with Go-shaped assumptions baked into wire-adjacent surfaces, or with policy logic baked into the service that should sit in the buyer's impression-tracking pipeline. + +## Architectural decisions + +### 1. Three layers, with explicit normative status + +| Layer | Status | What it covers | +|---|---|---| +| **Wire spec** | Normative | HTTP JSON, `serve_window_sec` semantic, TMPX binary format. Anything crossing an agent boundary. | +| **Conformance invariants** | Normative | The eligibility logic an IdentityMatch service MUST compute, expressed in terms of inputs (identities, packages, audiences, cap-state) and outputs (eligible_package_ids). Storage-agnostic. | +| **Boundary contract for cap-fire events** | Normative for the cap-state store API | What events flow from the impression tracker into the IdentityMatch cap-state store, and what state IdentityMatch consumes at query time. The store interface (e.g. `RecordCap` / `IsCapped` in `adcp-go/targeting/fcap`) is the reference shape. Storage backend is implementer choice. | + +The protocol describes **what** the service must compute and **what** events flow into it, not how the impression tracker counts impressions or where its policy state lives. + +### 2. Counting and policy live in the impression tracker, not in IdentityMatch + +The IdentityMatch service does not count impressions. It does not own fcap policies. It does not evaluate windows. Those concerns live entirely in the buyer's impression-tracking pipeline, where they vary across buyers and across campaigns. + +The IdentityMatch service maintains a narrow **cap-state store** keyed at `(user_identity, seller_agent_url, package_id)` with a TTL-bound expiration. The impression tracker writes a cap-fire entry on the impression that exhausts a cap; the IdentityMatch service checks presence at query time and excludes the package from `eligible_package_ids` while the entry is live. + +This split keeps the IdentityMatch service narrow and makes new cap dimensions (advertiser, campaign, creative, line item — see [Future extensions](#future-extensions)) extensions of the boundary contract rather than rewrites of the service. Earlier iterations of this design proposed an exposure-log model inside the IdentityMatch service, with cross-identity dedup via `impression_id`, label-model fcap keys, and the IdentityMatch service evaluating windows at read time. That design was unwound — counting, dedup, and policy evaluation all depend on buyer-internal concerns the protocol shouldn't constrain. The reference store in [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap) implements the simpler boundary contract. + +### 3. Cross-identity dedup is a buyer-internal concern + +A single impression resolved to multiple identity tokens may produce multiple cap-fire entries — one per `(identity, package)` pair the cap fired on — but how the impression tracker decides "this is one impression vs. three" is buyer-internal. Buyers running their own identity graph can canonicalize before counting; buyers that don't get whatever counting their impression tracker is configured to do. The protocol does not require an `impression_id` and does not constrain dedup logic. + +### 4. `serve_window_sec` replaces `ttl_sec` + +The original `ttl_sec` field was documented as a router cache TTL but operationally functioned as a per-package single-shot fcap. Two distinct concerns sharing one knob meant tuning for cost (long cache) silently broke fcap, and tuning for fcap (short cache) wasted IdentityMatch round-trips. + +Replacement: `serve_window_sec` (1–300, default 60) with the corrected semantic — *after serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again.* + +`ttl_sec` is removed. No deprecation window: TMP is pre-launch (experimental, pre-3.0.0 GA) and not subject to deprecation cycles. The field is not present in the 3.0.1 schema. + +### 5. Cap-fire events as the impression-handling primitive + +The impression tracker decodes TMPX, applies the buyer's policy logic, and (when a cap fires) writes a cap-fire entry to the IdentityMatch cap-state store. The cap-state store API ([`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap)) exposes: + +``` +RecordCap(ctx, userIdentity, fields[]Field, expireAt) // write cap-fire +IsCapped(ctx, userIdentity, field Field) (bool) // query cap-state +``` + +— plus batch variants. `Field` is `{SellerAgentURL, PackageID}`. Production deployments separate decode (synchronous, at intake) from policy evaluation and cap-state writes (asynchronous, behind a queue) for buffering — bundling would force synchronous topology and break the pattern. + +### 6. TMP IdentityMatch service is a downstream consumer of cap-state + +The IdentityMatch service reads cap-state on each `/identity` call. Writes come from the impression tracker (or a downstream service in its pipeline) on cap-fire. No new wire endpoints for impressions or policies. The IdentityMatch service stays narrow. + +### 7. Policy updates trigger cap-state re-evaluation at the buyer + +Cap-state entries are written under whatever fcap policy was in force at cap-fire time. When policies change (window length, `max_count`, activation, package reassignment), the buyer's policy owner MUST re-evaluate every affected `(user_identity, package)` entry against the new policy and push delete-or-extend events to the cap-state store. The cap-state store carries no counts and can't re-evaluate on its own — the buyer's counting state is the source of truth. The protocol does not constrain re-evaluation cadence; only that cap-state must converge to what the current policies imply. See [docs/trusted-match/identity-match-implementation.mdx § Policy updates and cap-state re-evaluation](../docs/trusted-match/identity-match-implementation.mdx#policy-updates-and-cap-state-re-evaluation) for the event shapes. + +### 8. `sync_audiences` is the audience on-ramp + +The existing wire `sync_audiences` task has `add[]`/`remove[]` deltas of audience-member objects — exactly the CRUD shape the IdentityMatch backend needs for the audience side of eligibility. No schema extension required. + +## Future extensions + +Today the cap-state store is keyed at `(user_identity, seller_agent_url, package_id)`. Future protocol versions may extend the field to additional dimensions — advertiser, campaign, creative, line item — so a buyer can express caps that span multiple packages without writing N entries on every cap-fire. The boundary contract is unchanged by such extensions: the impression tracker writes cap-fire entries; the IdentityMatch service checks presence at query time. + +## Open questions + +1. **Cap-state extensions for advertiser/campaign/creative.** v1 keys at `(user_identity, seller_agent_url, package_id)`. Extending to broader cap dimensions without forcing the impression tracker to write N entries on each cap-fire is a follow-up workstream. +2. **Explicit delete primitive on the cap-state store.** The reference impl exposes `RecordCap` (write/extend) and `IsCapped` (presence) but no explicit delete. Re-evaluation today expresses "delete" as "extend with an `expire_at` already in the past." A first-class `DeleteCap` operation is a candidate primitive, especially as policy-change re-evaluation becomes a hot path. +3. **Identity-graph plug-point.** Whether the impression tracker canonicalizes identities before writing cap-state, or writes per-resolved-identity, is buyer-internal. The protocol does not require the IdentityMatch service to know about identity graphs. +4. **Audience strength scores.** Per-segment scores are an open extension on the audience side of eligibility, separate from cap-state. +5. **Production-deployment perf benchmarks.** Cap-state lookups are hash-field presence checks (HEXISTS), but real-world latency depends on backend choice, network co-location, and cluster sharding under load. Tracked as a rollout-plan deliverable. + +## Deferred security & privacy issues (follow-up) + +These came out of pre-merge review. Each warrants a focused follow-up rather than blocking this design landing. + +1. **TMPX harvest → competitor-suppression attack.** TMPX in publisher creative URLs is harvestable. Without per-impression binding (creative_id, slot_id, ts) inside the AEAD AAD, an attacker fires harvested tokens at the buyer's impression endpoint to drive cap-fire signals and starve a target user out of a campaign. Mitigation: bind TMPX to per-impression context, or rate-limit-per-token at the impression handler. +2. **Eligibility-as-audience-membership oracle.** A malicious publisher submits honeypot `package_ids` and observes which return eligible to reconstruct the user's audience profile. The "publishers don't see audience records" privacy claim is wire-correct but functionally false. Mitigation: package-ownership check at IdentityMatch ingress, or k-anonymity floor on eligibility responses. +3. **Consent revocation between IdentityMatch and impression.** TMPX has no consent fingerprint; if consent is revoked during the serve window, the impression tracker may still process the exposure. GDPR/TCF problem. +4. **Side-channel via eligibility deltas.** A router observing two responses for the same user 30s apart sees `eligible_package_ids` shrink as caps trip — fingerprinting fcap state per-user. +5. **`hashed_email` in TMPX widens identity-leak surface.** Putting unsalted SHA-256 email inside a creative URL macro re-identifies on token leak. Either prohibit `hashed_email` in TMPX plaintext or require salting. +6. **DoS amplification via large `package_ids[]`.** Per-IdentityMatch cap-state reads scale `O(|identities| × |candidate_packages|)` — at 25k packages from a busy publisher, this is an amplification primitive. Cap candidate_packages at IdentityMatch ingress. +7. **Rollout work plan ownership gaps.** No named owner for the eligibility-evaluator hot path, observability/SLO, key-rotation drill, or load testing. Address before SDK ships. + +## Rollout plan + +### What this PR landed + +- Wire spec change (additive): `serve_window_sec` field on `identity-match-response.json`. `ttl_sec` removed (pre-launch, no deprecation cycle needed). +- Doc updates to `docs/trusted-match/specification.mdx`, `buyer-guide.mdx`, `migration-from-axe.mdx`. +- New page: `docs/trusted-match/identity-match-implementation.mdx` — frequency-cap data flow (boundary contract). +- This architecture-rationale doc. + +### Next workstreams (not in this PR) + +1. **`adcp-go/targeting/fcap` cap-state store** — landed upstream as the reference cap-state store backed by Valkey 9 hashes (`fcap:{hash}` keys, one HSETEX field per `(seller_agent_url, package_id)`). +2. **`@adcp/client` (TS) and `adcp` (Python) parity** — same `RecordCap` / `IsCapped` boundary in TS and Python. +3. **`adcp-go/identitymatch` reference TMP server** — open-source read path for `POST /identity` over the cap-state store. +4. **Scope3 hosted IdentityMatch** — public deployment for buyers who don't want to host their own service. +5. **Training agent integration** — hosts both AdCP MCP/A2A and TMP `/identity` surfaces, sharing the cap-state store internally. End-to-end IdentityMatch demo. +6. **Conformance harness** — runner script that seeds cap-state directly, runs `/identity` queries against the TMP server, and asserts eligibility responses. Lives as integration tests inside `adcp-go` and `@adcp/client`. +7. **TMP graduation (target: 3.1.0)** — TMP enters `supported_protocols` (currently in `experimental_features` as `trusted_match.core`). At that point AdCP storyboards can wrap the harness if cross-protocol integration testing becomes useful. + +## Threads consolidated from Slack 2026-04-26 + +- **Thread 1 (exposure struct location):** resolved by the three-layer model. Cross-language interop is at the cap-state store API level (`RecordCap` / `IsCapped`); no proto, no JSON Schema for buyer-internal records. TMPX wire format stays as published in `docs/trusted-match/specification.mdx`. +- **Thread 2 (campaign isn't AdCP):** resolved — cap dimensions live in the impression tracker, not in the wire protocol. v1 cap-state keys at `(user_identity, seller_agent_url, package_id)`. Seller agent + package_id remains the seller-side identifier per `core/seller-agent-ref.json`. +- **Thread 3 (campaign logic in IdentityMatch):** resolved — counting and policy live in the impression tracker; IdentityMatch consumes cap-fire events at the boundary. +- **Thread 4 (campaign sync via Cerberus):** resolved — cap-fire events are written directly to the cap-state store from the impression tracker; no Cerberus. + +## Threads consolidated from Slack 2026-04-30 (impression handling) + +Per discussion with @bhuo (Scope3 impression-tracker owner) and Brian: + +- Production deployments separate decode at intake (synchronous) from policy evaluation and cap-state writes (asynchronous, behind a queue) for buffering. The cap-state store API exposes the write-side primitive (`RecordCap`); the impression tracker decides when to call it. +- "JS for writers, Go for reader" framing was wrong — Brian's "JS" was shorthand for "the language the impression tracker runs in," currently Go at Scope3. Spec/SDK is language-neutral; the cap-state API ships in `adcp-go`, with TS and Python parity tracked as a follow-up. +- Pub/sub buffering, retries, dedup, observability, abuse protection are deployment concerns, not protocol concerns. The cap-state store ships the boundary primitives; topology is the implementer's choice. + +## Threads consolidated from PR #3359 review + +- **@oleksandr's normative/reference layering question:** the original spec called the buyer-side valkey schema "normative" while leaving an open question for a pluggable FrequencyStore interface. Inconsistent. Resolved by the three-layer model — wire spec + conformance invariants are normative; cap-state store interface is the boundary contract; storage backend is implementer choice. +- **Counter-vs-log debate (Brian):** earlier iterations explored a counter-based exposure model and a log-based exposure-log model with `impression_id` dedup, both inside the IdentityMatch service. Both unwound — counting and dedup are buyer-internal concerns the protocol shouldn't constrain. The IdentityMatch service consumes cap-fire events; whatever counting the impression tracker does to decide "this is the cap-firing impression" is up to the buyer. +- **Cap dimensions:** earlier iterations debated how the protocol should express advertiser/campaign/creative caps (label model, hierarchy, etc.). Resolved — the protocol does not enumerate cap dimensions at all. The cap-state store v1 keys at `(user_identity, seller_agent_url, package_id)`; broader-dimension caps are a follow-up extension to the boundary contract. diff --git a/static/schemas/source/index.json b/static/schemas/source/index.json index 52f8d23a48..15d6eb5f5d 100644 --- a/static/schemas/source/index.json +++ b/static/schemas/source/index.json @@ -1555,7 +1555,8 @@ "description": "Per-package eligibility — boolean eligible plus optional intent score" } } - } + }, + "implementation-guidance": "Conformance invariants and the boundary contract between the impression tracker and the IdentityMatch cap-state store are documented in specs/identitymatch-fcap-architecture.md and docs/trusted-match/identity-match-implementation.mdx. Storage backend is an implementation choice; conformant services may use any store that satisfies the invariants." }, "brand-protocol": { "description": "Brand protocol for identity retrieval, rights discovery, acquisition, and lifecycle management", diff --git a/static/schemas/source/tmp/identity-match-response.json b/static/schemas/source/tmp/identity-match-response.json index afac4a7128..871f0519e4 100644 --- a/static/schemas/source/tmp/identity-match-response.json +++ b/static/schemas/source/tmp/identity-match-response.json @@ -2,7 +2,7 @@ "$schema": "http://json-schema.org/draft-07/schema#", "$id": "/schemas/tmp/identity-match-response.json", "title": "Identity Match Response", - "description": "Response indicating which packages the user is eligible for. The ttl_sec field defines a caching contract: the router caches this response and returns cached eligibility without re-querying the buyer during the TTL window. Extension fields (ext, context) are intentionally omitted to prevent data leakage across the identity privacy boundary.", + "description": "Response indicating which packages the user is eligible for. The serve_window_sec field defines a per-package single-shot fcap: after serving the user one impression on each eligible package, the publisher MUST re-query Identity Match before serving from those packages again. Extension fields (ext, context) are intentionally omitted to prevent data leakage across the identity privacy boundary.", "type": "object", "allOf": [ { @@ -27,11 +27,11 @@ "type": "string" } }, - "ttl_sec": { + "serve_window_sec": { "type": "integer", - "description": "How long the router should cache this response, in seconds. The router returns cached eligibility without re-querying the buyer during this window. A value of 0 means do not cache.", - "minimum": 0, - "maximum": 86400 + "description": "Per-package single-shot fcap window, in seconds. After serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. This is NOT a router response cache TTL — it is a buyer-asserted serve throttle. Multi-impression frequency caps are handled separately by the buyer's impression tracker, which writes cap-fire events to the IdentityMatch cap-state store at the boundary regardless of this window. Maximum 300 — longer windows reduce IdentityMatch load but coarsen fcap granularity below what most campaigns require.", + "minimum": 1, + "maximum": 300 }, "tmpx": { "type": "string", @@ -42,7 +42,7 @@ "type", "request_id", "eligible_package_ids", - "ttl_sec" + "serve_window_sec" ], "additionalProperties": true }