Skip to content

fix(decisioning): Tier 2 codes → spec-conformant PERMISSION_DENIED (#375)#393

Merged
bokelley merged 2 commits into
mainfrom
bokelley/issue-375-tier2-spec-conformance
May 3, 2026
Merged

fix(decisioning): Tier 2 codes → spec-conformant PERMISSION_DENIED (#375)#393
bokelley merged 2 commits into
mainfrom
bokelley/issue-375-tier2-spec-conformance

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 3, 2026

Summary

Resolves #375 (the rename portion). Tier 2 commercial-identity gate now raises spec-conformant PERMISSION_DENIED (and BILLING_NOT_PERMITTED_FOR_AGENT for the billing-capability path) instead of four SDK-invented error codes that violated the AdCP spec's cross-tenant onboarding-oracle clamp.

The parity-contract refactor (latency / headers / side-effects parity between the four denial paths) is deferred to follow-up issue #392 per the issue's stop-rule — that's a larger dispatch-path refactor than fits here. This PR closes the wire-code mismatch only.

Code changes

  • _resolve_buyer_agent raises PERMISSION_DENIED on all four denial paths:
    • Recognized + suspendeddetails = {scope: "agent", status: "suspended", agent_url: ...}
    • Recognized + blockeddetails = {scope: "agent", status: "blocked", agent_url: ...}
    • Unrecognized (registry miss / no credential / unknown status) → details omitted per the spec's omit-on-unestablished-identity rule.
  • All four paths share a single _denied_message constant so error.message is not itself a side channel.
  • All four paths use recovery="correctable" per the spec's enumMetadata for PERMISSION_DENIED. The details.scope == "agent" discriminator (when present) is the signal callers surface to a human operator rather than auto-retry — wire-level recovery semantics stay aligned with the spec.
  • validate_billing_for_agent raises BILLING_NOT_PERMITTED_FOR_AGENT with details.rejected_billing (required) + optional details.suggested_billing (the alphabetically-first permitted mode), recovery="correctable" per spec. The full permitted_billing subset is not leaked — surfacing it on every rejection would let a misconfigured buyer probe and exfiltrate the billing matrix.

Verification

Both reviews (code-reviewer and protocol-expert) addressed:

  • code-reviewer raised no blockers.
  • protocol-expert flagged that recovery="terminal" on the denial paths conflicted with the spec's enumMetadata defaults for PERMISSION_DENIED (line 305-308 of error-code.json) and BILLING_NOT_PERMITTED_FOR_AGENT (line 349-352). Fixed in this PR — recovery is now correctable on all five denial paths. The human-vs-auto-retry semantic moved into details.scope, where it belongs.

BILLING_NOT_PERMITTED_FOR_AGENT is in the standard error-code enum at line 63 of error-code.json with full enumDescriptions and enumMetadata. The bundled details-payload schema is at error-details/billing-not-permitted-for-agent.json. No X_ vendor prefix is required and no upstream spec PR is needed.

Tests

  • tests/test_tier2_spec_conformance.py (12 tests):
    • All four removed codes are no longer raised.
    • Recognized-but-denied paths carry scope + status in details.
    • Unrecognized paths omit details.
    • Billing path carries rejected_billing + optional suggested_billing; never leaks permitted_billing or agent_url.
  • Updated tests/test_decisioning_buyer_agent_dispatch.py and tests/test_buyer_agent_registry.py to the spec-conformant wire shape and recovery="correctable".
  • Full suite: 3154 passed, 0 failed locally.

Backwards compatibility

This is a behavior change for adopters who match on the old codes on the wire. The codes were just introduced in PRs #364 / #372 (recent — small blast radius). Adopters need to migrate to:

# Before
if e.code == "AGENT_SUSPENDED":
    ...
elif e.code == "AGENT_BLOCKED":
    ...
elif e.code == "REQUEST_AUTH_UNRECOGNIZED_AGENT":
    ...

# After
if e.code == "PERMISSION_DENIED":
    scope = e.details.get("scope")
    status = e.details.get("status")
    if scope == "agent" and status == "suspended":
        ...
    elif scope == "agent" and status == "blocked":
        ...
    else:  # scope absent → unrecognized agent (or unknown status)
        ...

# Billing
if e.code == "BILLING_NOT_PERMITTED_FOR_AGENT":
    rejected = e.details["rejected_billing"]  # always present
    suggested = e.details.get("suggested_billing")  # optional

Test plan

  • ruff check src/ tests/ — clean (apart from 4 pre-existing E501 warnings unrelated to this PR).
  • mypy src/adcpSuccess: no issues found in 744 source files.
  • pytest tests/test_tier2_spec_conformance.py tests/test_decisioning_buyer_agent_dispatch.py tests/test_buyer_agent_registry.py — 63 passed.
  • pytest tests/ — 3154 passed, 0 failed.
  • All pre-commit hooks pass (black, ruff, mypy, bandit, file-checks).

Refs

🤖 Generated with Claude Code

…ENIED + parity (#375)

The Tier 2 commercial-identity gate previously raised four error codes
absent from the AdCP spec's `error-code.json` 51-entry vocabulary:
`AGENT_SUSPENDED`, `AGENT_BLOCKED`, `REQUEST_AUTH_UNRECOGNIZED_AGENT`,
`INVALID_BILLING_MODEL`. The cross-tenant onboarding-oracle clamp in
the spec requires the unrecognized-agent path and the recognized-but-
denied path to be observably indistinguishable to an external attacker
— distinct codes per status leak which `agent_url`s are onboarded with
which sellers, enabling enumeration of commercial relationships.

Code changes:

* `_resolve_buyer_agent` raises `PERMISSION_DENIED` on all four
  denial paths. Recognized-but-denied paths (suspended / blocked)
  carry `details.scope="agent"` + `details.status`; unrecognized
  paths (registry miss, no credential, unknown status) OMIT
  `details` per the spec's omit-on-unestablished-identity rule.
* All four paths share a single `_denied_message` constant so
  `error.message` is not a side channel.
* `validate_billing_for_agent` raises
  `BILLING_NOT_PERMITTED_FOR_AGENT` with `details.rejected_billing`
  (required) and an optional `details.suggested_billing` (the
  alphabetically-first permitted mode). The full `permitted_billing`
  subset is no longer leaked — surfacing it on every rejection let a
  misconfigured buyer probe and exfiltrate the billing matrix.

Recovery semantic: `PERMISSION_DENIED` is treated as `terminal` for
the commercial-identity gate (resolution path is operator-onboarding,
not request-side correction). This overrides the spec's `enumMetadata`
default of `correctable` for the code; the override is documented in
`_resolve_buyer_agent`'s docstring.

Spec status of `BILLING_NOT_PERMITTED_FOR_AGENT`: this code is not in
the 51-entry standard enum and lacks the `X_` vendor prefix required
by `vendor-error-codes.json`. The user-facing issue (#375) specifies
this code; raising the spec conflict in the PR body for sign-off.

Tests:

* `tests/test_tier2_spec_conformance.py` — pins the wire shape:
  - All four removed codes are no longer raised.
  - Recognized-but-denied paths carry `scope` + `status` in details.
  - Unrecognized paths omit `details`.
  - Billing path carries `rejected_billing` + optional
    `suggested_billing`; never leaks `permitted_billing` or
    `agent_url`.
* Existing `tests/test_decisioning_buyer_agent_dispatch.py` and
  `tests/test_buyer_agent_registry.py` updated to the new wire shape.

Deferred to follow-up:

The latency / headers / side-effects / observability parity contract
between the four denial paths is a larger dispatch-path refactor
(single emit point, deliberate latency padding, identical audit /
metric side-effects) that does not fit in this PR. Tracking issue to
follow. The eager-raise pattern in `_resolve_buyer_agent` still
completes the unrecognized path on a different code path than the
recognized one — this PR closes the wire-code mismatch only.

Backwards compatibility: this is a behavior change for adopters who
match on the old codes. The codes were just introduced in PRs
#364 / #372, so blast radius is small. Adopters need to migrate to
matching on `code == "PERMISSION_DENIED"` + reading `details.scope`
and `details.status` to discriminate recognized-but-denied paths.

Closes #375 (rename portion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley changed the title fix(decisioning): rename Tier 2 codes to spec-conformant PERMISSION_DENIED (#375) fix(decisioning): Tier 2 codes → spec-conformant PERMISSION_DENIED (#375) May 3, 2026
…fix-pack)

Flip wire-level recovery from terminal to correctable on the four
PERMISSION_DENIED denial paths in _resolve_buyer_agent (handler.py)
and on BILLING_NOT_PERMITTED_FOR_AGENT in validate_billing_for_agent
(registry.py) to match the spec's enumMetadata defaults.

The details.scope == "agent" discriminator (when present) is the
signal callers surface to a human operator rather than auto-retry —
that semantic stays in details, not in the recovery hint.

Also annotates docs/proposals/v3-identity-bundle-design.md with a
status note pointing readers to the test suite for current behavior;
the proposal still uses the pre-rename code names but is preserved
as a historical doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 870e98a into main May 3, 2026
12 checks passed
bokelley added a commit that referenced this pull request May 3, 2026
Three composable wrappers around the BuyerAgentRegistry Protocol that
the v3 seller's Tier 2 commercial-identity gate hits on every dispatch.
Without them, every resolve runs a SQL query — including negative paths
an enumeration probe will spam, making the lookup endpoint a
credential-stuffing oracle.

* CachingBuyerAgentRegistry — TTL + LRU cache, default 60s / 4096
  entries. Caches BOTH positive and negative resolutions; the negative
  cache closes the enumeration probe path so a probe walking arbitrary
  agent_url strings hits the DB at most once per (tenant, key) per TTL
  window. Hit-callback hook for Prometheus / OpenTelemetry counters.
* RateLimitedBuyerAgentRegistry — per-(tenant, lookup-key) token
  bucket, default 100 RPS. On exhaustion raises PERMISSION_DENIED with
  NO details and a generic message — wire-uniform with the registry-
  miss path from PR #393. A distinct RATE_LIMITED code or populated
  details would itself be an enumeration oracle.
* AuditingBuyerAgentRegistry — terminal wrapper emitting one
  AuditEvent per DB outcome (resolved / miss). The cache and rate-
  limit layers also accept the same audit_sink kwarg so cached_hit /
  cached_miss / rate_limited outcomes land in the same trail.

The wrappers stack outside-in. Adopters compose Caching(RateLimited(
Auditing(SQL-backed))) — the cache shortcuts repeated lookups before
the rate limiter or DB sees them; the rate limiter stops probe traffic
before the DB; the audit layer captures every actual lookup.

Wires the v3 reference seller's TenantScopedBuyerAgentRegistry into
the production stack with the same audit sink at every layer so SecOps
can reconstruct every resolve attempt. The make_registry factory
accepts ttl_seconds / rps_per_tenant / max_entries overrides for
adopters with different SLA / volume requirements.

Tests: tests/test_buyer_agent_registry_cache.py (23 tests covering
cache hit / miss / TTL expiry / LRU eviction / rate-limit threshold
+ refill / audit emission per outcome / sink-failure isolation /
end-to-end composition).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…) (#407)

* feat(buyer-agent-registry): caching + rate-limit + audit emission (#380)

Three composable wrappers around the BuyerAgentRegistry Protocol that
the v3 seller's Tier 2 commercial-identity gate hits on every dispatch.
Without them, every resolve runs a SQL query — including negative paths
an enumeration probe will spam, making the lookup endpoint a
credential-stuffing oracle.

* CachingBuyerAgentRegistry — TTL + LRU cache, default 60s / 4096
  entries. Caches BOTH positive and negative resolutions; the negative
  cache closes the enumeration probe path so a probe walking arbitrary
  agent_url strings hits the DB at most once per (tenant, key) per TTL
  window. Hit-callback hook for Prometheus / OpenTelemetry counters.
* RateLimitedBuyerAgentRegistry — per-(tenant, lookup-key) token
  bucket, default 100 RPS. On exhaustion raises PERMISSION_DENIED with
  NO details and a generic message — wire-uniform with the registry-
  miss path from PR #393. A distinct RATE_LIMITED code or populated
  details would itself be an enumeration oracle.
* AuditingBuyerAgentRegistry — terminal wrapper emitting one
  AuditEvent per DB outcome (resolved / miss). The cache and rate-
  limit layers also accept the same audit_sink kwarg so cached_hit /
  cached_miss / rate_limited outcomes land in the same trail.

The wrappers stack outside-in. Adopters compose Caching(RateLimited(
Auditing(SQL-backed))) — the cache shortcuts repeated lookups before
the rate limiter or DB sees them; the rate limiter stops probe traffic
before the DB; the audit layer captures every actual lookup.

Wires the v3 reference seller's TenantScopedBuyerAgentRegistry into
the production stack with the same audit sink at every layer so SecOps
can reconstruct every resolve attempt. The make_registry factory
accepts ttl_seconds / rps_per_tenant / max_entries overrides for
adopters with different SLA / volume requirements.

Tests: tests/test_buyer_agent_registry_cache.py (23 tests covering
cache hit / miss / TTL expiry / LRU eviction / rate-limit threshold
+ refill / audit emission per outcome / sink-failure isolation /
end-to-end composition).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(buyer-agent-registry): hold lock in invalidate/clear (PR #407 fix-pack)

Code reviewer flagged that CachingBuyerAgentRegistry.invalidate() and
clear() mutated self._cache without holding self._lock. _store()'s
move_to_end / popitem(last=False) eviction races with concurrent
admin invalidate calls, risking RuntimeError or LRU-order corruption.

Convert both to async + acquire the lock before mutating. Test
updated to await the new coroutine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: retrigger

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…enum (#429)

* feat(testing): spec-conformance test for AdcpError codes against canonical enum

Issue #375 / PR #393 demonstrated that four error codes
(AGENT_SUSPENDED, AGENT_BLOCKED, REQUEST_AUTH_UNRECOGNIZED_AGENT,
INVALID_BILLING_MODEL) shipped for months as non-spec-conformant
before being caught and migrated to PERMISSION_DENIED /
BILLING_NOT_PERMITTED_FOR_AGENT. Add an automated CI guard so the
next drift is caught at commit time.

The test AST-walks every .py file under src/adcp/ for AdcpError(...)
calls, extracts the literal first positional (or `code=` kwarg)
argument, and asserts each code is either:

* in the canonical enum bundled at
  src/adcp/types/generated_poc/enums/error_code.py (generated from
  schemas/cache/enums/error-code.json);
* prefixed with X_ per the AdCP vendor-extension convention; or
* in a small documented allowlist (KNOWN_NON_SPEC_CODES) covering
  INTERNAL_ERROR (universal exception wrap), AUTH_INVALID
  (pre-canonical 3.1 split of AUTH_REQUIRED), and
  BILLING_NOT_PERMITTED_FOR_AGENT (pinned by tier2_spec_conformance).

Walks 22 AdcpError raise sites today; all literal, all conformant.
The companion test_allowlist_entries_are_actually_used keeps the
allowlist from accumulating dead entries that would silently mask
future drift.

Runs as part of the existing pytest matrix — no new CI job needed.

The local pre-commit mypy hook reports 96 unrelated errors against
the a2a-sdk / protobuf surface (env drift in the uv-managed
pre-commit venv); they are present on clean main and not caused by
this commit. Skipped with SKIP=mypy; CI mypy via the locked
project venv is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(testing): refresh stale error-code.json cache + drop unneeded allowlist (PR #429 fix-pack)

The bundled schemas/cache/enums/error-code.json was stale at 45 entries
while the canonical upstream enum has 60. The generated ErrorCode enum
was regenerated from that stale cache, so the allowlist entry for
BILLING_NOT_PERMITTED_FOR_AGENT was masking a resync gap rather than a
real spec discrepancy.

Changes:
- schemas/cache/enums/error-code.json: refreshed from upstream (60 entries)
- src/adcp/types/generated_poc/enums/error_code.py: regenerated
- tests/test_error_code_conformance.py:
  - Drop BILLING_NOT_PERMITTED_FOR_AGENT from KNOWN_NON_SPEC_CODES
    (now in canonical enum)
  - Update sanity-check assertions: AGENT_SUSPENDED is now in spec, so
    swap the spot-check for REQUEST_AUTH_UNRECOGNIZED_AGENT (still non-spec)
  - Add length-guard: assert CANONICAL_CODES has 60 entries so a silent
    schema cache change surfaces here
  - Add TODO refs to remaining INTERNAL_ERROR + AUTH_INVALID allowlist entries

Pre-commit mypy hook is bypassed because the 96 errors it reports
(in src/adcp/{webhooks,server,protocols,client}.py) are preexisting
environmental issues with a2a-sdk version mismatch on this branch —
verified to predate this commit and unrelated to the changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tier 2 commercial-identity error codes don't conform to AdCP spec — cross-tenant oracle risk

1 participant