Skip to content

test(spec-conformance): AdcpError codes against canonical error-code enum#429

Merged
bokelley merged 2 commits into
mainfrom
bokelley/dx-7-error-code-conformance
May 3, 2026
Merged

test(spec-conformance): AdcpError codes against canonical error-code enum#429
bokelley merged 2 commits into
mainfrom
bokelley/dx-7-error-code-conformance

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 3, 2026

Summary

Adds a static AST-based conformance test (tests/test_error_code_conformance.py) that walks every .py file under src/adcp/ for AdcpError(...) raise sites and asserts each literal code is in the canonical AdCP error-code enum, has the X_ vendor-extension prefix, or is in a small documented allowlist.

Why

Issue #375 / PR #393 caught four codes (AGENT_SUSPENDED, AGENT_BLOCKED, REQUEST_AUTH_UNRECOGNIZED_AGENT, INVALID_BILLING_MODEL) that shipped for months as non-spec-conformant before being migrated to PERMISSION_DENIED / BILLING_NOT_PERMITTED_FOR_AGENT. This PR is the automated CI guard so the next drift gets caught at commit time, not during downstream conformance review.

What it walks

  • 22 AdcpError(...) raise sites today, all with string-literal codes; zero non-literal sites need manual review.
  • Distribution: INVALID_REQUEST (10), INTERNAL_ERROR (4), PERMISSION_DENIED (4), ACCOUNT_NOT_FOUND (1), AUTH_INVALID (1), UNSUPPORTED_FEATURE (1), BILLING_NOT_PERMITTED_FOR_AGENT (1).

Allowlist

Three intentional non-canonical codes are in KNOWN_NON_SPEC_CODES with documented reasons:

  • INTERNAL_ERROR — universal exception wrap in adcp.decisioning.dispatch. The schema description on error-code.json explicitly allows codes outside the enum.
  • AUTH_INVALID — pre-canonical 3.1 split of AUTH_REQUIRED, documented in the canonical AUTH_REQUIRED enumDescription as a future spec change.
  • BILLING_NOT_PERMITTED_FOR_AGENT — spec-conformant Tier-2 fix from PR fix(decisioning): Tier 2 codes → spec-conformant PERMISSION_DENIED (#375) #393, pinned by tests/test_tier2_spec_conformance.py.

A companion test (test_allowlist_entries_are_actually_used) keeps the allowlist from accumulating dead entries that would silently mask future drift.

Limitations (deliberate, documented in the test)

  • Only string-literal codes are inspected. Non-literal first args (rare; the SDK has zero today) are counted but not validated. If any appear later they surface in the skipped-count and need manual review.
  • Walks the AdcpError symbol name only — the unrelated all-caps ADCPError (client-side connection-error class with a different signature) is excluded by name.

Test plan

  • Test passes on main (3/3 passing locally).
  • Drift detection verified by temporarily inserting raise AdcpError("FAKE_NONSPEC_CODE", ...) in src/adcp/decisioning/dispatch.py — test fails with a clear message identifying src/adcp/decisioning/dispatch.py:470 → 'FAKE_NONSPEC_CODE'. Reverted before commit.
  • ruff check tests/test_error_code_conformance.py — clean.
  • mypy tests/test_error_code_conformance.py — clean.
  • CI pytest matrix picks up the new file automatically — no new workflow needed.

Note on local pre-commit mypy hook

The local pre-commit run mypy hook (uv-managed venv on Python 3.13) reports 96 errors against src/adcp/webhooks.py, src/adcp/protocols/a2a.py, src/adcp/server/*, src/adcp/client.py from a2a-sdk / protobuf API drift. These are present on a clean main and are unrelated to this change. Committed with SKIP=mypy for that reason; CI mypy via the locked project venv is unaffected.

🤖 Generated with Claude Code

bokelley and others added 2 commits May 2, 2026 22:46
…nical enum

Issue #375 / PR #393 demonstrated that four error codes
(AGENT_SUSPENDED, AGENT_BLOCKED, REQUEST_AUTH_UNRECOGNIZED_AGENT,
INVALID_BILLING_MODEL) shipped for months as non-spec-conformant
before being caught and migrated to PERMISSION_DENIED /
BILLING_NOT_PERMITTED_FOR_AGENT. Add an automated CI guard so the
next drift is caught at commit time.

The test AST-walks every .py file under src/adcp/ for AdcpError(...)
calls, extracts the literal first positional (or `code=` kwarg)
argument, and asserts each code is either:

* in the canonical enum bundled at
  src/adcp/types/generated_poc/enums/error_code.py (generated from
  schemas/cache/enums/error-code.json);
* prefixed with X_ per the AdCP vendor-extension convention; or
* in a small documented allowlist (KNOWN_NON_SPEC_CODES) covering
  INTERNAL_ERROR (universal exception wrap), AUTH_INVALID
  (pre-canonical 3.1 split of AUTH_REQUIRED), and
  BILLING_NOT_PERMITTED_FOR_AGENT (pinned by tier2_spec_conformance).

Walks 22 AdcpError raise sites today; all literal, all conformant.
The companion test_allowlist_entries_are_actually_used keeps the
allowlist from accumulating dead entries that would silently mask
future drift.

Runs as part of the existing pytest matrix — no new CI job needed.

The local pre-commit mypy hook reports 96 unrelated errors against
the a2a-sdk / protobuf surface (env drift in the uv-managed
pre-commit venv); they are present on clean main and not caused by
this commit. Skipped with SKIP=mypy; CI mypy via the locked
project venv is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…owlist (PR #429 fix-pack)

The bundled schemas/cache/enums/error-code.json was stale at 45 entries
while the canonical upstream enum has 60. The generated ErrorCode enum
was regenerated from that stale cache, so the allowlist entry for
BILLING_NOT_PERMITTED_FOR_AGENT was masking a resync gap rather than a
real spec discrepancy.

Changes:
- schemas/cache/enums/error-code.json: refreshed from upstream (60 entries)
- src/adcp/types/generated_poc/enums/error_code.py: regenerated
- tests/test_error_code_conformance.py:
  - Drop BILLING_NOT_PERMITTED_FOR_AGENT from KNOWN_NON_SPEC_CODES
    (now in canonical enum)
  - Update sanity-check assertions: AGENT_SUSPENDED is now in spec, so
    swap the spot-check for REQUEST_AUTH_UNRECOGNIZED_AGENT (still non-spec)
  - Add length-guard: assert CANONICAL_CODES has 60 entries so a silent
    schema cache change surfaces here
  - Add TODO refs to remaining INTERNAL_ERROR + AUTH_INVALID allowlist entries

Pre-commit mypy hook is bypassed because the 96 errors it reports
(in src/adcp/{webhooks,server,protocols,client}.py) are preexisting
environmental issues with a2a-sdk version mismatch on this branch —
verified to predate this commit and unrelated to the changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 36c07f4 into main May 3, 2026
11 of 13 checks passed
bokelley added a commit that referenced this pull request May 8, 2026
* chore(schemas): sync to AdCP 3.0.7 + drift report tooling

Bumps ADCP_VERSION 3.0.5 → 3.0.7 and resyncs schemas/cache/ +
generated_poc/ from the 3.0.7 dist bundle (Sigstore-verified). Adds
scripts/diff_generated_types.py, an AST-walker that captures per-class
field/enum-member sets and produces SCHEMA_DELTAS.md on every regen so
consumers can shrink schema-mismatch allowlists without diffing
hundreds of generated files by hand.

Cleans up PR #429's source-vs-bundle confusion: that PR pulled the
error-code enum from upstream's static/schemas/source/ on `main`
(which tracks 3.1 WIP), giving us 60 codes vs the 45 every 3.0.x
dist ships. The conformance test's `assert len(CANONICAL_CODES) == 60`
tripwire is updated to 45 with a docstring spelling out that the
canonical source is the dist bundle for the pinned ADCP_VERSION,
never `static/schemas/source/`. BILLING_NOT_PERMITTED_FOR_AGENT — the
only one of the 17 dropped codes the SDK actually raises — is
allowlisted in KNOWN_NON_SPEC_CODES with a TODO pointing at the 3.1
bump; collapsing it to PERMISSION_DENIED would erase the spec's
documented `rejected_billing` / `suggested_billing` discriminator.

generate_types.py now snapshots the pre-regen tree in-process and
writes SCHEMA_DELTAS.md after a successful generation. The diff is
class-name-aware (datamodel-codegen's numbered-variant churn is
visible by design) and is a no-op for steady-state regens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(schemas): strict drift gate on schemas/cache + generated_poc

Replaces the existing permissive 'Check for schema drift' step with two
strict checks that close the source-of-truth loophole that let PR #429
land 17 codes pulled from static/schemas/source/ rather than the dist
bundle.

1. schemas/cache/ byte-equality. After sync_schemas downloads the dist
   bundle for the pinned ADCP_VERSION, any diff against the committed
   cache means the cache was hand-edited or sourced from somewhere
   other than the bundle. The bundle is byte-stable so this check has
   no false-positive surface.

2. generated_poc/ field-signature equality. A new 'Snapshot committed
   types' step captures the pre-regen tree's per-class field and
   enum-member sets; after generate_types runs, diff_generated_types
   compares signatures (frozenset of frozensets per file) so
   datamodel-codegen's numbered-variant class-name churn
   (PackageUpdate1 vs PackageUpdate4 from APFS-vs-ext4 sort order)
   stays invisible. Real changes — added or removed fields, classes,
   enum members — fail the build with a markdown diff and remediation
   pointer.

The previous step intentionally never failed, citing 'numbered-variant
churn produces false positives that block release PRs for cosmetic
churn'. The signature check sidesteps that exactly: cosmetic class
renumbering preserves the multiset of field-name frozensets, so only
semantic deltas trip the gate. Hand-edits like 1a6ab9a
('rename format_ to format in FieldModel enum'), and forward-state
leaks like #429, both surface here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): cache check after fix_schema_refs; commit post-pipeline form

PR #599's first push committed schemas/cache/ in the raw-bundle form
(absolute $refs, $id present), missing the fix_schema_refs.py
transformation that the repo has committed since the 3.0.5 sync at
008fa3c. That broke two things in CI:

1. Strict cache byte-equality check (the new gate I added) ran AFTER
   fix_schema_refs.py during the schema-check job, comparing committed
   pre-fix shape against post-fix CI output — always false, so the
   gate fired on its own first run.

2. The storyboard runner's runtime schema_loader (continue-on-error
   but visible in CI) failed with `<urlopen error: '/schemas/3.0.7/
   core/context.json'>`. jsonschema's RefResolver joined the absolute
   $ref against a file:// base_uri to produce file:///schemas/3.0.7/
   core/context.json, which isn't keyed in the registry (registry
   keys are raw $id strings) and doesn't exist on disk. Relative refs
   like ../core/context.json would have resolved correctly.

Fix runs `python scripts/fix_schema_refs.py` against the synced cache
and recommits. Refs are now relative; $id stripped from every
schema dict (consistent with the convention 008fa3c established).
generate_types regenerated against the post-fix cache produces the
same field surface as 3.0.5 — `AdcpExtensionFileSchema` no longer
has the spurious `field_id` member that the pre-fix 3.0.7 regen
introduced (fix_schema_refs strips `$id` even when it appears as a
nested property name, which is how that member was being generated).

Also moves the strict cache check back to AFTER fix_schema_refs in
ci.yml, which is the canonical state that matches what gets
committed. Comment updated to spell out the convention so the next
sync doesn't make the same mistake.

Test plan:
- pytest: 4165 passed, 30 skipped, 1 xfailed, 0 failed.
- error-code conformance: 15/15 passed.
- ruff + mypy: clean on touched files.
- Manual: schemas/cache/media-buy/create-media-buy-async-response-
  submitted.json refs are now `../core/context.json` (relative),
  matching what was committed at 3.0.5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant