Skip to content

Registry: follow-ups from #3495 — stale types, registered/discovered conflation, docs IA gaps, non-member path #3538

@EmmaLouise2018

Description

@EmmaLouise2018

Follow-ups from #3495 (sales agents mis-classified as buying). The original three PRs (#3496, #3497, #3498) fixed the inference path, added server-side override, and backfilled the discovered_agents table — but several adjacent issues remain visible in the live registry and in the docs.

Status

P0 shipped — four PRs open against main:

PR Concern Closes
#3540 fix(registry): flip 6 type === 'buying' filter inversions to 'sales' + add sales to two by_type tallies Problem 1b
#3541 ops(registry): backfill stale member_profiles.agents types + crawler promote-on-unknown-only / log-on-disagreement policy Problem 1
#3542 docs(registry): rename Lookups & AuthorizationAuthorization Lookups, annotate source/member/discovered_from, add /operator auth-aware note Problem 4 + Problem 3 step 1
#3543 docs(registry): new registering-an-agent.mdx covering the four crawl paths + AAO-membership value story Problem 5 option A

P1 still open: Problem 2, Problem 3 steps 2–3, Problem 4 P1 (overview table), Problem 6. P2: Problem 3 step 4, Problem 5 option B, the type-enum tangent.

Live evidence (https://adcontextprotocol.org/registry, 22 agents)

type:    unknown=17   buying=2   creative=1   signals=1   sales=1
source:  registered=19            discovered=3

77% of agents have type: 'unknown'; 2 sales agents (Bidcliq, Swivel) are still stored as buying; 3 entries in the public registry are crawl-only discovered agents that never opted in.


Problem 1 — Member-registered agents stuck at the wrong type

Status: Shipped in #3541.

Symptom: Bidcliq and Swivel show type: 'buying' despite being sales agents.

Root cause: PR #3498's resolveAgentTypes() only runs on writes (POST/PUT to /api/me/member-profile). PR #3497's backfill only touched the discovered_agents table — not the member_profiles.agents JSONB column. Member-registered rows that were saved before the fix never get re-evaluated.

Compounding: server/src/crawler.ts:580 only writes back inferredType when the existing federated row has no type. Once any non-unknown value is set, the crawler will not reclassify it (crawler.ts:504-507).

Fix shipped (#3541):

  • New script server/scripts/backfill-member-agent-types.ts (with --dry-run) walks every member_profiles.agents[], calls resolveAgentTypes(), and writes back any rows where snapshot disagrees with stored type. Idempotent. resolveAgentTypes exported from member-profiles.ts so the script can reuse it.
  • crawler.ts:580 policy tightened. Final policy is more conservative than the original proposalupdate on disagreement was deemed too aggressive (single bad probes could corrupt good rows). Ships as: promote when stored is missing or 'unknown' AND inferred is non-unknown; log warning on disagreement (stored non-unknown ≠ inferred non-unknown); never auto-flip a known-good row. Operator runs the backfill explicitly to reconcile flagged disagreements.
  • Regression test server/tests/unit/crawler-type-update-policy.test.ts pins the promote/disagreement matrix (5/5 pass).

Problem 1b — #3495 sweep missed multiple type === 'buying' filter inversions, silently breaking sales-agent crawl + stats + enrichment paths

Status: Shipped in #3540.

Symptom: Multiple filter sites still gate on agent.type === 'buying' for logic that semantically operates on sales agents (the ones that call list_authorized_properties and have publisher authorizations). Pre-#3496, sales tools were mis-classified as 'buying', so the wrong-but-aligned filter accidentally worked. Post-#3496, sales agents are correctly tagged 'sales', so these filters now match zero real sales agents — every affected path is silently dead.

Sweep result: 6 inverted filter sites + 2 stat-tally omissions

The codebase grep for type [!=]==? "buying" surfaced more sites than the original two. All six filter sites have the same shape: filter says 'buying', the data being processed (publisher authorizations, properties, publisher_domains) belongs to sales agents. The two stat-tally omissions are a related but separate bug — sales agents are simply absent from by_type counts.

File:line What it does Effect post-#3496
crawler.ts:419 reverse-crawl publishers from sales agents' list_authorized_properties dead — no sales agent matches 'buying'
health.ts:131 fetchStats populates property_count, publishers, publisher_count sales agents return empty stats
publishers.ts:204 trackPublishers builds publisher-domain → expected-agents map publisher tracking misses sales agents entirely
http.ts:8784 agent cache warmer calling propertiesService.getPropertiesForAgent property cache cold for sales agents
registry-api.ts:3667 withProperties enrichment dispatch in /api/registry/agents publisher_domains / property_summary empty in API response
registry-api.ts:3677 same enrichment block, result mapping same path downstream
http.ts:2021 (stats) /api/stats by_type count tally sales is never counted (only creative, signals, buying)
http.ts:2087 (stats) /agents JSON by_type count tally same — sales omitted

The receiving function at federated-index.ts:281 even has the docstring "Record a publisher discovered from a sales agent's list_authorized_properties" — the downstream call sites already know the source is a sales agent. The callers were the bug.

Fix shipped (#3540):

  • All six filters flipped 'buying''sales'.
  • crawler.ts:416-417 comment + log message updated to match.
  • Both stats tallies in http.ts gain a sales line so sales agents get counted.
  • Regression test server/tests/unit/health-stats-type-filter.test.ts pins fetchStats for sales/buying/empty cases (3/3 pass).
  • Codebase sweep confirmed no other inverted sites; the two type === "buying" matches that remain (http.ts:2021, 2087) are deliberate category counts, not filters.

Problem 2 — 77% of agents are type: 'unknown'

Symptom: Most agents in the public registry render as 'Unclassified.'

Root cause: resolveAgentTypes() reads agent_capabilities_snapshot.inferred_type. If the probe failed, timed out (10s in crawler.ts:548), or the snapshot row is null, the override falls back to client value or 'unknown'. There is no retry/backoff for agents that fail probing — they stay unknown until they happen to respond on a future pass.

Fix:

  • Re-probe unknown agents on a faster cadence than the regular crawl, with longer timeout and bounded retry.
  • Track `last_probe_attempt_at` so dead agents do not get hammered.
  • Acceptance: `count(type='unknown') / count(*)` < 5% in the public registry.

Problem 3 — Registered vs discovered are conflated in the API surface

Symptom: `/api/registry/agents` returns registered (AAO-attested, opt-in) and discovered (crawled from a member publisher's adagents.json, no opt-in) agents in one merged list. The public UI at `server/public/agents.html` does not filter, segment, or even display the `source` field.

Why this matters: These are different trust levels. A registered agent has signed AAO terms; a discovered agent has only been mentioned in someone else's manifest and may not know they are listed.

Fix (cheap → correct):

  1. Annotate the response schema with descriptions: `source`, `discovered_from`, `member` in `server/src/schemas/registry.ts:354-356` need `.openapi({ description })`.
  2. Add a `?source=registered|discovered` query parameter; document the default.
  3. Segment the `/registry` UI: tabs or visible badges for 'Registered' vs 'Discovered via adagents.json,' with explainer copy on hover.
  4. Eventually split into `/api/registry/agents/registered` and `/api/registry/agents/discovered`. Keep the merged endpoint as deprecated for back-compat.
  5. Fix the example response on the docs page — Mintlify auto-populates every optional field, so a single example currently shows `source: 'registered'` AND `discovered_from: {...}` together (impossible in practice). Add per-`source` example overrides.

Problem 4 — Docs IA hides `/api/registry/operator` and lacks an overview map

Symptom: The `/api/registry/operator` docs page exists but is at `https://docs.adcontextprotocol.org/docs/registry/api-reference/lookups-%26-authorization/operator-lookup\` — the literal `&` in the tag name produces an unsearchable, unshareable URL slug.

Root cause: `server/src/routes/registry-api.ts:656` declares `tags: ["Lookups & Authorization"]`. Mintlify slugifies that as `lookups-%26-authorization`.

Auth-aware behavior is invisible to docs readers: The endpoint returns different agents depending on caller identity (`registry-api.ts:5159-5177`):

  • Anonymous → `public` agents only
  • Authenticated AAO member with API access → adds `members_only` agents
  • Profile owner → adds `private` agents

This is literally where AAO membership unlocks more data — the value story we want to tell — and neither the OpenAPI description nor the docs page mentions it.

Three overlapping lookup endpoints, no overview:

Endpoint Question it answers
`/api/registry/agents` 'Give me the catalog'
`/api/registry/operator?domain=X` 'What does this entity operate?'
`/api/registry/publisher?domain=X` 'What does this entity publish?'

These live under two different tag groups in the docs with no top-level prose tying them together.

Status: Tag rename + auth-aware /operator note shipped in #3542. Overview-table expansion of docs/registry/index.mdx is P1 (separate PR).

Fix shipped (#3542):

  • All 8 occurrences of tags: ["Lookups & Authorization"] renamed → ["Authorization Lookups"]. URL slug now authorization-lookups (no %26).
  • /api/registry/operator description includes the auth-aware response-shape note explicitly.
  • source / member / discovered_from schemas in server/src/schemas/registry.ts now have .openapi({ description }) so the docs page renders explanatory copy on each field.
  • static/openapi/registry.yaml regenerated; pulled in pre-existing drift from earlier PRs that hadn't run build:openapi.

Still P1:

  • Expand docs/registry/index.mdx with the three-endpoint overview table (catalog vs operator vs publisher) and one paragraph per row.
  • Add equivalent auth-aware note to /publisher if it has the same visibility logic.

Problem 5 — Non-AAO-member registration path is undocumented (and partially broken)

Symptom: A non-member who wants to be in the registry cannot self-register. There is no public form, no community_agents table, no docs page explaining what is possible. They must rely on indirect crawl paths.

Root cause: All write paths in server/src/routes/member-profiles.ts (lines 96, 183, 439, 1696) sit behind requireAuth + AAO membership lookup.

The actual entry points (four exist; one is silently dead — see Problem 1b)

There are four crawl entry points that can put a domain (and thereby its adagents.json-listed agents) into the federated index:

# Path Where Auth required?
1 Member-publisher walk crawler.ts:371-413 — iterates member_profiles.publishers[] where is_public=true n/a (server-side, scheduled)
2 Sales-agent reverse crawl crawler.ts:417-465 — for every sales agent, crawl any publisher_domain returned by its list_authorized_properties n/a (server-side, scheduled)
3 Authenticated single-domain crawl request registry-api.ts:6299POST /api/registry/crawl-request yes (requireAuth)
4 Catalog crawl queue crawler.ts:925-1023, populated by bulk-property-check.ts:235 (anyone calling POST /api/properties/check) and workos-webhooks.ts:942 (org-domain claim events) partially — /properties/check does not require auth

Path 2 is currently dead because of the inversion bug in Problem 1b (filter at crawler.ts:419 matches type === 'buying', but post-#3496 sales agents are tagged 'sales'). So the practical entry points today are 1, 3, and 4 only.

This explains how mamamia.com.au, gatavo.com, wheelrandom.com ended up in the registry as discovered. None are member-claimed publishers; they entered via path 4 (catalog crawl queue), or via path 2 before #3496 merged when the inverted filter happened to align with the inverted classification.

Practical reality for a non-member today

  • A non-member's agent can end up in the registry as discovered if (a) their agent URL is listed in some publisher's valid adagents.json, AND (b) that publisher domain is crawled via paths 1, 3, or 4.
  • The publisher domain itself does not need to be AAO-member-claimed — paths 3 and 4 will crawl arbitrary domains.
  • The agent is still classified discovered (low trust), shows member: null, and there is no first-class registration for the agent itself.

Status: Option A shipped in #3543. Option B is P2 (separate product call once direction is set).

Fix shipped (#3543, option A — docs only):

  • New page docs/registry/registering-an-agent.mdx covers all four crawl paths, the trust-level difference (discoveredregistered, being in adagents.json ≠ being an AAO member), and what AAO membership unlocks (public / members_only / private visibility, Verified badge eligibility, storyboard testing, compliance reporting). Includes a curl/JS recipe for verifying how your own agent appears, plus a comparison table for member vs non-member capabilities.
  • Linked from docs.json Registry API nav (both version trees).
  • Mintlify broken-links check passed.

Fix (option B — P2, ships later):

  • Lightweight public registration that lands self-attested agents in a separate community_agents table or a source: 'self_attested' enum value.
  • Renders in the registry with an explicit 'Self-attested, not AAO-verified' badge so consumers can tell the trust level apart.

Problem 6 — Discovered agents do not surface their member-publisher linkage

Symptom: Discovered agents always show member: null even when the publisher domain that listed them in adagents.json is owned by an AAO member.

Live evidence — all three discovered agents return member: null:

agent.mamamia.com.au   ← from publisher_domain mamamia.com.au   member: null
gatavocom.sales-agent.setupad.ai   ← from gatavo.com   member: null
wheelrandom.sales-agent.setupad.ai   ← from wheelrandom.com   member: null

Root cause: When the crawler records a discovered agent via recordAgentFromAdagentsJson(), it stores publisher_domain in discovered_from but does not check whether any member profile claims that domain. The /api/registry/agents response builder (registry-api.ts:1517-1550) populates member only from the agent_url's own member registration — never from the publisher_domain's member ownership.

Why it matters: A discovered agent endorsed by a member publisher's adagents.json carries meaningful trust (the publisher staked their reputation on the listing). Surfacing member: null hides that signal and makes member publishers' endorsements invisible to consumers.

Fix (decision needed first):

  • Option A: if the publisher_domain is claimed by a member, surface that linkage in the discovered agent's response — a new endorsed_by_publisher_member field, or extend member to optionally carry the publisher-side reference. The agent stays source: 'discovered' (it didn't opt in itself); membership is shown via the publisher relationship.
  • Option B: if the agent_url is registered by a member elsewhere, prefer that membership in the response. Verify whether recordAgentFromAdagentsJson already collapses to the registered row in this case; if not, fix the merge.

Both options can ship together. The change is in the response-builder, no schema migration required for option A if added as a new optional field.


Type enum still includes `buying` (tangential)

`server/src/types.ts:1` and `server/src/schemas/registry.ts:512` still list `buying` in the `AgentType` enum. Per #3495, sales-tool inference should never produce `buying`. Worth deciding whether `buying` agents should appear in the public registry at all, or whether the enum should be tightened. Out of scope for this issue but flagging.


Proposed plan / sequencing

P0 (shipped):

  1. ✅ Backfill script for stale member_profiles.agents types — ops(registry): backfill stale member_profiles.agents types + crawler reclassify-on-disagreement #3541.
  2. crawler.ts:580 policy update — promote on unknown only, log on disagreement, never auto-flip — ops(registry): backfill stale member_profiles.agents types + crawler reclassify-on-disagreement #3541.
  3. ✅ Flip type === 'buying' filter inversions to 'sales' — six sites confirmed (not two), shipped in fix(registry): flip type==='buying' filter inversions to 'sales' + add sales to by_type tally #3540.
  4. ✅ Codebase sweep for type === "buying" / type !== "buying" sites — sweep complete; six filter sites flipped, two stat tallies fixed (http.ts:2021, 2087) — fix(registry): flip type==='buying' filter inversions to 'sales' + add sales to by_type tally #3540.
  5. ✅ Tag rename "Lookups & Authorization""Authorization Lookups"docs(registry): rename Lookups tag, annotate source/member/discovered_from, /operator auth-aware note #3542.
  6. ✅ OpenAPI description annotations for source, discovered_from, member plus auth-aware note on /operatordocs(registry): rename Lookups tag, annotate source/member/discovered_from, /operator auth-aware note #3542.
  7. docs/registry/registering-an-agent.mdxdocs(registry): registering-an-agent — four crawl paths + AAO-membership value story #3543.

P1 (next sprint):
8. ?source= query param + UI segmentation on /registry (Problem 3 steps 2–3).
9. Re-probe loop for type: 'unknown' agents with retry/backoff (Problem 2).
10. Expand docs/registry/index.mdx with the three-endpoint overview table (Problem 4).
11. Surface member-publisher endorsement on discovered agents (Problem 6 — option A and/or B once decided).

P2 (3.x):
12. Split into /api/registry/agents/registered and /api/registry/agents/discovered with deprecation of the merged endpoint (Problem 3 step 4).
13. Decide on Problem 5 option B (separate product issue once direction is set).
14. Decide whether to remove buying from the public type enum.


Refs: #3495, #3496, #3497, #3498

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.documentationImprovements or additions to documentationwebsiteAny issues related to the website

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions