fix(registry): canonicalize agent_url on the registered path (#3573)#4551
Open
brandonling27 wants to merge 1 commit into
Open
Conversation
…le) path Closes adcontextprotocol#3573. The crawler/discovered path already routed agent_url through `canonicalizeAgentUrl` (write site in `crawler.ts:reconcileLegacyAdagentsAgents`, read-side SQL safety net in `federated-index-db.ts`). The member-profile registered path did not — POST/PATCH/DELETE on `/api/me/agents`, the bulk `PUT /api/me/member-profile` handler, and Addie's `save_agent` all wrote raw URLs into `member_profiles.agents` JSONB and `agent_registry_metadata`. Two writes for the same logical agent differing only in case or trailing slash landed as separate rows, and the JS map joins in `FederatedIndexService.lookupDomain` / `listAllAgents` enriched on raw string equality — so the member badge silently dropped off any discovered authorization whose URL differed from the registered one only in case or trailing slash. Forward-only application-layer fix (no DB migration, per adcontextprotocol#3573's "out of scope"). Canonicalize at every member-side write boundary; canonicalize both the map key and the lookup key in every JS read site so legacy non-canonical rows continue to match canonical inputs until the member next saves. Write sites (canonicalize, reject ?/#, 400 on null): - POST/PATCH/DELETE in `routes/member-agents.ts` (handler boundary) - bulk `PUT /api/me/member-profile` in `routes/member-profiles.ts` - Addie `save_agent` and `remove_saved_agent` in `addie/mcp/member-tools.ts` - defense-in-depth in `applyMemberAgentMutation`'s seed loop so any future writer that forgets is still caught Read sites (canonical key with `?? raw` fallback): - `FederatedIndexService.listAllAgents` map key - `FederatedIndexService.lookupDomain` map key plus the two lookup sites (`auth.agent_url` and `claim.discovered_by_agent`) - `FederatedIndexService.getAllAgentDomainPairs` (bulk snapshot reader) Tests: - `member-agents-api.test.ts` — POST collapses mixed-case/trailing-slash to canonical; idempotency across forms yields one JSONB row + one metadata row; PATCH/DELETE match canonical rows from non-canonical url-encoded paths; PATCH accepts case/slash-only body.url differences; 400 on query/fragment/wildcard. - `registry-crawler-cache.test.ts` — Pin 1: registered `https://agent.example/` + discovered `https://agent.example` collapse with member-badge enrichment. Pin 2: scheme mismatch (http vs https) intentionally does NOT collapse — different security posture per adcontextprotocol#3573. Trade-offs: - Path-case is lowercased (existing `canonicalizeAgentUrl` behavior) rather than scheme+host-only as adcontextprotocol#3573 phrased it — chosen for consistency with the discovered side; switching both sides to a stricter canonicalizer would have a much larger blast radius. - Existing non-canonical rows in `member_profiles.agents` and `agent_registry_metadata` are not backfilled. Read-side canonicalization in `federated-index.ts` papers over this for the registry surface; direct readers of `agent_registry_metadata` (heartbeat, dashboards) will continue to see un-collapsed legacy rows until the member next saves. Verification: - `npm run typecheck` ✓ - Affected unit tests ✓ - Integration tests require a running Postgres; run with `DATABASE_URL=… npx vitest run server/tests/integration/member-agents-api.test.ts server/tests/integration/registry-crawler-cache.test.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
IPR Policy Agreement Required@brandonling27 — thanks for the contribution. Before this PR can be merged, the AgenticAdvertising.Org IPR Policy requires your agreement. To agree, post a new comment on this PR with the exact phrase: Your signature is recorded once and covers all contributions to AAO repositories. See |
Author
|
I have read the IPR Policy |
IPR Policy — signedThanks, @brandonling27. Your agreement to the IPR Policy is recorded at |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #3573.
The crawler/discovered path already canonicalized
agent_urlviacanonicalizeAgentUrl(write site incrawler.ts:reconcileLegacyAdagentsAgents, read-side SQL safety net infederated-index-db.ts). The member-profile registered path did not — POST/PATCH/DELETE on/api/me/agents, the bulkPUT /api/me/member-profile, and Addie'ssave_agent/remove_saved_agentall wrote raw URLs intomember_profiles.agentsJSONB andagent_registry_metadata. Two writes for the same logical agent differing only in case or trailing slash landed as separate rows, and the JS map joins inFederatedIndexService.lookupDomain/listAllAgentsenriched on raw string equality — so the member badge silently dropped off any discovered authorization whose URL differed from the registered one only in case or trailing slash.This change is forward-only, application-layer only (no DB migration, per #3573's "out of scope"). Canonicalize at every member-side write boundary; canonicalize both the map key and the lookup key in every JS read site so legacy non-canonical rows continue to match canonical inputs until the member next saves.
Write sites (canonicalize, reject
?/#, 400 on null)server/src/routes/member-agents.ts(handler boundary, mirroring the precedent atroutes/registry-api.ts:7370-7377)PUT /api/me/member-profileinserver/src/routes/member-profiles.tssave_agentandremove_saved_agentinserver/src/addie/mcp/member-tools.tsapplyMemberAgentMutation'sagent_registry_metadataseed loop so any future writer that forgets is still caughtRead sites (canonical key with
?? rawfallback)FederatedIndexService.listAllAgentsmap keyFederatedIndexService.lookupDomainmap key plus the two lookup sites (auth.agent_urlandclaim.discovered_by_agent)FederatedIndexService.getAllAgentDomainPairs(bulk snapshot reader)Tests
server/tests/integration/member-agents-api.test.ts— POST collapses mixed-case/trailing-slash to canonical; idempotency across forms yields one JSONB row + oneagent_registry_metadatarow; PATCH/DELETE match canonical rows from non-canonical url-encoded paths; PATCH accepts case/slash-onlybody.urldifferences; 400 on query/fragment/wildcard.server/tests/integration/registry-crawler-cache.test.ts— Pin 1: registeredhttps://agent.example/+ discoveredhttps://agent.examplecollapse with member-badge enrichment. Pin 2: scheme mismatch (http://vshttps://) intentionally does NOT collapse — different security posture per fix(registry): canonicalize agent_url before registered/discovered collapse (trailing slash + scheme) #3573.Trade-offs
canonicalizeAgentUrlbehavior) rather than scheme+host-only as fix(registry): canonicalize agent_url before registered/discovered collapse (trailing slash + scheme) #3573 phrased it. Chosen for consistency with the discovered side; switching both sides to a stricter canonicalizer would have a much larger blast radius (every existing caller inregistry-api.ts, the SQL safety net infederated-index-db.ts, etc.).member_profiles.agentsandagent_registry_metadatapersist until the member next saves. Read-side canonicalization infederated-index.tspapers over this for the registry surface; direct readers ofagent_registry_metadata(compliance heartbeat, lifecycle dashboards) will continue to see un-collapsed legacy rows until they're rewritten. The issue explicitly defers cleanup ("write a one-off cleanup if a sweep finds collisions").brand.jsonwill publish the canonical form for any public agent whose stored URL gets rewritten on next save. Believed harmless (same URL, normalized) but worth flagging for downstream consumers.Test plan
npm run typecheck— passes locallyDATABASE_URL=… npx vitest run server/tests/integration/member-agents-api.test.ts— new canonicalization casesDATABASE_URL=… npx vitest run server/tests/integration/registry-crawler-cache.test.ts— Pin 1 and Pin 2https://Example.com/Agent/to/api/me/agents; confirm response and DB row both showhttps://example.com/agent. Crawl a fixture authorizinghttps://EXAMPLE.com/agent; confirmlookupDomainreturns the member badge on the discovered authorization.🤖 Generated with Claude Code