fix(addie): owner evaluate_agent_quality writes to canonical compliance state#4250
fix(addie): owner evaluate_agent_quality writes to canonical compliance state#4250
Conversation
|
Reviewing this against the updated #4247 plan. Right shape for PR 1 — owner-only canonical writes + 1.
|
…ns test Address review feedback from @EmmaLouise2018 on PR #4250: 1. `verdict_source` field on /api/registry/agents/:url/compliance — `AgentComplianceDetailSchema` gains optional `verdict_source`: 'heartbeat' | 'owner_test' | 'manual' | 'webhook' | null — `getComplianceStatus` and `bulkGetComplianceStatus` join `agent_compliance_runs` via LATERAL subquery (dry_run=false, ORDER BY tested_at DESC LIMIT 1) to surface the triggered_by of the most recent run. No migration needed. — Endpoint response emits `verdict_source: status.last_triggered_by`. — `AgentComplianceStatus` interface gets `last_triggered_by` field. 2. Last-write-wins contract test — New `compliance-db-last-write-wins.test.ts` pins the ON CONFLICT DO UPDATE semantics: every recordComplianceRun call overwrites agent_compliance_status regardless of triggered_by source. A future change to first-write-wins or priority ordering would break these tests. https://claude.ai/code/session_01NVVqgeSGevUGXgDbMw1XKZ
|
Both asks addressed in the fixup commit ( 1.
verdict_source: 'heartbeat' | 'owner_test' | 'manual' | 'webhook' | null
2. Last-write-wins contract test New
A future switch to first-write-wins or source-priority filtering breaks these tests. All 3302 unit tests pass. Session: https://claude.ai/code/session_01NVVqgeSGevUGXgDbMw1XKZ Generated by Claude Code |
|
Code review (expert pass): solid root, ready to merge after stripping committed Block:
Nits (non-blocking):
After dist strip, mark ready and the chain unblocks (#4263 is clean too — see #4264 for the real chain blocker). |
|
Blocker addressed — pushed
Nits noted (not fixed):
Ready to mark out of draft when you are. Triaged by Claude Code. Session: https://claude.ai/code/session_01WrFMahjaHU7y4JWqPqxTbx Generated by Claude Code |
…ce state Closes the 12-hour gap between owner-triggered storyboard runs and the public /api/registry/agents/:url/compliance endpoint (issue #4247, PR 1 of 4). When evaluate_agent_quality is triggered by the agent owner, the result is now written to agent_compliance_status + agent_compliance_runs + agent_storyboard_status with triggered_by = 'owner_test'. Non-owner runs continue writing to agent_test_history (deprecated in PR 3). Migration 471 adds 'owner_test' to both triggered_by CHECK constraints. notifyComplianceChange is intentionally suppressed for owner runs to prevent iteration-loop Slack spam. https://claude.ai/code/session_01UNHkGhBXk9XD2dpzvSLdhb
…ns test Address review feedback from @EmmaLouise2018 on PR #4250: 1. `verdict_source` field on /api/registry/agents/:url/compliance — `AgentComplianceDetailSchema` gains optional `verdict_source`: 'heartbeat' | 'owner_test' | 'manual' | 'webhook' | null — `getComplianceStatus` and `bulkGetComplianceStatus` join `agent_compliance_runs` via LATERAL subquery (dry_run=false, ORDER BY tested_at DESC LIMIT 1) to surface the triggered_by of the most recent run. No migration needed. — Endpoint response emits `verdict_source: status.last_triggered_by`. — `AgentComplianceStatus` interface gets `last_triggered_by` field. 2. Last-write-wins contract test — New `compliance-db-last-write-wins.test.ts` pins the ON CONFLICT DO UPDATE semantics: every recordComplianceRun call overwrites agent_compliance_status regardless of triggered_by source. A future change to first-write-wins or priority ordering would break these tests. https://claude.ai/code/session_01NVVqgeSGevUGXgDbMw1XKZ
…rtifacts
Generated JS/TS files don't belong in source control. Also adds
.gitignore rules for dist/schemas/*.{js,d.ts,*.map} to prevent recurrence.
https://claude.ai/code/session_01WrFMahjaHU7y4JWqPqxTbx
…tion_queue on main)
c71809a to
42e7f37
Compare
PR 2 of the #4247 unification stack. Reads two fields PR #4250 added to the compliance API but the dashboard wasn't yet rendering: - compliance tile: appends "(your test)" / "(heartbeat)" / "(manual)" / "(webhook)" after Last checked, so operators see whether the current verdict came from their own evaluate_agent_quality run or the scheduled heartbeat. - history panel: per-run badge with the same source label, info-blue for owner_test and neutral for the rest. Pre-PR-1 rows render with neutral — no regression. No backend changes; pure UI surfacing of fields already in the API. Stacked on PR #4250.
Refs #4247 — PR 1 of 4 in the compliance-state unification initiative.
Summary
Closes the 12-hour gap between an owner running
evaluate_agent_quality(via Addie chat or thetest_adcp_agentMCP tool) and the public/api/registry/agents/:url/complianceendpoint reflecting the result.Root cause:
evaluate_agent_qualitywrote toagent_test_history/agent_contexts.last_test_*(not visible to the compliance API). The heartbeat is the only prior writer to the canonical tables, so owner-triggered results were invisible until the next 12-hour cron.What this PR does (owner path only):
member_profiles.agents @> $1::jsonb JOIN organization_memberships— identical toresolveAgentOwnerOrginregistry-api.ts:4733.compliance_opt_outis not set: callscomplianceResultToDbInput()+complianceDb.recordComplianceRun()withtriggered_by = 'owner_test'anddry_run = false.'owner_test'to theTriggeredBytype and bothtriggered_byCHECK constraints (migration 471 — coversagent_compliance_runsandagent_storyboard_status).notifyComplianceChangeto prevent Slack spam on iterative owner test runs.agentContextDb.recordTest()write for backward compatibility (deprecated in PR 3).Known gap documented for follow-up: AAO Verified badge state is still updated only by the heartbeat (badge issuance calls
processAgentBadgeswhich is in the heartbeat loop, not inrecordComplianceRun). Compliance status reflects owner test results immediately; badge re-issuance still requires the next heartbeat. Tracked in #4247.Non-breaking justification: Adds an optional write path to existing canonical tables with a new
triggered_by = 'owner_test'value. No fields removed, no types changed, no existing writes modified. The new enum value is additive; the constraint drop-and-recreate is non-destructive DDL. Existing consumers of the compliance API gain data, not schema changes.Pre-PR review
organization_memberships,compliance_opt_outguard added,dry_run: falsecomment added, log level on owner-check failure raised towarnmember_profiles.agents @> $1::jsonb) matches established codebase pattern;notifyComplianceChangesuppression is correct per notification subscriber scope (Slack + external consumers via Scope3); badge-issuance gap acknowledged as follow-upSession: https://claude.ai/code/session_01UNHkGhBXk9XD2dpzvSLdhb
Generated by Claude Code