feat(dashboard): surface verdict_source + per-run triggered_by badge#4263
Open
EmmaLouise2018 wants to merge 1 commit intoclaude/issue-4247-owner-test-canonical-writefrom
Conversation
Contributor
c71809a to
42e7f37
Compare
PR 2 of the #4247 unification stack. Reads two fields PR #4250 added to the compliance API but the dashboard wasn't yet rendering: - compliance tile: appends "(your test)" / "(heartbeat)" / "(manual)" / "(webhook)" after Last checked, so operators see whether the current verdict came from their own evaluate_agent_quality run or the scheduled heartbeat. - history panel: per-run badge with the same source label, info-blue for owner_test and neutral for the rest. Pre-PR-1 rows render with neutral — no regression. No backend changes; pure UI surfacing of fields already in the API. Stacked on PR #4250.
47b26d4 to
fd7d406
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR 2 of the #4247 unification stack. Stacked on #4250.
Summary
PR #4250 added
verdict_sourceto/api/registry/agents/:url/complianceandtriggered_byto each row returned by/api/registry/agents/:url/compliance/history. The dashboard wasn't rendering either. This PR surfaces both so operators can distinguish their own on-demand runs from scheduled heartbeat verdicts at a glance.Why
Brian's review on #4250 (and the #4247 plan): the public compliance contract is shifting from "last scheduled verdict" to "last verdict from any source." Telling the caller — including the operator viewing their own dashboard — that the current verdict is from
your testvsheartbeatis the load-bearing UX clarification. Without it, an operator runsevaluate_agent_quality, sees the dashboard tile flip to passing, and has no signal that the public registry now reflects their on-demand run rather than the cron's verdict.What changes
(your test)/(heartbeat)/(manual)/(webhook)afterLast checked: 3m ago. Empty string whenverdict_sourceis null (never run).triggered_by = 'owner_test'; neutral forheartbeat/manual/webhookso pre-PR-1 rows render without regression.No backend changes. Pure UI on fields the API already emits.
Stacked on
verdict_sourceandtriggered_byto the API responsesThis PR's diff is read against #4250's branch. Merge order: #4250 → this PR.
Out of scope (later PRs in the #4247 stack)
agent_test_history, backfill owner-triggered rows intoagent_compliance_runs, S3-export third-party rows. Destructive migration; soaks behind PR 2.agent_contexts.last_test_*columns into a derived view. Pure schema cleanup.Test plan
tsc --noEmit -p server/tsconfig.jsoncleanevaluate_agent_quality, observe dashboard compliance tile shows(your test). Wait for next heartbeat, observe it flips to(heartbeat).Your testbadge, heartbeat runs render with neutral badge.triggered_byvalue, or only'heartbeat') render cleanly without empty/garbage labels.