Skip to content

feat(compliance): derive agent_context.last_test_* from canonical runs#4268

Open
EmmaLouise2018 wants to merge 2 commits intoEmmaLouise2018/unification-pr3-backfill-and-drop-test-historyfrom
EmmaLouise2018/unification-pr4-collapse-last-test-columns
Open

feat(compliance): derive agent_context.last_test_* from canonical runs#4268
EmmaLouise2018 wants to merge 2 commits intoEmmaLouise2018/unification-pr3-backfill-and-drop-test-historyfrom
EmmaLouise2018/unification-pr4-collapse-last-test-columns

Conversation

@EmmaLouise2018
Copy link
Copy Markdown
Contributor

PR 4 of the #4247 unification stack. Stacked on #4264#4263#4250.

Summary

Replaces direct reads of `agent_contexts.last_test_*` with a view that derives them from `agent_compliance_runs` — the canonical source PR #4250 unified onto. Adds `triggered_org_id` so the view's per-org scope is accurate.

Why

After PR 1-3, owner test runs land in `agent_compliance_runs` with `triggered_by = 'owner_test'`. But that table only carries `agent_url`, no org dimension. Today's `agent_contexts.last_test_*` columns ARE org-scoped (`agent_contexts` PK is `(organization_id, agent_url)`). To collapse one into the other without losing semantics, runs need a triggering-org dimension.

`triggered_org_id` (nullable; populated only for `triggered_by = 'owner_test'`) closes the gap. Heartbeat / manual / webhook writes have no org, so NULL is correct.

What changes

  • Migration 473. Adds `agent_compliance_runs.triggered_org_id TEXT`. Partial index on `(triggered_org_id, agent_url, tested_at DESC) WHERE triggered_org_id IS NOT NULL` supports the view's per-org `DISTINCT ON` as a single index scan.
  • View `agent_context_with_latest_test`. `agent_contexts.` joined via `LEFT JOIN LATERAL` to the latest non-dry-run `agent_compliance_runs` row scoped by `(triggered_org_id, agent_url)`. Surfaces derived fields as `canonical_last_test_`.
  • `AgentContextDatabase` readers (`getByOrganization`, `getById`, `getByOrgAndUrl`) now SELECT from the view, aliasing `canonical_last_test_` → `last_test_` so callers see no shape change.
  • Owner-test write in `evaluate_agent_quality` populates `triggered_org_id` from the caller's `organizationId`.

Backward compat

The legacy `agent_contexts.last_test_*` columns stay. Third-party (non-owner) `recordTest()` writes still update them — that's the session-scoped audit trail PR 3 retained for non-owner runs. The columns become dead-letter once:

  1. `agent_test_history` is dropped (gated on the soak windows in Unify compliance state: every storyboard run writes to one canonical path (heartbeat + Addie + dashboard tests) #4247)
  2. `recordTest()` retires (follow-up "final cleanup" PR)

This PR is the prep work that makes that final cleanup a no-op for readers.

Stacked on

Merge order: #4250#4263#4264 → this PR.

Test plan

  • `tsc --noEmit -p server/tsconfig.json` clean
  • Migration 473 applies cleanly on staging (`ALTER TABLE ADD COLUMN IF NOT EXISTS` is non-destructive; `CREATE OR REPLACE VIEW` is idempotent)
  • Owner test run populates `triggered_org_id` correctly; heartbeat run leaves it NULL
  • `getByOrganization` returns `last_test_*` derived from the view (verify via test that compares pre-migration column values to post-migration view-derived values for an owner-triggered run)
  • Cross-org isolation: two orgs with the same agent URL see distinct `last_test_*` derivations from the view

PR 4 of the #4247 unification stack.

Adds triggered_org_id to agent_compliance_runs so per-org scoping of
the new agent_context_with_latest_test view is accurate. Without it,
two orgs that own the same agent URL would conflate test history.
Owner-test write path in evaluate_agent_quality populates it from the
caller's organizationId; heartbeat/manual/webhook leave it NULL.

agent_context_with_latest_test view: agent_contexts.* joined LATERAL
to the latest non-dry-run agent_compliance_runs row scoped by
(triggered_org_id, agent_url), plus COUNT for total_tests_run.

agent-context-db.ts readers (getByOrganization, getById, getByOrgAndUrl)
SELECT from the view and alias canonical_last_test_* → last_test_* so
callers see no shape change.

Legacy columns stay for backward compat — third-party recordTest()
writes still hit them (session-scoped audit retained per PR 3). The
columns + recordTest retire in the follow-up "final cleanup" PR
that drops agent_test_history.

Stacked on #4264#4263#4250.
@bokelley
Copy link
Copy Markdown
Contributor

bokelley commented May 9, 2026

Code review (expert pass): solid, minor nits — but blocked by #4264 in the chain.

Nits (non-blocking):

  • agent_compliance_runs.triggered_org_id is TEXT, not FK to organizations.id/workos_organization_id. No referential integrity; an org delete leaves dangling rows. WorkOS IDs are foreign-system so a hard FK isn't appropriate, but a comment or partial CHECK on shape would help.
  • last_test_scenario derivation tracks_json -> 0 ->> 'track' is a semantic shift from the literal 'quality_evaluation' to a track name. Grep confirms no callers branch on the string, so safe — but call out in the changeset for downstream consumers.

View is plain CREATE OR REPLACE VIEW (not SECURITY DEFINER), inherits caller's RLS, per-org scoping correct. No security concern.

@bokelley
Copy link
Copy Markdown
Contributor

bokelley commented May 9, 2026

Both nits addressed in 1103d56:

  • triggered_org_id comment — added COMMENT ON COLUMN in migration 473 explaining the TEXT/no-FK decision (WorkOS IDs are foreign-system keys; DB-layer referential integrity isn't appropriate here).
  • Changeset — added a **Semantic shift (last_test_scenario).** paragraph calling out the tracks_json[0].track vs. legacy literal-string difference for downstream consumers.

Blocked-by #4264 noted — this PR stays parked until the stack merges in order.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants