AgentAssert Type-C provides mathematical guarantees for AI agent behavior. This page explains the three mathematical foundations.
JSD measures how much your agent's action distribution has drifted from its established baseline. If your agent normally uses Read 50% of the time, Write 20%, Bash 30%, and suddenly starts using Bash 80% — JSD catches it.
Welford-style incremental update — O(1) per new action. No batch recomputation.
For distributions P and Q over action categories:
JSD(P || Q) = 0.5 × KL(P || M) + 0.5 × KL(Q || M)
where M = 0.5 × (P + Q)
and KL(P || Q) = Σ P(i) × log(P(i) / Q(i))
- JSD ∈ [0, 1] where 0 = identical distributions, 1 = maximally different
- JSD is symmetric and bounded (unlike KL divergence)
- Square root of JSD is a metric
A baseline distribution is frozen after window_size observations. Thereafter, each new window is compared against this frozen baseline. This catches slow drift that a rolling comparison might miss.
- process_drift:
window_size: 10
jsd_threshold: 0.30
action: log| Parameter | What It Means |
|---|---|
window_size = 10 |
Take a distribution snapshot every 10 actions |
jsd_threshold = 0.30 |
Trigger if JSD between current window and baseline exceeds 0.30 |
action: log |
Record the drift event (other options: warn, theta_penalty) |
KL divergence is asymmetric and unbounded. JSD is:
- Symmetric:
JSD(P, Q) = JSD(Q, P)— drift from A→B equals drift from B→A - Bounded: always in [0, 1], normalized
- A true metric (its square root) — satisfies triangle inequality
Θ = 0.35 × C̄ + 0.25 × (1 - D̄) + 0.20 / (1 + E) + 0.20 × S
| Symbol | Term | Weight | Meaning |
|---|---|---|---|
| C̄ | Compliance rate | 0.35 | Fraction of hard constraints that pass |
| D̄ | Average drift | 0.25 | Average JSD over session (penalized: we use 1-D̄) |
| E | Event frequency | 0.20 | Number of violation events (penalized: we use 1/(1+E)) |
| S | Recovery success rate | 0.20 | Fraction of soft violations recovered within k turns |
Θ ∈ [0, 1]. Higher is better.
Default: Θ ≥ 0.90 for deployment. Agents below this threshold should have their contracts reviewed or constraints tightened.
process_driftwithaction: theta_penaltyapplies -0.03 to Θ per drift eventjudge_predicatewithaction_on_fail: theta_penaltyapplies -0.03 per failed judge samplemust_precedeviolations apply -0.01 per violation
reliability:
weights:
compliance: 0.40 # Higher weight on hard constraint satisfaction
drift: 0.30
stress: 0.15
recovery: 0.15
deployment_threshold: 0.85Weights must sum to ≤ 1.0.
Formal satisfaction guarantee for your behavioral contract.
A contract is (p, δ, k)-satisfied if:
- Hard constraints are met with probability at least p (per turn)
- Soft constraint deviations are bounded by δ
- Recovery from any violation occurs within k turns
satisfaction:
p: 0.95 # 95% probability hard constraints hold
delta: 0.10 # Soft deviations ≤ 0.10
k: 3 # Recover within 3 turns| (p, δ, k) | Meaning |
|---|---|
| (0.95, 0.10, 3) | 95% of tool calls comply. Deviations are small. Agent self-corrects within 3 turns. |
| (0.99, 0.05, 1) | Extremely strict. 99% compliance. Near-zero deviation. Immediate recovery. |
| (0.90, 0.20, 5) | Relaxed. Suitable for non-critical agents. |
The SessionMonitor tracks actual (p, δ, k) values over a session and reports them at SessionEnd via the drift report.
evaluate(request, contract):
result = ALLOW
# 1. Tool blocklist check (Hard → DENY)
if any(tool matches blocklist.pattern):
return DENY("tool_blocklist", matched_pattern)
# 2. Tool allowlist check (Hard → DENY)
if allowlist exists AND tool NOT in allowlist:
return DENY("tool_allowlist", tool)
# 3. must_state check (Hard → DENY)
for each must_state rule:
if tool matches before_tool_pattern AND field not in state:
return DENY("must_state", field)
# 4. context_budget check (Configurable)
if token_count > max_tokens_per_turn:
if action_on_breach == "deny": return DENY
if action_on_breach == "warn": log_warning
return ALLOW
evaluate_post(response, contract):
# 5. Update drift distribution
drift_tracker.observe(tool)
# 6. Check must_precede (Soft → Θ penalty)
if after seen before before:
violation_log.record_soft("must_precede")
# 7. JSD drift check (Soft)
if jsd > threshold:
action: log | warn | theta_penalty
# 8. asyncio: schedule judge_predicate (Soft + Sampled)
if random() < sample_rate:
schedule_judge_evaluation(rubric, output)
# Update (p, δ, k) statistics
# Update Θ score
| Operator | Type | Why |
|---|---|---|
tool_blocklist |
Hard DENY |
Destructive — must block, can't just warn |
tool_allowlist |
Hard DENY |
Unauthorized tools must be gated upstream |
must_state |
Hard DENY |
Cost rules must gate before spend |
context_budget |
Configurable | Context bloat is a cost concern, not always safety |
must_precede |
Soft | Process violations hurt quality, not safety |
process_drift |
Soft | Drift means "behavior changed" not "behavior is dangerous" |
judge_predicate |
Soft | Qualitative — subjective assessment, not deterministic |