Understanding these core concepts will help you get the most out of Agent Control.
A Policy is a named collection of controls assigned to an agent. Policies enable you to:
- Bundle multiple controls together as a cohesive protection strategy
- Audit your safety rules
- Enables reusable security/compliance configurations across agents
- Apply different policies to different environments (dev/staging/prod)
# Policy for PROD deployment
{
"id": 42,
"name": "production-safety",
"controls": [ # collection of rules.
{....}
]
}A Control is a protection rule that evaluates agent interactions (inputs/outputs) and takes action based on configured criteria. It defines when to check, what to check, how to evaluate it, and what to do with the results.
Control = Scope (When) + Selector (What) + Evaluator (How) + Action (Decision)
The Scope defines which steps trigger the control—acting as a filter to select specific sections of your agent workflow to monitor.
Fields:
step_types: List of step types (['tool', 'llm_inference']). If null, applies to all types.step_names: List of exact step names to target (e.g.,['search_db', 'send_email'])step_name_regex: Regex pattern for step name matching (e.g.,"^db.*"matches all "db_" prefixed steps)stages: When to evaluate -['pre', 'post']'pre': Check before execution (block bad inputs, prevent prompt injection)'post': Check after execution (filter bad outputs, redact PII)
Example 1: Apply to all tool steps before execution
{
"scope": {
"step_types": ["tool"],
"stages": ["pre"]
}
}Example 2: Target specific database operations
{
"scope": {
"step_names": ["query_database", "execute_sql", "db_lookup"],
"stages": ["pre"]
}
}Example 3: Use regex to match multiple steps
{
"scope": {
"step_name_regex": "^db_.*", // Matches all steps starting with "db_"
"stages": ["pre", "post"]
}
}Example 4: Full scope configuration
{
"scope": {
"step_types": ["tool", "llm_inference"],
"step_names": ["search_db", "query_api"],
"step_name_regex": "^db_.*",
"stages": ["pre", "post"]
}
}The Selector specifies which portion of the step's data to extract and pass to the evaluator for analysis. It uses a path specification to navigate the step object.
Field:
path: Dot-notation path to the data you want to evaluate
Common Paths:
"input"- Entire input to the step"output"- Step's output/result"input.query"- Specific parameter within input"input.user_message"- User message field"name"- The step/tool name itself"context.user_id"- Context metadata"*"- All available payload data
Example 1: Check tool output for PII
{
"selector": {
"path": "output"
}
}Example 2: Validate specific input parameter
{
"selector": {
"path": "input.sql_query"
}
}Example 3: Check user message in LLM input
{
"selector": {
"path": "input.user_message"
}
}Example 4: Access context metadata
{
"selector": {
"path": "context.user_id"
}
}The Evaluator receives the data extracted by the selector and evaluates it against configured rules, returning whether the data matches specified criteria.
Components:
name: Evaluator identifier (e.g.,"regex","list","galileo-luna2")config: Evaluator-specific configuration, validated against the evaluator's schemametadata: Optional evaluator identification (name, version, description)
Example 1: Regex evaluator to detect PII
{
"evaluator": {
"name": "regex",
"config": {
"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b" // Social Security Number pattern
}
}
}Example 2: List evaluator for banned terms
{
"evaluator": {
"name": "list",
"config": {
"values": ["DROP TABLE", "DELETE FROM", "TRUNCATE"],
"case_sensitive": false
}
}
}Example 3: AI-powered toxicity detection with Luna-2
{
"evaluator": {
"name": "galileo-luna2",
"config": {
"metric": "input_toxicity",
"operator": "gt",
"target_value": 0.5
}
}
}Example 4: Complex regex for multiple PII types
{
"evaluator": {
"name": "regex",
"config": {
"pattern": "(?:\\b\\d{3}-\\d{2}-\\d{4}\\b|\\b\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}\\b|[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,})"
}
}
}Custom Evaluators: Agent Control supports custom evaluators for domain-specific requirements. See examples/deepeval for a complete example of creating and integrating custom evaluators.
The Action defines what happens when the evaluator matches/detects an issue.
Fields:
decision: The enforcement action to take"allow"- Continue execution (useful for allowlist controls)"deny"- Hard stop execution (blocks the request)"warn"- Log a warning but continue execution"log"- Log the match for observability only
metadata: Optional additional context (JSON object)
Important: Priority Semantics
Agent Control uses priority-based logic:
- deny wins - If any
denycontrol matches, execution is blocked - steer second - If any
steercontrol matches (and no deny), steering context is provided for correction - allow/warn/log - Observability actions that don't block
This ensures fail-safe behavior while allowing corrective workflows.
Example 1: Block on match
{
"action": {
"decision": "deny"
}
}Example 2: Warn without blocking
{
"action": {
"decision": "warn"
}
}Example 3: Action with metadata
{
"action": {
"decision": "deny",
"metadata": {
"reason": "PII detected in output",
"severity": "high",
"compliance": "GDPR"
}
}
}Putting it all together - a control that blocks Social Security Numbers in tool outputs:
{
"name": "block-ssn-output",
"description": "Prevent SSN leakage in tool responses",
"enabled": true,
"scope": {
"step_types": ["tool"],
"stages": ["post"]
},
"selector": {
"path": "output"
},
"evaluator": {
"name": "regex",
"config": {
"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"
}
},
"action": {
"decision": "deny",
"metadata": {
"reason": "SSN detected",
"compliance": "PII protection"
}
}
}Controls are defined via the API or SDK. Each control follows the Control = Scope + Selector + Evaluator + Action structure explained above.
Via Python SDK:
from agent_control import AgentControlClient, controls
async with AgentControlClient() as client:
await controls.create_control(
client,
name="block-ssn-output",
data={
"description": "Block Social Security Numbers in responses",
"enabled": True,
"execution": "server",
"scope": {"step_names": ["generate_response"], "stages": ["post"]},
"selector": {"path": "output"},
"evaluator": {
"name": "regex",
"config": {"pattern": r"\b\d{3}-\d{2}-\d{4}\b"}
},
"action": {"decision": "deny"}
}
)Via curl:
# Step 1: Create control
CONTROL_ID=$(curl -X PUT http://localhost:8000/api/v1/controls \
-H "Content-Type: application/json" \
-d '{"name": "block-ssn-output"}' | jq -r '.control_id')
# Step 2: Set control configuration
curl -X PUT "http://localhost:8000/api/v1/controls/$CONTROL_ID/data" \
-H "Content-Type: application/json" \
-d '{
"data": {
"description": "Block Social Security Numbers in responses",
"enabled": true,
"execution": "server",
"scope": {"step_names": ["generate_response"], "stages": ["post"]},
"selector": {"path": "output"},
"evaluator": {
"name": "regex",
"config": {"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"}
},
"action": {"decision": "deny"}
}
}'Via Python SDK:
await controls.create_control(
client,
name="block-toxic-input",
data={
"description": "Block toxic or harmful user messages",
"enabled": True,
"execution": "server",
"scope": {"step_names": ["process_user_message"], "stages": ["pre"]},
"selector": {"path": "input"},
"evaluator": {
"name": "galileo.luna2",
"config": {
"metric": "input_toxicity",
"operator": "gt",
"target_value": 0.5
}
},
"action": {"decision": "deny"}
}
)Via curl:
# Create control with Luna-2 evaluator
CONTROL_ID=$(curl -X PUT http://localhost:8000/api/v1/controls \
-H "Content-Type: application/json" \
-d '{"name": "block-toxic-input"}' | jq -r '.control_id')
curl -X PUT "http://localhost:8000/api/v1/controls/$CONTROL_ID/data" \
-H "Content-Type: application/json" \
-d '{
"data": {
"description": "Block toxic or harmful user messages",
"enabled": true,
"execution": "server",
"scope": {"step_names": ["process_user_message"], "stages": ["pre"]},
"selector": {"path": "input"},
"evaluator": {
"name": "galileo.luna2",
"config": {
"metric": "input_toxicity",
"operator": "gt",
"target_value": 0.5
}
},
"action": {"decision": "deny"}
}
}'Note: For Luna-2 evaluator, set
GALILEO_API_KEYenvironment variable. See docs/REFERENCE.md for all available evaluators.

