agent-control/CONCEPTS.md at main · crewAIInc/agent-control

Concepts

Understanding these core concepts will help you get the most out of Agent Control.

Policy

A Policy is a named collection of controls assigned to an agent. Policies enable you to:

Bundle multiple controls together as a cohesive protection strategy
Audit your safety rules
Enables reusable security/compliance configurations across agents
Apply different policies to different environments (dev/staging/prod)

# Policy for PROD deployment
{
  "id": 42,
  "name": "production-safety",
  "controls": [ # collection of rules.
    {....}
  ]
}

Controls

A Control is a protection rule that evaluates agent interactions (inputs/outputs) and takes action based on configured criteria. It defines when to check, what to check, how to evaluate it, and what to do with the results.

Control = Scope (When) + Selector (What) + Evaluator (How) + Action (Decision)

1. Scope (When to Check)

The Scope defines which steps trigger the control—acting as a filter to select specific sections of your agent workflow to monitor.

Fields:

step_types: List of step types (['tool', 'llm_inference']). If null, applies to all types.
step_names: List of exact step names to target (e.g., ['search_db', 'send_email'])
step_name_regex: Regex pattern for step name matching (e.g., "^db.*" matches all "db_" prefixed steps)
stages: When to evaluate - ['pre', 'post']
- 'pre': Check before execution (block bad inputs, prevent prompt injection)
- 'post': Check after execution (filter bad outputs, redact PII)

Example 1: Apply to all tool steps before execution

{
  "scope": {
    "step_types": ["tool"],
    "stages": ["pre"]
  }
}

Example 2: Target specific database operations

{
  "scope": {
    "step_names": ["query_database", "execute_sql", "db_lookup"],
    "stages": ["pre"]
  }
}

Example 3: Use regex to match multiple steps

{
  "scope": {
    "step_name_regex": "^db_.*",  // Matches all steps starting with "db_"
    "stages": ["pre", "post"]
  }
}

Example 4: Full scope configuration

{
  "scope": {
    "step_types": ["tool", "llm_inference"],
    "step_names": ["search_db", "query_api"],
    "step_name_regex": "^db_.*",
    "stages": ["pre", "post"]
  }
}

2. Selector (What to Check)

The Selector specifies which portion of the step's data to extract and pass to the evaluator for analysis. It uses a path specification to navigate the step object.

Field:

path: Dot-notation path to the data you want to evaluate

Common Paths:

"input" - Entire input to the step
"output" - Step's output/result
"input.query" - Specific parameter within input
"input.user_message" - User message field
"name" - The step/tool name itself
"context.user_id" - Context metadata
"*" - All available payload data

Example 1: Check tool output for PII

{
  "selector": {
    "path": "output"
  }
}

Example 2: Validate specific input parameter

{
  "selector": {
    "path": "input.sql_query"
  }
}

Example 3: Check user message in LLM input

{
  "selector": {
    "path": "input.user_message"
  }
}

Example 4: Access context metadata

{
  "selector": {
    "path": "context.user_id"
  }
}

3. Evaluator (How to Check)

The Evaluator receives the data extracted by the selector and evaluates it against configured rules, returning whether the data matches specified criteria.

Components:

name: Evaluator identifier (e.g., "regex", "list", "galileo-luna2")
config: Evaluator-specific configuration, validated against the evaluator's schema
metadata: Optional evaluator identification (name, version, description)

Example 1: Regex evaluator to detect PII

{
  "evaluator": {
    "name": "regex",
    "config": {
      "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"  // Social Security Number pattern
    }
  }
}

Example 2: List evaluator for banned terms

{
  "evaluator": {
    "name": "list",
    "config": {
      "values": ["DROP TABLE", "DELETE FROM", "TRUNCATE"],
      "case_sensitive": false
    }
  }
}

Example 3: AI-powered toxicity detection with Luna-2

{
  "evaluator": {
    "name": "galileo-luna2",
    "config": {
      "metric": "input_toxicity",
      "operator": "gt",
      "target_value": 0.5
    }
  }
}

Example 4: Complex regex for multiple PII types

{
  "evaluator": {
    "name": "regex",
    "config": {
      "pattern": "(?:\\b\\d{3}-\\d{2}-\\d{4}\\b|\\b\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}[\\s-]?\\d{4}\\b|[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,})"
    }
  }
}

Custom Evaluators: Agent Control supports custom evaluators for domain-specific requirements. See examples/deepeval for a complete example of creating and integrating custom evaluators.

4. Action (What to Do)

The Action defines what happens when the evaluator matches/detects an issue.

Fields:

decision: The enforcement action to take
- "allow" - Continue execution (useful for allowlist controls)
- "deny" - Hard stop execution (blocks the request)
- "warn" - Log a warning but continue execution
- "log" - Log the match for observability only
metadata: Optional additional context (JSON object)

Important: Priority Semantics

Agent Control uses priority-based logic:

deny wins - If any deny control matches, execution is blocked
steer second - If any steer control matches (and no deny), steering context is provided for correction
allow/warn/log - Observability actions that don't block

This ensures fail-safe behavior while allowing corrective workflows.

Example 1: Block on match

{
  "action": {
    "decision": "deny"
  }
}

Example 2: Warn without blocking

{
  "action": {
    "decision": "warn"
  }
}

Example 3: Action with metadata

{
  "action": {
    "decision": "deny",
    "metadata": {
      "reason": "PII detected in output",
      "severity": "high",
      "compliance": "GDPR"
    }
  }
}

Complete Control Example

Putting it all together - a control that blocks Social Security Numbers in tool outputs:

{
  "name": "block-ssn-output",
  "description": "Prevent SSN leakage in tool responses",
  "enabled": true,
  "scope": {
    "step_types": ["tool"],
    "stages": ["post"]
  },
  "selector": {
    "path": "output"
  },
  "evaluator": {
    "name": "regex",
    "config": {
      "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"
    }
  },
  "action": {
    "decision": "deny",
    "metadata": {
      "reason": "SSN detected",
      "compliance": "PII protection"
    }
  }
}

Defining Controls

Controls are defined via the API or SDK. Each control follows the Control = Scope + Selector + Evaluator + Action structure explained above.

Example: Block PII in Output (Regex)

Via Python SDK:

from agent_control import AgentControlClient, controls

async with AgentControlClient() as client:
    await controls.create_control(
        client,
        name="block-ssn-output",
        data={
            "description": "Block Social Security Numbers in responses",
            "enabled": True,
            "execution": "server",
            "scope": {"step_names": ["generate_response"], "stages": ["post"]},
            "selector": {"path": "output"},
            "evaluator": {
                "name": "regex",
                "config": {"pattern": r"\b\d{3}-\d{2}-\d{4}\b"}
            },
            "action": {"decision": "deny"}
        }
    )

Via curl:

# Step 1: Create control
CONTROL_ID=$(curl -X PUT http://localhost:8000/api/v1/controls \
  -H "Content-Type: application/json" \
  -d '{"name": "block-ssn-output"}' | jq -r '.control_id')

# Step 2: Set control configuration
curl -X PUT "http://localhost:8000/api/v1/controls/$CONTROL_ID/data" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "description": "Block Social Security Numbers in responses",
      "enabled": true,
      "execution": "server",
      "scope": {"step_names": ["generate_response"], "stages": ["post"]},
      "selector": {"path": "output"},
      "evaluator": {
        "name": "regex",
        "config": {"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b"}
      },
      "action": {"decision": "deny"}
    }
  }'

Example: Block Toxic Input (Luna-2 AI)

Via Python SDK:

await controls.create_control(
    client,
    name="block-toxic-input",
    data={
        "description": "Block toxic or harmful user messages",
        "enabled": True,
        "execution": "server",
        "scope": {"step_names": ["process_user_message"], "stages": ["pre"]},
        "selector": {"path": "input"},
        "evaluator": {
            "name": "galileo.luna2",
            "config": {
                "metric": "input_toxicity",
                "operator": "gt",
                "target_value": 0.5
            }
        },
        "action": {"decision": "deny"}
    }
)

Via curl:

# Create control with Luna-2 evaluator
CONTROL_ID=$(curl -X PUT http://localhost:8000/api/v1/controls \
  -H "Content-Type: application/json" \
  -d '{"name": "block-toxic-input"}' | jq -r '.control_id')

curl -X PUT "http://localhost:8000/api/v1/controls/$CONTROL_ID/data" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "description": "Block toxic or harmful user messages",
      "enabled": true,
      "execution": "server",
      "scope": {"step_names": ["process_user_message"], "stages": ["pre"]},
      "selector": {"path": "input"},
      "evaluator": {
        "name": "galileo.luna2",
        "config": {
          "metric": "input_toxicity",
          "operator": "gt",
          "target_value": 0.5
        }
      },
      "action": {"decision": "deny"}
    }
  }'

Note: For Luna-2 evaluator, set GALILEO_API_KEY environment variable. See docs/REFERENCE.md for all available evaluators.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concepts

Policy

Controls

1. Scope (When to Check)

2. Selector (What to Check)

3. Evaluator (How to Check)

4. Action (What to Do)

Complete Control Example

Defining Controls

Example: Block PII in Output (Regex)

Example: Block Toxic Input (Luna-2 AI)

FilesExpand file tree

CONCEPTS.md

Latest commit

History

CONCEPTS.md

File metadata and controls

Concepts

Policy

Controls

1. Scope (When to Check)

2. Selector (What to Check)

3. Evaluator (How to Check)

4. Action (What to Do)

Complete Control Example

Defining Controls

Example: Block PII in Output (Regex)

Example: Block Toxic Input (Luna-2 AI)