External Action Gates

Problem

Autonomous agents with access to external tools (email, APIs, production systems) can take actions that are difficult or impossible to undo. A misinterpreted request can send an email to the wrong person, deploy broken code, or delete production data.

Solution

Classify all agent actions as internal (safe for autonomous execution) or external (require human approval before execution).

Implementation

1. Classify Actions

Category	Examples	Gate
Internal read	Read files, search web, check logs	Autonomous
Internal write	Write to workspace, update memory	Autonomous
Internal compute	Analyze data, generate reports	Autonomous
External read	Check email, query APIs	Autonomous
External write	Send email, post to service	Approval required
External modify	Deploy code, modify server	Approval required
External delete	Remove data, terminate service	Double approval

2. Approval Flow

When an agent needs to perform a gated action, it previews the action and waits:

Agent prepares the action (drafts the email, stages the deployment)
Agent presents the full preview to the human
Human approves, modifies, or rejects
Only on approval does the agent execute

Key: the preview must show exactly what will happen. "I'll send an email" is insufficient. Show the recipient, subject, and body.

3. Trust Escalation

Over time, frequently-approved actions can be promoted:

Week 1: Every email send requires approval
Month 2: Emails to known internal recipients are auto-approved
Month 3: Only emails to new external contacts need approval

This balances safety with productivity. Start restrictive, relax as trust builds.

4. Audit Trail

Log every gated action with:

What was requested
Whether it was approved or rejected
Who approved it
What was actually executed

This is essential for debugging ("why did the agent send that email?") and for building the trust history that enables escalation.

5. Prefer Reversible Over Destructive

When possible, use reversible operations:

trash instead of rm
Soft delete instead of hard delete
Feature flags instead of code removal
Draft/preview instead of direct send

Trade-offs

Friction: Approval gates slow down autonomous operation. That's the point, but it frustrates when the approval is obviously safe.
Human bottleneck: The system is only as fast as the human approver. Batch approvals help.
False security: A human rubber-stamping approvals without reading them defeats the purpose.

When to Skip

Purely internal agents that never interact with external systems.
Sandboxed environments where all actions are safely reversible.
Emergency operations where speed matters more than caution (define these in advance).

Related Patterns

Isolated Workspaces - credential scoping limits what agents can access
Identity as Architecture - identity helps define which agents need stricter gates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

External Action Gates

Problem

Solution

Implementation

1. Classify Actions

2. Approval Flow

3. Trust Escalation

4. Audit Trail

5. Prefer Reversible Over Destructive

Trade-offs

When to Skip

Related Patterns

FilesExpand file tree

external-action-gates.md

Latest commit

History

external-action-gates.md

File metadata and controls

External Action Gates

Problem

Solution

Implementation

1. Classify Actions

2. Approval Flow

3. Trust Escalation

4. Audit Trail

5. Prefer Reversible Over Destructive

Trade-offs

When to Skip

Related Patterns