The deterministic SOAR layer that intercepts, judges, and contains rogue AI agents before they do damage.
- 🎯 What is PLAYBOOK?
- 🚀 Live Demo
- ⚡ The Problem
- 🏗️ Architecture
- ✨ Key Features
- 🛡️ Judge Layer — Zero LLM Enforcement
- 📸 Screenshots
- ⚙️ Quick Start
- 🔌 SDK Quick Start
- 🧪 Running Tests
- 📚 Tech Stack
- 🗺️ Roadmap
- 🤝 Contributing
- 📄 License
PLAYBOOK is an automated incident response system designed specifically for AI agent deployments. It integrates with Veea Lobster Trap DPI (Deep Packet Inspection for LLM traffic) to:
| Stage | Action | Latency |
|---|---|---|
| 🔍 DETECT | Normalize & inspect every agent action | < 2ms |
| ⚖️ JUDGE | Deterministic rule evaluation — zero LLM calls | < 5ms |
| 🛡️ ENFORCE | ALLOW / DENY / QUARANTINE / ESCALATE |
< 1ms |
| 📦 FORENSICS | Auto-generate evidence packages + compliance reports | Async |
End-to-end p95: < 40ms 🚀
🌐 URL: https://playbooksoar.aiproofofconcept.in
| Field | Value |
|---|---|
| 👤 Username | demo |
| 🔑 Password | demo123 |
- 🎯 Dashboard — See live KPIs, critical alerts, and agent health scores
- 📡 DPI Live Feed (
/dpi-live) — Watch real-time Lobster Trap intercepts with risk scores and verdict badges - 🐝 Agent Swarm (
/swarm) — Launch a 3-agent attack. Toggle Misbehavior Mode for 100% malicious actions - ⚖️ Judge Layer (
/judge) — Inspect deterministic rule evaluation. Zero LLM in the enforcement path - 📋 Incidents (
/incidents) — Review auto-generated incidents with full forensics packages - 🏛️ Policy Builder (
/policy-builder) — Customize NIST SP 800-53 ODPs per incident type
Enterprises deployed thousands of autonomous AI agents in 2025. Each has access to databases, APIs, and sensitive data. But nobody built the security layer to watch them.
When an agent goes rogue — exfiltrating customer data 💳, injecting malicious prompts 💉, or escalating privileges 🔓 — you find out after the damage is done.
| Stat | Value |
|---|---|
| 💰 Avg. Data Breach Cost | $4.5M |
| 🏢 Orgs Lacking Agent Security | 73% |
| ⏱️ Days to Identify Breach | 287 days |
| 🛡️ Real-Time Defense | < 1% |
PLAYBOOK fixes this. We don't just block. We capture the agent's reasoning, document the evidence, execute the playbook, and generate audit-ready compliance reports — all in under 40 milliseconds.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 🤖 LLM App │─────▶│ 🦞 Veea Lobster │─────▶│ 📕 PLAYBOOK │
│ (Agent Swarm) │ │ Trap DPI Proxy │ │ Judge + SOAR │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌─────────────────────────────┼─────────────────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 🚨 INCIDENT│ │ 📦 FORENSICS│ │ 🏛️ COMPLIANCE│
│ CREATE │ │ PACKAGE │ │ MAPPING │
│ + ROUTE │ │ BUILDER │ │ EU/NIST │
└──────────┘ └──────────┘ └──────────┘
Stage 1: DETECT Stage 2: CLASSIFY/JUDGE Stage 3: ENFORCE Stage 4: FORENSICS
+----------------+ +----------------+ +----------------+ +----------------+
| 📝 Log Tailer |─────▶| ⚖️ Local Judge |──────▶| 🛡️ Playbook |──────▶| 📦 Evidence |
| + Pre-Screen | | + ODP Resolver | | + Judge Gate | | Package |
+----------------+ +----------------+ +----------------+ +----------------+
↓
🏛️ Policy Builder
(NIST Baseline + ODPs)
| Feature | Description |
|---|---|
| ⚖️ Deterministic Judge Layer | Zero LLM in enforcement path. Rule-based decisions in < 5ms. Immune to prompt-injection bypasses. |
| 📡 DPI Live Feed | Real-time dashboard powered by Veea Lobster Trap. See every intercepted prompt, matched rule, and verdict. |
| 🐝 Agent Swarm Simulator | Launch multi-agent attacks (FX Swap fraud, Data Exfiltration, Prompt Injection) with live WebSocket streaming. |
| 🧠 Agent Playground | Test LLM providers (OpenAI, Gemini, Claude, Azure, Ollama) with human-in-the-loop approvals. |
| 📋 16 Incident Types | Full taxonomy from Data Destruction to Regulatory Trigger. |
| 🏛️ Policy Builder | NIST SP 800-53 ODPs with 6 industry templates (Finance, Healthcare, Gov, Retail, Manufacturing, Energy). |
| 🗺️ Compliance Mapping | EU AI Act Art. 9/15/73, NIST AI RMF Agentic Profile, SOC 2 Type II, HIPAA, GDPR. |
| 🕵️ Bypass Detection | Immune to 4 known LLM-judge bypass patterns (55 test vectors). |
| 📦 Forensics Packages | Tamper-evident packages with SHA-256 manifest + HMAC signature. |
| 📧 Multi-Channel Alerts | Slack, Email (Resend), PagerDuty notifications. |
| 🔌 SDK + Middleware | Python SDK with @guard decorator, LangChain & CrewAI middleware. |
The Judge Layer is architecturally separated from any LLM-based classification and is immune to bypass:
| Bypass Pattern | Defense |
|---|---|
| 📝 Context Window Displacement | Regex indicators (ignore previous, DAN mode, jailbreak) |
| 🔗 Indirect Tool Chaining | Suspicious tool sequences (read_file → send_email) |
| 🔤 Unicode Homoglyph Substitution | NFKC normalization + confusable character map |
| 🎭 Confidence Hijacking | Social engineering pattern detection |
| Severity | Auth Present | Verdict |
|---|---|---|
| 1–3 | Any | ✅ ALLOW |
| 4–6 | Yes | ✅ ALLOW |
| 4–6 | No | 🟡 QUARANTINE |
| 7–8 | Any | 🟡 QUARANTINE |
| 9–10 | Any | 🔴 DENY |
Fail-closed: any exception returns
ESCALATE(severity 10)
🎥 [Demo video coming soon]
Real-time KPIs, critical alerts, agent health %, and live pulse indicators.
Watch Lobster Trap intercepts streaming in with verdict badges, risk-score bars, and detected pattern chips.
Launch attacks, watch live event stream with colored verdicts, and see incidents auto-generate in real time.
Inspect deterministic rule evaluation with bypass pattern detection and latency histograms.
Customize NIST baselines, edit ODPs per incident type, detect conflicts, and apply industry templates.
- 🐍 Python 3.11+
- 📦 Node.js 18+
- 🐘 PostgreSQL 15 (or SQLite for local dev)
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
alembic upgrade head
uvicorn app.main:app --reload --port 8000cd frontend
npm install
npm run devCreate backend/.env:
# 🌍 Environment
ENVIRONMENT=development
DEMO_MODE=true
# 🗄️ Database
DATABASE_URL=sqlite:///./data/playbooks.db
# DATABASE_URL=postgresql+asyncpg://user:pass@localhost/playbook
# 🔐 Security
SECRET_KEY=change-me-in-production-min-64-characters-long
ACCESS_TOKEN_EXPIRE_MINUTES=60
# 🤖 Gemini (optional — for AI analysis overlay)
GEMINI_API_KEY=
# 🦞 Lobster Trap
LOBSTERTRAP_BINARY_PATH=./bin/lobstertrap
# 📧 Notifications (optional)
RESEND_API_KEY=
SMTP_HOST=
SMTP_PORT=587
SMTP_USER=
SMTP_PASSWORD=
NOTIFICATION_DEFAULT_RECIPIENTS=["security@company.com"]pip install playbook-guardfrom playbook_sdk import guard
@guard(agent_id="my-agent")
async def risky_action(data):
"""This function is automatically protected by PLAYBOOK."""
return await process_data(data)from playbook_sdk.middleware.langchain import PlaybookCallbackHandler
handler = PlaybookCallbackHandler(agent_id="my-agent")
# Pass handler to your LangChain agentfrom playbook_sdk.middleware.crewai import crewai_guard
@crewai_guard
class MyAgent(Agent):
role = "Security Analyst"cd backend
pytest -v # Full suite
pytest -v -m "integration" # Integration tests only
pytest --cov=app --cov-report=html # Coverage reportpytest tests/unit/test_bypass_detection.py # 55/55 bypass vectors
pytest tests/unit/test_determinism.py # 1000-repeat variance
pytest tests/unit/test_enforcement_accuracy.py # True positive ratecd frontend
npm run lint
npm run typecheck
npm test| Layer | Technology | Purpose |
|---|---|---|
| ⚡ Backend | Python 3.11, FastAPI, SQLAlchemy 2.0, Alembic | REST API + WebSocket |
| 🎨 Frontend | React 18.2, TypeScript, Tailwind CSS, Recharts | Dashboard UI |
| 🗄️ Database | SQLite (dev) / PostgreSQL 15 (prod) | Persistence |
| 🦞 DPI | Veea Lobster Trap | LLM traffic inspection |
| 🤖 LLM Overlay | Gemini 3.1 Flash Lite (async reasoning only) | Post-hoc analysis |
| Resend SMTP | Alert notifications | |
| 🔄 Reverse Proxy | Caddy | Auto HTTPS |
- ✅ Deterministic Judge Layer with 4 bypass patterns
- ✅ Policy Builder with NIST SP 800-53 ODPs
- ✅ Agent Swarm Simulator + Playground
- ✅ DPI Live Feed integration
- ✅ Forensics Package Builder (SHA-256 + HMAC)
- ✅ Compliance Mapping (EU AI Act, NIST, SOC 2)
- ✅ Python SDK with LangChain / CrewAI middleware
- 🔄 SupraWall competitive integration
- 🔄 Multi-tenant policy isolation
- 🔄 Terraform / Pulumi deployment modules
Contributions are welcome! Please read our Development Plan and open an issue before submitting PRs.
Apache 2.0 — see LICENSE for details.
Built for the lablab.ai TechEx Intelligent Enterprise Solutions hackathon. Track: Agent Security & AI Governance — Veea 🦞
⭐ Star us on GitHub if you find PLAYBOOK useful!
