Klanker Maker (km)

Hard spending limits for AI agents on AWS - outside the runtime, at cloud scale.

AWS makes it surprisingly difficult to set hard budget limits. CloudWatch billing alarms are delayed by hours. AWS Budgets can send notifications but can't stop a running workload. There's no native way to say "this workload can spend $5 on Bedrock, Anthropic, or OpenAI and then stop." For human operators that's annoying. For autonomous AI agents that make their own API calls, loop on failures, and spawn sub-agents - it's a real problem.

Klanker Maker is an open-source platform that puts enforceable spending limits between your AI agents and your AWS bill. It turns declarative YAML profiles into budget-capped, policy-locked AWS sandboxes - each with its own Security Group boundary, IAM role, network allowlists, and a dollar ceiling that actually stops the workload when the money runs out. The enforcement lives in the infrastructure (proxy layer + IAM revocation), not in the agent runtime, so no agent can spend past its budget regardless of how it's built or what SDK it uses.

Define what an agent is allowed to do. Set how much it can spend on compute and AI tokens. Walk away.

_{Art by Mike Wigmore (@mikewigmore)}

$ km create profiles/goose.yaml
  ✓ Profile validated
  ✓ Budget: compute $0.50, AI $1.00, warning at 80%
  ✓ Budget enforcer Lambda deployed
  ✓ Metadata stored (alias: goose)
──────────────────────────────────────────────────
Sandbox goose-e6c7d024 created successfully. (43s)
  TTL: 4h (expires 11:42:15 PM EDT)

$ km list
#   ALIAS       SANDBOX ID        STATUS     TTL
1   goose       goose-e6c7d024    running    3h42m

$ km status goose-e6c7d024
Sandbox ID:  goose-e6c7d024
Profile:     goose
Substrate:   ec2
Region:      us-east-1
Status:      running
Created At:  2026-03-31 7:42:15 PM EDT
TTL Expiry:  2026-03-31 11:42:15 PM EDT
Budget:
  Compute: $0.0312 / $0.5000 (6.2%)
  AI:      $0.4200 / $1.0000 (42.0%)

Why This Exists

The agent ecosystem is exploding. My GitHub stars tell the story - in the last few months alone I've starred:

Coding agents that need real compute, real network, and real credentials:

Goose - Block's autonomous agent that installs deps, edits files, runs tests, orchestrates workflows
Aider - AI pair programming in your terminal with automatic git commits
OpenDev - open-source coding agent in the terminal
open-swe - LangChain's asynchronous coding agent
DeepCode - agentic coding for Paper2Code, Text2Web, Text2Backend
deepagents - LangGraph harness with planning, filesystem, and sub-agent spawning

Multi-agent orchestrators that spawn fleets of workers:

agent-orchestrator - parallel coding agents with autonomous CI fixes and code reviews
nanoclaw - lightweight agent on Anthropic's Agent SDK, runs in containers
openclaw - personal AI assistant across platforms
pi-mono - coding agent CLI, unified LLM API, Slack bot, vLLM pods
gobii-platform - always-on AI workforce
autoresearch - Karpathy's agents running research on single-GPU training automatically

Security and red-team agents that definitely need containment:

redamon - AI-powered red team framework, recon to exploitation, zero human intervention
raptor - turns Claude Code into an offensive/defensive security agent
hexstrike-ai - 150+ cybersecurity tools orchestrated by AI agents
strix - open-source AI hackers that find and fix vulnerabilities
shannon - autonomous white-box AI pentester for web apps and APIs
EVA - AI-assisted penetration testing agent

Sandbox platforms solving the same problem from different angles:

agent-sandbox - Kubernetes SIG for isolated agent runtimes
E2B - secure cloud environments for enterprise agents
OpenSandbox - Alibaba's sandbox platform with Docker/K8s runtimes
void-box - composable agent runtime with enforced isolation
monty - Pydantic's minimal, secure Python interpreter in Rust for AI

Every one of these projects needs somewhere safe to run. The common pattern is either "trust the agent" (bad) or "containerize it locally" (insufficient - no real cloud resources, no real credentials, no real network). What's missing is cloud-native physical isolation - a real VPC, real IAM boundaries, real network controls, with a budget ceiling that prevents a $10 experiment from becoming a $10,000 AWS bill.

That's what Klanker Maker builds. The sandbox is a compiled policy object - you declare the constraints and the infrastructure is the compiled artifact:

Budget ceiling - set a dollar cap for compute and AI API spend per sandbox. At 80% you get a warning email. At 100% the sandbox is suspended, not destroyed - top it up and resume.
Network enforcement (three modes) - choose your enforcement layer per profile:
- Proxy mode (default) - iptables DNAT redirects traffic to userspace proxy sidecars for MITM inspection. Traditional approach, works everywhere.
- eBPF mode - Cilium-style cgroup BPF programs enforce DNS/HTTP/TLS-SNI allowlists directly in the kernel. No iptables, no DNAT bypass possible (fixes the root-user escape). Four BPF programs (connect4, sendmsg4, sockops, egress) with an LPM trie for CIDR allowlists, a userspace DNS resolver that populates BPF maps on the fly, and a ring buffer for structured audit events. E2E verified on AL2023 kernel 6.18 across 14 iterations.
- Both mode (gatekeeper) - eBPF connect4 as the primary enforcer in block mode, with selective DNAT rewrite to a transparent proxy for L7-required hosts. The proxy recovers original destinations via pinned BPF maps (src_port_to_sock → sock_to_original_ip). Non-L7 traffic flows direct — never touches the proxy. E2E verified: allowed repos clone, blocked repos get 403, evil.com gets EPERM, non-proxy hosts go direct.
Scoped identity - each sandbox gets its own IAM role session, region-locked, time-limited, with only the permissions the profile declares.
Automatic lifecycle - TTL auto-destroy, idle timeout, artifact upload on exit (including on spot interruption), and email notifications for every lifecycle event.
Spot-first economics - EC2 Spot and Fargate Spot by default. A t3.medium spot instance in us-east-1 costs ~$0.01/hr. Run 10 agent sandboxes for a full workday for under $1 in compute.

The difference between Klanker Maker and the other sandbox platforms: this is pure AWS infrastructure - no orchestration layer to trust, no shared runtime, no container escape surface. Each region has a shared VPC (provisioned once by km init), and every sandbox gets its own Security Groups, IAM role, and sidecar enforcement. The isolation is at the network policy and identity layer, backed by real AWS primitives.

# Profile-level enforcement selection
spec:
  network:
    enforcement: "ebpf"   # or "proxy" (default) or "both"
    egress:
      allowedDNSSuffixes: [".amazonaws.com", ".github.com", ...]
      allowedHosts: ["api.anthropic.com", ...]

AWS Account Architecture

Klanker Maker follows AWS Organizations best practices, supporting either a three-account or two-account topology. In both models, sandboxes run in a dedicated application account - completely separated from the account that owns the domain and applies SCP policies.

Why Three Accounts?

Account	Role	What Lives Here	Why Separate
Management	DNS, identity, org root	Route53 hosted zone, domain registration, AWS SSO, Organizations root	Domain and identity are org-wide - they don't belong in a sandbox blast radius
Terraform	State and provisioning	S3 state buckets, DynamoDB lock tables, cross-account provisioning role	Terraform state contains every resource ARN and secret path - isolating it limits exposure if the application account is compromised
Application	Sandbox execution	Regional VPCs, EC2/ECS instances, IAM sandbox roles, SES, Lambda handlers, DynamoDB budget table, S3 artifacts, CloudWatch Logs	This is where agents run - if an agent escapes its sandbox, it can only reach resources in this account, not state or DNS

In a two-account topology, the Terraform and Application accounts are the same - set both account IDs to the same value during km configure. This is simpler for development while keeping the management account separate for SCP policies and DNS.

The operator authenticates via AWS SSO with named profiles:

# Authenticate to all three accounts
aws sso login --profile klanker-management     # DNS, domain
aws sso login --profile klanker-terraform      # State, provisioning
aws sso login --profile klanker-application    # VPC/network init

The km CLI selects the right AWS profile per command. Commands are grouped by workflow stage:

Setup (once)

Command	AWS Profile	What it does
`km configure`	-	Set domain, account IDs, SSO URL, region, `--max-sandboxes`
`km configure github`	`klanker-terraform`	Configure GitHub App token integration
`km bootstrap`	`klanker-terraform`	Deploy SCP containment policy + KMS key + artifacts bucket
`km init`	`klanker-application`	Build Lambdas/sidecars, provision shared VPC/network
`km doctor`	`klanker-terraform`	Validate platform health across all accounts (20 checks)
`km info`	-	Show platform config, accounts, SES quota, AWS spend, DynamoDB tables

Sandbox lifecycle

Command	AWS Profile	What it does
`km validate <profile>`	-	Check a profile YAML against the schema
`km create <profile>`	`klanker-terraform`	Provision a sandbox from a profile (`--no-bedrock`, `--docker`, `--alias`, `--on-demand`, `--ttl`, `--idle`)
`km clone <sandbox>`	`klanker-terraform`	Duplicate a running sandbox (`--alias`, `--count`, `--no-copy`)
`km list` (alias: `ls`)	`klanker-terraform`	List sandboxes with live status (DynamoDB scan; `--wide`, `--json`, `--tags`)
`km status <sandbox>`	`klanker-terraform`	Budget, identity, idle countdown, resources
`km shell <sandbox>` (alias: `sh`)	`klanker-terraform`	SSM session (`--root`, `--ports`, `--no-bedrock`, `--learn`)
`km agent <sandbox>`	`klanker-terraform`	Launch AI agent in sandbox (`--claude`, `--codex`)

Observability

Command	AWS Profile	What it does
`km logs <sandbox>`	`klanker-terraform`	Tail CloudWatch audit logs
`km otel <sandbox>`	`klanker-terraform`	AI spend summary by provider + OTEL S3 data
`km otel --prompts`	`klanker-terraform`	User prompts with timestamps
`km otel --timeline`	`klanker-terraform`	Conversation turns with per-turn cost
`km otel --events`	`klanker-terraform`	Full event stream (API calls, tool calls)
`km otel --tools`	`klanker-terraform`	Tool call history with parameters and duration

Budget and lifecycle management

Command	AWS Profile	What it does
`km budget add <sandbox>`	`klanker-terraform`	Top up compute or AI budget
`km extend <sandbox> <dur>`	`klanker-terraform`	Add time before TTL expires
`km pause <sandbox>`	`klanker-terraform`	Pause (hibernate) instance, preserve infrastructure
`km stop <sandbox>`	`klanker-terraform`	Stop instance, preserve infrastructure
`km lock <sandbox>`	`klanker-terraform`	Lock sandbox to prevent accidental destroy/stop/pause
`km unlock <sandbox>`	`klanker-terraform`	Unlock sandbox, re-enable lifecycle commands
`km rsync save/load <sandbox>`	`klanker-terraform`	Save/restore sandbox home directory snapshots
`km resume <sandbox>`	`klanker-terraform`	Resume a paused or stopped sandbox
`km destroy <sandbox>`	`klanker-terraform`	Teardown sandbox (`--remote` by default; forced local for Docker substrate)
`km kill <sandbox>`	`klanker-terraform`	Alias for `km destroy`
`km at '<time>' <cmd>`	`klanker-terraform`	Schedule a deferred/recurring operation (`create`, `destroy`, `pause`, `resume`, `budget-add`, etc.)
`km at list`	`klanker-terraform`	List scheduled operations
`km at cancel <name>`	`klanker-terraform`	Cancel a scheduled operation
`km roll`	`klanker-terraform`	Rotate platform and sandbox credentials (`--platform`, `--sandbox`, `--dry-run`)

Email (operator-side)

Command	AWS Profile	What it does
`km email send`	`klanker-terraform`	Send signed email between sandboxes or to/from operator (`--cc`, `--use-bcc`, `--reply-to`)
`km email read <sandbox>`	`klanker-terraform`	Read a sandbox mailbox with signature verification and auto-decryption (`--json`, `--mark-read`)

Teardown (reverse of setup)

Command	AWS Profile	What it does
`km uninit`	`klanker-application`	Destroy all shared regional infrastructure

Platform Configuration

Klanker Maker is forkable. All platform-specific values - domain, account IDs, SSO URL, region preferences - are configurable via km configure:

km configure
  Domain:                 mysandboxes.example.com
  Management account ID:  111111111111
  Terraform account ID:   222222222222
  Application account ID: 333333333333
  SSO start URL:          https://myorg.awsapps.com/start
  Primary region:         us-east-1

No hardcoded account IDs. No hardcoded domains. A fork with a different domain works end-to-end after km configure.

Security Model

Klanker Maker uses explicit allowlists everywhere - if it's not in the policy, it's denied. There is no "default allow."

No SSH. No Bastion. No Keys.

Sandboxes are accessed exclusively through AWS SSM Session Manager:

Zero open inbound ports - Security Groups have no SSH ingress rules. Port 22 doesn't exist.
No SSH keys to manage - no generation, rotation, distribution, or leaked keys on GitHub.
IAM-gated access - who can connect is controlled by IAM policy, not by who has a .pem file.
Full session audit - every session and every command is logged to CloudTrail and CloudWatch. There is no "off the record."
No bastion hosts - no jump boxes, no VPN. SSM connects through the agent, even in private subnets with no internet access.

SCP Sandbox Containment

Even if a sandbox IAM role is misconfigured - or an agent finds a way to escalate within the application account - the Service Control Policy (SCP) acts as an org-level backstop that cannot be bypassed from within the account. SCPs are enforced by AWS Organizations at the API layer, before IAM policy evaluation.

The km-sandbox-containment SCP is deployed to the management account and attached to the application account. It contains 6 deny statements:

Statement	What It Blocks	Why It Matters
DenyInfraAndStorage	SG mutation, VPC/subnet/route/IGW/NAT creation, VPC peering, Transit Gateway, snapshot/image creation and export	A compromised sandbox cannot open new network paths, create escape routes to the internet, peer with other VPCs, or exfiltrate data via EBS snapshots or AMI copies
DenyInstanceMutation	`RunInstances`, `ModifyInstanceAttribute`, `ModifyInstanceMetadataOptions`	Prevents launching rogue EC2 instances or disabling IMDSv2 (which would enable SSRF credential theft via the metadata service)
DenyIAMEscalation	`CreateRole`, `AttachRolePolicy`, `DetachRolePolicy`, `PassRole`, `AssumeRole`	Blocks the classic IAM privilege escalation chain: create a new admin role → attach `AdministratorAccess` → assume it
DenySSMPivot	`SendCommand`, `StartSession`	Prevents a compromised sandbox from using SSM to pivot laterally into other sandbox instances
DenyOrgDiscovery	`organizations:List`, `organizations:Describe`	Prevents enumeration of the org structure, other accounts, and OUs - information useful for targeting lateral movement
DenyOutsideRegion	All regional actions outside allowed regions	Region-locks the entire account to prevent resource creation in regions where there's no monitoring or VPC infrastructure

Each statement uses ArnNotLike conditions to carve out trusted operator roles (SSO, provisioner, lifecycle handlers). The carve-outs are minimal - for example, the budget enforcer Lambda only gets an IAM carve-out (it needs AttachRolePolicy/DetachRolePolicy to revoke Bedrock access), not a network or instance carve-out.

The SCP is deployed via km bootstrap --dry-run=false. Run km bootstrap --show-prereqs to see the exact IAM role and trust policy that must be created in the management account first.

Defense in Depth

Layer	Control	Enforcement
Organization	SCP sandbox containment	Org-level deny on SG/network/IAM/instance/SSM/region - cannot be bypassed from within the account
Account	Three-account isolation	Sandbox blast radius limited to Application account; state and DNS unreachable
Network	VPC Security Groups	Primary boundary - blocks all egress except proxy paths
DNS	DNS proxy sidecar / eBPF resolver	Allowlisted suffixes only; non-matching → NXDOMAIN
HTTP	HTTP proxy sidecar / eBPF connect4	Allowlisted hosts only; non-matching → 403 / EPERM
eBPF	Cgroup BPF programs (connect4, sendmsg4, sockops, egress)	Kernel-level enforcement; LPM trie allowlist; ring buffer audit; no root bypass
Identity	Scoped IAM sessions	Region-locked, time-limited, minimal permissions
Email	Ed25519 signed email	Per-sandbox key pairs; profile-controlled signing, verification, and encryption policies
Secrets	SSM Parameter Store + KMS	Allowlisted refs only; per-sandbox encryption key with auto-rotation
Metadata	IMDSv2 enforced	Token-required; blocks SSRF credential theft via instance metadata
Source	GitHub App scoped tokens	Per-repo, per-ref, per-permission; short-lived installation tokens refreshed via Lambda
Filesystem	Path-level enforcement	Writable vs read-only directories at OS level
Audit	Command + network logging	Secret-redacted; delivered to CloudWatch/S3
TLS Observability	eBPF SSL uprobes (OpenSSL, Go, BoringSSL)	Passive plaintext capture without MITM certs; independent audit trail
Telemetry	OTEL observability	Claude Code prompts, tool calls, API requests, cost metrics → OTel Collector → S3
Budget	Compute + AI spend tracking	DynamoDB real-time metering; proxy 403 + IAM revocation at ceiling

eBPF Network Enforcement

When spec.network.enforcement is set to "ebpf" or "both", the sandbox uses Cilium-style cgroup BPF programs instead of (or alongside) iptables DNAT. This is the same approach used by Cilium in Kubernetes - attaching BPF programs to a cgroup to intercept all network syscalls from processes in that group. E2E verified across 14+ iterations on AL2023 kernel 6.18.

Four BPF programs, one cgroup:

Sandbox Cgroup (/sys/fs/cgroup/km.slice/km-{id}.scope)
│
├── cgroup/connect4   — TCP connect() hook
│   ├── Dual-PID exemption (enforcer + proxy sidecar)
│   ├── LPM trie lookup: is dest IP in allowed_cidrs?
│   ├── If denied → return EPERM (connection refused)
│   ├── If allowed + proxy-marked → stash original dest, rewrite to 127.0.0.1:3128
│   └── Emit structured audit event to ring buffer
│
├── cgroup/sendmsg4   — UDP sendmsg() hook
│   ├── Intercept DNS (port 53)
│   └── Redirect to local resolver (127.0.0.1:53)
│
├── sockops           — TCP state transitions
│   └── Map source_port → socket_cookie (transparent proxy recovers real dest)
│
└── cgroup_skb/egress — Packet-level backstop
    ├── Parse IPv4 header, check allowed_cidrs
    └── Drop packets to non-allowlisted IPs (L3 defense-in-depth)

How the allowlist stays fresh: A userspace DNS resolver (127.0.0.1:53) checks every DNS query against the profile's allowedDNSSuffixes. Allowed queries are forwarded to VPC DNS; resolved IPs are injected into the BPF allowed_cidrs LPM trie map with TTL-based expiry. For L7-required hosts (GitHub, Bedrock), IPs are also inserted into http_proxy_ips for selective proxy redirect. The allowlist is dynamic — it grows as the agent resolves new hosts and shrinks as DNS TTLs expire.

Why cgroups? The BPF programs are scoped to the sandbox cgroup, not the whole instance. The enforcer process, SSM agent, and sidecars run outside the cgroup and are unaffected. This is the same isolation model that makes this approach portable to EKS pods, Docker cgroups, and other container runtimes in future substrates.

Transparent proxy (both mode): When connect4 rewrites a connection's destination to the local proxy, the sandbox app sends raw TLS (not HTTP CONNECT). A TransparentListener in the HTTP proxy peeks the first byte (0x16 = TLS ClientHello), then recovers the original destination via a three-step BPF map lookup chain: src_port_to_sock[peer_port] → sock_to_original_ip[cookie] → sock_to_original_port[cookie]. This enables L7 inspection (GitHub repo filtering, Bedrock token metering) without HTTP_PROXY environment variable cooperation from the client.

Editable diagram: docs/diagrams/ebpf-architecture.excalidraw

eBPF SSL Uprobe Observability

Alongside kernel-level enforcement, eBPF uprobes provide passive TLS plaintext capture for audit and observability — without MITM certificates. E2E verified on AL2023 with 8 probes attaching to OpenSSL 3.2.2:

TLS Library	Used By	Uprobe Target	Status
OpenSSL (libssl.so.3)	curl, wget, Python, Ruby	`SSL_write` / `SSL_read` / `SSL_write_ex` / `SSL_read_ex`	E2E verified (8 probes)
Go crypto/tls	Goose (if Go)	`writeRecordLocked` / `Read`	Schema-ready (per-RET offsets, no uretprobe)
BoringSSL (Bun)	Claude Code	`SSL_write`	Schema-ready (byte-pattern offset discovery)
rustls	Future Rust agents	`rustls_connection_write_tls`	Schema-ready

What uprobes add that MITM can't: Visibility into traffic that bypasses the proxy (if any), audit trail independent of proxy logs, plaintext capture without certificate trust issues. The observer logs structured JSON events with HTTP method, URL, host, and response status for every TLS connection. Git-smart-HTTP (clone/push) uses HTTP/1.1 and is captured correctly.

What uprobes can't replace: Active request blocking (uprobes are passive — they observe but cannot deny), HTTP/2 body parsing (GitHub API and Bedrock use HTTP/2 — uprobe captures HPACK-compressed binary, not parseable HTTP/1.1), and the transparent proxy's active enforcement (repo filtering, budget 403s).

The two eBPF layers work together:

Phase 40 (enforcement): cgroup BPF programs decide allow/deny/redirect at the kernel level
Phase 41 (observability): SSL uprobes capture TLS plaintext for audit logging alongside the MITM proxy

Architecture Diagrams

Editable source: docs/sandbox-architecture.excalidraw - open in excalidraw.com or the VS Code Excalidraw extension.

How It Works

Configure with km configure (alias: km conf) - set your domain, account IDs, SSO URL, region (once)
Bootstrap with km bootstrap - deploys SCP, KMS key, artifacts bucket (once)
Initialize with km init --region us-east-1 - builds Lambdas/sidecars, provisions VPC, DynamoDB (budgets, identities, sandboxes), SES, TTL handler (once per region)
Check with km doctor - validates all 12 platform health checks, assumes cross-account role for SCP verification
Define a SandboxProfile in YAML - budget, lifecycle, network policy, identity, sidecars
Validate with km validate <profile.yaml>
Create with km create <profile> - compiles to Terragrunt inputs, provisions infrastructure, shows elapsed time
Monitor with km list (alias-first, lock icons, narrow default, --wide for all columns) / km status (budget, identity, idle countdown with color) / km logs / km otel (telemetry + spend)
Connect with km shell 1 (restricted user) or km shell 1 --root (operator access)
Port forward with km shell 1 --ports 8080:80,3000 (Docker-style syntax)
Extend with km extend 1 2h - add time before TTL expires
Stop with km stop 1 - stop instance, preserve infrastructure for restart
Top-up with km budget add 1 --compute 5.00 - add compute or AI budget
Destroy with km destroy 1 (Lambda-dispatched by default) or km destroy 1 --remote=false (local terragrunt)
Teardown with km uninit --region us-east-1 - reverse of init, destroys all regional infrastructure

SandboxProfile

Profiles use a Kubernetes-style schema at klankermaker.ai/v1alpha1. Here's the goose profile - a working example that provisions a Goose agent sandbox with Bedrock access, budget enforcement, OTEL telemetry, hibernation support, EFS shared storage, and GitHub repo allowlisting:

apiVersion: klankermaker.ai/v1alpha1
kind: SandboxProfile
metadata:
  name: goose
  labels:
    tier: development
    tool: goose
  prefix: gebpfgk

spec:
  lifecycle:
    ttl: "4h"
    idleTimeout: "1h"
    teardownPolicy: stop

  runtime:
    substrate: ec2
    spot: false
    instanceType: t3.medium
    region: us-east-1
    rootVolumeSize: 15
    hibernation: true              # preserve RAM state on pause (on-demand only)
    mountEFS: true                 # mount regional EFS shared filesystem
    efsMountPoint: /shared         # EFS mount path (default: /shared)
    additionalVolume:              # extra EBS volume for data
      size: 20                     # GB
      mountPoint: /data

  execution:
    shell: /bin/bash
    workingDir: /workspace
    useBedrock: true               # route Anthropic API via AWS Bedrock (SigV4 auth)
    privileged: false
    env:
      SANDBOX_MODE: goose-ebpf-gatekeeper
      GOOSE_PROVIDER: aws_bedrock
      GOOSE_MODEL: us.anthropic.claude-opus-4-6-v1
      GOOSE_MODE: auto
      GOOSE_TELEMETRY_ENABLED: "false"
      CODEX_CA_CERTIFICATE: /usr/local/share/ca-certificates/km-proxy-ca.crt
      OPENAI_API_KEY: ""
    configFiles:
      "/home/sandbox/.claude/settings.json": |
        {"trustedDirectories":["/home/sandbox","/workspace"]}
    rsyncPaths:
      - ".gitconfig"
      - ".config/goose"
      - ".claude"
      - ".claude.json"
      - ".codex"
    initCommands:
      - "yum install -y git nodejs npm python3 python3-pip bzip2 jq tar gzip unzip tmux"
      - "HOME=/root curl -fsSL https://github.com/block/goose/releases/download/stable/download_cli.sh | HOME=/root CONFIGURE=false bash"
      - "npm install -g @anthropic-ai/claude-code@2.1.108"
      - "mkdir -p /workspace && chown -R sandbox:sandbox /workspace"

  budget:
    compute:
      maxSpendUSD: 0.50
    ai:
      maxSpendUSD: 1.00
    warningThreshold: 0.80

  network:
    enforcement: both              # "proxy" (default), "ebpf", or "both"
    egress:
      allowedDNSSuffixes:
        - ".amazonaws.com"
        - ".anthropic.com"
        - ".claude.ai"
        - ".claude.com"
        - ".sentry.io"
        - ".cloudfront.net"
        - ".github.com"
        - ".githubusercontent.com"
        - ".npmjs.org"
        - ".npmjs.com"
        - ".nodejs.org"
        - ".npmmirror.com"
        - ".openai.com"
        - ".chatgpt.com"
        - ".pypi.org"
        - ".pythonhosted.org"
        - ".pulsemcp.com"
        - ".google.com"
        - ".google-analytics.com"
        - ".googletagmanager.com"
        - ".googleapis.com"
        - ".featuregates.org"
        - ".statsig.com"
      allowedHosts:
        - "statsig.com"

  sourceAccess:
    mode: allowlist
    github:
      allowedRepos:
        - "whereiskurt/meshtk"
        - "whereiskurt/defcon.run.34"
        - "whereiskurt/klanker-maker"
      allowedRefs:
        - "main"
        - "develop"
        - "feature/*"
        - "fix/*"

  identity:
    roleSessionDuration: "1h"
    allowedRegions:
      - us-east-1
    sessionPolicy: minimal

  sidecars:
    dnsProxy:
      enabled: true
      image: km-dns-proxy:latest
    httpProxy:
      enabled: true
      image: km-http-proxy:latest
    auditLog:
      enabled: true
      image: km-audit-log:latest
    tracing:
      enabled: true
      image: km-tracing:latest

  observability:
    commandLog:
      destination: cloudwatch
      logGroup: /klankrmkr/sandboxes
    networkLog:
      destination: cloudwatch
      logGroup: /klankrmkr/network
    claudeTelemetry:
      enabled: true
      logPrompts: true
      logToolDetails: true
    learnMode: false
    tlsCapture:                    # eBPF SSL uprobe plaintext capture (Phase 41)
      enabled: true
      libraries: [openssl]
      capturePayloads: false

  artifacts:
    paths:
      - /workspace
    maxSizeMB: 500

  email:
    signing: required
    verifyInbound: required
    encryption: required

  cli:
    noBedrock: true

Built-in Profiles

Profile	TTL	Network	Budget	Use Case
`hardened`	4h	eBPF+proxy (both), AWS services only	No budget section	Production-adjacent testing
`sealed`	1h	Proxy, .anthropic.com + .npmjs.org only	$5 compute / $10 AI	Minimal egress, short-lived execution
`goose`	4h	eBPF+proxy (both), Anthropic, GitHub, npm, PyPI, OpenAI, Goose extensions	$0.50 compute / $1 AI	Goose agent (Block) with Bedrock, MCP extensions
`codex`	4h	Proxy, OpenAI, GitHub	$2 compute / $5 AI	OpenAI Codex agent
`ao`	8h	eBPF+proxy (both), Anthropic, GitHub, npm, OpenAI	$4 compute / $10 AI	Multi-agent orchestration (Claude + Codex + AO)
`learn`	2h	eBPF+proxy (both), wide-open TLD suffixes	$2 compute / $0 AI	Traffic observation for profile generation

Profiles support inheritance via extends - start from a base and override what you need.

Non-Interactive Agent Execution

Run Claude (or any agent) non-interactively inside a sandbox via km agent run. Prompts are dispatched via SSM SendCommand, agents run in persistent tmux sessions that survive disconnects, and output is stored on disk + S3 for fast retrieval.

# Fire-and-forget — agent runs in tmux, returns immediately
km agent run sb-abc123 --prompt "fix the failing tests"

# Wait for completion — blocks until done, prints JSON result
km agent run sb-abc123 --prompt "What model are you?" --wait

# Interactive — attach to tmux, watch Claude work live (Ctrl-B d to detach)
km agent run sb-abc123 --prompt "refactor auth module" --interactive

# Attach to a running agent's tmux session
km agent attach sb-abc123

# Fetch results (S3 fast path, ~3s)
km agent results sb-abc123
km agent results sb-abc123 | jq '.result'

# List all runs with status
km agent list sb-abc123

# Schedule a future agent run (resumes sandbox if paused)
km at '5pm tomorrow' agent run sb-abc123 --prompt "nightly tests" --auto-start

# Use direct Anthropic API instead of Bedrock
km agent run sb-abc123 --prompt "..." --no-bedrock --wait

Profile defaults: Set spec.cli.noBedrock: true to default to direct API. Use spec.execution.configFiles to pre-seed Claude settings (trusted directories, etc.).

Running Agents in Sandboxes

Klanker Maker is workload-agnostic - any agent that runs on Linux works inside a sandbox. Here's how the controls map to real agent workloads:

Agent	What It Does	Which Controls Matter
Goose	Installs deps, edits files, runs tests, orchestrates workflows	Budget cap - prevents runaway AI API costs when Goose loops
Aider	AI pair programming with auto git commits	Source access - controls which repos it can push to
agent-orchestrator	Spawns parallel coding agents, handles CI fixes autonomously	Budget + TTL - caps fleet cost; each spawned worker inherits the sandbox ceiling
deepagents	Planning + filesystem + sub-agent spawning via LangGraph	Network allowlist - limits where sub-agents can reach
open-swe	Async coding agent that clones, patches, and PRs	Source access - allowlist repos + refs; block push to protected branches
redamon	Automated red team: recon → exploitation → post-exploitation	Sealed profile - air-gapped, no egress, full audit trail
raptor	Claude Code as an offensive security agent	Hardened profile - minimal egress, short TTL, every command logged
autoresearch	Agents running research on GPU training	Compute budget - prevents a runaway training loop from burning hours of GPU
nanoclaw	Anthropic Agent SDK agent connected to messaging apps	HTTP proxy - controls which external APIs the agent can call
gobii-platform	Always-on AI workforce	Idle timeout - shuts down workers that stop producing; artifact upload preserves state

Multi-Agent Orchestration via Signed Email

Sandboxes communicate through digitally signed email (SES + Ed25519). Each sandbox gets a unique address derived from its ID (e.g., sb-a1b2c3d4@sandboxes.klankermaker.ai) and an Ed25519 key pair at creation time.

Signing - outbound emails are signed with the sender's Ed25519 private key (stored in SSM, KMS-encrypted). The signature and sender ID are attached as X-KM-Signature and X-KM-Sender-ID headers.
Verification - the receiver fetches the sender's public key from the km-identities DynamoDB table and verifies the signature. When verifyInbound: required, unsigned or invalid emails are rejected.
Encryption - optional X25519 key exchange (NaCl box). When encryption: required, the sender encrypts the body with the recipient's public key. When encryption: optional, it encrypts if the recipient has a published key, plaintext otherwise.

Profile controls (spec.email.signing, spec.email.verifyInbound, spec.email.encryption) govern policy per sandbox. Hardened and sealed profiles default to signing: required; goose defaults to signing: required.

This enables multi-agent pipelines where each worker is physically isolated but logically connected - with cryptographic proof of sender identity and optional confidentiality.

Substrates

Substrate	How It Works	Cost
EC2 Spot (default)	Shared regional VPC, per-sandbox SG, spot instance, SSM access, sidecar systemd services	~$0.01/hr for t3.medium
EC2 On-Demand	Same as above, guaranteed capacity	~$0.04/hr for t3.medium
ECS Fargate Spot	Fargate task with sidecar containers, service discovery	~$0.01/hr for 1 vCPU / 2GB
ECS Fargate	Same as above, guaranteed capacity	~$0.04/hr for 1 vCPU / 2GB
Docker (local)	Docker Compose on local machine, sidecar containers, IAM roles via STS	Free (local compute)

Spot interruption handlers automatically upload artifacts to S3 before instances are reclaimed.

Budget Enforcement

Budget enforcement tracks two spend pools per sandbox, stored in a DynamoDB global table replicated to every region where agents run. Reads from within the sandbox hit the local regional replica with sub-millisecond latency.

Compute Budget

Tracked as spot rate x elapsed minutes, sourced from the AWS Price List API at sandbox creation. When the compute budget is exhausted, the sandbox is suspended - not destroyed:

EC2: StopInstances preserves the EBS volume. No compute charges accrue while stopped.
ECS Fargate: Artifacts are uploaded, then the task is stopped. Re-provision from the stored S3 profile on top-up.

AI Budget (Bedrock, Anthropic, OpenAI)

The HTTP proxy sidecar intercepts every AI API response - Bedrock (invoke-with-response-stream), Anthropic direct (api.anthropic.com, for Claude Code Max/API key users), and OpenAI-compatible endpoints. A tee-reader streams data through to the client without blocking, captures the full response, then extracts token counts asynchronously:

Bedrock streaming: base64-decodes {"bytes":"<b64>"} event-stream wrappers to find message_start/message_delta payloads
Anthropic SSE: parses data: lines for the same event types
Non-streaming: reads usage from the JSON response body

Tokens are priced against static model rates and atomically incremented in the DynamoDB spend counter.

Dual-layer enforcement at 100%:

Proxy layer (immediate) - HTTP proxy returns 403 for subsequent AI calls
IAM layer (backstop) - a Lambda revokes the sandbox IAM role's Bedrock permissions, catching calls that bypass the proxy

km status shows per-model AI spend grouped by provider:

$ km status goose-e6c7d024
Sandbox ID:  goose-e6c7d024
Profile:     goose
...
Budget:
  Compute: $0.0312 / $0.5000 (6.2%)
  AI:      $0.4200 / $1.0000 (42.0%)
    anthropic.claude-sonnet-4-6:  $0.85  (89K in / 34K out)   # Bedrock
    claude-opus-4-6:              $0.55  (12K in / 8K out)     # Max/API

OTEL Telemetry

Claude Code running inside sandboxes exports OpenTelemetry telemetry (prompts, tool calls, API requests, token usage, cost metrics) through an OTel Collector sidecar to S3. Profile-controlled via spec.observability.claudeTelemetry:

observability:
  claudeTelemetry:
    enabled: true        # master switch
    logPrompts: true     # include actual prompt text
    logToolDetails: true # include tool parameters (bash commands, file paths)

km otel provides five views into this data:

$ km otel claude-e6c7d024              # summary: budget + S3 + metrics
$ km otel claude-e6c7d024 --prompts    # user prompts with timestamps
$ km otel claude-e6c7d024 --events     # full event stream
$ km otel claude-e6c7d024 --tools      # tool calls with params + duration
$ km otel claude-e6c7d024 --timeline   # conversation turns with per-turn cost

Warnings and Top-Up

At 80% (configurable via spec.budget.warningThreshold) of either pool, the operator receives an email via SES.

$ km budget add claude-e6c7d024 --ai 3.00
  AI budget: $5.00 → $8.00
  Proxy: unblocked
  IAM: restored
  Status: running

Top-up unblocks the proxy, restores IAM permissions, and restarts suspended compute - all in one command.

Architecture

km CLI / ConfigUI
├── cmd/km/                  CLI entry point
├── cmd/configui/            Web dashboard (Go + embedded HTML)
├── cmd/ttl-handler/         Lambda: TTL expiry + artifact upload
├── cmd/budget-enforcer/     Lambda: budget ceiling enforcement
├── cmd/create-handler/      Lambda: remote sandbox creation via EventBridge
├── cmd/email-create-handler/ Lambda: email-driven sandbox creation
├── cmd/github-token-refresher/ Lambda: GitHub App installation token refresh
├── internal/app/cmd/        Cobra commands (configure, bootstrap, init, uninit, validate, create, clone, destroy/kill, pause, resume, lock, unlock, stop, extend, roll, at/schedule, list, status, logs, budget, shell, agent, doctor, otel, info, rsync, email)
├── internal/app/config/     Configuration (config.yaml, env vars, CLI flags)
├── pkg/
│   ├── profile/             SandboxProfile schema, validation, inheritance
│   ├── compiler/            Profile → Terragrunt artifacts (EC2 + ECS paths)
│   ├── ebpf/                eBPF enforcer (cgroup BPF programs, DNS resolver, audit consumer, SSL uprobes)
│   ├── aws/                 SDK helpers (S3, SES, CloudWatch, EC2 metadata, DynamoDB, EventBridge Scheduler, identity/signing)
│   ├── terragrunt/          Runner + per-sandbox state isolation
│   ├── lifecycle/           TTL scheduling, idle detection, teardown
│   ├── allowlistgen/        Allowlist generation from observed traffic
│   ├── at/                  Deferred/recurring operation scheduling
│   ├── github/              GitHub App token management
│   ├── localnumber/         Persistent local sandbox numbering
│   └── version/             Build version info
├── sidecars/
│   ├── dns-proxy/           DNS allowlist filter (UDP/TCP:53)
│   ├── http-proxy/          HTTP allowlist filter (TCP:3128) + AI token metering (Bedrock, Anthropic, OpenAI)
│   ├── audit-log/           Command + network log router with secret redaction
│   └── tracing/             OTel Collector sidecar (logs, metrics → S3)
├── profiles/                Built-in YAML profiles (sealed, hardened, goose, codex, ao, learn)
└── infra/
    ├── modules/             Terraform modules
    │   ├── network/         VPC, subnets, security groups
    │   ├── ec2spot/         Spot + on-demand instances, IMDSv2, IAM
    │   ├── ecs-cluster/     ECS cluster, Fargate Spot capacity provider
    │   ├── ecs-task/        Task definitions with sidecar containers
    │   ├── ecs-service/     Service deployment + service discovery
    │   ├── ecs-spot-handler/  Lambda: Fargate Spot interruption → artifact upload
    │   ├── efs/             Regional EFS shared filesystem for cross-sandbox data
    │   ├── secrets/         SSM Parameter Store + KMS encryption
    │   ├── ses/             SES domain, DKIM, inbound email → S3
    │   ├── scp/             SCP sandbox containment (deployed to management account)
    │   ├── dynamodb-budget/ Budget enforcement table
    │   ├── dynamodb-identities/ Sandbox identity public key table
    │   ├── dynamodb-sandboxes/ Sandbox metadata table (km-sandboxes)
    │   ├── dynamodb-schedules/ Scheduled operations table
    │   ├── budget-enforcer/ Lambda: budget ceiling enforcement
    │   ├── create-handler/  Lambda: remote sandbox creation
    │   ├── email-handler/   Lambda: email-driven operations
    │   ├── github-token/    Lambda: GitHub App token refresh
    │   ├── s3-replication/  Cross-region artifact replication
    │   └── ttl-handler/     Lambda: TTL expiry → artifacts + email + self-cleanup
    └── live/                Terragrunt hierarchy (site.hcl, per-sandbox isolation)

Quick Start

# Install
go install github.com/whereiskurt/klankrmkr/cmd/km@latest

# Configure your platform (once)
km configure    # or: km conf

# See what's needed in the management account before bootstrap
km bootstrap --show-prereqs

# Bootstrap SCP + KMS + artifacts bucket (once)
km bootstrap --dry-run=false

# Initialize the region - builds Lambdas, sidecars, deploys infra (once per region)
km init --region us-east-1

# Check platform health (20 checks)
km doctor

# Create a sandbox (shows progress dots + elapsed time)
km create profiles/goose.yaml
km create --on-demand profiles/sealed.yaml  # skip spot, use on-demand
km create profiles/goose.yaml --no-bedrock  # disable Bedrock, use direct API keys
km create profiles/goose.yaml --docker      # shortcut for --substrate=docker
km create profiles/goose.yaml --alias mybot # override the sandbox alias

# List sandboxes (narrow default — alias first, live status)
km list
km list --wide    # show profile, substrate, region columns

# Status with budget, identity, idle countdown
km status 1

# Connect as restricted user (no sudo)
km shell 1
km shell 1 --root    # operator access

# Port forward (Docker-style)
km shell 1 --ports 8080           # localhost:8080 → remote:8080
km shell 1 --ports 8080:80,3000   # multiple ports

# Launch an AI agent inside a sandbox
km agent 1 --claude                          # interactive Claude Code
km agent run 1 --prompt "fix tests"          # headless with prompt
km agent run 1 --prompt "fix tests" --wait   # wait for completion

# Extend TTL
km extend 1 2h

# Pause (hibernate) — preserves RAM state
km pause 1

# Resume a paused or stopped sandbox
km resume 1

# Lock to prevent accidental destroy/stop/pause
km lock 1
km unlock 1 --yes

# Stop without destroying
km stop 1

# View audit logs
km logs 1

# OTEL telemetry + AI spend
km otel 1                  # summary
km otel 1 --timeline       # conversation turns with cost
km otel 1 --prompts        # user prompts
km otel 1 --tools          # tool call history

# Destroy (remote by default, or local)
km destroy 1                    # Lambda-dispatched (default)
km destroy 1 --yes              # skip confirmation prompt
km destroy 1 --remote=false     # local terragrunt destroy

# km kill is an alias for km destroy
km kill 1 --yes

# Schedule a deferred or recurring operation
km at '10pm tomorrow' create profiles/goose.yaml     # one-shot
km at 'every thursday at 3pm' kill 1                  # recurring
km at list                                            # list scheduled ops
km at cancel my-schedule-name                         # cancel one
# km schedule is an alias for km at

# Teardown region infrastructure
km uninit --region us-east-1

Documentation

Document	Description
User Manual	Full command reference, walkthroughs (Claude Code, Goose, security agents), profile authoring
Operator Guide	AWS account setup, KMS, S3, SES, Lambda deployment - everything before `km init`
Profile Reference	Complete YAML schema with every field, type, default, and validation rule
Security Model	Deep dive on each security layer, from VPC to IMDSv2 to secret redaction
Budget Guide	DynamoDB schema, proxy metering, enforcement flow, threshold configuration
Docker Substrate	Running sandboxes locally via Docker Compose (`km create --docker`)
Sidecar Reference	Each sidecar's config, env vars, log formats, EC2 vs ECS deployment
Multi-Agent Email	SES setup, sandbox addressing, cross-sandbox orchestration patterns
ConfigUI Guide	Web dashboard setup, profile editor, secrets management

Roadmap

Phase	Description	Status
1	Schema, Compiler & AWS Foundation	Complete
2	Core Provisioning & Security Baseline	Complete
3	Sidecar Enforcement & Lifecycle Management	Complete
4	Lifecycle Hardening, Artifacts & Email	Complete
5	ConfigUI Web Dashboard	Complete
6	Budget Enforcement & Platform Configuration	Complete
7	Unwired Code Paths	Complete
8	Sidecar Build & Deployment Pipeline	Complete
9	Live Infrastructure & Operator Docs	Complete
10	SCP Sandbox Containment	Complete
11	Sandbox Auto-Destroy & Metadata Wiring	Complete
12	ECS Budget Top-Up & S3 Replication	Complete
13	GitHub App Token Integration	Complete
14	Sandbox Identity & Signed Email	Complete
15	km doctor - Platform Health Check	Complete
16	Documentation Refresh (Phases 6-15)	Complete
17	Sandbox Email Mailbox & Access Control	Complete
18	Loose Ends - km init, uninit, bootstrap KMS, github-token	Complete
19	Budget Enforcement Wiring - EC2 hard stop, IAM revocation	Complete
20	Anthropic API Metering & Terragrunt Output Suppression	Complete
21	Bug fixes and mini-features - budget precision, polish	Complete
22	Remote Sandbox Dispatch - `km create/destroy/stop/extend --remote` via Lambda	Complete
23	Email-Driven Operations - operator inbox, email-to-create, safe phrase auth, EventBridge	Complete
24	Documentation Refresh - docs for Phases 22-32	Complete
25	GitHub Source Access Restrictions - repo allowlists, deny-by-default	Complete
26	Live Operations Hardening - bootstrap, init, TTL, idle, sidecars, CLI polish	Complete
27	Claude Code OTEL Integration - sandbox observability via built-in telemetry	Complete
28	OTEL Observability Hardening - timeline view, events, tools flags	Complete
29	EC2 Hibernation & MaxLifetime Enforcement	Complete
30	Sandbox Pause, Lock, Unlock & km list Enhancements	Complete
31	Transparent HTTPS & Audit Log Improvements	Complete
32	Profile-Scoped Rsync Paths & External File Lists	Complete
33	EC2 Storage Customization, Hibernation & AMI Selection	Complete
34	Agent Profiles - Agent Orchestrator, Goose, and Codex	Complete
35	MITM CA Trust for Python, Node, and Non-System SSL Libraries	Complete
36	km-sandbox Base Container Image	Complete
37	Docker Compose Local Substrate	Complete
38	EKS / Kubernetes Substrate	Planned
39	DynamoDB Metadata Migration (S3 to DynamoDB)	Complete
40	eBPF Cgroup Network Enforcement (connect4, sendmsg4, sockops, egress)	Complete
41	eBPF SSL Uprobe TLS Observability (OpenSSL, Go, BoringSSL)	Complete
42	eBPF Gatekeeper Mode — connect4 DNAT Rewrite for L7 Proxy	Complete
43	Regional EFS Shared Filesystem	Complete
44	`km at` / `km schedule` — Deferred & Recurring Operations	Complete
45	km-send/km-recv Sandbox Scripts & km email send/read CLI	Complete
46	AI Email-to-Command — Haiku Interprets Free-Form Operator Emails	Complete
47	Privileged Execution Mode & Learn Profile	Complete
48	Profile Override Flags for km create (--ttl, --idle)	Complete
49	Prebaked AMI Support	Planned
50	km agent Non-Interactive Execution (--prompt, results, list)	Complete
51	km agent Tmux Sessions (attach, --interactive)	Complete
52	km clone — Duplicate a Running Sandbox	Complete
53	Persistent Local Sandbox Numbering	Complete

See .planning/ROADMAP.md for detailed phase breakdowns and success criteria.

License

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 1,406 Commits
.claude-plugin		.claude-plugin
.planning		.planning
cmd		cmd
containers/sandbox		containers/sandbox
docs		docs
infra		infra
internal/app		internal/app
pkg		pkg
profiles		profiles
schemas		schemas
scripts		scripts
sidecars		sidecars
skills		skills
testdata/profiles		testdata/profiles
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitguardian.yaml		.gitguardian.yaml
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile.ebpf-generate		Dockerfile.ebpf-generate
Makefile		Makefile
OPERATOR-GUIDE.md		OPERATOR-GUIDE.md
README.md		README.md
VERSION		VERSION
go.mod		go.mod
go.sum		go.sum
km-config.example.yaml		km-config.example.yaml

Folders and files

Latest commit

History

Repository files navigation

Klanker Maker (km)

Why This Exists

AWS Account Architecture

Why Three Accounts?

Platform Configuration

Security Model

No SSH. No Bastion. No Keys.

SCP Sandbox Containment

Defense in Depth

eBPF Network Enforcement

eBPF SSL Uprobe Observability

Architecture Diagrams

How It Works

SandboxProfile

Built-in Profiles

Non-Interactive Agent Execution

Running Agents in Sandboxes

Multi-Agent Orchestration via Signed Email

Substrates

Budget Enforcement

Compute Budget

AI Budget (Bedrock, Anthropic, OpenAI)

OTEL Telemetry

Warnings and Top-Up

Architecture

Quick Start

Documentation

Roadmap

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages