New to headless agent runtimes? Start with the glossary first.
This guide reflects the current Agentbox runtime.
Agentbox is a self-contained Linux container that runs coding agents (Claude Code, Ruflo, Antigravity, Codex and friends) behind a single management API. Think of it as a shared workstation for agents: one image carries the CLIs, skills, MCP servers and durable-state adapters, and you drive it from your laptop, a remote VM, or a cloud provider. Compared to running agents directly on your machine, Agentbox keeps keys, state, skill trees and model endpoints behind one switchable configuration file.
graph TB
subgraph host["Your host machine"]
Laptop["Laptop / VM / Cloud"]
end
subgraph agentbox["Agentbox container"]
API["Management API :9090"]
subgraph agents["Agent CLIs"]
CC[Claude Code]
RF[Ruflo]
GM[Antigravity]
CX[Codex]
end
subgraph state["Durable state adapters"]
BD[Beads]
PD[Pods]
MM[Memory]
EV[Events]
OR[Orchestrator]
end
SK[Skills corpus]
MCP[MCP servers]
end
Laptop -->|"HTTP / docker exec"| API
API --> agents
API --> state
agents --> SK
agents --> MCP
What it solves
- Agents losing their memory, beads and pod state between sessions because each CLI stashes things in its own home directory.
- API keys leaking into shell history, Dockerfiles and git diffs instead of living in one
.env. - Boot cycles spending minutes downloading npm, pip and model weights every restart — Agentbox bakes them into the image.
- Swapping agent backends (local SQLite vs a federated host mesh) without rewriting orchestration code.
- Agent actions being invisible: when federated into a host mesh, agentbox emits a canonical
agent_actionsignal over the bidirectional/wss/agent-eventschannel (ADR-014), so a host project can render each action as a live agent actor and feed user-interaction events back to the agents. Standalone, the same events stay local; nothing in this quickstart requires a host.
When to skip this: if you only need a single agent CLI on a single machine and are comfortable wiring its storage and keys by hand, running the CLI natively is simpler. Agentbox earns its keep once you want more than one agent, reproducible storage, or remote deployment.
Use the interactive launcher unless you specifically want to edit files by hand:
./scripts/start-agentbox.shThe launcher opens a browser-based setup wizard (PRD-012 / ADR-024) that renders
every agentbox.toml section with schema-validated form controls — dropdowns,
toggles, text inputs — using the DreamLab glassmorphism design system. No
pre-installed dependencies required beyond Python 3 (for the local HTTP server).
Browser-based configuration wizard with sidebar navigation and live editing
The wizard works in three tiers (ADR-024 D1):
- Full Rust binary — if the pre-built
agentbox-setupbinary exists, it serves the frontend, proxies the management API, and handles secret containment server-side. - Python HTTP server —
start-agentbox.shcopiesagentbox.tomland the JSON schema alongside the frontend HTML and serves them viapython3 -m http.server. The SPA auto-loads the co-located config. - Pure browser — open
setup/frontend/dist/index.htmldirectly. Load your TOML via drag-and-drop or file picker. Save via download.
Pass --tui to use the legacy terminal wizard (gum/whiptail) instead.
Dual-mode UI. The same interface has a Dashboard tab that connects to the management API (port 9090) for real-time container monitoring once the stack is running — service health, active tasks, agent events, adapter status, and quick action launchers for Jupyter, Code Server, VNC, and more.
Operations dashboard with service grid and real-time status
Sections covered — the wizard renders all top-level agentbox.toml sections
including Core, Federation, Adapters, GPU, Desktop, Observability, Providers,
Skills, Toolchains, Integrations, Security, Sovereign Mesh, Linked Data,
Identity, Limits, Backup, Payment, Code-as-Harness, and Marketplace. Each section
is driven by the JSON schema at schema/agentbox.toml.schema.json — adding a new
section to the schema automatically surfaces it in the wizard.
Validate-only mode (CI / pre-commit use):
./scripts/start-agentbox.sh --validate-onlyRuns the validator against the existing agentbox.toml and exits with the validator's
exit code (0 = clean, 1 = errors). No TUI is opened.
Manual path:
Edit agentbox.toml before building. This file is the single manifest: the Nix build reads it, the compose generator reads it, and the runtime validator enforces it. Every feature you see in the wizard maps to a key here.
Key sections:
[mesh]—mode = "standalone"(default; the container is complete on its own) or"client"(federates with an external host mesh through adapter endpoints).[adapters]— one per durable-state slot (beads, pods, memory, events, orchestrator). Anadapteris the pluggable-backend pattern from ADR-005: each slot resolves tolocal-*,external, oroff, so you can run fully self-hosted or delegate to a host-mesh without changing code.[sovereign_mesh]— Nostr identity + NIP-98 auth[skills.*]— 96-skill catalogue gates[toolchains]— core CLIs (claude, ruflo, claude_flow, agentic_qe, antigravity_cli, etc.)[gpu]—none(default, no ollama sidecar) |ollama-rocm(ROCm/Vulkan via/dev/kfd+/dev/dri) |ollama-cuda(NVIDIA container runtime, sidecar only) |local-cuda(CUDA baked into image; required forgaussian_splatting)[desktop]— TigerVNC Xvnc desktop (access via SSH tunnel to port 5901)[observability]— metrics port, OTLP endpoint, log level[providers.*]— per-provider API-key gates
Minimal example (standalone, local fallbacks for everything):
[mesh]
mode = "standalone"
[adapters]
beads = "local-sqlite"
pods = "local-solid-rs" # only first-party impl (ADR-010); legacy local-jss removed 2026-04-25
memory = "embedded-ruvector"
events = "local-jsonl"
orchestrator = "local-process-manager"
[sovereign_mesh]
enabled = true
[skills.browser]
playwright = true
[toolchains]
claude = true
claude_code = true
ruflo = true
agentic_qe = true
[gpu]
backend = "none"Federated example (drops into a host container mesh):
[mesh]
mode = "client"
peer_relays = ["wss://host-orchestrator:7070"]
[adapters]
beads = "external"
pods = "external"
memory = "external-pg"
events = "external"
orchestrator = "stdio-bridge"
[integrations.ruvector_external]
enabled = true
conninfo = "postgresql://ruvector@ruvector-postgres:5432/ruvector"Always run agentbox config validate after editing — it checks semantic rules (e.g. gaussian_splatting = true requires gpu.backend = "local-cuda") before the build.
[skills.ontology]
enabled = false # default — ontology-core + ontology-enrich are not loadedSet enabled = true to load the ontology-core and ontology-enrich skills into the agent's skill surface. These skills target Logseq OWL2 DL TBox workflows and are opt-in because they carry specific domain assumptions (Logseq graph conventions, OWL2 DL reasoner tooling). When enabled = false (the default) neither skill is registered and no extra tooling is pulled into the image.
This gate is a prepared placeholder — the MCP server and associated tooling for ontology operations will be fleshed out in a future milestone. Enabling the flag now has no runtime effect beyond advertising the skills in the manifest; downstream agents that check the manifest before loading skills will respect it once the implementation lands.
Agentbox is built with Nix (a reproducible package manager). The flake.nix file composes packages, skills and toolchains into a Docker image based on your manifest — no Dockerfile, no layer drift between rebuilds. nix build .#runtime produces a nix2container OCI manifest at ./result; the runtime exposes a copyToDockerDaemon helper that loads the image into the local Docker daemon via skopeo (no intermediate tarball, no layer copies).
flowchart LR
TOML["agentbox.toml"] --> FLAKE["flake.nix"]
LOCK["flake.lock<br/>pinned inputs"] --> FLAKE
FLAKE --> N2C["nix2container"]
N2C --> OCI["OCI image<br/>at ./result"]
OCI -->|"copyToDockerDaemon<br/>(skopeo)"| DAEMON["Local Docker daemon"]
DAEMON --> RUN["docker compose up"]
nix build .#runtime
nix run .#runtime.copyToDockerDaemonOptional variants:
nix build .#desktop
nix build .#fullManual path:
cp .env.example .envProvider API keys are gated by [providers.*] sections in agentbox.toml.
Only set the env vars for providers you have enabled — the validator (E017) will
warn at boot for any enabled provider whose env var is missing.
-
In
agentbox.toml, setenabled = truefor each provider you want:[providers.anthropic] enabled = true env_var = "ANTHROPIC_API_KEY"
-
In
.env, fill in the corresponding value:ANTHROPIC_API_KEY=sk-ant-...
Infrastructure vars (always required regardless of providers):
MANAGEMENT_API_KEY— API key for the management HTTP APIAGENTBOX_AGENT_ID— stable identity label for this instanceNOSTR_RELAYS— comma-separated Nostr relay URLsWORKSPACE— shared workspace mount pathSHARED_PROJECTS_ROOT— shared projects mount path
For the full provider reference, optional overrides, and instructions for adding
new providers see docs/guides/providers.md.
The preferred boot path uses agentbox.sh up, which starts the stack and blocks until the management API health endpoint responds (or times out after 60 s):
./agentbox.sh upIf you just rebuilt the Nix image and need to load it before starting:
./agentbox.sh up --buildDirect compose is also fine for simple cases, but you will need to poll health manually:
docker compose up -dFor a full dev-loop iteration (stop existing stack, rebuild image, restart):
./agentbox.sh rebuildUse agentbox.sh health to get a per-service status summary:
./agentbox.sh health # pretty-print; exits non-zero if any service is degraded
./agentbox.sh health --json # raw JSON; always exits 0Low-level Docker commands for deeper inspection:
docker compose ps
docker logs --tail 100 agentbox
docker inspect --format '{{json .State.Health}}' agentboxIf the container is using an older image or an older entrypoint, use agentbox.sh rebuild to rebuild and recreate it.
The runtime exposes a small set of HTTP endpoints for liveness, readiness and metrics. These replace the usual "did the container boot?" guesswork with concrete signals. /ready goes green only after every required programme reaches RUNNING and the bootstrap-seal sentinel writes /run/agentbox/bootstrap.done — see ADR-006 for the bootstrap contract.
graph TB
subgraph container["Agentbox container"]
SUP["supervisord (PID 1)"]
API["management-api :9090"]
SOLID["solid-pod-rs :8484"]
MCP["MCP servers"]
SEAL["bootstrap-seal"]
MET["metrics :9091"]
end
SUP --> API
SUP --> SOLID
SUP --> MCP
SUP --> SEAL
API --> MET
SEAL -->|"touches sentinel"| API
HOST["Host (localhost only)"] -->|"127.0.0.1:9090"| API
HOST -->|"127.0.0.1:9091"| MET
HOST -->|"127.0.0.1:8484"| SOLID
From the host:
# Via SSH tunnel (ports are localhost-only on host)
curl http://localhost:9090/health
curl http://localhost:9090/v1/meta # adapter contract versions + image hash
curl http://localhost:9091/metrics # Prometheus — scrape this
curl http://localhost:8484/health # solid-pod-rsFrom inside the container:
docker exec agentbox supervisorctl status
docker exec agentbox tmux -V
docker exec -it agentbox tmux attach -t agentbox
docker exec agentbox ls -la /home/devuser/workspace/profiles
docker exec agentbox ls -la /projectsAll agentbox ports bind to 127.0.0.1 on the host — they are not exposed to the network. Remote access uses SSH tunnels, which provides authentication and encryption without additional VNC passwords or TLS certificates.
graph LR
subgraph laptop["Your Laptop"]
VNC["TigerVNC Viewer<br/>localhost:5901"]
BROWSER["Browser<br/>localhost:9090"]
CLI["SSH Terminal"]
end
subgraph tunnel["SSH Tunnel (encrypted)"]
T1["L5901:localhost:5901"]
T2["L9090:localhost:9090"]
T3["L8080:localhost:8080"]
end
subgraph host["Host Machine"]
D5901["127.0.0.1:5901"]
D9090["127.0.0.1:9090"]
D8080["127.0.0.1:8080"]
end
subgraph agentbox["Agentbox Container"]
XVNC[":5901 Xvnc"]
API[":9090 Management API"]
CODE[":8080 Code Server"]
end
VNC --> T1 --> D5901 --> XVNC
BROWSER --> T2 --> D9090 --> API
CLI --> T3 --> D8080 --> CODE
Open all tunnels in one command:
ssh -L 5901:localhost:5901 \
-L 9090:localhost:9090 \
-L 8080:localhost:8080 \
-L 8484:localhost:8484 \
-N machinelearn@YOUR_HOST_IPOr use the built-in helper:
./agentbox.sh all # opens VNC + code-server + API + CDP tunnels
./agentbox.sh vnc # VNC tunnel onlyOnce the tunnel is open, connect your VNC client to localhost:5901:
vncviewer localhost:5901 # TigerVNC
open vnc://localhost:5901 # macOS Screen SharingThe desktop runs TigerVNC Xvnc with -SecurityTypes None (no VNC password) and -localhost (container-internal only). Security is provided by the SSH tunnel — no unauthenticated network access is possible.
| Service | Container Port | Host Binding | Access |
|---|---|---|---|
| Management API | 9090 | 127.0.0.1:9090 | SSH tunnel, NIP-98 auth |
| VNC Desktop | 5901 | 127.0.0.1:5901 | SSH tunnel |
| Code Server | 8080 | 127.0.0.1:8080 | SSH tunnel |
| Solid Pod | 8484 | 127.0.0.1:8484 | SSH tunnel, WAC auth |
| Agent Events | 9700 | 127.0.0.1:9700 | SSH tunnel |
| Prometheus | 9091 | 127.0.0.1:9091 | SSH tunnel |
All ports are localhost-only on the host. The only way in from the network is through SSH authentication to the host machine.
docker-compose.yml is auto-generated from flake.nix and agentbox.toml. Do not edit it by hand — changes will be overwritten the next time the compose output is regenerated, and hand-edits can desync the file from the manifest.
To regenerate it after changing agentbox.toml:
nix build .#compose && cp result/docker-compose.yml docker-compose.ymlThe compose flake output is a pure text-generation pass — it does not build the container image, so it evaluates on any system Nix runs on (including macOS and Linux aarch64). The generated file is committed to the repo so that operators without Nix can still run the stack; treat it as a build artefact rather than a source file. Per-deployment customisation belongs in docker-compose.override.yml, which Docker Compose merges automatically.
Copy .env.example to .env and fill in your values before running ./agentbox.sh up.
cp .env.example .env| Variable | Required | Purpose |
|---|---|---|
ANTHROPIC_API_KEY |
Yes (for Claude) | Claude Code and the claude toolchain |
OPENAI_API_KEY |
Optional | Set to ollama to route through Ollama locally |
OPENAI_BASE_URL |
Optional | Ollama endpoint — http://ollama:11434/v1 (sidecar) or http://host.docker.internal:11434/v1 (host, requires networking.host_gateway = true) |
GOOGLE_GEMINI_API_KEY |
Optional | Gemini CLI and Gemini provider |
GITHUB_TOKEN |
Optional | gh CLI and GitHub MCP tools |
CRATES_TOKEN |
Optional | cargo publish for Rust ecosystem crates |
| Variable | Purpose |
|---|---|
TAILSCALE_AUTHKEY |
Ephemeral auth key for Tailscale federation; leave blank to disable |
TAILSCALE_HOSTNAME |
MagicDNS name for this instance (default: agentbox) |
NOSTR_RELAYS |
Comma-separated Nostr relay URLs for the sovereign mesh |
AGENTBOX_NSEC |
Pre-generated bech32 nsec1... private key; leave blank to auto-generate at boot |
AGENTBOX_NPUB |
Operator bech32 pubkey; overrides [sovereign_mesh.operator].pubkey_hex in the manifest |
| Variable | Purpose |
|---|---|
MANAGEMENT_API_KEY |
Bearer token for the management HTTP API; auto-generated on first boot if empty |
RUVECTOR_PG_PASSWORD |
PostgreSQL password for the RuVector vector store (must match the ruvector-postgres container) |
AGENTBOX_AGENT_ID |
Stable identity label for this instance |
| Variable | Purpose |
|---|---|
COMFYUI_API_ENDPOINT |
ComfyUI API URL reachable from inside the container |
AGENTBOX_CPU_LIMIT / AGENTBOX_MEM_LIMIT |
Docker deploy resource caps (referenced by docker-compose.override.yml) |
AGENTBOX_CPU_RESERVE / AGENTBOX_MEM_RESERVE |
Docker deploy reservations |
Run ./agentbox.sh preflight after editing .env — it validates the compose merge and checks that required variables are present before the stack starts.
The runtime creates these profile roots:
/home/devuser/workspace/profiles/claude-core/home/devuser/workspace/profiles/ruflo-orchestrator/home/devuser/workspace/profiles/qe-fleet/home/devuser/workspace/profiles/nagual-qe/home/devuser/workspace/profiles/rust-builder/home/devuser/workspace/profiles/docs-latex
Each one should expose:
.claude/settings.json.claude/skills -> /opt/agentbox/skillsprojects -> /projectsworkspace -> /home/devuser/workspace
- RuVector:
/var/lib/ruvector - Solid-style pod storage:
/var/lib/solid - Sovereign identities:
/var/lib/agentbox/identities - Shared workspace:
/home/devuser/workspace - Shared external projects:
/projects
The container runs a tmux session (agentbox) with 10 pre-configured windows:
| Window | Name | Purpose |
|---|---|---|
| 0 | Claude | Primary development shell |
| 1 | Agent | Agent execution workspace |
| 2 | Services | supervisorctl status |
| 3 | Build | Build/compile workspace |
| 4 | Logs | Management API logs (split pane) |
| 5 | System | Resource monitor (btm / htop) |
| 6 | VNC | VNC connection info |
| 7 | Git | Project git status |
| 8 | OpenRouter | Claude Code via free OpenRouter models |
| 9 | ZAI | Claude Code via Z.AI GLM relay |
Attach from inside the container:
tmux attach -t agentbox # or use the alias: taThe shell environment includes fish + starship prompt (Tokyo Night theme), modern CLI replacements (eza, bat, delta, dust, procs, btm), and the full agentbox alias set. Run agentbox-help for a quick reference or agentbox-pick for an interactive launcher.
Check whether the container is still an older image using the old keepalive-only supervisor config.
The management API may not be running in the current container image, or the container may be older than the repo state.
Check the entrypoint and logs:
docker logs agentbox
docker exec agentbox ls -la /home/devuser/workspaceVerify the volumes are mounted and the entrypoint bootstrap ran successfully.