Skip to content

Latest commit

 

History

History
536 lines (399 loc) · 20.2 KB

File metadata and controls

536 lines (399 loc) · 20.2 KB

Quick Start

New to headless agent runtimes? Start with the glossary first.

This guide reflects the current Agentbox runtime.

Why this exists

Agentbox is a self-contained Linux container that runs coding agents (Claude Code, Ruflo, Antigravity, Codex and friends) behind a single management API. Think of it as a shared workstation for agents: one image carries the CLIs, skills, MCP servers and durable-state adapters, and you drive it from your laptop, a remote VM, or a cloud provider. Compared to running agents directly on your machine, Agentbox keeps keys, state, skill trees and model endpoints behind one switchable configuration file.

graph TB
    subgraph host["Your host machine"]
        Laptop["Laptop / VM / Cloud"]
    end
    subgraph agentbox["Agentbox container"]
        API["Management API :9090"]
        subgraph agents["Agent CLIs"]
            CC[Claude Code]
            RF[Ruflo]
            GM[Antigravity]
            CX[Codex]
        end
        subgraph state["Durable state adapters"]
            BD[Beads]
            PD[Pods]
            MM[Memory]
            EV[Events]
            OR[Orchestrator]
        end
        SK[Skills corpus]
        MCP[MCP servers]
    end
    Laptop -->|"HTTP / docker exec"| API
    API --> agents
    API --> state
    agents --> SK
    agents --> MCP
Loading

What it solves

  • Agents losing their memory, beads and pod state between sessions because each CLI stashes things in its own home directory.
  • API keys leaking into shell history, Dockerfiles and git diffs instead of living in one .env.
  • Boot cycles spending minutes downloading npm, pip and model weights every restart — Agentbox bakes them into the image.
  • Swapping agent backends (local SQLite vs a federated host mesh) without rewriting orchestration code.
  • Agent actions being invisible: when federated into a host mesh, agentbox emits a canonical agent_action signal over the bidirectional /wss/agent-events channel (ADR-014), so a host project can render each action as a live agent actor and feed user-interaction events back to the agents. Standalone, the same events stay local; nothing in this quickstart requires a host.

When to skip this: if you only need a single agent CLI on a single machine and are comfortable wiring its storage and keys by hand, running the CLI natively is simpler. Agentbox earns its keep once you want more than one agent, reproducible storage, or remote deployment.

Recommended Path

Use the interactive launcher unless you specifically want to edit files by hand:

./scripts/start-agentbox.sh

The launcher opens a browser-based setup wizard (PRD-012 / ADR-024) that renders every agentbox.toml section with schema-validated form controls — dropdowns, toggles, text inputs — using the DreamLab glassmorphism design system. No pre-installed dependencies required beyond Python 3 (for the local HTTP server).

Setup Wizard Browser-based configuration wizard with sidebar navigation and live editing

The wizard works in three tiers (ADR-024 D1):

  1. Full Rust binary — if the pre-built agentbox-setup binary exists, it serves the frontend, proxies the management API, and handles secret containment server-side.
  2. Python HTTP serverstart-agentbox.sh copies agentbox.toml and the JSON schema alongside the frontend HTML and serves them via python3 -m http.server. The SPA auto-loads the co-located config.
  3. Pure browser — open setup/frontend/dist/index.html directly. Load your TOML via drag-and-drop or file picker. Save via download.

Pass --tui to use the legacy terminal wizard (gum/whiptail) instead.

Dual-mode UI. The same interface has a Dashboard tab that connects to the management API (port 9090) for real-time container monitoring once the stack is running — service health, active tasks, agent events, adapter status, and quick action launchers for Jupyter, Code Server, VNC, and more.

Dashboard Operations dashboard with service grid and real-time status

Sections covered — the wizard renders all top-level agentbox.toml sections including Core, Federation, Adapters, GPU, Desktop, Observability, Providers, Skills, Toolchains, Integrations, Security, Sovereign Mesh, Linked Data, Identity, Limits, Backup, Payment, Code-as-Harness, and Marketplace. Each section is driven by the JSON schema at schema/agentbox.toml.schema.json — adding a new section to the schema automatically surfaces it in the wizard.

Validate-only mode (CI / pre-commit use):

./scripts/start-agentbox.sh --validate-only

Runs the validator against the existing agentbox.toml and exits with the validator's exit code (0 = clean, 1 = errors). No TUI is opened.

1. Configure The Build

Manual path:

Edit agentbox.toml before building. This file is the single manifest: the Nix build reads it, the compose generator reads it, and the runtime validator enforces it. Every feature you see in the wizard maps to a key here.

Key sections:

  • [mesh]mode = "standalone" (default; the container is complete on its own) or "client" (federates with an external host mesh through adapter endpoints).
  • [adapters] — one per durable-state slot (beads, pods, memory, events, orchestrator). An adapter is the pluggable-backend pattern from ADR-005: each slot resolves to local-*, external, or off, so you can run fully self-hosted or delegate to a host-mesh without changing code.
  • [sovereign_mesh] — Nostr identity + NIP-98 auth
  • [skills.*] — 96-skill catalogue gates
  • [toolchains] — core CLIs (claude, ruflo, claude_flow, agentic_qe, antigravity_cli, etc.)
  • [gpu]none (default, no ollama sidecar) | ollama-rocm (ROCm/Vulkan via /dev/kfd+/dev/dri) | ollama-cuda (NVIDIA container runtime, sidecar only) | local-cuda (CUDA baked into image; required for gaussian_splatting)
  • [desktop] — TigerVNC Xvnc desktop (access via SSH tunnel to port 5901)
  • [observability] — metrics port, OTLP endpoint, log level
  • [providers.*] — per-provider API-key gates

Minimal example (standalone, local fallbacks for everything):

[mesh]
mode = "standalone"

[adapters]
beads = "local-sqlite"
pods = "local-solid-rs"                # only first-party impl (ADR-010); legacy local-jss removed 2026-04-25
memory = "embedded-ruvector"
events = "local-jsonl"
orchestrator = "local-process-manager"

[sovereign_mesh]
enabled = true

[skills.browser]
playwright = true

[toolchains]
claude = true
claude_code = true
ruflo = true
agentic_qe = true

[gpu]
backend = "none"

Federated example (drops into a host container mesh):

[mesh]
mode = "client"
peer_relays = ["wss://host-orchestrator:7070"]

[adapters]
beads = "external"
pods = "external"
memory = "external-pg"
events = "external"
orchestrator = "stdio-bridge"

[integrations.ruvector_external]
enabled = true
conninfo = "postgresql://ruvector@ruvector-postgres:5432/ruvector"

Always run agentbox config validate after editing — it checks semantic rules (e.g. gaussian_splatting = true requires gpu.backend = "local-cuda") before the build.

Ontology skill gate (prepared placeholder)

[skills.ontology]
enabled = false   # default — ontology-core + ontology-enrich are not loaded

Set enabled = true to load the ontology-core and ontology-enrich skills into the agent's skill surface. These skills target Logseq OWL2 DL TBox workflows and are opt-in because they carry specific domain assumptions (Logseq graph conventions, OWL2 DL reasoner tooling). When enabled = false (the default) neither skill is registered and no extra tooling is pulled into the image.

This gate is a prepared placeholder — the MCP server and associated tooling for ontology operations will be fleshed out in a future milestone. Enabling the flag now has no runtime effect beyond advertising the skills in the manifest; downstream agents that check the manifest before loading skills will respect it once the implementation lands.

2. Build The Image

Agentbox is built with Nix (a reproducible package manager). The flake.nix file composes packages, skills and toolchains into a Docker image based on your manifest — no Dockerfile, no layer drift between rebuilds. nix build .#runtime produces a nix2container OCI manifest at ./result; the runtime exposes a copyToDockerDaemon helper that loads the image into the local Docker daemon via skopeo (no intermediate tarball, no layer copies).

flowchart LR
    TOML["agentbox.toml"] --> FLAKE["flake.nix"]
    LOCK["flake.lock<br/>pinned inputs"] --> FLAKE
    FLAKE --> N2C["nix2container"]
    N2C --> OCI["OCI image<br/>at ./result"]
    OCI -->|"copyToDockerDaemon<br/>(skopeo)"| DAEMON["Local Docker daemon"]
    DAEMON --> RUN["docker compose up"]
Loading
nix build .#runtime
nix run .#runtime.copyToDockerDaemon

Optional variants:

nix build .#desktop
nix build .#full

3. Configure Environment

Manual path:

cp .env.example .env

Provider API keys are gated by [providers.*] sections in agentbox.toml. Only set the env vars for providers you have enabled — the validator (E017) will warn at boot for any enabled provider whose env var is missing.

  1. In agentbox.toml, set enabled = true for each provider you want:

    [providers.anthropic]
    enabled = true
    env_var = "ANTHROPIC_API_KEY"
  2. In .env, fill in the corresponding value:

    ANTHROPIC_API_KEY=sk-ant-...
    

Infrastructure vars (always required regardless of providers):

  • MANAGEMENT_API_KEY — API key for the management HTTP API
  • AGENTBOX_AGENT_ID — stable identity label for this instance
  • NOSTR_RELAYS — comma-separated Nostr relay URLs
  • WORKSPACE — shared workspace mount path
  • SHARED_PROJECTS_ROOT — shared projects mount path

For the full provider reference, optional overrides, and instructions for adding new providers see docs/guides/providers.md.

4. Start The Stack

The preferred boot path uses agentbox.sh up, which starts the stack and blocks until the management API health endpoint responds (or times out after 60 s):

./agentbox.sh up

If you just rebuilt the Nix image and need to load it before starting:

./agentbox.sh up --build

Direct compose is also fine for simple cases, but you will need to poll health manually:

docker compose up -d

For a full dev-loop iteration (stop existing stack, rebuild image, restart):

./agentbox.sh rebuild

5. Verify Host-Level Container State

Use agentbox.sh health to get a per-service status summary:

./agentbox.sh health          # pretty-print; exits non-zero if any service is degraded
./agentbox.sh health --json   # raw JSON; always exits 0

Low-level Docker commands for deeper inspection:

docker compose ps
docker logs --tail 100 agentbox
docker inspect --format '{{json .State.Health}}' agentbox

If the container is using an older image or an older entrypoint, use agentbox.sh rebuild to rebuild and recreate it.

6. Verify Runtime Services

The runtime exposes a small set of HTTP endpoints for liveness, readiness and metrics. These replace the usual "did the container boot?" guesswork with concrete signals. /ready goes green only after every required programme reaches RUNNING and the bootstrap-seal sentinel writes /run/agentbox/bootstrap.done — see ADR-006 for the bootstrap contract.

graph TB
    subgraph container["Agentbox container"]
        SUP["supervisord (PID 1)"]
        API["management-api :9090"]
        SOLID["solid-pod-rs :8484"]
        MCP["MCP servers"]
        SEAL["bootstrap-seal"]
        MET["metrics :9091"]
    end
    SUP --> API
    SUP --> SOLID
    SUP --> MCP
    SUP --> SEAL
    API --> MET
    SEAL -->|"touches sentinel"| API
    HOST["Host (localhost only)"] -->|"127.0.0.1:9090"| API
    HOST -->|"127.0.0.1:9091"| MET
    HOST -->|"127.0.0.1:8484"| SOLID
Loading

From the host:

# Via SSH tunnel (ports are localhost-only on host)
curl http://localhost:9090/health
curl http://localhost:9090/v1/meta        # adapter contract versions + image hash
curl http://localhost:9091/metrics        # Prometheus — scrape this
curl http://localhost:8484/health         # solid-pod-rs

From inside the container:

docker exec agentbox supervisorctl status
docker exec agentbox tmux -V
docker exec -it agentbox tmux attach -t agentbox
docker exec agentbox ls -la /home/devuser/workspace/profiles
docker exec agentbox ls -la /projects

7. Remote Access & Security

All agentbox ports bind to 127.0.0.1 on the host — they are not exposed to the network. Remote access uses SSH tunnels, which provides authentication and encryption without additional VNC passwords or TLS certificates.

graph LR
    subgraph laptop["Your Laptop"]
        VNC["TigerVNC Viewer<br/>localhost:5901"]
        BROWSER["Browser<br/>localhost:9090"]
        CLI["SSH Terminal"]
    end
    subgraph tunnel["SSH Tunnel (encrypted)"]
        T1["L5901:localhost:5901"]
        T2["L9090:localhost:9090"]
        T3["L8080:localhost:8080"]
    end
    subgraph host["Host Machine"]
        D5901["127.0.0.1:5901"]
        D9090["127.0.0.1:9090"]
        D8080["127.0.0.1:8080"]
    end
    subgraph agentbox["Agentbox Container"]
        XVNC[":5901 Xvnc"]
        API[":9090 Management API"]
        CODE[":8080 Code Server"]
    end
    VNC --> T1 --> D5901 --> XVNC
    BROWSER --> T2 --> D9090 --> API
    CLI --> T3 --> D8080 --> CODE
Loading

Connect via SSH tunnel

Open all tunnels in one command:

ssh -L 5901:localhost:5901 \
    -L 9090:localhost:9090 \
    -L 8080:localhost:8080 \
    -L 8484:localhost:8484 \
    -N machinelearn@YOUR_HOST_IP

Or use the built-in helper:

./agentbox.sh all    # opens VNC + code-server + API + CDP tunnels
./agentbox.sh vnc    # VNC tunnel only

VNC desktop

Once the tunnel is open, connect your VNC client to localhost:5901:

vncviewer localhost:5901          # TigerVNC
open vnc://localhost:5901         # macOS Screen Sharing

The desktop runs TigerVNC Xvnc with -SecurityTypes None (no VNC password) and -localhost (container-internal only). Security is provided by the SSH tunnel — no unauthenticated network access is possible.

Port reference

Service Container Port Host Binding Access
Management API 9090 127.0.0.1:9090 SSH tunnel, NIP-98 auth
VNC Desktop 5901 127.0.0.1:5901 SSH tunnel
Code Server 8080 127.0.0.1:8080 SSH tunnel
Solid Pod 8484 127.0.0.1:8484 SSH tunnel, WAC auth
Agent Events 9700 127.0.0.1:9700 SSH tunnel
Prometheus 9091 127.0.0.1:9091 SSH tunnel

All ports are localhost-only on the host. The only way in from the network is through SSH authentication to the host machine.

Compose File Generation

docker-compose.yml is auto-generated from flake.nix and agentbox.toml. Do not edit it by hand — changes will be overwritten the next time the compose output is regenerated, and hand-edits can desync the file from the manifest.

To regenerate it after changing agentbox.toml:

nix build .#compose && cp result/docker-compose.yml docker-compose.yml

The compose flake output is a pure text-generation pass — it does not build the container image, so it evaluates on any system Nix runs on (including macOS and Linux aarch64). The generated file is committed to the repo so that operators without Nix can still run the stack; treat it as a build artefact rather than a source file. Per-deployment customisation belongs in docker-compose.override.yml, which Docker Compose merges automatically.

Environment Variables

Copy .env.example to .env and fill in your values before running ./agentbox.sh up.

cp .env.example .env

API Keys

Variable Required Purpose
ANTHROPIC_API_KEY Yes (for Claude) Claude Code and the claude toolchain
OPENAI_API_KEY Optional Set to ollama to route through Ollama locally
OPENAI_BASE_URL Optional Ollama endpoint — http://ollama:11434/v1 (sidecar) or http://host.docker.internal:11434/v1 (host, requires networking.host_gateway = true)
GOOGLE_GEMINI_API_KEY Optional Gemini CLI and Gemini provider
GITHUB_TOKEN Optional gh CLI and GitHub MCP tools
CRATES_TOKEN Optional cargo publish for Rust ecosystem crates

Networking and Identity

Variable Purpose
TAILSCALE_AUTHKEY Ephemeral auth key for Tailscale federation; leave blank to disable
TAILSCALE_HOSTNAME MagicDNS name for this instance (default: agentbox)
NOSTR_RELAYS Comma-separated Nostr relay URLs for the sovereign mesh
AGENTBOX_NSEC Pre-generated bech32 nsec1... private key; leave blank to auto-generate at boot
AGENTBOX_NPUB Operator bech32 pubkey; overrides [sovereign_mesh.operator].pubkey_hex in the manifest

Infrastructure

Variable Purpose
MANAGEMENT_API_KEY Bearer token for the management HTTP API; auto-generated on first boot if empty
RUVECTOR_PG_PASSWORD PostgreSQL password for the RuVector vector store (must match the ruvector-postgres container)
AGENTBOX_AGENT_ID Stable identity label for this instance

Feature Configuration

Variable Purpose
COMFYUI_API_ENDPOINT ComfyUI API URL reachable from inside the container
AGENTBOX_CPU_LIMIT / AGENTBOX_MEM_LIMIT Docker deploy resource caps (referenced by docker-compose.override.yml)
AGENTBOX_CPU_RESERVE / AGENTBOX_MEM_RESERVE Docker deploy reservations

Run ./agentbox.sh preflight after editing .env — it validates the compose merge and checks that required variables are present before the stack starts.

8. Inspect Provisioned Profiles

The runtime creates these profile roots:

  • /home/devuser/workspace/profiles/claude-core
  • /home/devuser/workspace/profiles/ruflo-orchestrator
  • /home/devuser/workspace/profiles/qe-fleet
  • /home/devuser/workspace/profiles/nagual-qe
  • /home/devuser/workspace/profiles/rust-builder
  • /home/devuser/workspace/profiles/docs-latex

Each one should expose:

  • .claude/settings.json
  • .claude/skills -> /opt/agentbox/skills
  • projects -> /projects
  • workspace -> /home/devuser/workspace

8b. Storage Paths

  • RuVector: /var/lib/ruvector
  • Solid-style pod storage: /var/lib/solid
  • Sovereign identities: /var/lib/agentbox/identities
  • Shared workspace: /home/devuser/workspace
  • Shared external projects: /projects

9. Terminal Workflow

The container runs a tmux session (agentbox) with 10 pre-configured windows:

Window Name Purpose
0 Claude Primary development shell
1 Agent Agent execution workspace
2 Services supervisorctl status
3 Build Build/compile workspace
4 Logs Management API logs (split pane)
5 System Resource monitor (btm / htop)
6 VNC VNC connection info
7 Git Project git status
8 OpenRouter Claude Code via free OpenRouter models
9 ZAI Claude Code via Z.AI GLM relay

Attach from inside the container:

tmux attach -t agentbox    # or use the alias: ta

The shell environment includes fish + starship prompt (Tokyo Night theme), modern CLI replacements (eza, bat, delta, dust, procs, btm), and the full agentbox alias set. Run agentbox-help for a quick reference or agentbox-pick for an interactive launcher.

Troubleshooting

Docker is running but the container is unhealthy

Check whether the container is still an older image using the old keepalive-only supervisor config.

9090 health checks fail

The management API may not be running in the current container image, or the container may be older than the repo state.

Profile directories are missing

Check the entrypoint and logs:

docker logs agentbox
docker exec agentbox ls -la /home/devuser/workspace

Solid or RuVector paths do not exist

Verify the volumes are mounted and the entrypoint bootstrap ran successfully.