diff --git a/SECURITY.md b/SECURITY.md index 69ed5c6..da3ea20 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -1,22 +1,67 @@ # Security Policy +SimpleDeploy is a self-hosted application platform. We take security seriously and welcome reports from researchers, downstream operators, and the broader community. + ## Reporting a vulnerability -Email security@vazra.example with details. Please do not open public GitHub issues for security reports. +**Preferred channel:** GitHub's [private vulnerability reporting](https://github.com/vazra/simpledeploy/security/advisories/new) for this repo. This keeps the report between you and the maintainers until a fix ships. + +**Email:** `security@vazra.us` (PGP key on request) if you cannot use GitHub. + +Please do **not** open public GitHub issues, pull requests, or discussions for unfixed vulnerabilities. + +A useful report includes: + +- A description of the issue and the affected component (file path, function, or endpoint). +- The version (`simpledeploy version`) and deployment shape (binary, Docker, distro package). +- Steps to reproduce. A minimal proof-of-concept is appreciated but not required. +- Your assessment of impact and severity. +- Any required preconditions (e.g. authenticated user, specific config). + +We will: + +1. Acknowledge receipt within **3 business days**. +2. Confirm reproduction and triage within **7 business days**. +3. Aim to ship a fix or mitigation within **30 days** for High/Critical and **90 days** for Medium/Low. Extensions are coordinated with you when upstream changes are required. +4. Credit you in the release notes and CVE record unless you ask to remain anonymous. -Include: +## Safe harbor -- A description of the issue and affected component -- Steps to reproduce, proof-of-concept if available -- Your assessment of impact +We will not pursue legal action or report you to law enforcement for security research that: -We aim to acknowledge reports within 3 business days and ship a fix or mitigation within 30 days for high-severity issues. We will credit reporters in release notes unless you prefer to remain anonymous. +- Is conducted against your own deployment of SimpleDeploy (or one you have explicit permission to test). +- Stops once a vulnerability is identified — no data exfiltration, lateral movement, or denial-of-service. +- Avoids accessing, modifying, or destroying data that is not yours. +- Discloses to us privately first via the channels above. ## Supported versions -Only the latest minor release receives security fixes. +Only the latest **minor release** of the `main` branch receives security fixes. Older minors are not patched. Operators are expected to upgrade within a reasonable window after a security release. + +## Scope + +**In scope:** + +- The `simpledeploy` binary (REST API, dashboard, CLI, reconciler, embedded Caddy modules). +- The Svelte UI shipped in this repository. +- Build/release artifacts produced by this repository. + +**Out of scope (report to upstream):** + +- The Docker daemon and its supply chain. +- Caddy core and CertMagic (`caddyserver/caddy`, `caddyserver/certmagic`). +- The Go standard library and `golang-jwt/jwt`. +- SQLite (`modernc.org/sqlite`). +- Linux kernel, systemd, distro packaging. +- Compose files, Docker images, and recipes authored outside this repo. + +If unsure, send the report and we will route it. + +## Auditing SimpleDeploy -## Further reading +For researchers and downstream auditors, see: -- [Security hardening](docs/operations/security-hardening.md) -- [Security audit](docs/operations/security-audit.md) +- [Security architecture](docs/operations/security-architecture.md) — design overview, cryptographic primitives, trust boundaries, and what's mitigated. +- [Threat model](docs/operations/threat-model.md) — trust assumptions, in/out of scope, and known design trade-offs. +- [Security hardening](docs/operations/security-hardening.md) — operator-facing controls. +- [Activity & Audit Log](docs/operations/security-audit.md) — what's recorded for forensic purposes. diff --git a/docs/operations/security-architecture.md b/docs/operations/security-architecture.md new file mode 100644 index 0000000..4660f04 --- /dev/null +++ b/docs/operations/security-architecture.md @@ -0,0 +1,162 @@ +--- +title: Security architecture +description: Design overview for security researchers and downstream auditors. Cryptographic primitives, trust boundaries, and the controls that enforce them. +--- + +This page describes how SimpleDeploy is built from a security standpoint. It is intended for researchers, downstream auditors, and operators who want to understand the design choices before deploying. + +It is **not** a vulnerability disclosure or an exploit guide. For reporting issues, see [SECURITY.md](https://github.com/vazra/simpledeploy/blob/main/SECURITY.md). + +## Components and process model + +A single Go binary that: + +- Hosts a REST API + Svelte SPA on `management_addr:management_port` (default `127.0.0.1:8443`). +- Embeds Caddy v2 to terminate TLS for app traffic on `:80` / `:443`. +- Drives Docker via the local socket (`/var/run/docker.sock`) and the `docker compose` CLI. +- Persists state in a single SQLite file (`$data_dir/simpledeploy.db`, mode `0600`, WAL). + +There is no second daemon, no message queue, no separate worker pool. Every action is in-process. + +## Cryptographic primitives + +| Purpose | Algorithm | Parameters | +|---|---|---| +| Password hashing | bcrypt | cost 12 | +| Session token | JWT HS256 | 24h expiry; `iss=simpledeploy`, `aud=simpledeploy-dashboard`, custom `tv` (token version) claim | +| JWT signing key | HKDF-SHA256 from `master_secret` | info=`simpledeploy-jwt-v1`, 32-byte output | +| API key | random 32 bytes from `crypto/rand` | `sd_` prefix + 64 hex | +| API key storage | HMAC-SHA256 keyed by `master_secret` | constant-time compare via DB index lookup | +| Credential at rest (registry, S3, gitsync token) | AES-256-GCM | random 16-byte salt + random 12-byte nonce; key via PBKDF2-HMAC-SHA256, 600k iterations (legacy 100k accepted on read) | +| Git webhook signature | HMAC-SHA256 | `hmac.Equal` constant-time compare | +| TLS automation | ACME via Caddy + CertMagic | local CA mode also available | + +`master_secret` is operator-supplied at install time and persisted in `config.yaml` (mode `0600`). It is the single root of trust for all symmetric crypto in the binary. Different purposes derive subkeys via HKDF where backward-compat permits; existing AES-GCM ciphertexts and API key HMACs continue to use the master directly to keep stored data decryptable. + +## Authentication + +Two parallel paths reach the same `AuthUser` context: + +- `Authorization: Bearer sd_` → API-key path. The full key is hashed with the master HMAC and compared against `api_keys.key_hash` (UNIQUE indexed). Expired keys are rejected at the middleware. `last_used_at` is lazy-updated. +- `Cookie: session=` → JWT path. The token is verified (alg pinned to HMAC), issuer/audience checked, the user fetched, and `claims.tv` compared against `users.token_version`. + +Both paths populate an `audit.Ctx` (actor user id, name, source, IP) carried through the request context, so every recorded mutation attributes to a real principal. + +## Authorization + +Three roles: `super_admin`, `manage`, `viewer`. Per-app grants in `user_app_access` extend `manage`/`viewer` to specific apps. Middleware: + +- `authMiddleware` — required on every authenticated route. +- `appAccessMiddleware` — read access to `/api/apps/{slug}/…`. super_admin bypass. +- `mutatingAppMiddleware` — same as above but rejects viewers. +- `superAdminMiddleware` — super_admin only. + +For routes keyed by a body or referenced row id (e.g. `PUT /api/backups/configs/{id}`), the handler resolves the underlying app id and calls `canMutateForApp`. The router registration in `internal/api/server.go` is the source of truth for which middleware applies where. + +## Session invalidation + +`users.token_version` is bumped server-side on: + +- **Logout** (best-effort: the unauthenticated logout endpoint reads and validates the cookie before bumping). +- **Password change** (`UpdatePassword`). +- **Role change** (`UpdateUserRole`). + +JWTs minted before any of those events fail the `tv` check on the next request and are rejected. + +## Network exposure + +Default bindings: + +- `:80`, `:443` — Caddy. Public-facing reverse proxy + ACME. +- `127.0.0.1:8443` — dashboard. Local-only by default; operators front it under a `manage.` route through Caddy if external access is needed. +- App `ports:` mappings are rewritten at deploy time to bind `127.0.0.1:` so the published port cannot be used to bypass per-app Caddy controls. Operator-explicit interface bindings (`0.0.0.0:`, `127.0.0.1:`, `[::1]:`) are preserved verbatim. The rewrite can be disabled globally with `SIMPLEDEPLOY_DISABLE_PORT_LOOPBACK=true`. + +The Caddy admin API (default `:2019`) is **disabled** programmatically. There is no pprof, no `/debug` endpoint. + +## Outbound traffic from the dashboard + +| Destination | When | +|---|---| +| Configured `recipes_index_url` | UI catalog browsing (HTTPS, same-host enforcement on sub-resources) | +| Operator-configured webhook URLs | Alert dispatch — public IPs only, with DNS-rebind protection in the dialer | +| Configured registries | Compose deploy (image pulls happen via the Docker daemon, not the binary) | +| Configured S3 endpoint | Backup target (operator-supplied creds) | +| Configured git remote | git sync (operator-supplied creds) | + +The webhook dispatcher's HTTP client uses a custom `DialContext` that re-validates the resolved IP at connect time and rejects private, loopback, link-local, multicast, CGNAT, IETF-reserved, and class-E ranges. + +## Compose validation + +Compose files are validated on every code path that produces them: API deploy, bundle import, reconciler scan (catches gitsync / SSH side-channel writes), and rollback. Rejection rules are documented in [Compose labels](/reference/compose-labels/). The validator is in `internal/compose/validate.go` with unit-test coverage in `validate_test.go`. + +## Restore archive validation + +The `volume` and `sqlite` restore strategies pre-walk the uploaded tar (`internal/backup/tarsafe.go`) and reject: + +- absolute paths +- `..` segments +- symlinks and hardlinks +- block/char/fifo entries +- NUL in names + +After validation the stream is replayed verbatim into `docker exec ... tar -xzf -` with `--no-same-owner --no-overwrite-dir`. Decompressed size is capped at 8 GiB by default. Concurrent restores are capped server-side. + +## Audit trail + +Every mutating endpoint records a row in `audit_log` with the actor, IP, source, before/after JSON snapshots (secrets redacted), and a pre-rendered summary. Two tamper-resistance properties: + +- `DELETE /api/activity` (super_admin only) writes a sentinel `system/audit_purged` row immediately after the wipe, including the pre-purge row count and actor info. Anyone trying to wipe the trail leaves a row recording the wipe. +- App purge does **not** delete `audit_log` rows. The `app_id` FK is set to NULL while the denormalized `app_slug` is preserved, so the trail is intact even after the app is gone. + +A super_admin can still tamper at the SQLite level. The trail is operator-trust-bound, not Byzantine-fault-tolerant. + +## Logging + +Process stdout/stderr is teed into an in-process ring buffer (`internal/logbuf`). Buffered messages are sanitized: ANSI/OSC escape sequences are stripped, ASCII control characters except tab are dropped, and any single line is truncated at 8 KiB. The buffer is exposed at `GET /api/system/process-logs` to super_admin only. + +The api logger (`log.Printf("[api] …")`) writes structured-ish lines and is also captured by the buffer. Handler errors are routed through `httpError`, which logs server-side and returns generic `http.StatusText` to the client; `err.Error()` is not echoed. + +## DoS / resource controls + +- `http.Server.ReadHeaderTimeout = 10s`, `IdleTimeout = 120s` (read/write deadlines are per-handler so streaming WS is not killed). +- Per-path body limit: 32 MiB for `upload-restore`, 256 KiB for cert uploads, 1 MiB elsewhere. +- WS endpoints set `SetReadLimit(16 KiB)` and a 30s ping ticker; auth is rechecked every 60s. +- Login: dedicated 10/min/IP rate limiter. +- Account lockout: per-(username, IP) tuple, max 30 minute backoff. Locked-out attempts return `401 invalid credentials` (no enumeration tell). +- Webhook dispatcher: 10s overall timeout, 5s TLS handshake, 10s response-header. +- Restore concurrency: server-wide semaphore caps to 4. +- Decompression: 8 GiB cap on gzip readers in restore paths. + +## Build and release integrity + +The release pipeline is described in [`.github/workflows/release.yml`](https://github.com/vazra/simpledeploy/blob/main/.github/workflows/release.yml) and `.goreleaser.yml`. As of this writing: + +- Builds run on GitHub-hosted Ubuntu runners. +- Artifacts are produced by `goreleaser` and attached to the GitHub release. +- Container images are pushed to GHCR. + +Cryptographic signing of release artifacts (cosign), SBOM emission (syft), and SLSA provenance are tracked as roadmap items. Until they ship, downstream operators are expected to verify GitHub release commits against tag annotations and pin Docker images by digest after the first pull. + +## Auditing the source + +Recommended starting points for a code audit: + +- `internal/api/server.go` — full route table. +- `internal/api/middleware.go` — auth + audit context plumbing. +- `internal/auth/` — JWT, API keys, password, lockout, real-IP, AES-GCM. +- `internal/compose/validate.go` — compose security validator. +- `internal/backup/tarsafe.go` — restore archive validator. +- `internal/proxy/proxy.go` — Caddy config builder + custom modules. +- `internal/store/migrations/` — schema history. +- `internal/audit/` — audit recorder + render. + +Run `go test ./...` and `go test -race ./...` from a clean checkout. Run `cd ui && npm test` for the dashboard. + +## Known design trade-offs + +These are choices, not bugs. They are documented in [Threat model](/operations/threat-model/). + +- super_admin is host-root-equivalent because it can deploy arbitrary compose to a daemon SimpleDeploy talks to as root. +- A super_admin who controls the host can rewrite the SQLite file directly. The audit trail is operator-trust-bound. +- The recipes index is fetched over HTTPS (TOFU) and is not yet cryptographically signed. +- `master_secret` rotation requires re-encrypting stored credentials and forces re-issuance of API keys. diff --git a/docs/operations/threat-model.md b/docs/operations/threat-model.md new file mode 100644 index 0000000..0f46686 --- /dev/null +++ b/docs/operations/threat-model.md @@ -0,0 +1,141 @@ +--- +title: Threat model +description: What SimpleDeploy is designed to defend against, the trust assumptions it makes, and the design trade-offs operators should be aware of. +--- + +This page sets the scope for security analysis: what SimpleDeploy considers a threat, what it does not, and why. It pairs with [Security architecture](/operations/security-architecture/) (the *how*). + +## Trust principals + +| Principal | Trust level | Capability | +|---|---|---| +| Host root | Fully trusted | Owns the SQLite DB, `master_secret`, Docker socket, kernel. Anything below this is bounded by host root. | +| `simpledeploy` daemon | Fully trusted | Runs as root by design (docker.sock + privileged ports). Compromise of the daemon equals host root. | +| `super_admin` user | Equivalent to host root | Can deploy arbitrary compose to a daemon SimpleDeploy talks to as root. Treat super_admin as a privileged operator. | +| `manage` user (with grant) | Trusted within their app set | Can mutate their accessible apps, restore backups, change env, etc. Cannot reach platform-level config. | +| `viewer` user (with grant) | Read-only within their app set | Can view, download logs, fetch activity, but not mutate. | +| Authenticated client (cookie or API key) | Bounded by the user's role | The role+grants of the user the credential belongs to. | +| Unauthenticated network traffic | Untrusted | Reaches Caddy on `:80`/`:443`, the dashboard if exposed, public health/setup endpoints. Treated as adversarial. | + +## In-scope adversaries + +The following are explicitly part of the threat model. SimpleDeploy is designed to make these costly or impossible without legitimate credentials: + +- **External network attacker on the public internet** trying to reach the dashboard, an app, or the docker socket. +- **External attacker on the same LAN** trying to bypass TLS, ride session cookies, or reach loopback-bound services. +- **Compromised compose file or recipe** trying to escape the container onto the host. +- **Compromised gitsync remote or backup tarball** trying to deliver a privileged compose or write outside the container's volume. +- **Hostile DNS or compromised CA** trying to redirect a webhook dispatch to an internal endpoint (DNS rebinding / SSRF). +- **Authenticated `viewer` or `manage` user** trying to escalate to platform-level access, read another user's apps, exfiltrate audit history, or smuggle through ID parameters. +- **Stolen JWT cookie or API key** trying to outlive logout / password change / role change. + +## Out-of-scope adversaries + +These are explicit non-goals. If your threat model includes them, layer additional controls below SimpleDeploy: + +- **Host root compromise.** Once root, the attacker rewrites `config.yaml`, the SQLite DB, the systemd unit, or the daemon binary. SimpleDeploy is not a sandbox against root. +- **Hypervisor or kernel exploit** that escapes Docker's isolation. Mitigated by upstream Linux + Docker; we do not add a second layer. +- **Physical access to the host or backup target** without disk encryption. +- **Side-channel timing on bcrypt** beyond the dummy-hash equalization on user-not-found. +- **Side-channel power/EM analysis** of the AES-GCM implementation. +- **Long-running cryptanalytic attacks** on AES-256-GCM, HMAC-SHA256, HKDF-SHA256, or bcrypt cost 12. +- **Supply-chain compromise of upstream dependencies** (Go stdlib, Caddy, Docker SDK, modernc/sqlite). Tracked via `govulncheck` and Dependabot; not separately mitigated. +- **A super_admin acting maliciously.** super_admin is trusted to the level of host root. Use the audit trail to detect, not prevent. Forward audit events to an external sink for tamper-evidence beyond the local DB. +- **Compromise of operator-supplied secrets at rest** (e.g. the operator pastes `master_secret` into a chat). + +## Boundary diagram + +``` + ┌───────────────────────────────────────────┐ + │ Untrusted: public internet + LAN │ + └───────────────┬───────────────────────────┘ + │ TLS, ACME, app traffic + ▼ + ┌───────────────────────────────────────────┐ + │ Caddy (in-process) │ + │ - per-app IP allow / rate-limit │ + │ - HSTS, security headers │ + │ - HTTP→HTTPS redirect │ + └───────────────┬───────────────────────────┘ + │ reverse_proxy localhost:N + ▼ + ┌───────────────────────────────────────────┐ + │ App containers │ + │ - port mappings rewritten to 127.0.0.1 │ + │ - compose validator rejects host-escape │ + │ - shared bridge `simpledeploy-public` │ + └───────────────┬───────────────────────────┘ + │ docker.sock (root) + ▼ + ┌───────────────────────────────────────────┐ + │ Docker daemon │ + └───────────────────────────────────────────┘ + + ┌───────────────────────────────────────────┐ + │ Dashboard listener │ + │ default 127.0.0.1:8443 (local-only) │ + │ - JWT cookie (HttpOnly, Secure, Strict) │ + │ - Bearer API key │ + │ - per-IP login rate limit │ + │ - per-(user,IP) lockout │ + └───────────────────────────────────────────┘ +``` + +## Known design trade-offs + +These are deliberate choices, surfaced for transparency: + +### 1. super_admin == host root + +**Why:** Deploys go through the docker.sock as root. Even with the compose validator, a super_admin who chooses a permissive image effectively executes arbitrary code on the host. + +**Mitigation:** Restrict super_admin to break-glass operators; use `manage` for day-to-day. Forward audit events offsite for forensic continuity. + +### 2. master_secret is the single root of trust + +**Why:** Simplifies operator UX. A multi-secret model would require a key-management story (rotation, backup, restore) that adds operational risk for small deployments. + +**Mitigation:** Per-purpose subkeys are derived via HKDF where compatibility allows (JWT signing). AES-GCM credential encryption and API-key HMAC continue to use the master directly so existing data stays decryptable. Document a rotation procedure if the master is ever exposed. + +### 3. Recipes index is HTTPS TOFU, not signed + +**Why:** Catalog publishing pipeline cost. End-to-end signing of an index plus per-recipe content requires a key-management story we have not yet committed to. + +**Mitigation:** Same-host enforcement on sub-resource fetches; deploy-time compose validation catches privileged recipes; a malicious recipe still passes through the same security validator as a hand-written compose. + +### 4. The audit trail is operator-trust-bound + +**Why:** Audit lives in the same SQLite DB as everything else. A super_admin (or anyone with host root) can rewrite it. + +**Mitigation:** Sentinel rows on `audit_purged` and on app purge. For Byzantine-fault-tolerant audit, forward events to an external sink (webhook category, syslog forwarder, etc.). + +### 5. Default `tls.mode: off` is permitted + +**Why:** local development and behind-LB setups need it. + +**Mitigation:** Cookies are still `SameSite=Strict` and `HttpOnly` even when `Secure` is omitted. Operators are warned in the docs not to expose plain HTTP to the network. + +### 6. Released artifacts are not cryptographically signed (yet) + +**Why:** SLSA provenance + cosign keyless is a roadmap item, not yet shipped. + +**Mitigation:** Operators can pin Docker images by digest after first pull and verify GitHub release commits against tag annotations. Tracked as a release-engineering item. + +## Evidence we expect a researcher to look for + +If you are auditing SimpleDeploy and this list does not match what you find, that is a finding worth reporting: + +- Every mutating endpoint emits an `audit_log` row. +- Every authenticated route has either a role or an app-access middleware. +- Every `exec.Command` uses argv form (no shell interpolation). +- Every SQL query uses placeholders (the three `Sprintf`-built queries interpolate validated whitelists only). +- Every cookie is `HttpOnly` + `SameSite=Strict`. +- Every WebSocket upgrade either matches `Origin == Host` or holds a Bearer token. +- Every restore tar is pre-walked before extraction. +- Every JWT carries `iss`, `aud`, `tv`, and is HS256. + +If you find a place where the codebase deviates from these invariants, please [report it](https://github.com/vazra/simpledeploy/security/advisories/new). It is much more likely to be a regression than an intentional choice. + +## Versioning and changelog + +Security-relevant changes are tagged with `fix(security)` or `fix(auth)` in commit messages and surfaced in the [changelog](https://github.com/vazra/simpledeploy/blob/main/CHANGELOG.md). Operator-impacting defaults (e.g. `management_addr` becoming `127.0.0.1`) are called out in release notes.