Skip to content

feat(worker): Tier 0 subprocess sandbox for untrusted media (Increment 1)#2

Open
oneshot2001 wants to merge 2 commits into
feature/plateproof-phase0from
feat/tier0-subprocess-sandbox
Open

feat(worker): Tier 0 subprocess sandbox for untrusted media (Increment 1)#2
oneshot2001 wants to merge 2 commits into
feature/plateproof-phase0from
feat/tier0-subprocess-sandbox

Conversation

@oneshot2001
Copy link
Copy Markdown
Owner

What & why

Adds a Tier 0 subprocess sandbox so that ffprobe + the SVF signed-video-validator C binary — which parse untrusted uploaded video — cannot let a memory-corruption exploit escalate beyond a confined, unprivileged, network-less, resource-capped child. Increment 1 (in-repo code + Dockerfile + rubric); the deploy topology is Increment 2 (below).

What's in this PR

  • app/sandbox.pyrun_sandboxed() choke point + SandboxResult + a startup bwrap-vs-DEGRADED capability probe. Never raises (fails closed).
  • app/sandbox_launcher.py — exec-only rlimit launcher (AS/CPU/FSIZE/NOFILE/NPROC/CORE) + PR_SET_NO_NEW_PRIVS; sentinel exit 125 on setup failure. No preexec_fn.
  • All three spawn sites (2× ffprobe, SVF validator) routed through run_sandboxed: bwrap --unshare-{user,pid,ipc,cgroup,net}, ldd-derived ro-bind set, scrubbed env, --die-with-parent, --new-session.
  • Fail-closed verdict gating — timeout / launch-fail / rlimit-kill / output-overflow / nonzero-exit all → status=error before any verdict parse (svf_runner reads validation_results.txt only after the gates).
  • Streamed upload cap (drops unbounded file.read()), concurrency semaphore, DEGRADED mode refuses /verify (503) unless ALLOW_DEGRADED_SANDBOX=true, /health reports mode.
  • Dockerfile — non-root uid 10001, multi-stage slim runtime, bubblewrap; build guard fails if SVF_SHA/SVF_EXAMPLES_SHA are unpinned (master) unless ALLOW_UNPINNED_SVF=true.
  • tests/test_sandbox.py — 17-AC rubric (@pytest.mark.linux_sandbox for container-only ACs).

Provenance & review

Built by Codex (GPT-5.5) from a Claude planner spec (worker/TIER0-SANDBOX-SPEC.md), then Claude-reviewed. The review caught that v1's containment guarantee was false on a rlimit-only host and that the rubric could pass while the goal failed; resolved by the move-host decision + 4 fixes:

  1. output-overflow now gated by design at all 3 sites (was held by json.loads accident);
  2. launcher-failure detection uses a numeric sentinel (125), not attacker-influenceable stderr;
  3. added --unshare-cgroup;
  4. Dockerfile build guard against unpinned SVF SHAs.

Tests

  • 22 host tests pass (pytest -m "not linux_sandbox"), py_compile clean.
  • 15 containment ACs are CI-only (need the Linux container): docker build -t epworker . && docker run --rm epworker pytest -m linux_sandbox.

⚠️ Merge / deploy gates

  • Pin SVF_SHA + SVF_EXAMPLES_SHA to reviewed commit SHAs before any production build — the guard now refuses master.
  • bwrap is defense-in-depth, not the boundary, until Increment 2. Railway lacks unprivileged user namespaces, so the production target is a Fly.io Firecracker microVM (outer) + bwrap/seccomp/rlimits (inner). Do not treat this as production containment for public untrusted video until that lands.

Out of scope (Increment 2)

Fly.io microVM split + object-storage hop + queue + control-plane/worker split; the seccomp profile (AC7, deliberately deferred — not shipped fail-open); and the flagged trust bugs: verdict-from-text injection in parse_svf_output and callback SSRF / WORKER_API_KEY exfil.

Spec: worker/TIER0-SANDBOX-SPEC.md · build report: worker/TIER0-BUILD-REPORT.md.

🤖 Generated with Claude Code

…t 1)

Contain ffprobe + the SVF C validator (which parse untrusted uploaded
video) so a memory-corruption exploit in those C binaries cannot escalate
beyond a confined, unprivileged, network-less, resource-capped child.

- app/sandbox.py: run_sandboxed() choke point + SandboxResult + startup
  bwrap-vs-DEGRADED capability probe; never raises (fails closed).
- app/sandbox_launcher.py: exec-only rlimit launcher (AS/CPU/FSIZE/NOFILE/
  NPROC/CORE) + PR_SET_NO_NEW_PRIVS; sentinel exit 125 on setup failure.
- Route both ffprobe sites + the SVF validator through run_sandboxed:
  bwrap --unshare-{user,pid,ipc,cgroup,net}, ldd-derived ro-bind set,
  scrubbed env, --die-with-parent, --new-session.
- Fail-closed verdict gating: timeout / launch-fail / rlimit-kill /
  output-overflow / nonzero-exit all -> status=error BEFORE any verdict
  parse (svf_runner reads validation_results.txt only after the gates).
- Streamed upload cap (drop unbounded file.read()), concurrency semaphore,
  DEGRADED mode refuses /verify (503) unless ALLOW_DEGRADED_SANDBOX=true,
  /health reports sandbox mode.
- Dockerfile: non-root uid 10001, multi-stage slim runtime, bubblewrap;
  build guard FAILS if SVF SHAs are unpinned (master) unless
  ALLOW_UNPINNED_SVF=true.
- tests/test_sandbox.py: 17-AC rubric (@pytest.mark.linux_sandbox for
  container-only ACs). 22 host tests pass.

Built by Codex (GPT-5.5) from a Claude planner spec, then Claude-reviewed
and fix-passed (output-overflow gating, launcher sentinel, --unshare-cgroup,
SVF-SHA build guard). Spec: worker/TIER0-SANDBOX-SPEC.md.

Increment 2 deferred: Fly.io microVM split + object-storage hop + queue +
seccomp profile (AC7) + callback-SSRF and verdict-from-text fixes.
Before prod: pin SVF_SHA / SVF_EXAMPLES_SHA to reviewed commit SHAs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
edgeproof Ready Ready Preview, Comment May 30, 2026 11:17pm
edgeproof-dev Ready Ready Preview, Comment May 30, 2026 11:17pm

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 652e6515-90e5-4492-9465-271cb568f554

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tier0-subprocess-sandbox

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 449975f0f0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Detect if signatures are present
if "signature" in lower_output or "signed" in lower_output:
# Detect signed and verified
if "video is signed and verified" in lower_output:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse the validator's real verdict strings

In deployments using the Axis validator built by this Dockerfile, the new validation_results.txt path will not classify real authentic clips because the validator summary uses VIDEO IS VALID! (and VIDEO IS INVALID! for tampering), not VIDEO IS SIGNED AND VERIFIED. Those clips fall through as inconclusive with signature_valid=False, so the verification API returns the wrong verdict for normal signed media.

Useful? React with 👍 / 👎.

Comment thread worker/Dockerfile Outdated
Comment on lines +7 to +8
ARG SVF_SHA=master
ARG SVF_EXAMPLES_SHA=master
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Provide pinned SVF args for default Docker builds

With worker/railway.json now selecting this Dockerfile and no build args configured there, Railway/default docker build uses these master defaults while ALLOW_UNPINNED_SVF remains false, so the guard immediately exits before the image can build. Any deploy path that does not inject both SVF SHAs is now unable to build the worker; use pinned SHA defaults or configure the deployment build args.

Useful? React with 👍 / 👎.

Comment thread worker/app/sandbox.py
Comment on lines +183 to +184
else:
command = launcher_argv
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail closed instead of running unsandboxed

If the startup probe falls back to DEGRADED, /verify can still execute whenever ALLOW_DEGRADED_SANDBOX=true is set for local tests or emergency operation, but this branch runs the untrusted media tool directly as launcher_argv without bubblewrap filesystem/network isolation. Because allow_degraded_sandbox only gates the request handler and does not make these inputs trusted, any deployment that enables it silently loses the Tier 0 containment promised by this change; return a sandbox error here (or restrict degraded mode to non-user inputs) rather than executing the validator/ffprobe outside bwrap.

Useful? React with 👍 / 👎.

Comment thread worker/app/main.py
Comment on lines +77 to +79
max_upload_bytes = min(settings.max_file_size_bytes, settings.sandbox_max_input_bytes)
if content_length is not None and content_length > max_upload_bytes:
raise HTTPException(status_code=400, detail="Uploaded file is too large")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce the upload cap before multipart parsing

For oversized multipart uploads that omit Content-Length (or are sent through a proxy that strips it), this check runs only after FastAPI has already parsed the request into an UploadFile, so the body can be fully spooled to the server's temporary storage before _stream_upload_to_disk rejects it. That leaves the worker vulnerable to the large-upload disk exhaustion this cap is meant to prevent; enforce the limit in ASGI/proxy middleware before multipart parsing starts.

Useful? React with 👍 / 👎.

Replace floating `master` defaults with immutable commit SHAs (the
Dockerfile build guard now passes without ALLOW_UNPINNED_SVF):
- SVF_SHA          1ae9fed = signed-video-framework tag v2.3.5
                   (latest release; == current master HEAD)
- SVF_EXAMPLES_SHA e009c31 = signed-video-framework-examples master HEAD
                   (repo has no tags; links the system-installed lib so it
                   tracks v2.3.5)

Closes the supply-chain gap flagged in the Tier 0 review: the C binaries
that parse untrusted video are now fetched from fixed, reviewed commits
instead of a moving branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant