diff --git a/architecture/sandbox-custom-containers.md b/architecture/sandbox-custom-containers.md index e44de29dd..1e4e2ab17 100644 --- a/architecture/sandbox-custom-containers.md +++ b/architecture/sandbox-custom-containers.md @@ -67,7 +67,7 @@ The server applies these transforms to every sandbox pod template (`sandbox/mod. 3. Overrides the agent container's `command` to `/opt/openshell/bin/openshell-sandbox`. 4. Sets `runAsUser: 0` so the supervisor has root privileges for namespace creation, proxy setup, and Landlock/seccomp. -These transforms apply to every generated pod template. +These transforms apply to every generated pod template. The image's Docker `ENTRYPOINT` is not the OpenShell startup path. To run image-specific startup logic on sandbox creation and pod restart, install a script at `/etc/openshell/boot.sh`; the supervisor launches it as a managed child process before starting the long-lived child process. ## CLI Usage @@ -97,7 +97,8 @@ openshell sandbox create --from ./my-sandbox/ # directory with Dockerfile The `openshell-sandbox` supervisor adapts to arbitrary environments: - **Log file fallback**: Attempts to open `/var/log/openshell.log` for append; silently falls back to stdout-only logging if the path is not writable. -- **Command resolution**: Executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort. +- **Boot hook**: Runs `/etc/openshell/boot.sh` with `/bin/sh` as a supervisor-managed child process on every pod start when that file exists. The script runs before the long-lived child process and must exit successfully for the sandbox to become ready. +- **Command resolution**: After the boot hook completes, executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort. - **Network namespace**: Requires successful namespace creation for proxy isolation; startup fails in proxy mode if required capabilities (`CAP_NET_ADMIN`, `CAP_SYS_ADMIN`) or `iproute2` are unavailable. If the `iptables` package is present, the supervisor installs OUTPUT chain rules (LOG + REJECT) inside the namespace to provide fast-fail behavior (immediate `ECONNREFUSED` instead of a 30-second timeout) and diagnostic logging when processes attempt direct connections that bypass the HTTP CONNECT proxy. If `iptables` is absent, the supervisor logs a warning and continues — core network isolation still works via routing. ## Design Decisions @@ -111,6 +112,7 @@ The `openshell-sandbox` supervisor adapts to arbitrary environments: | hostPath side-load | Supervisor binary lives on the node filesystem — no init container, no emptyDir, no extra image pull. Faster pod startup. | | Read-only mount in agent | Supervisor binary cannot be tampered with by the workload | | Command override | Ensures `openshell-sandbox` is the entrypoint regardless of the image's default CMD | +| `/etc/openshell/boot.sh` hook | Gives images a restart-safe startup hook even though Docker `ENTRYPOINT` is bypassed | | Clear `run_as_user/group` for custom images | Prevents startup failure when the image lacks the default `sandbox` user | | Non-fatal log file init | `/var/log/openshell.log` may be unwritable in arbitrary images; falls back to stdout | | `docker save` / `ctr import` for push | Avoids requiring a registry for local dev; images land directly in the k3s containerd store | diff --git a/architecture/sandbox.md b/architecture/sandbox.md index c5e212f85..3631a4a7e 100644 --- a/architecture/sandbox.md +++ b/architecture/sandbox.md @@ -65,8 +65,11 @@ flowchart TD L2 --> N{SSH enabled?} M --> N N -- Yes --> O[Spawn SSH server task] - N -- No --> P[Spawn child process] - O --> P + N -- No --> O1{`/etc/openshell/boot.sh`?} + O --> O1 + O1 -- Yes --> O2[Run boot hook] + O1 -- No --> P[Spawn child process] + O2 --> P P --> Q[Store entrypoint PID] Q --> R{gRPC mode?} R -- Yes --> T[Spawn policy poll task] @@ -111,16 +114,22 @@ flowchart TD 8. **SSH server** (optional): If `--ssh-listen-addr` is provided, spawn an async task running `ssh::run_ssh_server()` with the policy, workdir, netns FD, proxy URL, CA paths, and provider env. -9. **Child process spawning** (`ProcessHandle::spawn()`): +9. **Boot hook** (optional): If `/etc/openshell/boot.sh` exists, run it with `/bin/sh` as a supervisor-managed child process before spawning the long-lived child process. + - The hook runs through the normal child spawn path, not as in-process supervisor code. + - The hook runs with the same privilege drop, Landlock/seccomp policy, proxy environment, provider environment, TLS trust files, and network namespace setup as the normal child process. + - While the hook is running, its PID is temporarily published through `entrypoint_pid` so proxy identity binding can attribute any startup traffic correctly. + - A non-zero exit code fails sandbox startup so Kubernetes retries the pod. + +10. **Child process spawning** (`ProcessHandle::spawn()`): - Build `tokio::process::Command` with inherited stdio and `kill_on_drop(true)` - Set environment variables: `OPENSHELL_SANDBOX=1`, provider credentials, proxy URLs, TLS trust store paths - Pre-exec closure (async-signal-safe): `setpgid` (if non-interactive) -> `setns` (enter netns) -> `drop_privileges` -> `sandbox::apply` (Landlock + seccomp) -10. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`. +11. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`. -11. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details. +12. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details. -12. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124. +13. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124. ## Policy Model diff --git a/crates/openshell-policy/src/lib.rs b/crates/openshell-policy/src/lib.rs index 9cf543bdf..c02bde9d3 100644 --- a/crates/openshell-policy/src/lib.rs +++ b/crates/openshell-policy/src/lib.rs @@ -411,6 +411,12 @@ pub fn load_sandbox_policy(cli_path: Option<&str>) -> Result Option<(&'static str, Vec)> { + path.is_file() + .then(|| ("/bin/sh", vec![path.display().to_string()])) +} + +fn boot_hook_failed(path: &std::path::Path, exit_code: i32) -> miette::Report { + miette::miette!( + "Sandbox boot hook failed: {} exited with code {}", + path.display(), + exit_code + ) +} + +#[cfg(target_os = "linux")] +async fn run_boot_hook_at_path( + path: &std::path::Path, + workdir: Option<&str>, + policy: &SandboxPolicy, + netns: Option<&NetworkNamespace>, + ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>, + provider_env: &std::collections::HashMap, + entrypoint_pid: Option<&AtomicU32>, +) -> Result { + let Some((program, args)) = boot_hook_command(path) else { + debug!(path = %path.display(), "No sandbox boot hook found"); + return Ok(false); + }; + + info!( + path = %path.display(), + "Running sandbox boot hook as supervisor-managed child process" + ); + let mut handle = ProcessHandle::spawn( + program, + &args, + workdir, + false, + policy, + netns, + ca_paths, + provider_env, + )?; + if let Some(pid_slot) = entrypoint_pid { + // Reuse the entrypoint PID slot while the boot hook runs so proxy + // identity binding can attribute startup traffic to this process. + pid_slot.store(handle.pid(), Ordering::Release); + } + + let wait_result = handle.wait().await; + if let Some(pid_slot) = entrypoint_pid { + pid_slot.store(0, Ordering::Release); + } + let status = wait_result.into_diagnostic()?; + + if status.code() != 0 { + return Err(boot_hook_failed(path, status.code())); + } + + info!( + path = %path.display(), + exit_code = status.code(), + "Sandbox boot hook completed" + ); + Ok(true) +} + +#[cfg(not(target_os = "linux"))] +async fn run_boot_hook_at_path( + path: &std::path::Path, + workdir: Option<&str>, + policy: &SandboxPolicy, + ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>, + provider_env: &std::collections::HashMap, + entrypoint_pid: Option<&AtomicU32>, +) -> Result { + let Some((program, args)) = boot_hook_command(path) else { + debug!(path = %path.display(), "No sandbox boot hook found"); + return Ok(false); + }; + + info!( + path = %path.display(), + "Running sandbox boot hook as supervisor-managed child process" + ); + let mut handle = ProcessHandle::spawn( + program, + &args, + workdir, + false, + policy, + ca_paths, + provider_env, + )?; + if let Some(pid_slot) = entrypoint_pid { + pid_slot.store(handle.pid(), Ordering::Release); + } + + let wait_result = handle.wait().await; + if let Some(pid_slot) = entrypoint_pid { + pid_slot.store(0, Ordering::Release); + } + let status = wait_result.into_diagnostic()?; + + if status.code() != 0 { + return Err(boot_hook_failed(path, status.code())); + } + + info!( + path = %path.display(), + exit_code = status.code(), + "Sandbox boot hook completed" + ); + Ok(true) +} + /// Build an inference context for local routing, if route sources are available. /// /// Route sources (in priority order): @@ -1673,6 +1815,47 @@ mod tests { static ENV_LOCK: std::sync::LazyLock> = std::sync::LazyLock::new(|| std::sync::Mutex::new(())); + /// Test-only boot hook runner that exercises the boot hook logic + /// (command detection, process execution, PID tracking, error reporting) + /// without applying sandbox policies (seccomp, landlock, privilege + /// dropping). Sandbox enforcement has its own dedicated tests. + async fn run_test_boot_hook( + path: &std::path::Path, + entrypoint_pid: Option<&AtomicU32>, + ) -> Result { + let Some((program, args)) = boot_hook_command(path) else { + return Ok(false); + }; + + let workdir = path.parent(); + let mut cmd = tokio::process::Command::new(program); + cmd.args(&args) + .stdin(std::process::Stdio::null()) + .stdout(std::process::Stdio::piped()) + .stderr(std::process::Stdio::piped()); + if let Some(dir) = workdir { + cmd.current_dir(dir); + } + + let child = cmd.spawn().into_diagnostic()?; + let pid = child.id().unwrap_or(0); + if let Some(pid_slot) = entrypoint_pid { + pid_slot.store(pid, Ordering::Release); + } + + let output = child.wait_with_output().await.into_diagnostic()?; + if let Some(pid_slot) = entrypoint_pid { + pid_slot.store(0, Ordering::Release); + } + + let exit_code = output.status.code().unwrap_or(-1); + if exit_code != 0 { + return Err(boot_hook_failed(path, exit_code)); + } + + Ok(true) + } + #[test] fn bundle_to_resolved_routes_converts_all_fields() { let bundle = openshell_core::proto::GetInferenceBundleResponse { @@ -1923,6 +2106,85 @@ routes: )); } + #[tokio::test] + async fn boot_hook_missing_is_noop() { + let dir = tempfile::tempdir().unwrap(); + let path = dir.path().join("boot.sh"); + let pid = AtomicU32::new(99); + assert!(!run_test_boot_hook(&path, Some(&pid)).await.unwrap()); + assert_eq!( + pid.load(Ordering::Acquire), + 99, + "missing hook must not mutate entrypoint pid state" + ); + } + + #[test] + fn boot_hook_uses_shell_child_process_contract() { + let dir = tempfile::tempdir().unwrap(); + let path = dir.path().join("boot.sh"); + std::fs::write(&path, "#!/bin/sh\nexit 0\n").unwrap(); + + let (program, args) = boot_hook_command(&path).expect("boot hook should be detected"); + assert_eq!(program, "/bin/sh"); + assert_eq!(args, vec![path.display().to_string()]); + } + + #[tokio::test] + async fn boot_hook_runs_and_primes_following_child_process() { + let dir = tempfile::tempdir().unwrap(); + let marker = dir.path().join("booted"); + let boot = dir.path().join("boot.sh"); + std::fs::write(&boot, "#!/bin/sh\necho booted > booted\n").unwrap(); + + let pid = AtomicU32::new(0); + assert!(run_test_boot_hook(&boot, Some(&pid)).await.unwrap()); + assert_eq!( + pid.load(Ordering::Acquire), + 0, + "boot hook should clear temporary entrypoint pid after completion" + ); + + let output = tokio::process::Command::new("/bin/sh") + .args(["-c", "test -f booted"]) + .current_dir(dir.path()) + .output() + .await + .unwrap(); + assert!( + output.status.success(), + "following child must observe boot side effects" + ); + assert!(marker.exists()); + } + + #[tokio::test] + async fn boot_hook_runs_on_every_startup_call() { + let dir = tempfile::tempdir().unwrap(); + let counter = dir.path().join("counter"); + let boot = dir.path().join("boot.sh"); + std::fs::write(&boot, "#!/bin/sh\necho start >> counter\n").unwrap(); + + assert!(run_test_boot_hook(&boot, None).await.unwrap()); + assert!(run_test_boot_hook(&boot, None).await.unwrap()); + + let contents = std::fs::read_to_string(&counter).unwrap(); + assert_eq!(contents.lines().count(), 2); + } + + #[tokio::test] + async fn boot_hook_nonzero_exit_aborts_startup() { + let dir = tempfile::tempdir().unwrap(); + let boot = dir.path().join("boot.sh"); + std::fs::write(&boot, "#!/bin/sh\nexit 42\n").unwrap(); + + let err = run_test_boot_hook(&boot, None).await.unwrap_err(); + let message = err.to_string(); + assert!(message.contains("boot hook failed")); + assert!(message.contains(&boot.display().to_string())); + assert!(message.contains("42")); + } + // ---- Policy disk discovery tests ---- #[test] diff --git a/docs/sandboxes/community-sandboxes.md b/docs/sandboxes/community-sandboxes.md index 0e3df84bd..c26d26440 100644 --- a/docs/sandboxes/community-sandboxes.md +++ b/docs/sandboxes/community-sandboxes.md @@ -59,7 +59,7 @@ When you pass `--from` with a community sandbox name, the CLI: 1. Resolves the name against the [OpenShell Community](https://github.com/NVIDIA/OpenShell-Community) repository. -2. Pulls the Dockerfile, policy, skills, and any startup scripts. +2. Pulls the Dockerfile, policy, skills, and any boot-hook scripts. 3. Builds the container image locally. 4. Creates the sandbox with the bundled configuration applied. @@ -96,7 +96,7 @@ You can also include the following optional files: - `policy.yaml` that defines the default policy applied when the sandbox launches. - `skills/` that contains agent skill definitions bundled with the sandbox. -- Startup scripts that are any scripts the Dockerfile or entrypoint invokes. +- `boot.sh` or another startup script that your Dockerfile installs at `/etc/openshell/boot.sh`. OpenShell runs that script as a supervisor-managed child process on every sandbox pod start, including restarts. To contribute, fork the repository, add your sandbox directory, and open a pull request. Refer to the repository's