NVIDIA · drew · Apr 6, 2026 · Apr 7, 2026
@@ -67,7 +67,7 @@ The server applies these transforms to every sandbox pod template (`sandbox/mod.
 3. Overrides the agent container's `command` to `/opt/openshell/bin/openshell-sandbox`.
 4. Sets `runAsUser: 0` so the supervisor has root privileges for namespace creation, proxy setup, and Landlock/seccomp.
 
-These transforms apply to every generated pod template.
+These transforms apply to every generated pod template. The image's Docker `ENTRYPOINT` is not the OpenShell startup path. To run image-specific startup logic on sandbox creation and pod restart, install a script at `/etc/openshell/boot.sh`; the supervisor launches it as a managed child process before starting the long-lived child process.
 
 ## CLI Usage
 
@@ -97,7 +97,8 @@ openshell sandbox create --from ./my-sandbox/  # directory with Dockerfile
 The `openshell-sandbox` supervisor adapts to arbitrary environments:
 
 - **Log file fallback**: Attempts to open `/var/log/openshell.log` for append; silently falls back to stdout-only logging if the path is not writable.
-- **Command resolution**: Executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort.
+- **Boot hook**: Runs `/etc/openshell/boot.sh` with `/bin/sh` as a supervisor-managed child process on every pod start when that file exists. The script runs before the long-lived child process and must exit successfully for the sandbox to become ready.
+- **Command resolution**: After the boot hook completes, executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort.
 - **Network namespace**: Requires successful namespace creation for proxy isolation; startup fails in proxy mode if required capabilities (`CAP_NET_ADMIN`, `CAP_SYS_ADMIN`) or `iproute2` are unavailable. If the `iptables` package is present, the supervisor installs OUTPUT chain rules (LOG + REJECT) inside the namespace to provide fast-fail behavior (immediate `ECONNREFUSED` instead of a 30-second timeout) and diagnostic logging when processes attempt direct connections that bypass the HTTP CONNECT proxy. If `iptables` is absent, the supervisor logs a warning and continues — core network isolation still works via routing.
 
 ## Design Decisions
@@ -111,6 +112,7 @@ The `openshell-sandbox` supervisor adapts to arbitrary environments:
 | hostPath side-load | Supervisor binary lives on the node filesystem — no init container, no emptyDir, no extra image pull. Faster pod startup. |
 | Read-only mount in agent | Supervisor binary cannot be tampered with by the workload |
 | Command override | Ensures `openshell-sandbox` is the entrypoint regardless of the image's default CMD |
+| `/etc/openshell/boot.sh` hook | Gives images a restart-safe startup hook even though Docker `ENTRYPOINT` is bypassed |
 | Clear `run_as_user/group` for custom images | Prevents startup failure when the image lacks the default `sandbox` user |
 | Non-fatal log file init | `/var/log/openshell.log` may be unwritable in arbitrary images; falls back to stdout |
 | `docker save` / `ctr import` for push | Avoids requiring a registry for local dev; images land directly in the k3s containerd store |

@@ -65,8 +65,11 @@ flowchart TD
     L2 --> N{SSH enabled?}
     M --> N
     N -- Yes --> O[Spawn SSH server task]
-    N -- No --> P[Spawn child process]
-    O --> P
+    N -- No --> O1{`/etc/openshell/boot.sh`?}
+    O --> O1
+    O1 -- Yes --> O2[Run boot hook]
+    O1 -- No --> P[Spawn child process]
+    O2 --> P
     P --> Q[Store entrypoint PID]
     Q --> R{gRPC mode?}
     R -- Yes --> T[Spawn policy poll task]
@@ -111,16 +114,22 @@ flowchart TD
 
 8. **SSH server** (optional): If `--ssh-listen-addr` is provided, spawn an async task running `ssh::run_ssh_server()` with the policy, workdir, netns FD, proxy URL, CA paths, and provider env.
 
-9. **Child process spawning** (`ProcessHandle::spawn()`):
+9. **Boot hook** (optional): If `/etc/openshell/boot.sh` exists, run it with `/bin/sh` as a supervisor-managed child process before spawning the long-lived child process.
+   - The hook runs through the normal child spawn path, not as in-process supervisor code.
+   - The hook runs with the same privilege drop, Landlock/seccomp policy, proxy environment, provider environment, TLS trust files, and network namespace setup as the normal child process.
+   - While the hook is running, its PID is temporarily published through `entrypoint_pid` so proxy identity binding can attribute any startup traffic correctly.
+   - A non-zero exit code fails sandbox startup so Kubernetes retries the pod.
+
+10. **Child process spawning** (`ProcessHandle::spawn()`):
    - Build `tokio::process::Command` with inherited stdio and `kill_on_drop(true)`
    - Set environment variables: `OPENSHELL_SANDBOX=1`, provider credentials, proxy URLs, TLS trust store paths
    - Pre-exec closure (async-signal-safe): `setpgid` (if non-interactive) -> `setns` (enter netns) -> `drop_privileges` -> `sandbox::apply` (Landlock + seccomp)
 
-10. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`.
+11. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`.
 
-11. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details.
+12. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details.
 
-12. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124.
+13. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124.
 
 ## Policy Model
 

@@ -411,6 +411,12 @@ pub fn load_sandbox_policy(cli_path: Option<&str>) -> Result<Option<SandboxPolic
 /// supervisor probes this path before falling back to the restrictive default.
 pub const CONTAINER_POLICY_PATH: &str = "/etc/openshell/policy.yaml";
 
+/// Well-known path where a sandbox container image can ship a startup script.
+///
+/// When present, the sandbox supervisor runs this script on every pod start
+/// before launching the long-lived child process.
+pub const CONTAINER_BOOT_SCRIPT_PATH: &str = "/etc/openshell/boot.sh";
+
 /// Legacy path used before the navigator → openshell rename.
 ///
 /// Existing community sandbox images still ship their policy at this path.
@@ -890,6 +896,11 @@ network_policies:
         assert_eq!(CONTAINER_POLICY_PATH, "/etc/openshell/policy.yaml");
     }
 
+    #[test]
+    fn container_boot_script_path_is_expected() {
+        assert_eq!(CONTAINER_BOOT_SCRIPT_PATH, "/etc/openshell/boot.sh");
+    }
+
     #[test]
     fn legacy_container_policy_path_is_expected() {
         assert_eq!(LEGACY_CONTAINER_POLICY_PATH, "/etc/navigator/policy.yaml");

@@ -541,6 +541,33 @@ pub async fn run_sandbox(
         }
     }
 
+    // Run the optional boot hook under supervisor control before the main
+    // long-lived child starts. This stays a managed child process launched by
+    // the supervisor, not in-process supervisor logic and not image ENTRYPOINT
+    // behavior.
+    #[cfg(target_os = "linux")]
+    let _boot_hook_ran = run_boot_hook_at_path(
+        std::path::Path::new(openshell_policy::CONTAINER_BOOT_SCRIPT_PATH),
+        workdir.as_deref(),
+        &policy,
+        netns.as_ref(),
+        ca_file_paths.as_ref(),
+        &provider_env,
+        Some(entrypoint_pid.as_ref()),
+    )
+    .await?;
+
+    #[cfg(not(target_os = "linux"))]
+    let _boot_hook_ran = run_boot_hook_at_path(
+        std::path::Path::new(openshell_policy::CONTAINER_BOOT_SCRIPT_PATH),
+        workdir.as_deref(),
+        &policy,
+        ca_file_paths.as_ref(),
+        &provider_env,
+        Some(entrypoint_pid.as_ref()),
+    )
+    .await?;
+
     #[cfg(target_os = "linux")]
     let mut handle = ProcessHandle::spawn(
         program,
@@ -640,6 +667,121 @@ pub async fn run_sandbox(
     Ok(status.code())
 }
 
+fn boot_hook_command(path: &std::path::Path) -> Option<(&'static str, Vec<String>)> {
+    path.is_file()
+        .then(|| ("/bin/sh", vec![path.display().to_string()]))
+}
+
+fn boot_hook_failed(path: &std::path::Path, exit_code: i32) -> miette::Report {
+    miette::miette!(
+        "Sandbox boot hook failed: {} exited with code {}",
+        path.display(),
+        exit_code
+    )
+}
+
+#[cfg(target_os = "linux")]
+async fn run_boot_hook_at_path(
+    path: &std::path::Path,
+    workdir: Option<&str>,
+    policy: &SandboxPolicy,
+    netns: Option<&NetworkNamespace>,
+    ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>,
+    provider_env: &std::collections::HashMap<String, String>,
+    entrypoint_pid: Option<&AtomicU32>,
+) -> Result<bool> {
+    let Some((program, args)) = boot_hook_command(path) else {
+        debug!(path = %path.display(), "No sandbox boot hook found");
+        return Ok(false);
+    };
+
+    info!(
+        path = %path.display(),
+        "Running sandbox boot hook as supervisor-managed child process"
+    );
+    let mut handle = ProcessHandle::spawn(
+        program,
+        &args,
+        workdir,
+        false,
+        policy,
+        netns,
+        ca_paths,
+        provider_env,
+    )?;
+    if let Some(pid_slot) = entrypoint_pid {
+        // Reuse the entrypoint PID slot while the boot hook runs so proxy
+        // identity binding can attribute startup traffic to this process.
+        pid_slot.store(handle.pid(), Ordering::Release);
+    }
+
+    let wait_result = handle.wait().await;
+    if let Some(pid_slot) = entrypoint_pid {
+        pid_slot.store(0, Ordering::Release);
+    }
+    let status = wait_result.into_diagnostic()?;
+
+    if status.code() != 0 {
+        return Err(boot_hook_failed(path, status.code()));
+    }
+
+    info!(
+        path = %path.display(),
+        exit_code = status.code(),
+        "Sandbox boot hook completed"
+    );
+    Ok(true)
+}
+
+#[cfg(not(target_os = "linux"))]
+async fn run_boot_hook_at_path(
+    path: &std::path::Path,
+    workdir: Option<&str>,
+    policy: &SandboxPolicy,
+    ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>,
+    provider_env: &std::collections::HashMap<String, String>,
+    entrypoint_pid: Option<&AtomicU32>,
+) -> Result<bool> {
+    let Some((program, args)) = boot_hook_command(path) else {
+        debug!(path = %path.display(), "No sandbox boot hook found");
+        return Ok(false);
+    };
+
+    info!(
+        path = %path.display(),
+        "Running sandbox boot hook as supervisor-managed child process"
+    );
+    let mut handle = ProcessHandle::spawn(
+        program,
+        &args,
+        workdir,
+        false,
+        policy,
+        ca_paths,
+        provider_env,
+    )?;
+    if let Some(pid_slot) = entrypoint_pid {
+        pid_slot.store(handle.pid(), Ordering::Release);
+    }
+
+    let wait_result = handle.wait().await;
+    if let Some(pid_slot) = entrypoint_pid {
+        pid_slot.store(0, Ordering::Release);
+    }
+    let status = wait_result.into_diagnostic()?;
+
+    if status.code() != 0 {
+        return Err(boot_hook_failed(path, status.code()));
+    }
+
+    info!(
+        path = %path.display(),
+        exit_code = status.code(),
+        "Sandbox boot hook completed"
+    );
+    Ok(true)
+}
+
 /// Build an inference context for local routing, if route sources are available.
 ///
 /// Route sources (in priority order):
@@ -1673,6 +1815,47 @@ mod tests {
     static ENV_LOCK: std::sync::LazyLock<std::sync::Mutex<()>> =
         std::sync::LazyLock::new(|| std::sync::Mutex::new(()));
 
+    /// Test-only boot hook runner that exercises the boot hook logic
+    /// (command detection, process execution, PID tracking, error reporting)
+    /// without applying sandbox policies (seccomp, landlock, privilege
+    /// dropping). Sandbox enforcement has its own dedicated tests.
+    async fn run_test_boot_hook(
+        path: &std::path::Path,
+        entrypoint_pid: Option<&AtomicU32>,
+    ) -> Result<bool> {
+        let Some((program, args)) = boot_hook_command(path) else {
+            return Ok(false);
+        };
+
+        let workdir = path.parent();
+        let mut cmd = tokio::process::Command::new(program);
+        cmd.args(&args)
+            .stdin(std::process::Stdio::null())
+            .stdout(std::process::Stdio::piped())
+            .stderr(std::process::Stdio::piped());
+        if let Some(dir) = workdir {
+            cmd.current_dir(dir);
+        }
+
+        let child = cmd.spawn().into_diagnostic()?;
+        let pid = child.id().unwrap_or(0);
+        if let Some(pid_slot) = entrypoint_pid {
+            pid_slot.store(pid, Ordering::Release);
+        }
+
+        let output = child.wait_with_output().await.into_diagnostic()?;
+        if let Some(pid_slot) = entrypoint_pid {
+            pid_slot.store(0, Ordering::Release);
+        }
+
+        let exit_code = output.status.code().unwrap_or(-1);
+        if exit_code != 0 {
+            return Err(boot_hook_failed(path, exit_code));
+        }
+
+        Ok(true)
+    }
+
     #[test]
     fn bundle_to_resolved_routes_converts_all_fields() {
         let bundle = openshell_core::proto::GetInferenceBundleResponse {
@@ -1923,6 +2106,85 @@ routes:
         ));
     }
 
+    #[tokio::test]
+    async fn boot_hook_missing_is_noop() {
+        let dir = tempfile::tempdir().unwrap();
+        let path = dir.path().join("boot.sh");
+        let pid = AtomicU32::new(99);
+        assert!(!run_test_boot_hook(&path, Some(&pid)).await.unwrap());
+        assert_eq!(
+            pid.load(Ordering::Acquire),
+            99,
+            "missing hook must not mutate entrypoint pid state"
+        );
+    }
+
+    #[test]
+    fn boot_hook_uses_shell_child_process_contract() {
+        let dir = tempfile::tempdir().unwrap();
+        let path = dir.path().join("boot.sh");
+        std::fs::write(&path, "#!/bin/sh\nexit 0\n").unwrap();
+
+        let (program, args) = boot_hook_command(&path).expect("boot hook should be detected");
+        assert_eq!(program, "/bin/sh");
+        assert_eq!(args, vec![path.display().to_string()]);
+    }
+
+    #[tokio::test]
+    async fn boot_hook_runs_and_primes_following_child_process() {
+        let dir = tempfile::tempdir().unwrap();
+        let marker = dir.path().join("booted");
+        let boot = dir.path().join("boot.sh");
+        std::fs::write(&boot, "#!/bin/sh\necho booted > booted\n").unwrap();
+
+        let pid = AtomicU32::new(0);
+        assert!(run_test_boot_hook(&boot, Some(&pid)).await.unwrap());
+        assert_eq!(
+            pid.load(Ordering::Acquire),
+            0,
+            "boot hook should clear temporary entrypoint pid after completion"
+        );
+
+        let output = tokio::process::Command::new("/bin/sh")
+            .args(["-c", "test -f booted"])
+            .current_dir(dir.path())
+            .output()
+            .await
+            .unwrap();
+        assert!(
+            output.status.success(),
+            "following child must observe boot side effects"
+        );
+        assert!(marker.exists());
+    }
+
+    #[tokio::test]
+    async fn boot_hook_runs_on_every_startup_call() {
+        let dir = tempfile::tempdir().unwrap();
+        let counter = dir.path().join("counter");
+        let boot = dir.path().join("boot.sh");
+        std::fs::write(&boot, "#!/bin/sh\necho start >> counter\n").unwrap();
+
+        assert!(run_test_boot_hook(&boot, None).await.unwrap());
+        assert!(run_test_boot_hook(&boot, None).await.unwrap());
+
+        let contents = std::fs::read_to_string(&counter).unwrap();
+        assert_eq!(contents.lines().count(), 2);
+    }
+
+    #[tokio::test]
+    async fn boot_hook_nonzero_exit_aborts_startup() {
+        let dir = tempfile::tempdir().unwrap();
+        let boot = dir.path().join("boot.sh");
+        std::fs::write(&boot, "#!/bin/sh\nexit 42\n").unwrap();
+
+        let err = run_test_boot_hook(&boot, None).await.unwrap_err();
+        let message = err.to_string();
+        assert!(message.contains("boot hook failed"));
+        assert!(message.contains(&boot.display().to_string()));
+        assert!(message.contains("42"));
+    }
+
     // ---- Policy disk discovery tests ----
 
     #[test]