Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions architecture/sandbox-custom-containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ The server applies these transforms to every sandbox pod template (`sandbox/mod.
3. Overrides the agent container's `command` to `/opt/openshell/bin/openshell-sandbox`.
4. Sets `runAsUser: 0` so the supervisor has root privileges for namespace creation, proxy setup, and Landlock/seccomp.

These transforms apply to every generated pod template.
These transforms apply to every generated pod template. The image's Docker `ENTRYPOINT` is not the OpenShell startup path. To run image-specific startup logic on sandbox creation and pod restart, install a script at `/etc/openshell/boot.sh`; the supervisor launches it as a managed child process before starting the long-lived child process.

## CLI Usage

Expand Down Expand Up @@ -97,7 +97,8 @@ openshell sandbox create --from ./my-sandbox/ # directory with Dockerfile
The `openshell-sandbox` supervisor adapts to arbitrary environments:

- **Log file fallback**: Attempts to open `/var/log/openshell.log` for append; silently falls back to stdout-only logging if the path is not writable.
- **Command resolution**: Executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort.
- **Boot hook**: Runs `/etc/openshell/boot.sh` with `/bin/sh` as a supervisor-managed child process on every pod start when that file exists. The script runs before the long-lived child process and must exit successfully for the sandbox to become ready.
- **Command resolution**: After the boot hook completes, executes the command from CLI args, then the `OPENSHELL_SANDBOX_COMMAND` env var (set to `sleep infinity` by the server), then `/bin/bash` as a last resort.
- **Network namespace**: Requires successful namespace creation for proxy isolation; startup fails in proxy mode if required capabilities (`CAP_NET_ADMIN`, `CAP_SYS_ADMIN`) or `iproute2` are unavailable. If the `iptables` package is present, the supervisor installs OUTPUT chain rules (LOG + REJECT) inside the namespace to provide fast-fail behavior (immediate `ECONNREFUSED` instead of a 30-second timeout) and diagnostic logging when processes attempt direct connections that bypass the HTTP CONNECT proxy. If `iptables` is absent, the supervisor logs a warning and continues — core network isolation still works via routing.

## Design Decisions
Expand All @@ -111,6 +112,7 @@ The `openshell-sandbox` supervisor adapts to arbitrary environments:
| hostPath side-load | Supervisor binary lives on the node filesystem — no init container, no emptyDir, no extra image pull. Faster pod startup. |
| Read-only mount in agent | Supervisor binary cannot be tampered with by the workload |
| Command override | Ensures `openshell-sandbox` is the entrypoint regardless of the image's default CMD |
| `/etc/openshell/boot.sh` hook | Gives images a restart-safe startup hook even though Docker `ENTRYPOINT` is bypassed |
| Clear `run_as_user/group` for custom images | Prevents startup failure when the image lacks the default `sandbox` user |
| Non-fatal log file init | `/var/log/openshell.log` may be unwritable in arbitrary images; falls back to stdout |
| `docker save` / `ctr import` for push | Avoids requiring a registry for local dev; images land directly in the k3s containerd store |
Expand Down
21 changes: 15 additions & 6 deletions architecture/sandbox.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,11 @@ flowchart TD
L2 --> N{SSH enabled?}
M --> N
N -- Yes --> O[Spawn SSH server task]
N -- No --> P[Spawn child process]
O --> P
N -- No --> O1{`/etc/openshell/boot.sh`?}
O --> O1
O1 -- Yes --> O2[Run boot hook]
O1 -- No --> P[Spawn child process]
O2 --> P
P --> Q[Store entrypoint PID]
Q --> R{gRPC mode?}
R -- Yes --> T[Spawn policy poll task]
Expand Down Expand Up @@ -111,16 +114,22 @@ flowchart TD

8. **SSH server** (optional): If `--ssh-listen-addr` is provided, spawn an async task running `ssh::run_ssh_server()` with the policy, workdir, netns FD, proxy URL, CA paths, and provider env.

9. **Child process spawning** (`ProcessHandle::spawn()`):
9. **Boot hook** (optional): If `/etc/openshell/boot.sh` exists, run it with `/bin/sh` as a supervisor-managed child process before spawning the long-lived child process.
- The hook runs through the normal child spawn path, not as in-process supervisor code.
- The hook runs with the same privilege drop, Landlock/seccomp policy, proxy environment, provider environment, TLS trust files, and network namespace setup as the normal child process.
- While the hook is running, its PID is temporarily published through `entrypoint_pid` so proxy identity binding can attribute any startup traffic correctly.
- A non-zero exit code fails sandbox startup so Kubernetes retries the pod.

10. **Child process spawning** (`ProcessHandle::spawn()`):
- Build `tokio::process::Command` with inherited stdio and `kill_on_drop(true)`
- Set environment variables: `OPENSHELL_SANDBOX=1`, provider credentials, proxy URLs, TLS trust store paths
- Pre-exec closure (async-signal-safe): `setpgid` (if non-interactive) -> `setns` (enter netns) -> `drop_privileges` -> `sandbox::apply` (Landlock + seccomp)

10. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`.
11. **Store entrypoint PID**: `entrypoint_pid.store(pid, Ordering::Release)` so the proxy can resolve TCP peer identity via `/proc`.

11. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details.
12. **Spawn policy poll task** (gRPC mode only): If `sandbox_id`, `openshell_endpoint`, and an OPA engine are all present, spawn `run_policy_poll_loop()` as a background tokio task. This task polls the gateway for policy updates and hot-reloads the OPA engine when a new version is detected. See [Policy Reload Lifecycle](#policy-reload-lifecycle) for details.

12. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124.
13. **Wait with timeout**: If `--timeout > 0`, wrap `handle.wait()` in `tokio::time::timeout()`. On timeout, kill the process and return exit code 124.

## Policy Model

Expand Down
11 changes: 11 additions & 0 deletions crates/openshell-policy/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,12 @@ pub fn load_sandbox_policy(cli_path: Option<&str>) -> Result<Option<SandboxPolic
/// supervisor probes this path before falling back to the restrictive default.
pub const CONTAINER_POLICY_PATH: &str = "/etc/openshell/policy.yaml";

/// Well-known path where a sandbox container image can ship a startup script.
///
/// When present, the sandbox supervisor runs this script on every pod start
/// before launching the long-lived child process.
pub const CONTAINER_BOOT_SCRIPT_PATH: &str = "/etc/openshell/boot.sh";

/// Legacy path used before the navigator → openshell rename.
///
/// Existing community sandbox images still ship their policy at this path.
Expand Down Expand Up @@ -890,6 +896,11 @@ network_policies:
assert_eq!(CONTAINER_POLICY_PATH, "/etc/openshell/policy.yaml");
}

#[test]
fn container_boot_script_path_is_expected() {
assert_eq!(CONTAINER_BOOT_SCRIPT_PATH, "/etc/openshell/boot.sh");
}

#[test]
fn legacy_container_policy_path_is_expected() {
assert_eq!(LEGACY_CONTAINER_POLICY_PATH, "/etc/navigator/policy.yaml");
Expand Down
262 changes: 262 additions & 0 deletions crates/openshell-sandbox/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -541,6 +541,33 @@ pub async fn run_sandbox(
}
}

// Run the optional boot hook under supervisor control before the main
// long-lived child starts. This stays a managed child process launched by
// the supervisor, not in-process supervisor logic and not image ENTRYPOINT
// behavior.
#[cfg(target_os = "linux")]
let _boot_hook_ran = run_boot_hook_at_path(
std::path::Path::new(openshell_policy::CONTAINER_BOOT_SCRIPT_PATH),
workdir.as_deref(),
&policy,
netns.as_ref(),
ca_file_paths.as_ref(),
&provider_env,
Some(entrypoint_pid.as_ref()),
)
.await?;

#[cfg(not(target_os = "linux"))]
let _boot_hook_ran = run_boot_hook_at_path(
std::path::Path::new(openshell_policy::CONTAINER_BOOT_SCRIPT_PATH),
workdir.as_deref(),
&policy,
ca_file_paths.as_ref(),
&provider_env,
Some(entrypoint_pid.as_ref()),
)
.await?;

#[cfg(target_os = "linux")]
let mut handle = ProcessHandle::spawn(
program,
Expand Down Expand Up @@ -640,6 +667,121 @@ pub async fn run_sandbox(
Ok(status.code())
}

fn boot_hook_command(path: &std::path::Path) -> Option<(&'static str, Vec<String>)> {
path.is_file()
.then(|| ("/bin/sh", vec![path.display().to_string()]))
}

fn boot_hook_failed(path: &std::path::Path, exit_code: i32) -> miette::Report {
miette::miette!(
"Sandbox boot hook failed: {} exited with code {}",
path.display(),
exit_code
)
}

#[cfg(target_os = "linux")]
async fn run_boot_hook_at_path(
path: &std::path::Path,
workdir: Option<&str>,
policy: &SandboxPolicy,
netns: Option<&NetworkNamespace>,
ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>,
provider_env: &std::collections::HashMap<String, String>,
entrypoint_pid: Option<&AtomicU32>,
) -> Result<bool> {
let Some((program, args)) = boot_hook_command(path) else {
debug!(path = %path.display(), "No sandbox boot hook found");
return Ok(false);
};

info!(
path = %path.display(),
"Running sandbox boot hook as supervisor-managed child process"
);
let mut handle = ProcessHandle::spawn(
program,
&args,
workdir,
false,
policy,
netns,
ca_paths,
provider_env,
)?;
if let Some(pid_slot) = entrypoint_pid {
// Reuse the entrypoint PID slot while the boot hook runs so proxy
// identity binding can attribute startup traffic to this process.
pid_slot.store(handle.pid(), Ordering::Release);
}

let wait_result = handle.wait().await;
if let Some(pid_slot) = entrypoint_pid {
pid_slot.store(0, Ordering::Release);
}
let status = wait_result.into_diagnostic()?;

if status.code() != 0 {
return Err(boot_hook_failed(path, status.code()));
}

info!(
path = %path.display(),
exit_code = status.code(),
"Sandbox boot hook completed"
);
Ok(true)
}

#[cfg(not(target_os = "linux"))]
async fn run_boot_hook_at_path(
path: &std::path::Path,
workdir: Option<&str>,
policy: &SandboxPolicy,
ca_paths: Option<&(std::path::PathBuf, std::path::PathBuf)>,
provider_env: &std::collections::HashMap<String, String>,
entrypoint_pid: Option<&AtomicU32>,
) -> Result<bool> {
let Some((program, args)) = boot_hook_command(path) else {
debug!(path = %path.display(), "No sandbox boot hook found");
return Ok(false);
};

info!(
path = %path.display(),
"Running sandbox boot hook as supervisor-managed child process"
);
let mut handle = ProcessHandle::spawn(
program,
&args,
workdir,
false,
policy,
ca_paths,
provider_env,
)?;
if let Some(pid_slot) = entrypoint_pid {
pid_slot.store(handle.pid(), Ordering::Release);
}

let wait_result = handle.wait().await;
if let Some(pid_slot) = entrypoint_pid {
pid_slot.store(0, Ordering::Release);
}
let status = wait_result.into_diagnostic()?;

if status.code() != 0 {
return Err(boot_hook_failed(path, status.code()));
}

info!(
path = %path.display(),
exit_code = status.code(),
"Sandbox boot hook completed"
);
Ok(true)
}

/// Build an inference context for local routing, if route sources are available.
///
/// Route sources (in priority order):
Expand Down Expand Up @@ -1673,6 +1815,47 @@ mod tests {
static ENV_LOCK: std::sync::LazyLock<std::sync::Mutex<()>> =
std::sync::LazyLock::new(|| std::sync::Mutex::new(()));

/// Test-only boot hook runner that exercises the boot hook logic
/// (command detection, process execution, PID tracking, error reporting)
/// without applying sandbox policies (seccomp, landlock, privilege
/// dropping). Sandbox enforcement has its own dedicated tests.
async fn run_test_boot_hook(
path: &std::path::Path,
entrypoint_pid: Option<&AtomicU32>,
) -> Result<bool> {
let Some((program, args)) = boot_hook_command(path) else {
return Ok(false);
};

let workdir = path.parent();
let mut cmd = tokio::process::Command::new(program);
cmd.args(&args)
.stdin(std::process::Stdio::null())
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped());
if let Some(dir) = workdir {
cmd.current_dir(dir);
}

let child = cmd.spawn().into_diagnostic()?;
let pid = child.id().unwrap_or(0);
if let Some(pid_slot) = entrypoint_pid {
pid_slot.store(pid, Ordering::Release);
}

let output = child.wait_with_output().await.into_diagnostic()?;
if let Some(pid_slot) = entrypoint_pid {
pid_slot.store(0, Ordering::Release);
}

let exit_code = output.status.code().unwrap_or(-1);
if exit_code != 0 {
return Err(boot_hook_failed(path, exit_code));
}

Ok(true)
}

#[test]
fn bundle_to_resolved_routes_converts_all_fields() {
let bundle = openshell_core::proto::GetInferenceBundleResponse {
Expand Down Expand Up @@ -1923,6 +2106,85 @@ routes:
));
}

#[tokio::test]
async fn boot_hook_missing_is_noop() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("boot.sh");
let pid = AtomicU32::new(99);
assert!(!run_test_boot_hook(&path, Some(&pid)).await.unwrap());
assert_eq!(
pid.load(Ordering::Acquire),
99,
"missing hook must not mutate entrypoint pid state"
);
}

#[test]
fn boot_hook_uses_shell_child_process_contract() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("boot.sh");
std::fs::write(&path, "#!/bin/sh\nexit 0\n").unwrap();

let (program, args) = boot_hook_command(&path).expect("boot hook should be detected");
assert_eq!(program, "/bin/sh");
assert_eq!(args, vec![path.display().to_string()]);
}

#[tokio::test]
async fn boot_hook_runs_and_primes_following_child_process() {
let dir = tempfile::tempdir().unwrap();
let marker = dir.path().join("booted");
let boot = dir.path().join("boot.sh");
std::fs::write(&boot, "#!/bin/sh\necho booted > booted\n").unwrap();

let pid = AtomicU32::new(0);
assert!(run_test_boot_hook(&boot, Some(&pid)).await.unwrap());
assert_eq!(
pid.load(Ordering::Acquire),
0,
"boot hook should clear temporary entrypoint pid after completion"
);

let output = tokio::process::Command::new("/bin/sh")
.args(["-c", "test -f booted"])
.current_dir(dir.path())
.output()
.await
.unwrap();
assert!(
output.status.success(),
"following child must observe boot side effects"
);
assert!(marker.exists());
}

#[tokio::test]
async fn boot_hook_runs_on_every_startup_call() {
let dir = tempfile::tempdir().unwrap();
let counter = dir.path().join("counter");
let boot = dir.path().join("boot.sh");
std::fs::write(&boot, "#!/bin/sh\necho start >> counter\n").unwrap();

assert!(run_test_boot_hook(&boot, None).await.unwrap());
assert!(run_test_boot_hook(&boot, None).await.unwrap());

let contents = std::fs::read_to_string(&counter).unwrap();
assert_eq!(contents.lines().count(), 2);
}

#[tokio::test]
async fn boot_hook_nonzero_exit_aborts_startup() {
let dir = tempfile::tempdir().unwrap();
let boot = dir.path().join("boot.sh");
std::fs::write(&boot, "#!/bin/sh\nexit 42\n").unwrap();

let err = run_test_boot_hook(&boot, None).await.unwrap_err();
let message = err.to_string();
assert!(message.contains("boot hook failed"));
assert!(message.contains(&boot.display().to_string()));
assert!(message.contains("42"));
}

// ---- Policy disk discovery tests ----

#[test]
Expand Down
Loading
Loading