Skip to content

feat: add Sandbox abstraction for agent code execution environments#1968

Draft
agent-of-mkmeral wants to merge 1 commit intostrands-agents:mainfrom
mkmeral:feat/sandbox-abstraction
Draft

feat: add Sandbox abstraction for agent code execution environments#1968
agent-of-mkmeral wants to merge 1 commit intostrands-agents:mainfrom
mkmeral:feat/sandbox-abstraction

Conversation

@agent-of-mkmeral
Copy link

Summary

Add the Sandbox interface that decouples tool logic from where code runs. Tools that need to execute code or access a filesystem receive a Sandbox instead of managing their own execution, enabling portability across local, Docker, and cloud environments.

Related: Design Doc PR #681

What Changed

New Files (4 source + 5 test)

File Lines Purpose
src/strands/sandbox/__init__.py 22 Module exports
src/strands/sandbox/base.py 263 Sandbox ABC with streaming AsyncGenerator interface, ExecutionResult, convenience methods (read_file, write_file, list_files, execute_code), auto-start lifecycle
src/strands/sandbox/local.py 161 LocalSandbox — native file I/O overrides, asyncio subprocess execution
src/strands/sandbox/docker.py 354 DockerSandbox — container lifecycle, stdin piping for safe file writes, docker exec streaming
tests/strands/sandbox/test_base.py 239 22 tests for ABC, ExecutionResult, convenience methods, lifecycle, edge cases
tests/strands/sandbox/test_local.py 201 17 tests for real command execution, file I/O, streaming, timeouts, code execution
tests/strands/sandbox/test_docker.py 314 18 tests for Docker lifecycle, container management, file ops (all mocked)
tests/strands/sandbox/test_agent_sandbox.py 49 6 tests for Agent integration

Modified Files (2)

File Change
src/strands/__init__.py Added exports for Sandbox, LocalSandbox, DockerSandbox, ExecutionResult
src/strands/agent/agent.py Added sandbox parameter to Agent.__init__, defaults to LocalSandbox()

Design Decisions

✅ IN core SDK

  • Sandbox ABC — the interface contract
  • LocalSandbox — zero-dependency host execution (default)
  • DockerSandbox — containerized execution
  • ExecutionResult dataclass
  • Agent integration (sandbox parameter)

❌ NOT in core SDK (for now)

  • AgentCoreSandbox — External dependency on bedrock-agentcore package. Should live in a separate package
  • Sandbox tools (run_command, python_tool, programmatic_tool_caller) — These are tools, not core abstractions. They belong in strands-agents/tools
  • CodeActPlugin — P2 in the design doc, not yet ready

Rationale: The core SDK should provide the minimal, zero-external-dependency abstraction. Tools and cloud-specific sandbox implementations have their own dependency graphs and release cycles.

Key Design Details

Streaming Interface

async for chunk in sandbox.execute("echo hello"):
    if isinstance(chunk, str):
        print(chunk, end="")  # stream output as it arrives
    elif isinstance(chunk, ExecutionResult):
        print(f"Exit: {chunk.exit_code}")  # final result

Agent Integration

from strands import Agent
from strands.sandbox import DockerSandbox

agent = Agent(sandbox=DockerSandbox(image="python:3.12-slim"))
# Tools access via: tool_context.agent.sandbox

Security

  • Heredoc injection → randomized secrets.token_hex(8) delimiter
  • Path injectionshlex.quote() for all paths
  • Shell escapingshlex.quote() for code arguments
  • DockerSandbox → stdin piping instead of heredoc for file writes

Tests

  • 76 new tests, all passing
  • 1686 existing tests still passing, 0 regressions

What's Left (Outside This PR)

  1. AgentCoreSandbox — needs its own package with bedrock-agentcore dependency
  2. Sandbox tools — should go to strands-agents/tools
  3. CodeActPlugin — P2 per design doc
  4. Docker integration tests — need Docker in CI

cc @mkmeral

Add the Sandbox interface that decouples tool logic from where code runs.
Tools that need to execute code or access a filesystem receive a Sandbox
instead of managing their own execution, enabling portability across
local, Docker, and cloud environments.

Core components:
- Sandbox ABC with streaming AsyncGenerator interface (base.py)
- LocalSandbox for host-process execution via asyncio subprocesses (local.py)
- DockerSandbox for containerized execution via docker exec (docker.py)
- Agent integration: sandbox parameter on Agent.__init__, defaults to LocalSandbox

Key design decisions:
- Only core abstractions in SDK; AgentCoreSandbox and sandbox tools belong
  in separate packages (external dependencies, different release cycles)
- Streaming output via AsyncGenerator[str | ExecutionResult] - yields lines
  as they arrive, ExecutionResult as the final yield
- Security: randomized heredoc delimiters, shlex.quote for all paths,
  stdin piping in DockerSandbox to prevent injection
- Auto-start lifecycle: sandbox starts on first execute() call
- Zero external dependencies for core sandbox package

Tests: 76 new tests (base: 22, local: 17, docker: 18, agent: 6, + shared)
All 76 sandbox tests passing, 1686 existing tests still passing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant