A small library that handles mundane and frequently needed tasks for various tasks in software engineering.
pip install -e .
- Docker Helper – build images, run containers, execute commands, and move files
- Evaluation
- File Helper
- Subprocess Helper
Create a new Docker helper instance.
from se_helpers.docker_helper.docker_helper import DockerHelper
container = DockerHelper()
Each instance manages exactly one active container.
Build a Docker image and store raw build logs.
container.build_container(
context_path=".",
dockerfile="path/to/Dockerfile",
tag="my_image_tag",
log_path="build.log",
)
context_path– Docker build contextdockerfile– Path to Dockerfiletag– Image tag nameplatform(optional) – Target platform (default: linux/amd64)log_path– File path to store full JSON build logs
Streams Docker build output Writes all raw build events to log_path #TODO check with logfile Logs human-readable output via logging
Run a container with arbitrary number of bind mounts.
container.run_container(
image="my_image_tag",
command="sleep infinity",
mounts={
"/host/path": "/container/path"
},
mode="rw",
)
image– Docker image tagcommand– Command executed as PID 1mounts– Mapping of host_path → container_pathmode– Mount mode ("rw" or "ro")
- The container must stay alive for
exec()to work (sleep infinity is the recommended default) - Overrides the image’s CMD
Execute a shell command inside the running container.
result = container.exec("ls /app/data")
Returns:
{
"cmd": "...",
"exit_code": int,
"stdout": str,
"stderr": str,
}
- Output is logged automatically
- Raises an error if no container is running
Copy a single file from the container to the host.
container.copy_file_from_container(
"/app/output/result.txt",
"./result.txt"
)
Stop or remove the active container.
container.stop_container()
- remove=True (default): force remove container
- remove=False: stop without removing Always call this when finished.
This example demonstrates how to:
- Build a Docker image
- Run a long-lived container
- Mount a host directory
- Execute a command inside the container
- Capture command output using Python’s built-in `logging
from pathlib import Path
import logging
from se_helpers.docker_helper.docker_helper import DockerHelper
path_log_file = Path("docker_helper.log")
logging.basicConfig(
level=logging.DEBUG,
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
filename=path_log_file,
filemode="w",
force=True,
)
container = DockerHelper()
container.build_container(
context_path=".",
dockerfile="dockerfile_ubuntu",
tag="my_ubuntu_image",
log_path="build.log",
)
container.run_container(
image="my_ubuntu_image",
command="sleep infinity",
mounts={
"/host/data": "/app/data",
},
)
container.exec("ls /app/data")
container.stop_container()
- All Docker build output is written to `build.log
- All container command output (stdout / stderr) is emitted via
loggingBecause logging is configured before using `DockerHelper, output from:
container.exec("ls /app/data")
is written to:
docker_helper.log
This makes the helper especially suitable for:
- automated tests
- experiment tracking
- CI pipelines
- reproducible evaluations
Iteration Tracker is a minimal, project-agnostic library for tracking repair loop iterations in automatic code generation experiments.
- Automatic code repair loops
- Experiment: Logical grouping (e.g. dataset or benchmark name)
- Task ID: Identifier for the evaluated task or problem
- Attempt ID: Identifier for a single attempt/run on a task
- Iteration: One step of the evaluation or generation loop
from se_helpers.evaluation.iteration_tracker import IterationTracker
tracker = IterationTracker(
experiment="code_repair",
task_id="problem_42",
attempt_id="attempt_0",
max_iterations=10,
raise_error=False,
metadata={"model": "gpt-4"}
)experiment (str): Name of the experiment or benchmarktask_id (str): Identifier of the task being solvedattempt_id (str): Identifier of the current attemptmax_iterations (int): Maximum number of allowed iterationsraise_error (bool): Whether to raise exceptions on terminal statesmetadata (dict, optional): Arbitrary metadata attached to the attempt
Call step() once per iteration in your loop:
while True:
tracker.step()
result = run_solver_step()
if result.success:
tracker.success()
break
- Each call to
step()increments the iteration counter - When
max_iterationsis reached:- The attempt is marked as `MAX_ITERATIONS
- A critical log message is emitted
StopIterationis raised ifraise_error=True
tracker.success()
Marks the attempt as successfully completed (only if not already finished).
tracker.abort()
Marks the attempt as aborted (only if not already finished). Once an outcome is set, the attempt is considered finished.
The tracked attempt can be written to disk after completion using a JSON Lines (.jsonl) format:
from pathlib import Path
from se_helpers.evaluation.iteration_tracker import IterationTracker, write_jsonl
tracker = IterationTracker(
experiment="code_repair",
task_id="problem_42",
attempt_id="attempt_0",
max_iterations=10,
raise_error=False,
metadata={"model": "gpt-4"}
)
tracker.success()
output_file = Path("results.jsonl")
write_jsonl(tracker.record, output_file)TODO
The Subprocess Helper provides small, explicit wrappers around `subprocess.Popen for running external commands in a blocking or background fashion, with optional shell execution and logfile streaming. The helpers are designed to: Reduce boilerplate Make blocking vs non-blocking behavior explicit Clearly define output ownership and lifecycle responsibility
Runs a command synchronously and waits for completion.
- Captures stdout and stderr
- Returns execution metadata
result = run_blocking("echo", ["hello"])
print(result["stdout"])
Returns:
{
stdout: str,
stderr: str,
returncode: int,
pid: int
}
Runs a command synchronously with optional shell interpretation. Use this when shell features such as pipes or redirects are required.
result = run_blocking_shell("echo hello | tr a-z A-Z", shell=True)
Launches a subprocess asynchronously (non-blocking).
- Output is piped
- Caller is responsible for calling
wait()orcommunicate()
proc = run_background("sleep", ["1"])
proc.wait()
Launches a subprocess asynchronously and streams output directly to a logfile.
stdoutandstderrare merged- Logfile handle is attached to the process as `proc.logfile
- Caller must close the logfile when finished
proc = run_background_with_log(
"echo",
["logged output"],
logfile_path="./out/run.log"
)
proc.wait()
proc.logfile.close()
- All background helpers return a live
subprocess.Popenobject - No timeouts or retries are applied automatically
- Shell execution should be used carefully with untrusted input