Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,4 @@ coverage/
ovmf/

**/target/*
scripts/__pycache__
113 changes: 40 additions & 73 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,83 +16,70 @@ docs/planning/ # Numbered phase directories (00-15)

## Build & Run

## 🚨 CRITICAL: QEMU MUST RUN INSIDE DOCKER ONLY 🚨
QEMU runs natively for both x86-64 and ARM64 architectures.

**NEVER run QEMU directly on the host.** This is an absolute, inviolable requirement.
### Interactive Mode

Running QEMU directly on macOS causes **system-wide instability**:
- Hypervisor.framework resource leaks when QEMU is killed
- GPU driver destabilization affecting ALL GPU-accelerated apps
- Crashes in terminals (Ghostty), browsers (Chrome/Brave), and Electron apps (Slack)
- Memory pressure cascades from orphaned QEMU processes
Use `./run.sh` for interactive development with graphical display:

**The ONLY acceptable ways to run QEMU:**
1. `./docker/qemu/run-boot-parallel.sh N` - Run N parallel boot tests
2. `./docker/qemu/run-kthread-parallel.sh N` - Run N parallel kthread tests
3. `./docker/qemu/run-kthread-test.sh` - Run single kthread test
4. `./docker/qemu/run-dns-test.sh` - Run DNS resolution test
5. `./docker/qemu/run-keyboard-test.sh` - Run keyboard input test
6. `./docker/qemu/run-interactive.sh` - Interactive session with VNC display

**For interactive use (framebuffer + keyboard):**
```bash
./docker/qemu/run-interactive.sh
# Then connect with TigerVNC:
open '/Applications/TigerVNC Viewer 1.15.0.app'
# Enter: localhost:5900
./run.sh # ARM64 with native cocoa display (default)
./run.sh --x86 # x86_64 with VNC display
./run.sh --headless # ARM64 serial output only
./run.sh --x86 --headless # x86_64 serial output only
```

**PROHIBITED commands (will destabilize the host system):**
```bash
# ❌ NEVER DO THIS:
cargo run -p xtask -- boot-stages
cargo run -p xtask -- dns-test
cargo run --release --bin qemu-uefi
./breenix-gdb-chat/scripts/gdb_session.sh start
qemu-system-x86_64 ...
```
**Display:**
- ARM64: Native window (cocoa) - no VNC needed
- x86_64: VNC at localhost:5900 (use TigerVNC to connect)

### One-Time Docker Setup
### Test Scripts

Before running any tests, build the Docker image:
```bash
cd docker/qemu
docker build -t breenix-qemu .
```
Test scripts are located in `docker/qemu/`:

**x86-64:**
- `./docker/qemu/run-boot-parallel.sh N` - Run N parallel boot tests
- `./docker/qemu/run-kthread-parallel.sh N` - Run N parallel kthread tests
- `./docker/qemu/run-kthread-test.sh` - Run single kthread test
- `./docker/qemu/run-dns-test.sh` - Run DNS resolution test
- `./docker/qemu/run-keyboard-test.sh` - Run keyboard input test
- `./docker/qemu/run-interactive.sh` - Interactive session with VNC display

### Standard Workflow: Docker-Based Testing
**ARM64:**
- `./docker/qemu/run-aarch64-boot-test-native.sh` - Native ARM64 boot test
- `./docker/qemu/run-aarch64-boot-test-strict.sh` - Strict ARM64 boot test

For normal development, use Docker-based boot tests:
### Standard Workflow

```bash
# Build the kernel first
# Build the kernel first (x86-64)
cargo build --release --features testing,external_test_bins --bin qemu-uefi

# Run boot tests in Docker (isolated, safe)
# Run boot tests
./docker/qemu/run-boot-parallel.sh 1

# Run multiple parallel tests for stress testing
./docker/qemu/run-boot-parallel.sh 5

# Build ARM64 kernel
cargo build --release --target aarch64-breenix.json -Z build-std=core,alloc -Z build-std-features=compiler-builtins-mem -p kernel --bin kernel-aarch64

# Run ARM64 boot test
./docker/qemu/run-aarch64-boot-test-native.sh
```

For kthread-focused testing:
```bash
# Build kthread-only kernel
cargo build --release --features kthread_test_only --bin qemu-uefi

# Run kthread tests in Docker
# Run kthread tests
./docker/qemu/run-kthread-parallel.sh 1
```

**Why Docker is mandatory:**
- Complete isolation from host Hypervisor.framework
- No GPU driver interaction (uses software rendering)
- Clean container lifecycle - no orphaned processes
- No risk of destabilizing user's system
- Containers auto-cleanup with `--rm`

### Logs
Docker test output goes to `/tmp/breenix_boot_N/` or `/tmp/breenix_kthread_N/`:

Test output goes to `/tmp/breenix_boot_N/` or `/tmp/breenix_kthread_N/`:
```bash
# View kernel log from test 1
cat /tmp/breenix_boot_1/serial_kernel.txt
Expand All @@ -101,14 +88,9 @@ cat /tmp/breenix_boot_1/serial_kernel.txt
cat /tmp/breenix_boot_1/serial_user.txt
```

### GDB Debugging - DISABLED

GDB debugging requires native QEMU which is prohibited. For debugging:
1. Add strategic `log::info!()` statements
2. Run in Docker and examine serial output
3. Use QEMU's `-d` flags for CPU/interrupt tracing (inside Docker)
### GDB Debugging

If you absolutely must use GDB for a critical issue, **ask the user first** and explain why Docker-based debugging is insufficient.
GDB debugging is fully supported. See the "GDB-Only Kernel Debugging" section below for details.

## Development Workflow

Expand Down Expand Up @@ -617,30 +599,15 @@ log::debug!("clock_gettime called"); # This changes timing!
./breenix-gdb-chat/scripts/gdb_session.sh stop
```

## Docker Container Cleanup
## QEMU Process Cleanup

**Docker containers auto-cleanup with `--rm`, but if needed:**

```bash
# Kill any running breenix-qemu containers
docker kill $(docker ps -q --filter ancestor=breenix-qemu) 2>/dev/null || true

# Verify no containers running
docker ps --filter ancestor=breenix-qemu
```

### If You See Orphaned Host QEMU Processes

If you see `qemu-system-x86_64` processes running directly on the host (not in Docker), this means someone violated the Docker-only rule. Clean them up:
If you need to clean up orphaned QEMU processes:

```bash
pkill -9 qemu-system-x86_64 2>/dev/null; pgrep -l qemu || echo "All QEMU processes killed"
pkill -9 qemu-system-aarch64 2>/dev/null; pgrep -l qemu || echo "All QEMU processes killed"
```

**Then investigate how they got there** - all QEMU execution must go through Docker.

This is the agent's responsibility - do not wait for the user to ask.

## Work Tracking

We use Beads (bd) instead of Markdown for issue tracking. Run `bd quickstart` to get started.
20 changes: 0 additions & 20 deletions docker/qemu/Dockerfile.aarch64

This file was deleted.

119 changes: 119 additions & 0 deletions docker/qemu/run-aarch64-boot-test-native.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#!/bin/bash
# Native ARM64 boot test (runs QEMU directly on host)
# Much faster than Docker version but only works on macOS ARM64
#
# The retry mechanism provides robustness for local testing against
# transient host resource contention. If retries are frequently needed,
# investigate for potential regressions.
#
# Usage: ./run-aarch64-boot-test-native.sh

set -e

MAX_RETRIES=5
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BREENIX_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"

# Find the ARM64 kernel
KERNEL="$BREENIX_ROOT/target/aarch64-breenix/release/kernel-aarch64"
if [ ! -f "$KERNEL" ]; then
echo "Error: No ARM64 kernel found. Build with:"
echo " cargo build --release --target aarch64-breenix.json -Z build-std=core,alloc -Z build-std-features=compiler-builtins-mem -p kernel --bin kernel-aarch64"
exit 1
fi

# Find ext2 disk (required for init_shell)
EXT2_DISK="$BREENIX_ROOT/target/ext2-aarch64.img"
if [ ! -f "$EXT2_DISK" ]; then
echo "Error: ext2 disk not found at $EXT2_DISK"
exit 1
fi

run_single_test() {
local OUTPUT_DIR="/tmp/breenix_aarch64_boot_native"
rm -rf "$OUTPUT_DIR"
mkdir -p "$OUTPUT_DIR"

# Run QEMU with 30s timeout
timeout 30 qemu-system-aarch64 \
-M virt -cpu cortex-a72 -m 512 \
-kernel "$KERNEL" \
-display none -no-reboot \
-device virtio-blk-device,drive=ext2 \
-drive if=none,id=ext2,format=raw,readonly=on,file="$EXT2_DISK" \
-serial file:"$OUTPUT_DIR/serial.txt" &
local QEMU_PID=$!

# Wait for kernel output (20s timeout)
local BOOT_COMPLETE=false
for i in $(seq 1 10); do
if [ -f "$OUTPUT_DIR/serial.txt" ]; then
if grep -qE "(breenix>|Welcome to Breenix|Interactive Shell)" "$OUTPUT_DIR/serial.txt" 2>/dev/null; then
BOOT_COMPLETE=true
break
fi
if grep -qiE "(KERNEL PANIC|panic!)" "$OUTPUT_DIR/serial.txt" 2>/dev/null; then
break
fi
fi
sleep 2
done

kill $QEMU_PID 2>/dev/null || true
wait $QEMU_PID 2>/dev/null || true

if $BOOT_COMPLETE; then
# Verify no excessive init_shell spawning
local SHELL_COUNT=$(grep -o "init_shell" "$OUTPUT_DIR/serial.txt" 2>/dev/null | wc -l | tr -d ' ')
SHELL_COUNT=${SHELL_COUNT:-0}
if [ "$SHELL_COUNT" -le 5 ]; then
echo "SUCCESS (${SHELL_COUNT} init_shell mentions)"
return 0
else
echo "FAIL: Too many init_shell mentions: $SHELL_COUNT"
return 1
fi
else
local LINES=$(wc -l < "$OUTPUT_DIR/serial.txt" 2>/dev/null || echo 0)
if grep -qiE "(KERNEL PANIC|panic!)" "$OUTPUT_DIR/serial.txt" 2>/dev/null; then
echo "FAIL: Kernel panic ($LINES lines)"
else
echo "FAIL: Shell not detected ($LINES lines)"
fi
return 1
fi
}

echo "========================================="
echo "ARM64 Boot Test (Native QEMU)"
echo "========================================="
echo "Kernel: $KERNEL"
echo "ext2 disk: $EXT2_DISK"
echo ""

for attempt in $(seq 1 $MAX_RETRIES); do
echo "Attempt $attempt/$MAX_RETRIES..."
if run_single_test; then
echo ""
echo "========================================="
echo "ARM64 BOOT TEST: PASSED"
echo "========================================="
exit 0
fi
if [ $attempt -lt $MAX_RETRIES ]; then
echo "Retrying..."
sleep 1
fi
done

echo ""
echo "========================================="
echo "ARM64 BOOT TEST: FAILED (after $MAX_RETRIES attempts)"
echo "========================================="
echo ""
echo "NOTE: If this test frequently requires retries or fails repeatedly,"
echo "there may be a regression. Check recent changes to boot code."
echo ""
echo "Last output:"
tail -10 /tmp/breenix_aarch64_boot_native/serial.txt 2>/dev/null || echo "(no output)"
exit 1
Loading
Loading