diff --git a/01_getting_started/01_hello_world/README.md b/01_getting_started/01_hello_world/README.md
index beb291c..7ae39af 100644
--- a/01_getting_started/01_hello_world/README.md
+++ b/01_getting_started/01_hello_world/README.md
@@ -18,17 +18,30 @@ uv run flash login
 
 Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`.
 
-### 3. Run Locally
+### 3. Run the Example
 
 ```bash
-uv run flash run
+python gpu_worker.py
 ```
 
-Server starts at **http://localhost:8888**
+The function executes on a Runpod GPU and prints the result directly:
+
+```
+Testing GPU worker with payload: {'message': 'Testing GPU worker'}
+Result: {'status': 'success', 'message': 'Testing GPU worker', 'worker_type': 'GPU', ...}
+```
+
+First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
+
+### Alternative: HTTP API Testing
 
-### 4. Test the API
+To test via HTTP endpoints instead:
 
-Visit **http://localhost:8888/docs** for interactive API documentation. QB endpoints are auto-generated by `flash run` based on your `@Endpoint` functions.
+```bash
+uv run flash run
+```
+
+Visit **http://localhost:8888/docs** for interactive API documentation.
 
 ```bash
 curl -X POST http://localhost:8888/gpu_worker/runsync \
@@ -36,13 +49,6 @@ curl -X POST http://localhost:8888/gpu_worker/runsync \
   -d '{"message": "Hello GPU!"}'
 ```
 
-### Full CLI Documentation
-
-For complete CLI usage including deployment, environment management, and troubleshooting:
-- **[CLI Reference](../../CLI-REFERENCE.md)** - All commands and options
-- **[Getting Started Guide](../../docs/cli/getting-started.md)** - Step-by-step tutorial
-- **[Workflows](../../docs/cli/workflows.md)** - Common development patterns
-
 ## What This Demonstrates
 
 ### GPU Worker (`gpu_worker.py`)
@@ -133,14 +139,14 @@ The worker uses PyTorch to detect and report GPU information:
 
 ## Development
 
-### Test Worker Locally
+### Run the Worker
 ```bash
 python gpu_worker.py
 ```
 
-### Run the Application
+### HTTP API Testing (Optional)
 ```bash
-flash run
+uv run flash run
 ```
 
 ## Next Steps
diff --git a/01_getting_started/01_hello_world/gpu_worker.py b/01_getting_started/01_hello_world/gpu_worker.py
index d7a330f..6636dc9 100644
--- a/01_getting_started/01_hello_world/gpu_worker.py
+++ b/01_getting_started/01_hello_world/gpu_worker.py
@@ -1,6 +1,6 @@
-# gpu serverless worker -- detects available GPU hardware.
-# run with: flash run
-# test directly: python gpu_worker.py
+# GPU serverless worker -- detects available GPU hardware.
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import Endpoint, GpuType
 
 
diff --git a/01_getting_started/02_cpu_worker/README.md b/01_getting_started/02_cpu_worker/README.md
index 4d5fb88..8f80804 100644
--- a/01_getting_started/02_cpu_worker/README.md
+++ b/01_getting_started/02_cpu_worker/README.md
@@ -18,17 +18,30 @@ uv run flash login
 
 Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`.
 
-### 3. Run Locally
+### 3. Run the Example
 
 ```bash
-uv run flash run
+python cpu_worker.py
 ```
 
-Server starts at **http://localhost:8888**
+The function executes on a Runpod CPU worker and prints the result directly:
+
+```
+Testing CPU worker with payload: {'name': 'Testing CPU worker'}
+Result: {'status': 'success', 'message': 'Hello, Testing CPU worker!', 'worker_type': 'CPU', ...}
+```
+
+First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
+
+### Alternative: HTTP API Testing
 
-### 4. Test the API
+To test via HTTP endpoints instead:
 
-Visit **http://localhost:8888/docs** for interactive API documentation. QB endpoints are auto-generated by `flash run` based on your `@Endpoint` functions.
+```bash
+uv run flash run
+```
+
+Visit **http://localhost:8888/docs** for interactive API documentation.
 
 ```bash
 curl -X POST http://localhost:8888/cpu_worker/runsync \
@@ -36,13 +49,6 @@ curl -X POST http://localhost:8888/cpu_worker/runsync \
   -d '{"name": "Flash User"}'
 ```
 
-### Full CLI Documentation
-
-For complete CLI usage including deployment, environment management, and troubleshooting:
-- **[CLI Reference](../../CLI-REFERENCE.md)** - All commands and options
-- **[Getting Started Guide](../../docs/cli/getting-started.md)** - Step-by-step tutorial
-- **[Workflows](../../docs/cli/workflows.md)** - Common development patterns
-
 ## What This Demonstrates
 
 ### CPU Worker (`cpu_worker.py`)
@@ -135,14 +141,14 @@ The CPU worker scales to zero when idle:
 
 ## Development
 
-### Test Worker Locally
+### Run the Worker
 ```bash
 python cpu_worker.py
 ```
 
-### Run the Application
+### HTTP API Testing (Optional)
 ```bash
-flash run
+uv run flash run
 ```
 
 ## When to Use CPU Workers
diff --git a/01_getting_started/02_cpu_worker/cpu_worker.py b/01_getting_started/02_cpu_worker/cpu_worker.py
index 0679296..94b702b 100644
--- a/01_getting_started/02_cpu_worker/cpu_worker.py
+++ b/01_getting_started/02_cpu_worker/cpu_worker.py
@@ -1,6 +1,6 @@
-# cpu serverless worker -- lightweight processing without GPU.
-# run with: flash run
-# test directly: python cpu_worker.py
+# CPU serverless worker -- lightweight processing without GPU.
+# Run: python cpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import CpuInstanceType, Endpoint
 
 
diff --git a/01_getting_started/03_mixed_workers/README.md b/01_getting_started/03_mixed_workers/README.md
index e85fad4..c1199dd 100644
--- a/01_getting_started/03_mixed_workers/README.md
+++ b/01_getting_started/03_mixed_workers/README.md
@@ -40,16 +40,33 @@ Response
 
 **Prerequisites**: Complete the [repository setup](../../README.md#quick-start) first (clone, `make dev`, set API key).
 
-### Run This Example
+### Test Individual Workers
+
+Run the CPU and GPU workers directly:
 
 ```bash
 cd 01_getting_started/03_mixed_workers
-flash run
+
+# Test CPU preprocessing worker
+python cpu_worker.py
+
+# Test GPU inference worker
+python gpu_worker.py
 ```
 
-### Alternative: Standalone Setup
+First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
+
+### Run the Full Pipeline
 
-If you haven't run the repository-wide setup:
+The pipeline endpoint (`/classify`) orchestrates multiple workers via HTTP. To test it:
+
+```bash
+uv run flash run
+```
+
+Server starts at http://localhost:8888
+
+### Setup (if needed)
 
 ```bash
 # Install dependencies
@@ -58,13 +75,8 @@ uv sync
 # Authenticate
 uv run flash login
 # Or create .env file with RUNPOD_API_KEY=your_api_key_here
-
-# Run
-uv run flash run
 ```
 
-Server starts at http://localhost:8888
-
 ## Test the Pipeline
 
 ```bash
diff --git a/01_getting_started/03_mixed_workers/cpu_worker.py b/01_getting_started/03_mixed_workers/cpu_worker.py
index f65fd6c..d96001a 100644
--- a/01_getting_started/03_mixed_workers/cpu_worker.py
+++ b/01_getting_started/03_mixed_workers/cpu_worker.py
@@ -1,7 +1,7 @@
-# cpu workers for text preprocessing and postprocessing.
-# part of the mixed CPU/GPU pipeline example.
-# run with: flash run
-# test directly: python cpu_worker.py
+# CPU workers for text preprocessing and postprocessing.
+# Part of the mixed CPU/GPU pipeline example.
+# Run: python cpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import CpuInstanceType, Endpoint
 
 
diff --git a/01_getting_started/03_mixed_workers/gpu_worker.py b/01_getting_started/03_mixed_workers/gpu_worker.py
index b6ae065..7ac28a1 100644
--- a/01_getting_started/03_mixed_workers/gpu_worker.py
+++ b/01_getting_started/03_mixed_workers/gpu_worker.py
@@ -1,7 +1,7 @@
-# gpu worker for ML inference (sentiment classification).
-# part of the mixed CPU/GPU pipeline example.
-# run with: flash run
-# test directly: python gpu_worker.py
+# GPU worker for ML inference (sentiment classification).
+# Part of the mixed CPU/GPU pipeline example.
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import Endpoint, GpuGroup
 
 
diff --git a/01_getting_started/03_mixed_workers/pipeline.py b/01_getting_started/03_mixed_workers/pipeline.py
index 6a4615f..1088627 100644
--- a/01_getting_started/03_mixed_workers/pipeline.py
+++ b/01_getting_started/03_mixed_workers/pipeline.py
@@ -1,6 +1,13 @@
-# classification pipeline: CPU preprocess -> GPU inference -> CPU postprocess.
-# demonstrates cross-worker orchestration via a load-balanced endpoint.
-# run with: flash run
+# Classification pipeline: CPU preprocess -> GPU inference -> CPU postprocess.
+# Demonstrates cross-worker orchestration via a load-balanced endpoint.
+# Run: python pipeline.py (local testing)
+# Alternative: flash run (for HTTP route testing)
+import sys
+from pathlib import Path
+
+# Ensure sibling modules (cpu_worker, gpu_worker) are importable regardless of cwd
+sys.path.insert(0, str(Path(__file__).parent))
+
 from runpod_flash import Endpoint
 
 pipeline = Endpoint(name="01_03_classify_pipeline", cpu="cpu3c-1-2", workers=(1, 3))
diff --git a/01_getting_started/04_dependencies/README.md b/01_getting_started/04_dependencies/README.md
index cf9a4a8..e54b615 100644
--- a/01_getting_started/04_dependencies/README.md
+++ b/01_getting_started/04_dependencies/README.md
@@ -29,14 +29,16 @@ Learn how to manage Python packages and system dependencies in Flash workers.
 
 ```bash
 cd 01_getting_started/04_dependencies
-flash run
-```
 
-Server starts at http://localhost:8888
+# Run any worker directly
+python gpu_worker.py
+python cpu_worker.py
+python mixed_worker.py
+```
 
-### Alternative: Standalone Setup
+First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
 
-If you haven't run the repository-wide setup:
+### Setup (if needed)
 
 ```bash
 # Install dependencies
@@ -45,11 +47,18 @@ uv sync
 # Authenticate
 uv run flash login
 # Or create .env file with RUNPOD_API_KEY=your_api_key_here
+```
 
-# Run
+### Alternative: HTTP API Testing
+
+To test via HTTP endpoints:
+
+```bash
 uv run flash run
 ```
 
+Server starts at http://localhost:8888
+
 ## GPU vs CPU Packaging
 
 GPU and CPU endpoints use different base Docker images, which affects how dependencies are resolved:
diff --git a/01_getting_started/04_dependencies/cpu_worker.py b/01_getting_started/04_dependencies/cpu_worker.py
index 64e2c96..3b542ee 100644
--- a/01_getting_started/04_dependencies/cpu_worker.py
+++ b/01_getting_started/04_dependencies/cpu_worker.py
@@ -1,6 +1,6 @@
-# cpu workers demonstrating data science and zero-dependency patterns.
-# run with: flash run
-# test directly: python cpu_worker.py
+# CPU workers demonstrating data science and zero-dependency patterns.
+# Run: python cpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import CpuInstanceType, Endpoint
 
 
diff --git a/01_getting_started/04_dependencies/gpu_worker.py b/01_getting_started/04_dependencies/gpu_worker.py
index 07df859..8d951a8 100644
--- a/01_getting_started/04_dependencies/gpu_worker.py
+++ b/01_getting_started/04_dependencies/gpu_worker.py
@@ -1,6 +1,6 @@
-# gpu workers demonstrating Python and system dependency management.
-# run with: flash run
-# test directly: python gpu_worker.py
+# GPU workers demonstrating Python and system dependency management.
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import Endpoint, GpuGroup
 
 
diff --git a/01_getting_started/04_dependencies/mixed_worker.py b/01_getting_started/04_dependencies/mixed_worker.py
index 4b15892..6736d1d 100644
--- a/01_getting_started/04_dependencies/mixed_worker.py
+++ b/01_getting_started/04_dependencies/mixed_worker.py
@@ -3,8 +3,8 @@
 #   - GPU images (runpod/pytorch:*) have numpy pre-installed
 #   - CPU images (python-slim) install numpy from the build artifact
 #
-# run with: flash run
-# test directly: python mixed_worker.py
+# Run: python mixed_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import CpuInstanceType, Endpoint, GpuType
 
 
diff --git a/02_ml_inference/01_text_to_speech/README.md b/02_ml_inference/01_text_to_speech/README.md
index 4b89a47..af2e3af 100644
--- a/02_ml_inference/01_text_to_speech/README.md
+++ b/02_ml_inference/01_text_to_speech/README.md
@@ -33,31 +33,30 @@ Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`.
 ### Run
 
 ```bash
-uv run flash run
+python gpu_worker.py
 ```
 
-First run provisions the endpoint (~1 min). Server starts at http://localhost:8888
+First run provisions the endpoint (~1 min) and downloads the model. The result is printed directly to your terminal.
 
-### Test the Endpoint
+Subsequent runs take 5-10 seconds (worker is already running).
 
-Visit http://localhost:8888/docs for interactive API documentation. QB endpoints are auto-generated by `flash run` based on your `@Endpoint` functions.
+### Alternative: HTTP API Testing
+
+To test via HTTP endpoints:
 
-**Generate speech (JSON with base64 audio):**
 ```bash
-curl -X POST http://localhost:8888/gpu_worker/runsync \
-  -H "Content-Type: application/json" \
-  -d '{"text": "Hello world!", "speaker": "Ryan", "language": "English"}'
+uv run flash run
 ```
 
-**List available voices:**
+Server starts at http://localhost:8888. Visit http://localhost:8888/docs for interactive API documentation.
+
+**Generate speech (JSON with base64 audio):**
 ```bash
 curl -X POST http://localhost:8888/gpu_worker/runsync \
   -H "Content-Type: application/json" \
-  -d '{}'
+  -d '{"text": "Hello world!", "speaker": "Ryan", "language": "English"}'
 ```
 
-Check `/docs` for the exact auto-generated endpoint paths and schemas.
-
 ## API Functions
 
 QB (queue-based) endpoints are auto-generated from `@Endpoint` functions. Visit `/docs` for the full API schema.
diff --git a/02_ml_inference/01_text_to_speech/gpu_worker.py b/02_ml_inference/01_text_to_speech/gpu_worker.py
index 6d60e01..4245efb 100644
--- a/02_ml_inference/01_text_to_speech/gpu_worker.py
+++ b/02_ml_inference/01_text_to_speech/gpu_worker.py
@@ -1,6 +1,6 @@
 # Qwen3-TTS text-to-speech GPU worker.
-# run with: flash run
-# test directly: python gpu_worker.py
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import Endpoint, GpuGroup
 
 
diff --git a/03_advanced_workers/05_load_balancer/README.md b/03_advanced_workers/05_load_balancer/README.md
index 2c6eadc..d9e855a 100644
--- a/03_advanced_workers/05_load_balancer/README.md
+++ b/03_advanced_workers/05_load_balancer/README.md
@@ -37,13 +37,29 @@ uv run flash login
 
 Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`.
 
-### 3. Run Locally (from repository root)
+### 3. Test Individual Workers
+
+Run each load-balanced worker directly:
+
+```bash
+# Test GPU load-balanced worker
+python gpu_lb.py
+
+# Test CPU load-balanced worker
+python cpu_lb.py
+```
+
+This tests the worker setup. Results are printed directly to your terminal.
+
+### 4. Test HTTP Routes
+
+Load-balanced endpoints expose HTTP routes. To test the full API:
 
 ```bash
 uv run flash run
 ```
 
-Visit **http://localhost:8888/docs** for interactive API documentation (unified app with all examples).
+Visit **http://localhost:8888/docs** for interactive API documentation.
 
 ### 4. Test Endpoints (via unified app)
 
diff --git a/03_advanced_workers/05_load_balancer/cpu_lb.py b/03_advanced_workers/05_load_balancer/cpu_lb.py
index 08a9105..8b239b3 100644
--- a/03_advanced_workers/05_load_balancer/cpu_lb.py
+++ b/03_advanced_workers/05_load_balancer/cpu_lb.py
@@ -1,6 +1,6 @@
-# cpu load-balanced endpoints with custom HTTP routes.
-# run with: flash run
-# test directly: python cpu_lb.py
+# CPU load-balanced endpoints with custom HTTP routes.
+# Run: python cpu_lb.py (test worker setup)
+# Run: flash run (test HTTP routes)
 from runpod_flash import Endpoint
 
 api = Endpoint(
diff --git a/03_advanced_workers/05_load_balancer/gpu_lb.py b/03_advanced_workers/05_load_balancer/gpu_lb.py
index 2637bef..d0f1218 100644
--- a/03_advanced_workers/05_load_balancer/gpu_lb.py
+++ b/03_advanced_workers/05_load_balancer/gpu_lb.py
@@ -1,6 +1,6 @@
-# gpu load-balanced endpoints with custom HTTP routes.
-# run with: flash run
-# test directly: python gpu_lb.py
+# GPU load-balanced endpoints with custom HTTP routes.
+# Run: python gpu_lb.py (test worker setup)
+# Run: flash run (test HTTP routes)
 from runpod_flash import Endpoint, GpuType
 
 api = Endpoint(
diff --git a/04_scaling_performance/01_autoscaling/README.md b/04_scaling_performance/01_autoscaling/README.md
index 0e02e67..5b090ca 100644
--- a/04_scaling_performance/01_autoscaling/README.md
+++ b/04_scaling_performance/01_autoscaling/README.md
@@ -6,22 +6,32 @@ Configure Flash worker autoscaling for different workload patterns. This example
 
 **Prerequisites**: Complete the [repository setup](../../README.md#quick-start) first, or run `flash login` to authenticate.
 
+### Run the Examples
+
 ```bash
 cd 04_scaling_performance/01_autoscaling
-flash run
+
+# Run GPU worker
+python gpu_worker.py
+
+# Run CPU worker
+python cpu_worker.py
 ```
 
-Server starts at http://localhost:8888 -- visit http://localhost:8888/docs for interactive API docs.
+First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
 
-### Test Individual Strategies
+### Alternative: HTTP API Testing
+
+To test via HTTP endpoints:
 
 ```bash
-# Scale-to-zero GPU worker
-curl -X POST http://localhost:8888/gpu_worker/runsync \
-  -H "Content-Type: application/json" \
-  -d '{"matrix_size": 512}'
+uv run flash run
+```
 
-# Always-on GPU worker (same payload, different endpoint)
+Server starts at http://localhost:8888. Visit http://localhost:8888/docs for interactive API docs.
+
+```bash
+# Scale-to-zero GPU worker
 curl -X POST http://localhost:8888/gpu_worker/runsync \
   -H "Content-Type: application/json" \
   -d '{"matrix_size": 512}'
diff --git a/04_scaling_performance/01_autoscaling/cpu_worker.py b/04_scaling_performance/01_autoscaling/cpu_worker.py
index 6660ea3..17a6ada 100644
--- a/04_scaling_performance/01_autoscaling/cpu_worker.py
+++ b/04_scaling_performance/01_autoscaling/cpu_worker.py
@@ -1,6 +1,6 @@
-# cpu autoscaling strategies -- scale-to-zero and burst-ready.
-# run with: flash run
-# test directly: python cpu_worker.py
+# CPU autoscaling strategies -- scale-to-zero and burst-ready.
+# Run: python cpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import CpuInstanceType, Endpoint
 
 
diff --git a/04_scaling_performance/01_autoscaling/gpu_worker.py b/04_scaling_performance/01_autoscaling/gpu_worker.py
index 2d12fb0..8c900a0 100644
--- a/04_scaling_performance/01_autoscaling/gpu_worker.py
+++ b/04_scaling_performance/01_autoscaling/gpu_worker.py
@@ -1,6 +1,6 @@
-# gpu autoscaling strategies -- scale-to-zero, always-on, high-throughput.
-# run with: flash run
-# test directly: python gpu_worker.py
+# GPU autoscaling strategies -- scale-to-zero, always-on, high-throughput.
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 from runpod_flash import Endpoint, GpuType, ServerlessScalerType
 
 
diff --git a/05_data_workflows/01_network_volumes/README.md b/05_data_workflows/01_network_volumes/README.md
index bd9cf24..13473c1 100644
--- a/05_data_workflows/01_network_volumes/README.md
+++ b/05_data_workflows/01_network_volumes/README.md
@@ -22,29 +22,32 @@ uv run flash login
 
 Or create a `.env` file with `RUNPOD_API_KEY=your_api_key_here`.
 
-### 3. Run Locally
+### 3. Run the GPU Worker
+
+Generate an image by running the GPU worker directly:
 
 ```bash
-uv run flash run
+python gpu_worker.py
 ```
 
-Server starts at `http://localhost:8888`
+First run takes 60-120 seconds (provisioning + model download). The image is saved to the network volume and the result is printed to your terminal.
+
+### 4. Test the CPU Worker (HTTP API)
 
-### 4. Test the API
+The CPU worker serves images via HTTP routes. To test it:
 
-**Generate an image (GPU worker):**
 ```bash
-curl -X POST http://localhost:8888/gpu_worker/runsync \
-  -H "Content-Type: application/json" \
-  -d '{"prompt": "a sunset over mountains"}'
+uv run flash run
 ```
 
-**List generated images (CPU worker):**
+Server starts at `http://localhost:8888`
+
+**List generated images:**
 ```bash
 curl http://localhost:8888/images
 ```
 
-**Get a specific image (CPU worker):**
+**Get a specific image:**
 ```bash
 curl http://localhost:8888/images/sd_generated_20240101_120000.png
 ```
diff --git a/05_data_workflows/01_network_volumes/cpu_worker.py b/05_data_workflows/01_network_volumes/cpu_worker.py
index 5d1dad4..68d6fb3 100644
--- a/05_data_workflows/01_network_volumes/cpu_worker.py
+++ b/05_data_workflows/01_network_volumes/cpu_worker.py
@@ -1,6 +1,6 @@
-# cpu worker with network volume for listing and serving generated images.
-# run with: flash run
-# test directly: python cpu_worker.py
+# CPU worker with network volume for listing and serving generated images.
+# This is an LB endpoint with HTTP routes - use flash run to test routes.
+# Run: flash run (required for HTTP route testing)
 from runpod_flash import Endpoint, NetworkVolume
 
 volume = NetworkVolume(
diff --git a/05_data_workflows/01_network_volumes/gpu_worker.py b/05_data_workflows/01_network_volumes/gpu_worker.py
index fd4c7b2..d7aa18c 100644
--- a/05_data_workflows/01_network_volumes/gpu_worker.py
+++ b/05_data_workflows/01_network_volumes/gpu_worker.py
@@ -1,6 +1,6 @@
-# gpu worker with network volume for Stable Diffusion image generation.
-# run with: flash run
-# test directly: python gpu_worker.py
+# GPU worker with network volume for Stable Diffusion image generation.
+# Run: python gpu_worker.py
+# Alternative: flash run (for HTTP API testing)
 import logging
 
 from runpod_flash import Endpoint, GpuType, NetworkVolume
diff --git a/06_real_world/README.md b/06_real_world/README.md
index e640184..3cec2b4 100644
--- a/06_real_world/README.md
+++ b/06_real_world/README.md
@@ -116,9 +116,17 @@ All real-world examples include:
 ## Deployment Patterns
 
 ### Development
+
+Run individual workers directly:
 ```bash
 cd example_name
-flash run
+python gpu_worker.py
+python cpu_worker.py
+```
+
+Or run the full app with HTTP routes:
+```bash
+uv run flash run
 ```
 
 ### Production
diff --git a/README.md b/README.md
index c73bcdf..bd581d7 100644
--- a/README.md
+++ b/README.md
@@ -2,38 +2,6 @@
 
 A collection of example applications showcasing Runpod Flash - a framework for building production-ready AI applications with distributed GPU and CPU computing.
 
-## What is Flash?
-
-Flash is a Python framework that lets you run functions on Runpod's Serverless infrastructure with a single decorator. Write code locally, deploy globally—Flash handles provisioning, scaling, and routing automatically.
-
-```python
-from runpod_flash import Endpoint, GpuType
-
-@Endpoint(name="image-gen", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, dependencies=["torch", "diffusers"])
-async def generate_image(prompt: str) -> bytes:
-    # This runs on a cloud GPU, not your laptop
-    ...
-```
-
-**Key features:**
-- **`@Endpoint` decorator**: Mark any async function to run on serverless infrastructure
-- **Auto-scaling**: Scale to zero when idle, scale up under load
-- **Local development**: `flash run` starts a local server with hot reload
-- **One-command deploy**: `flash deploy` packages and ships your code
-
-## Prerequisites
-
-- **Python 3.10+**
-- **uv**: Install with `curl -LsSf https://astral.sh/uv/install.sh | sh`
-- **Runpod account**: [Sign up here](https://runpod.io/console/signup)
-
-### Python version in deployed workers
-
-Your local Python version does not affect what runs in the cloud. `flash build` downloads wheels for the container's Python version automatically.
-
-- **GPU workers**: Python 3.12 only. The GPU base image ships multiple interpreters (3.9-3.14) for interactive pod use, but torch and CUDA libraries are installed only for 3.12.
-- **CPU workers**: Python 3.10, 3.11, or 3.12. Configurable via `PYTHON_VERSION` build arg.
-
 ## Quick Start
 
 ```bash
@@ -45,11 +13,12 @@ uv sync && uv pip install -e .
 # Authenticate with Runpod
 uv run flash login
 
-# Run all examples locally
-uv run flash run
+# Run an example
+cd 01_getting_started/01_hello_world
+python gpu_worker.py
 ```
 
-Open **http://localhost:8888/docs** to explore all endpoints.
+The function executes on a Runpod GPU and prints the result directly. First run takes 30-60 seconds (provisioning); subsequent runs take 2-3 seconds.
 
 > **Using pip, poetry, or conda?** See [DEVELOPMENT.md](./DEVELOPMENT.md) for alternative setups.
 
@@ -68,85 +37,30 @@ Open **http://localhost:8888/docs** to explore all endpoints.
 
 More examples coming soon in each category.
 
-## CLI Commands
-
-```bash
-flash login              # Authenticate with Runpod (opens browser)
-flash run                # Run development server (localhost:8888)
-flash build              # Build deployment package
-flash deploy --env <name># Build and deploy to environment
-flash undeploy <name>    # Delete deployed endpoint
-```
-
-See **[CLI-REFERENCE.md](./CLI-REFERENCE.md)** for complete documentation.
-
-## Key Concepts
-
-### Endpoint
-
-The `Endpoint` class configures functions for execution on Runpod's serverless infrastructure:
+## What is Flash?
 
-**Queue-based (one function = one endpoint):**
+Flash is a Python framework that lets you run functions on Runpod's Serverless infrastructure with a single decorator. Write code locally, deploy globally—Flash handles provisioning, scaling, and routing automatically.
 
 ```python
 from runpod_flash import Endpoint, GpuType
 
-@Endpoint(name="my-worker", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, workers=(0, 3), dependencies=["torch"])
-async def process(data: dict) -> dict:
-    import torch
-    # this code runs on Runpod GPUs
-    return {"result": "processed"}
-```
-
-**Load-balanced (multiple routes, shared workers):**
-
-```python
-from runpod_flash import Endpoint
-
-api = Endpoint(name="my-api", cpu="cpu3c-1-2", workers=(1, 3))
-
-@api.get("/health")
-async def health():
-    return {"status": "ok"}
-
-@api.post("/compute")
-async def compute(data: dict) -> dict:
-    return {"result": data}
-```
-
-**Client mode (connect to an existing endpoint):**
-
-```python
-from runpod_flash import Endpoint
-
-ep = Endpoint(id="ep-abc123")
-job = await ep.run({"prompt": "hello"})
-await job.wait()
-print(job.output)
+@Endpoint(name="image-gen", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, dependencies=["torch", "diffusers"])
+async def generate_image(prompt: str) -> bytes:
+    # This runs on a cloud GPU, not your laptop
+    ...
 ```
 
-### Resource Types
-
-**GPU Workers** (`gpu=`):
-| Type | Use Case |
-|------|----------|
-| `GpuType.NVIDIA_GEFORCE_RTX_4090` | RTX 4090 (24GB) |
-| `GpuType.NVIDIA_RTX_6000_ADA_GENERATION` | RTX 6000 Ada (48GB) |
-| `GpuType.NVIDIA_A100_80GB_PCIe` | A100 (80GB) |
-
-**CPU Workers** (`cpu=`):
-| Type | Specs |
-|------|-------|
-| `cpu3g-2-8` | 2 vCPU, 8GB RAM |
-| `cpu3c-4-8` | 4 vCPU, 8GB RAM (Compute) |
-| `cpu5c-4-16` | 4 vCPU, 16GB RAM (Latest) |
+**Key features:**
+- **`@Endpoint` decorator**: Mark any async function to run on serverless infrastructure
+- **Auto-scaling**: Scale to zero when idle, scale up under load
+- **Local development**: `flash run` starts a local server with hot reload
+- **One-command deploy**: `flash deploy` packages and ships your code
 
-### Auto-Scaling
+## Prerequisites
 
-Workers automatically scale based on demand:
-- `workers=(0, 3)` - Scale from 0 to 3 workers (cost-efficient)
-- `workers=(1, 5)` - Keep 1 warm, scale up to 5
-- `idle_timeout=5` - Seconds before scaling down
+- **Python 3.10-3.12**
+- **uv**: Install with `curl -LsSf https://astral.sh/uv/install.sh | sh`
+- **Runpod account**: [Sign up here](https://runpod.io/console/signup)
 
 ## Resources