Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions 04_scaling_performance/02_datacenters/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# 02_datacenters

Pin endpoints to specific RunPod data centers for latency, compliance, or availability reasons.

## Overview

By default, endpoints deploy across all available data centers. The `datacenter` parameter restricts placement to one or more specific DCs. CPU endpoints are limited to a subset of DCs that support CPU serverless (see `CPU_DATACENTERS`).

## Quick Start

```bash
pip install -r requirements.txt
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quick Start instructs pip install -r requirements.txt, but this example directory does not include a requirements.txt. Update the instructions to reference the repo-level setup (as other examples do) or add the missing requirements file so the commands work as written.

Suggested change
pip install -r requirements.txt
pip install -r ../../requirements.txt

Copilot uses AI. Check for mistakes.
flash run
```

## What You'll Learn

- How to pin a GPU endpoint to a single datacenter
- How to deploy across multiple datacenters
- How CPU datacenter restrictions work

## Available Data Centers

| ID | Location |
|----|----------|
| `US-GA-1` | US - Georgia |
| `US-KS-1` | US - Kansas |
| `US-TX-1` | US - Texas |
| `US-OR-1` | US - Oregon |
| `CA-MTL-1` | Canada - Montreal |
| `EU-NL-1` | Europe - Netherlands |
| `EU-CZ-1` | Europe - Czech Republic |
| `EU-RO-1` | Europe - Romania |
| `EU-NO-1` | Europe - Norway |
| `EU-SE-1` | Europe - Sweden |

CPU endpoints support: `EU-RO-1`, `US-TX-1`, `EU-SE-1`.

## Examples

**Single datacenter:**

```python
@Endpoint(name="us-worker", gpu=GpuGroup.ANY, datacenter=DataCenter.US_GA_1)
async def inference(data: dict) -> dict:
...
```

**Multiple datacenters:**

```python
@Endpoint(
name="global-worker",
gpu=GpuGroup.ANY,
datacenter=[DataCenter.US_GA_1, DataCenter.EU_RO_1],
)
async def inference(data: dict) -> dict:
...
```

**No datacenter (default, all DCs):**

```python
@Endpoint(name="anywhere", gpu=GpuGroup.ANY)
async def inference(data: dict) -> dict:
...
```

## Project Structure

```
02_datacenters/
├── gpu_worker.py # single-DC and multi-DC GPU endpoints
├── cpu_worker.py # CPU endpoint in a supported DC
└── README.md
```
29 changes: 29 additions & 0 deletions 04_scaling_performance/02_datacenters/cpu_worker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# cpu worker pinned to a cpu-supported datacenter.
# cpu endpoints are only available in a subset of datacenters
# (see CPU_DATACENTERS). selecting an unsupported DC raises an error.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment has a sentence starting mid-line after a period; capitalize the first word for readability/grammar.

Suggested change
# (see CPU_DATACENTERS). selecting an unsupported DC raises an error.
# (see CPU_DATACENTERS). Selecting an unsupported DC raises an error.

Copilot uses AI. Check for mistakes.
# run with: flash run
from runpod_flash import Endpoint, DataCenter

api = Endpoint(
name="04_02_cpu_eu",
cpu="cpu3c-2-4",
workers=(0, 2),
datacenter=DataCenter.EU_RO_1,
)


@api.post("/process")
async def process(data: dict) -> dict:
"""CPU processing pinned to EU-RO-1."""
return {"datacenter": "EU-RO-1", "result": data}


@api.get("/health")
async def health():
return {"status": "ok"}


if __name__ == "__main__":
import asyncio

print(asyncio.run(process({"text": "hello"})))
39 changes: 39 additions & 0 deletions 04_scaling_performance/02_datacenters/gpu_worker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# gpu workers pinned to specific datacenters.
# run with: flash run
from runpod_flash import Endpoint, GpuGroup, DataCenter


# pin to a single datacenter
@Endpoint(
name="04_02_gpu_us",
gpu=GpuGroup.ANY,
workers=(0, 3),
datacenter=DataCenter.US_GA_1,
)
async def us_inference(payload: dict) -> dict:
"""GPU inference pinned to US-GA-1."""
return {"datacenter": "US-GA-1", "result": payload}


# deploy across multiple datacenters for broader availability
@Endpoint(
name="04_02_gpu_multi",
gpu=GpuGroup.ANY,
workers=(0, 3),
datacenter=[DataCenter.US_GA_1, DataCenter.EU_RO_1],
)
async def multi_dc_inference(payload: dict) -> dict:
"""GPU inference available in US-GA-1 and EU-RO-1."""
return {"result": payload}


if __name__ == "__main__":
import asyncio

async def test():
print("=== US datacenter ===")
print(await us_inference({"prompt": "hello"}))
print("\n=== Multi-DC ===")
print(await multi_dc_inference({"prompt": "hello"}))

asyncio.run(test())
5 changes: 4 additions & 1 deletion 05_data_workflows/01_network_volumes/cpu_worker.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,21 @@
# cpu worker with network volume for listing and serving generated images.
# run with: flash run
# test directly: python cpu_worker.py
from runpod_flash import Endpoint, NetworkVolume
from runpod_flash import Endpoint, DataCenter, NetworkVolume

# same volume as gpu_worker.py -- must match name and datacenter
volume = NetworkVolume(
name="flash-05-volume",
size=50,
datacenter=DataCenter.EU_RO_1,
)

api = Endpoint(
name="05_01_cpu_worker",
cpu="cpu3c-1-2",
workers=(1, 3),
idle_timeout=120,
datacenter=DataCenter.EU_RO_1,
volume=volume,
)

Expand Down
4 changes: 3 additions & 1 deletion 05_data_workflows/01_network_volumes/gpu_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# test directly: python gpu_worker.py
import logging

from runpod_flash import Endpoint, GpuType, NetworkVolume
from runpod_flash import Endpoint, GpuType, DataCenter, NetworkVolume

logger = logging.getLogger(__name__)

Expand All @@ -12,6 +12,7 @@
volume = NetworkVolume(
name="flash-05-volume",
size=50,
datacenter=DataCenter.EU_RO_1,
)


Expand All @@ -20,6 +21,7 @@
gpu=GpuType.NVIDIA_GEFORCE_RTX_5090,
workers=(0, 3),
idle_timeout=300,
datacenter=DataCenter.EU_RO_1,
volume=volume,
env={"HF_HUB_CACHE": MODEL_PATH, "MODEL_PATH": MODEL_PATH},
dependencies=["torch", "diffusers", "transformers", "accelerate"],
Expand Down
Loading