Forecasting Service

A production-ready forecasting service that provides time series predictions using state-of-the-art models including TiRex and Amazon Chronos. This service is designed for deployment as a serverless endpoint on RunPod.

Overview

This project implements a forecasting API that supports multiple time series forecasting models:

TiRex: A transformer-based forecasting model from NX-AI
Chronos: Amazon's pretrained time series forecasting models

The service is optimized for serverless deployment and provides both point forecasts and uncertainty quantification through quantile predictions.

Features

Multi-Model Support: Choose between TiRex and Chronos models
Quantile Forecasting: Get uncertainty estimates with 10th, 50th, and 90th percentiles
Serverless Ready: Optimized for RunPod serverless deployment
GPU Accelerated: Leverages CUDA for fast inference
CPU Fallback: Automatic CPU detection and fallback when CUDA unavailable
Production Ready: Error handling, logging, and input validation
Auto-Configuration: Automatic CPU/CUDA detection with environment variable overrides

Quick Start

Docker Usage (Recommended)

The fastest way to get started is using the pre-built Docker image from Docker Hub:

# Pull the latest image
docker pull egargale/forecasting:latest

# Run with GPU support (if CUDA is available)
docker run --gpus all -p 8000:8000 egargale/forecasting:latest

# Run with CPU only
docker run -p 8000:8000 egargale/forecasting:latest

# Run with custom environment variables
docker run -e USE_CPU=true -p 8000:8000 egargale/forecasting:latest

# Test the service
curl -X POST http://localhost:8000/runsync \
  -H "Content-Type: application/json" \
  -d '{"input": {"model": "tirex", "context": [1,2,3,4,5,6,7,8,9,10], "prediction_length": 5}}'

Local Development

Install Dependencies
```
pip install -r requirements.txt
```

Test CPU/CUDA Configuration

# Check automatic CPU/CUDA detection
python test_cpu_config.py

Run Local Tests

# Test with TiRex model
python -c "import json; f=open('test_input.json'); print(json.load(f))"

# Test with Chronos model
python -c "import json; f=open('test_input_chronos.json'); print(json.load(f))"

Start Serverless Worker
```
python rp_handler.py
```

API Usage

The service accepts POST requests with the following JSON structure:

{
    "input": {
        "model": "tirex",
        "context": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
        "prediction_length": 5
    }
}

Parameters:

model: Choose between "tirex" or "chronos" (default: "tirex")
context: Array of historical time series values
prediction_length: Number of future points to predict

Response Format:

{
    "model": "tirex",
    "forecast": [11.2, 12.3, 13.1, 14.5, 15.2],
    "quantiles": [[10.1, 11.2, 12.3], [11.5, 12.3, 13.8], ...]
}

CPU/CUDA Configuration

The service automatically detects CPU/CUDA availability and configures models accordingly:

Automatic Detection

CUDA Available: Uses GPU acceleration with torch_dtype=torch.bfloat16
CUDA Unavailable: Falls back to CPU with torch_dtype=torch.float32

Environment Variables

Override automatic detection with these environment variables:

Variable	Values	Description
`USE_CPU`	`true`, `1`, `yes`	Force CPU mode regardless of CUDA availability
`TIREX_NO_CUDA`	`1`	Disable TiRex CUDA kernels (set automatically in CPU mode)

Examples

# Force CPU mode
export USE_CPU=true
python rp_handler.py

# Use CUDA (default when available)
unset USE_CPU
python rp_handler.py

Deployment

RunPod Serverless Deployment

Build Docker Image
```
docker build -t forecasting-service .
```
Deploy to RunPod
- Upload your Docker image to a container registry
- Create a new serverless endpoint on RunPod
- Configure GPU requirements (minimum: 1x A100 or T4)
Environment Variables
- No additional environment variables required
- Models are downloaded automatically on startup

Local Testing with RunPod

Test your handler locally before deployment:

# Install runpod package
pip install runpod

# Test the handler
python rp_handler.py

Models

TiRex (NX-AI/TiRex)

Type: Transformer-based forecasting model
Strengths: Fast inference, good for short to medium-term forecasts
Model Size: ~100M parameters
Use Case: General-purpose forecasting with uncertainty

Chronos-Bolt-Small (amazon/chronos-bolt-small)

Type: Pretrained time series language model
Strengths: Zero-shot forecasting, handles multiple frequencies
Model Size: ~50M parameters
Use Case: Cross-domain forecasting without retraining

Project Structure

forecasting/
├── main.py                 # Simple entry point for local testing
├── rp_handler.py          # RunPod serverless handler
├── test_cpu_config.py     # CPU/CUDA configuration test script
├── requirements.txt       # Python dependencies
├── pyproject.toml        # Project configuration
├── test_input.json       # Sample TiRex input
├── test_input_chronos.json  # Sample Chronos input
├── test_input_tirex.json  # Sample TiRex input (alias)
├── chronos-forecasting/  # Chronos model submodule
└── tirex/               # TiRex model submodule

Development

Setup Development Environment

Clone with Submodules

git clone --recurse-submodules <repository-url>
cd forecasting

Install in Development Mode
```
pip install -e .
```
Update Submodules
```
git submodule
```

Legal Disclaimer

This project incorporates third-party models and software:

Chronos (amazon/chronos-bolt-small)

License: Apache License 2.0
Source: https://github.com/amazon-science/chronos-forecasting
Copyright: Amazon.com, Inc. or its affiliates

TiRex (NX-AI/TiRex)

License: Apache License 2.0
Source: https://github.com/NX-AI/tirex
Copyright: NX-AI GmbH

This software is provided "as is" without warranty of any kind, express or implied. The authors and copyright holders are not liable for any claims, damages, or other liability arising from the use of this software.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
chronos-forecasting @ fcd09fe		chronos-forecasting @ fcd09fe
tirex @ 60812b2		tirex @ 60812b2
.gitmodules		.gitmodules
.python-version		.python-version
Dockerfile		Dockerfile
Dockerfile_CI		Dockerfile_CI
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
rp_handler.py		rp_handler.py
test_cpu_config.py		test_cpu_config.py
test_input.json		test_input.json
test_input_chronos.json		test_input_chronos.json
test_input_tirex.json		test_input_tirex.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forecasting Service

Overview

Features

Quick Start

Docker Usage (Recommended)

Local Development

API Usage

CPU/CUDA Configuration

Automatic Detection

Environment Variables

Examples

Deployment

RunPod Serverless Deployment

Local Testing with RunPod

Models

TiRex (NX-AI/TiRex)

Chronos-Bolt-Small (amazon/chronos-bolt-small)

Project Structure

Development

Setup Development Environment

Legal Disclaimer

Chronos (amazon/chronos-bolt-small)

TiRex (NX-AI/TiRex)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Forecasting Service

Overview

Features

Quick Start

Docker Usage (Recommended)

Local Development

API Usage

CPU/CUDA Configuration

Automatic Detection

Environment Variables

Examples

Deployment

RunPod Serverless Deployment

Local Testing with RunPod

Models

TiRex (NX-AI/TiRex)

Chronos-Bolt-Small (amazon/chronos-bolt-small)

Project Structure

Development

Setup Development Environment

Legal Disclaimer

Chronos (amazon/chronos-bolt-small)

TiRex (NX-AI/TiRex)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages