VideoAnnotator v1.2.0 Installation Guide

📖 Navigation: Getting Started | Demo Commands | Pipeline Specs | Main Documentation

VideoAnnotator is a modern video analysis toolkit that uses AI models for comprehensive behavioral annotation. This guide covers installation using our modern uv-based workflow for fast, reliable dependency management.

Prerequisites

Python 3.12+ (required)
Git for cloning repositories
uv package manager (fast, modern Python dependency management)
CUDA Toolkit 12.4+ (recommended for GPU acceleration)
NVIDIA GPU with CUDA support (GTX 1060 6GB+ or better recommended)

System Requirements

Minimum Requirements

CPU: Intel Core i5 (8th gen) or AMD Ryzen 5 (2000 series) or better
RAM: 16GB
Storage: 20GB free space
GPU: NVIDIA GTX 1060 6GB (for CUDA acceleration)
OS: Windows 10/11, Ubuntu 20.04+, or macOS 11+

Recommended Requirements

CPU: Intel Core i7/i9 (11th gen+) or AMD Ryzen 7/9 (5000 series+)
RAM: 32GB or more
Storage: 50GB+ SSD
GPU: NVIDIA RTX 3060 12GB or better
OS: Windows 11, Ubuntu 22.04+, or macOS 12+

macOS specifics (concise)

These tips address the most common macOS installation issues.

Architecture: Apple Silicon (M1/M2/M3) is supported in CPU mode. CUDA GPU acceleration is not available on macOS.
Prerequisites (Homebrew):
```
# Install Homebrew if missing: https://brew.sh
brew install libomp ffmpeg node
```
- libomp: fixes OpenMP errors and segfaults in some audio models.
- ffmpeg: required by whisper/librosa/audio pipelines.
- node: needed if you use the companion web viewer.

uv in PATH:

# If 'uv' is not found after install
export PATH="$HOME/.local/bin:$PATH"

Shell config permissions: If ~/.zshrc or ~/.config are owned by root (from earlier sudo usage), fix ownership before writing PATH updates:

sudo chown -R "$USER" ~/.zshrc ~/.config
chmod u+w ~/.zshrc
# If .zshrc is missing, write PATH to ~/.zprofile instead for login shells

Hugging Face token (for speaker diarization): set HF_AUTH_TOKEN in a .env file or your shell. See Environment Setup guide.
Viewer start command: inside the viewer repo, use npm run dev (not npm start).

Quick Start (Recommended)

1. Install uv Package Manager

# Install uv (fast, modern Python package manager)
# Linux/Mac:
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

2. Clone and Setup Environment

# Clone the repository
git clone https://github.com/your-org/VideoAnnotator.git
cd VideoAnnotator

# Install all dependencies with uv (fast and reliable)
uv sync

# Install development dependencies
uv sync --extra dev

# Initialize the local SQLite database (creates tables + admin API key)
uv run videoannotator setup-db --admin-email you@example.com --admin-username you

The setup-db command is idempotent. Re-run it after pulling new schema changes or use --force when you want to drop and recreate tables. Pass --skip-admin if you prefer to manage API keys yourself later with videoannotator generate-token.

3. Install CUDA-enabled PyTorch (GPU acceleration)

# Note: This repo pins Torch sources via `pyproject.toml` to the CUDA 12.4 wheel index.
# In most cases `uv sync` is sufficient.
# If you need to force a reinstall of CUDA wheels in your local environment:
uv pip install --upgrade \
   "torch==2.8.*+cu124" "torchvision==0.21.*+cu124" "torchaudio==2.8.*+cu124" \
   --index-url https://download.pytorch.org/whl/cu124

4. Install Native Dependencies (if needed)

Some dependencies like dlib require system-level installation:

# If using conda for native dependencies:
conda install -n videoannotator cmake dlib -c conda-forge

# Otherwise, install cmake system-wide from cmake.org

2. Install OpenFace 3.0 (Manual Installation Required)

OpenFace 3.0 requires manual compilation. Follow these steps:

Windows Installation

# Install additional dependencies
conda install -c conda-forge boost
conda install -c conda-forge openblas

# Clone OpenFace 3.0
git clone https://github.com/CMU-MultiComp-Lab/OpenFace-3.0.git
cd OpenFace-3.0

# Build OpenFace
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release ..
cmake --build . --config Release

# Install Python bindings
cd ../python
python setup.py install

Linux/Mac Installation

# Install system dependencies (Ubuntu/Debian)
sudo apt-get update
sudo apt-get install build-essential cmake pkg-config
sudo apt-get install libopenblas-dev liblapack-dev
sudo apt-get install libboost-all-dev

# Clone and build OpenFace 3.0
git clone https://github.com/CMU-MultiComp-Lab/OpenFace-3.0.git
cd OpenFace-3.0
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release ..
make -j4

# Install Python bindings
cd ../python
python setup.py install

Verify Installation

# Test the installation
uv run python -c "
import torch
print(f'PyTorch version: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')
if torch.cuda.is_available():
    print(f'CUDA version: {torch.version.cuda}')
    print(f'GPU device: {torch.cuda.get_device_name(0)}')

# Test YOLO11 installation
from ultralytics import YOLO
print('YOLO11 available!')

# Test other key components
import cv2, transformers, scenedetect
print('All core dependencies installed!')
"

# Test the API server
uv run videoannotator server --host 0.0.0.0 --port 18011
# Should start server on http://localhost:18011

Development Commands

Once installed, use these commands for development:

# Start API server
uv run videoannotator server --host 0.0.0.0 --port 18011

# Run linting and formatting
uv run ruff check .
uv run ruff format .

# Run type checking
uv run mypy src

# Run tests
uv run pytest

# CLI help
uv run videoannotator --help

# API server (compatibility wrapper)
uv run python api_server.py

Docker Installation (Alternative)

CPU Container

# Build and run CPU version
docker build -f Dockerfile.cpu -t videoannotator:cpu .
docker run --rm -v $(pwd)/data:/app/data videoannotator:cpu

GPU Container (Requires NVIDIA Container Toolkit)

# Build and run GPU version
docker build -f Dockerfile.gpu -t videoannotator:gpu .
docker run --gpus all --rm -v $(pwd)/data:/app/data videoannotator:gpu

Dev Container (VS Code)

Open the project in VS Code and use "Reopen in Container" for a complete GPU-enabled development environment.

Troubleshooting

uv Installation Issues

uv command not found:

# Restart your terminal or run:
export PATH="$HOME/.local/bin:$PATH"

Permission errors:

# On Windows, run PowerShell as Administrator
# On Linux/Mac, ensure you have write permissions to ~/.local/

Slow dependency resolution: uv is typically very fast, but large ML dependencies can take time on first install.

CUDA/GPU Issues

CUDA not found: Run nvidia-smi to check GPU detection. Update drivers if needed.

PyTorch doesn't detect CUDA:

# Verify CUDA-enabled PyTorch installation
uv run python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

CUDA version mismatch: Ensure CUDA Toolkit matches PyTorch CUDA version:
- Check CUDA Toolkit: nvcc --version
- Check PyTorch CUDA: uv run python -c "import torch; print(torch.version.cuda)"

Native Dependencies

dlib/cmake errors: Install cmake system-wide or use conda for native dependencies
OpenFace compilation: Requires manual setup - see OpenFace section above
Build tool errors: Ensure you have Visual Studio Build Tools (Windows) or GCC (Linux/Mac)

Modern Architecture

VideoAnnotator v1.2.0 uses:

uv - Fast, reliable Python package management
Ruff - Unified linting and formatting (replaces Black, isort, flake8)
FastAPI - Modern API framework
Hatchling/setuptools - Modern build backend
Docker - CPU and GPU containerization
Python 3.12+ - Latest Python with performance improvements

Dependencies Overview

Category	Tools	Purpose	CUDA Support
Detection	YOLO11	Person/object detection	✅
Tracking	ByteTrack	Multi-object tracking	✅
Scene	PySceneDetect, OpenCLIP	Scene segmentation & classification	✅
Face	OpenFace 3.0, LAION Face	Face analysis & emotion	✅
Audio	Whisper, pyannote.audio	Speech & audio processing	✅
API	FastAPI, uvicorn	REST API server	N/A

Next Steps

After installation:

See docs/usage/GETTING_STARTED.md for usage examples
Check docs/development/ for development workflows
Review configs/ for configuration options
Use uv run videoannotator --help to test the CLI
Use uv run videoannotator server --host 0.0.0.0 --port 18011 to start the server

Performance Tips

GPU acceleration: Install CUDA-enabled PyTorch for 10x speedup
Batch processing: Process multiple videos for efficiency
Memory management: Use appropriate model sizes for your GPU
Container deployment: Use Docker for consistent environments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VideoAnnotator v1.2.0 Installation Guide

Prerequisites

System Requirements

Minimum Requirements

Recommended Requirements

macOS specifics (concise)

Quick Start (Recommended)

1. Install uv Package Manager

2. Clone and Setup Environment

3. Install CUDA-enabled PyTorch (GPU acceleration)

4. Install Native Dependencies (if needed)

2. Install OpenFace 3.0 (Manual Installation Required)

Windows Installation

Linux/Mac Installation

Verify Installation

Development Commands

Docker Installation (Alternative)

CPU Container

GPU Container (Requires NVIDIA Container Toolkit)

Dev Container (VS Code)

Troubleshooting

uv Installation Issues

CUDA/GPU Issues

Native Dependencies

Modern Architecture

Dependencies Overview

Next Steps

Performance Tips

FilesExpand file tree

INSTALLATION.md

Latest commit

History

INSTALLATION.md

File metadata and controls

VideoAnnotator v1.2.0 Installation Guide

Prerequisites

System Requirements

Minimum Requirements

Recommended Requirements

macOS specifics (concise)

Quick Start (Recommended)

1. Install uv Package Manager

2. Clone and Setup Environment

3. Install CUDA-enabled PyTorch (GPU acceleration)

4. Install Native Dependencies (if needed)

2. Install OpenFace 3.0 (Manual Installation Required)

Windows Installation

Linux/Mac Installation

Verify Installation

Development Commands

Docker Installation (Alternative)

CPU Container

GPU Container (Requires NVIDIA Container Toolkit)

Dev Container (VS Code)

Troubleshooting

uv Installation Issues

CUDA/GPU Issues

Native Dependencies

Modern Architecture

Dependencies Overview

Next Steps

Performance Tips