Skip to content

lin-bot23/vbench-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VBench API

AI video quality evaluation backend service powered by VBench.

Architecture

User (Web UI / CLI)
    ↓ HTTPS POST (upload video)
Cloudflare Tunnel / LAN
    ↓
VBench API (FastAPI) ← Mac Mini (MPS)
    ↓
Evaluation Engine (VBench dimensions)
    ↓
JSON Results

Quick Start

# 1. Install dependencies
pip install -r requirements.txt

# 2. (Optional) Pre-download models
bash scripts/download_models.sh

# 3. Start the server
python3 -m uvicorn app.main:app --host 0.0.0.0 --port 8999

# 4. (Optional) Expose via Cloudflare Tunnel
cloudflared tunnel run vbench-api

Model Weights

Model weights are downloaded on demand. When a dimension is evaluated for the first time, the service attempts to find the model in these locations (in order):

  1. models/ (project directory — run scripts/download_models.sh to populate)
  2. ~/.cache/vbench/ (user cache directory)
  3. Auto-download to ~/.cache/vbench/ from the original source URL

Required models (~199MB total):

  • aesthetic_model/sa_0_4_vit_l_14_linear.pth — Aesthetic quality scoring
  • amt_model/amt-s.pth — Frame interpolation (motion smoothness)
  • raft_model/raft-things.pth — Optical flow (motion/dynamic degree)
  • pyiqa_model/musiq_spaq_ckpt-358bb6af.pth — Image quality scoring

API

POST /api/v1/evaluate

Evaluate one or more videos.

Request (multipart/form-data):

Field Type Required Description
videos File[] Yes Video files (mp4/gif), multiple allowed
prompts string No JSON array string, one prompt per video
dimensions string No Dimensions to evaluate, comma-separated. All by default

Supported dimensions:

  • subject_consistency — Subject consistency (DINO)
  • background_consistency — Background consistency (CLIP)
  • temporal_flickering — Temporal flickering (pixel MAE)
  • motion_smoothness — Motion smoothness (AMT)
  • dynamic_degree — Dynamic degree (RAFT optical flow)
  • aesthetic_quality — Aesthetic quality (LAION Aesthetic)
  • imaging_quality — Imaging quality (MUSIQ)
  • object_class — Object class detection (CLIP zero-shot)
  • multiple_objects — Multiple objects presence (CLIP)
  • color — Color matching (CLIP)
  • spatial_relationship — Spatial relationship (CLIP)
  • scene — Scene matching (CLIP)
  • temporal_style — Temporal style similarity (CLIP)
  • overall_consistency — Overall consistency (CLIP)
  • human_action — Human action matching (CLIP)
  • appearance_style — Appearance style (CLIP)

Response (JSON):

{
  "task_id": "uuid",
  "status": "completed",
  "results": {
    "subject_consistency": 0.92,
    "motion_smoothness": 0.85
  },
  "per_video": [
    {
      "filename": "demo.mp4",
      "scores": {
        "subject_consistency": 0.91,
        "motion_smoothness": 0.86
      }
    }
  ]
}

GET /api/v1/status

Returns service status and cached model info.

Project Structure

vbench-api/
├── app/
│   ├── main.py                    # FastAPI entry point
│   ├── routers/
│   │   └── evaluate.py            # Evaluation API routes
│   ├── services/
│   │   └── vbench_service.py      # VBench wrapper
│   └── models/
│       └── schemas.py             # Pydantic models
├── models/                        # Model weights (run download script)
├── cache/                         # Temporary upload files
├── scripts/
│   └── download_models.sh         # Pre-download all model weights
├── requirements.txt
└── README.md

References

About

VBench API - AI video quality evaluation backend service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors