Retreivr Community Cache

This repository is a transport index dataset for Retreivr.

It stores mappings from canonical MusicBrainz recording MBIDs to known-good transport identifiers.

Scope

Canonical mapping model:

recording_mbid -> transport sources

Examples of transport identifiers:

YouTube video IDs
SoundCloud track IDs (future)
Other supported transport IDs (future)

MusicBrainz remains the authoritative source of metadata. This repository does not replicate MusicBrainz entity metadata.

Data Layout

Current dataset namespace:

youtube/recording/<prefix>/<recording_mbid>.json
youtube/video/<prefix>/<video_id>.json (generated reverse index)

Where:

prefix is the first two characters of recording_mbid
filename stem equals recording_mbid
reverse-index prefix is the first two characters of video_id
reverse-index filename stem equals video_id

Record Model

Each record contains:

recording_mbid
sources[] with transport candidate identifiers and minimal validation fields
schema_version

See schema/schema.json for the strict record contract.

Reverse index records contain minimal lookup metadata:

video_id
recording_mbid
confidence
verified_at

Reverse index files are generated by promotion tooling and must not be edited manually.

Non-Goals

This repository must not contain:

scraped metadata dumps
platform search result dumps
thumbnails
ranking heuristics
MusicBrainz entity metadata copies
media files or download URLs

CI Guarantees

Validation in .github/workflows/validate.yml enforces:

JSON parse validity for dataset files
JSON Schema compliance
shard-path and filename/MBID consistency
duplicate MBID prevention in namespace
stats integrity via scripts/generate_stats.py --check

Purpose

The dataset accelerates transport resolution for Retreivr clients while keeping output deterministic, lightweight, and Git-native.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
proposals		proposals
schema		schema
scripts		scripts
stats		stats
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retreivr Community Cache

Scope

Data Layout

Record Model

Non-Goals

CI Guarantees

Purpose

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Retreivr Community Cache

Scope

Data Layout

Record Model

Non-Goals

CI Guarantees

Purpose

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages