Rewrite audio DSP plan around fingerprint-first architecture#3
Rewrite audio DSP plan around fingerprint-first architecture#3abossard wants to merge 7 commits into
Conversation
Replace the prior plan with a three-library pipeline: Essentia (offline features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by a single SQLite database for tracks, features, fingerprints, and profiles. Key changes: - Add live track recognition via Olaf so cached features replay in sync with whatever the DJ is playing, without loopback. - AudioFeatures becomes the single view consumed by scripts, widgets, and MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer. - VCAudioTrigger is rewritten as the audio control center with a library browser, recognition badge, drop/build/key indicators, and the existing envelope/AGC/trigger/spectral panels. - Drop all backwards-compatibility paths: ledfx_compat.js, audio_common.js, legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are scheduled for deletion in M7. - Sequence the work fingerprint-first: M1 proves Olaf can lock onto EDM through DJ EQ and pitch shift before any further engine work. - Accept AGPL-3.0 for the combined binary when Essentia is linked; provide a -Daudio_essentia=OFF build flag for downstream redistributors. Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema, AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.
Adapt the architecture after rubberducking with research and after the direction "live features first, best possible, low latency": - Live AudioAnalyzer is M1, shippable on its own through M5. Cached features (M6), Olaf identification (M7), chromagram tracking (M8), and Tier-1 DJ protocols (M9) extend the same AudioFeatures view incrementally. - Olaf is no longer used for continuous lock. It runs one-shot on a rolling ~5 s buffer to identify the track and seed initial position. This avoids Olaf's known brittleness past ~3% time-stretch since identification needs only one good match, not continuous lock. - New PositionSource abstraction with three tiers: DJ-software protocols (OS2L beat counter + cached beat grid, Pro DJ Link, StagelinQ), chromagram cross-correlation against cached chroma with a small speed search, and aubio + internal clock fallback. Highest-priority confident-and-fresh tier wins; per-source latency offsets calibrated against onsets. - SQLite schema adds a `chroma` table holding 12-bin chroma at ~10 Hz per track for the chromagram tracker. - Live latency target codified: <10 ms input-to-onset, <1 ms shared analyzer budget, <0.5 ms per AudioChannel; no heap allocation per frame; lock-free SPSC ring for snapshots. - AudioIdentifier is an interface with a Panako backend ready as a build option for environments where DJ pitch-bend during the ID window matters. - VCAudioTrigger ships its live panels in M4; library browser, recognition badge, drop/build/key indicators, and position-source picker are added incrementally in M6-M9 in the same chrome.
|
@copilot please finish the live implementation |
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725 Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Implemented the first live audio path: AudioCapture now publishes a fixed AudioFeatures snapshot with live RMS/peak/crest, 32-band normalized spectrum, perceptual bands, spectral shape/flux/onset, and beat/BPM state via LiveAudioAnalyzer. Addressed in 03bc4ae. |
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/f5a26a26-1240-49f7-a73b-4b9c0d29c0ce Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
|
@codex[agent] finish the implementation with tests and actual mo3 files |
|
To use Codex here, create a Codex account and connect to github. |
Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Added |
Replace the prior plan with a three-library pipeline: Essentia (offline
features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by
a single SQLite database for tracks, features, fingerprints, and profiles.
Key changes:
with whatever the DJ is playing, without loopback.
MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer.
browser, recognition badge, drop/build/key indicators, and the existing
envelope/AGC/trigger/spectral panels.
legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are
scheduled for deletion in M7.
through DJ EQ and pitch shift before any further engine work.
a -Daudio_essentia=OFF build flag for downstream redistributors.
Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema,
AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.