Add WAV/MP3 input with automatic 48 kHz resampling and stereo upmix#15
Add WAV/MP3 input with automatic 48 kHz resampling and stereo upmix#15
Conversation
…ols and cover example Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Updated the cover script to remove LLM step and clarify audio processing.
Updated the cover JSON to reflect new attributes and changes.
resync with master
|
@copilot review the following report. remember we should convert stereo->stereo and mono->stereo and NOT touch the acestep-cpp existing files which work when invoked manually. Known issues in
|
Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Fixed in 7ea3602.
No existing files ( |
resync with master
The
--src-audio(cover mode) andneural-codec --encodepaths only accepted WAV at exactly 48 kHz. This adds transparent WAV + MP3 support at any sample rate, auto-resampled to 48 kHz and always delivered as stereo — exactly what the VAE encoder requires — with noffmpegpre-conversion needed.New:
src/audio.hSingle header providing
read_audio(path, T_audio, n_channels):.mp3→ dr_mp3, anything else → dr_wavaudio_resample_linear) is channel-agnostic; only runs whensr ≠ 48000[T × 2]— mono input is upmixed (L = R), N-channel input uses the first two channels;*n_channelsis always 2 on successfree()to release dr_libs buffers (both dr_wav and dr_mp3 use the system allocator)malloc'd buffer; caller freesNew:
thirdparty/dr_wav.hv0.14.5 — WAV decode (public domain / MIT-0, mackron/dr_libs)dr_mp3.hv0.7.3 — MP3 decode via minimp3 (public domain / MIT-0)Zero new link-time dependencies — both are single-header, included once per translation unit via
#define DR_*_IMPLEMENTATIONinsideaudio.h.Tool changes
neural-codec.cpp: encode path switchesread_wav()→read_audio(); help text updateddit-vae.cpp:--src-audioswitches toread_audio(); help text updatedCMakeLists.txt:thirdparty/added asSYSTEMinclude in the sharedlink_ggml_backendsmacro (vendor warnings suppressed)vae-enc.hetc.) were modifiedExample
New example
examples/cover.sh+examples/cover.json— demonstrates cover-mode generation from a WAV or MP3 reference track with inline usage notes.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.