This repository contains exploratory research code for voice anonymization and related audio processing tasks. It is organized for iterative experimentation rather than production-style packaging.
- Start from reproducible scripts to prepare data and train models.
- Use notebooks for analysis/inspection after data and checkpoints are generated.
- Expect structure and files to evolve as research questions change.
- Prefer reproducible checkpoints and notes over strict software release workflows.
.
|-- anonymization_pipeline/ # Core anonymization pipeline components
|-- notebooks/ # Analysis and experiment inspection
|-- scripts/ # Data prep and training entrypoints
|-- source_separation/ # Source separation experiments and related code
|-- voice_blurring/ # Voice blurring methods and prototypes
`-- requirements.txt # Minimal Python dependencies
- Create and activate a Python environment.
- Install dependencies:
pip install -r requirements.txtUse scripts/prepare_sonyc_vox_mixes.py to generate SONYC + VoxCeleb mixtures.
- Default inputs:
- SONYC annotations:
datasets/sonyc-v1-dataset/annotations.csv - SONYC audio root:
datasets/sonyc-v1-dataset/ - Vox root:
datasets/voxceleb1-audio-wav-files-for-india-celebrity/ - Vox metadata:
datasets/voxceleb1-audio-wav-files-for-india-celebrity/vox1_meta.csv
- SONYC annotations:
- Output is written to the path passed with
--out-dir, plus a manifest CSV.
Examples:
python scripts/prepare_sonyc_vox_mixes.py --mode train --out-dir data/mixes_train
python scripts/prepare_sonyc_vox_mixes.py --mode eval --out-dir data/mixes_eval --eval-snr high
python scripts/prepare_sonyc_vox_mixes.py --mode eval --out-dir data/mixes_eval --use-training-voxTrain with scripts/train_unet.py. The script expects a CSV manifest with:
mix_wav: path to a mixture WAVvoice_wav: path to the corresponding target voice WAV
Example training command:
python scripts/train_unet.py --manifest data/pairs.csv --checkpoint-dir checkpoints/run1Useful options:
python scripts/train_unet.py \
--manifest data/pairs.csv \
--checkpoint-dir checkpoints/run1 \
--epochs 20 \
--batch-size 8 \
--lr 1e-3 \
--device auto \
--num-workers 4 \
--seed 0Output:
- A model checkpoint at
checkpoints/<run>/unet_voice_sep.pt
- Voice anonymization reference listed in
pdf-material/fonts.txt: