Skip to content

A music classification project for atarayo, yorushika, yoasobi and zutomayo which based on PyTorch.

License

Notifications You must be signed in to change notification settings

virtualguard101/ForyoClassifierCNN

Repository files navigation

ForyoClassifierCNN

A music classification project for atarayo, yorushika, yoasobi and zutomayo which based on PyTorch.

Overview

This project uses PyTorch to implement a CNN model for classifying atarayo, yorushika, yoasobi and zutomayo, which is suitable for beginners who ardently love Japanese music (such as me) in computer audition and music information retrieval to learn and practice.

Features

  • Audio data preprocessing and standardization

  • MFCC feature extraction

  • CNN-based deep learning model

  • Complete training, validation, and test process

  • Model evaluation and visualization

  • Single file prediction demo

Project structure

ForyoClassifierCNN/
├── README.md                 # Project description document
├── requirements.txt          # Python dependencies
├── config.yaml               # Configuration file
├── data/                     # Data directory
│   ├── raw/                  # Raw audio files (stored by band name)
│   ├── processed/            # Processed feature files
│   └── splits/               # Training/validation/test set split
├── src/
│   ├── __init__.py
│   ├── data_preparation.py   # Data preparation and preprocessing
│   ├── feature_extraction.py # Audio feature extraction (MFCC, etc.)
│   ├── dataset.py            # PyTorch Dataset class
│   ├── model.py              # CNN model definition
│   ├── train.py              # Training script
│   └── evaluate.py           # Evaluation script
├── main.py                   # Demo script (predict single audio file)
└── notebooks/
    └── exploration.ipynb     # Data exploration and visualization Jupyter notebook

Quick Start

Install dependencies

pip install -r requirements.txt

Or use Astral uv:

uv sync

If you use uv to manage your Python environment, all scripts can be launched by uv run /path/to/script.py.

GPU Support (Optional)

If you have an NVIDIA GPU and want to use it for faster training, you have several options:

Automatic Installation (Recommended):

The wrapper scripts (uv_run_with_gpu.bat / uv_run_with_gpu.sh) will automatically install CUDA-enabled PyTorch when needed. Just use them to run your training script - no manual installation required!

Manual Installation: If you prefer to install CUDA PyTorch manually:

uv pip uninstall torch torchaudio
uv pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu124 --no-deps

⚠️ Important: By default, uv run will install the CPU version of PyTorch from the lock file. See the Training Device Selection Guide below for detailed information on choosing the right execution method.

System Requirements:

  • NVIDIA GPU with CUDA support

  • NVIDIA drivers installed (check with nvidia-smi)

  • CUDA Toolkit 12.4+ (or compatible version)

Download release model

The released model is available at Release, and you can download it and put it in the models/ directory.

Initialize data json

python src/data_preparation.py

Prediction

python main.py --audio path/to/audio.wav

Training

Prepare data

Put the audio files of the four bands into the data/raw/ directory, and create subfolders by band name:

data/raw/
├── atarayo/
│   ├── song1.mp3
│   ├── song2.flac
│   └── ...
├── yorushika/
│   ├── song1.wav
│   └── ...
├── yoasobi/
│   ├── song1.m4a
│   └── ...
└── zutomayo/
    ├── song1.ogg
    └── ...

Data preprocessing and feature extraction

python src/data_preparation.py

Training Device Selection Guide

Before training, you need to choose the appropriate execution method based on whether you want to use GPU acceleration:

Quick Decision Table

Scenario Command Device Used Notes
Have GPU, want GPU acceleration uv_run_with_gpu.bat src/train.py (Windows)
./uv_run_with_gpu.sh src/train.py (Linux/Mac)
GPU Automatically installs CUDA PyTorch if needed
Have GPU, already installed CUDA PyTorch .venv\Scripts\python.exe src/train.py (Windows)
.venv/bin/python src/train.py (Linux/Mac)
GPU Fastest option if CUDA PyTorch is already installed
No GPU or want CPU training uv run python src/train.py
or python src/train.py
CPU Uses CPU version from lock file or system Python

Detailed Explanation

⚠️ Important: uv run and GPU Support

By default, uv run will install the CPU version of PyTorch from the uv.lock file, which means:

  • uv run python src/train.pyWill use CPU (even if you have a GPU)

  • scripts/uv_run_with_gpu.bat src/train.pyWill use GPU (automatically handles CUDA PyTorch installation)

  • .venv\Scripts\python.exe src/train.pyWill use GPU (if CUDA PyTorch is already installed)

Why does this happen?

The uv.lock file locks the CPU version of PyTorch (torch==2.9.1+cpu). When you run uv run, uv synchronizes the environment according to the lock file, which reinstalls the CPU version even if you previously installed the CUDA version.

Solutions:

  1. Use the wrapper script (Recommended for GPU users):

    • Automatically checks and installs CUDA PyTorch if needed

    • Works seamlessly with uv workflow

    • No manual intervention required

  2. Use Python directly (After installing CUDA PyTorch):

    • Faster startup (no dependency checking)

    • Requires manual CUDA PyTorch installation first

    • Bypasses uv run synchronization

  3. Use uv run (For CPU training only):

    • Simplest command

    • Always uses CPU version

    • Suitable if you don't have a GPU or prefer CPU training

Verify Your Setup

To check which device will be used for training:

# Windows
.venv\Scripts\python.exe -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('Device:', 'GPU' if torch.cuda.is_available() else 'CPU')"

# Linux/Mac
.venv/bin/python -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('Device:', 'GPU' if torch.cuda.is_available() else 'CPU')"

The training script (src/train.py) will automatically detect and use the available device. You'll see a message like:

Using device: cuda

or

Using device: cpu

Train model

Option 1: Using wrapper script (Recommended for GPU users)

# Windows - automatically installs CUDA PyTorch if needed
scripts/uv_run_with_gpu.bat src/train.py
# Linux/Mac - automatically installs CUDA PyTorch if needed
chmod +x scripts/uv_run_with_gpu.sh
scripts/uv_run_with_gpu.sh src/train.py

Best for: Users with GPU who want automatic CUDA PyTorch management

Option 2: Using Python directly (After installing CUDA PyTorch)

# Windows
.venv\Scripts\python.exe src/train.py
# Linux/Mac
.venv/bin/python src/train.py

Best for: Users who have already installed CUDA PyTorch and want fastest startup

Option 3: Using uv run (CPU only)

uv run python src/train.py

⚠️ Note: This will use CPU version. See Training Device Selection Guide above for details.

Option 4: Using standard Python

python src/train.py

Best for: Users with system-wide Python installation

The trained model will be saved in the models/ directory. The training script will automatically detect and use GPU if available.

Evaluate model

python src/evaluate.py

Configuration

All hyperparameters and path configurations are in the config.yaml file, which can be modified as needed.

Tech stack

  • PyTorch - Deep learning framework

  • torchaudio - Audio processing

  • librosa - Audio feature extraction

  • numpy, pandas - Data processing

  • matplotlib, seaborn - Visualization

  • scikit-learn - Evaluation metrics

Learning points

  • Audio preprocessing and feature engineering

  • CNN modeling for time series data

  • Music information retrieval basics

  • PyTorch training process practice

License

The project is licensed under the MIT License. See the LICENSE file for details.

About

A music classification project for atarayo, yorushika, yoasobi and zutomayo which based on PyTorch.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published