🧠 Speaking Disorder Detection Using Convolutional Neural Networks (CNN)
A deep learning solution to identify speaking disorders, specifically dysarthria, using audio feature extraction and 1D CNNs.
This project detects dysarthric speech using Mel-Frequency Cepstral Coefficients (MFCC) and other spectral audio features. The model classifies .wav files as either normal or disordered speech with high accuracy.
├── All Dys Wav/ # 🧑⚕️ Dysarthric audio samples (label: 1)
├── All Non Dys Wav/ # 🙂 Normal speech samples (label: 0)
├── Testing Dataset/ # 🧪 Additional unseen test samples
├── images/ # 🖼️ Contains model diagram and screenshots
│ └── img1.png
├── Dysarthria_Detection.ipynb # 🧠 Main training and evaluation notebook
├── README.md # 📘 You are here
Speech files are processed using librosa to extract:
- MFCC (Mel-Frequency Cepstral Coefficients)
- Chroma Features
- Spectral Contrast
- Tonnetz
- Zero-Crossing Rate
| Layer Type | Description |
|---|---|
| Conv1D | Extracts patterns from MFCCs |
| MaxPooling1D | Downsamples feature maps |
| Flatten | Flattens output for dense layers |
| Dense + Dropout | Fully connected layers to prevent overfitting |
| Softmax | Final classification output layer |
Input: dys_001.wav
Predicted Class: 1 (Dysarthric Speech)
Confidence: 93.4%
1️⃣ Clone the repository
git clone https://github.com/Saim-Nadeem/Speaking-Disorder-Detection-Using-Convolutional-Neural-Networks-CNN-.git
cd Speaking-Disorder-Detection-Using-Convolutional-Neural-Networks-CNN-2️⃣ Install dependencies
pip install -r requirements.txt3️⃣ Launch Jupyter Notebook
jupyter notebook Dysarthria_Detection.ipynb4️⃣ Run all cells
- Upload your test
.wavfiles in the Testing Dataset/ folder - Watch the model predict normal or disordered speech
- numpy
- pandas
- librosa
- scikit-learn
- tensorflow
- absl-py
- keras
Saim Nadeem
🔗 GitHub: Saim-Nadeem
