Skip to content

ibtesaamaslam/SignSense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SignSense: Hand Gesture Recognition System

SignSense is a machine learning-based hand gesture recognition system that detects and classifies hand gestures (wave, stop, thumbs_up) using MediaPipe for hand landmark detection and a Random Forest classifier for robust prediction. It supports both dataset-based training (from Kaggle) and real-time inference via webcam, making it ideal for human-computer interaction, accessibility tools, and interactive AI applications.


πŸ“‘ Table of Contents


πŸš€ Project Overview

SignSense uses computer vision and machine learning to identify hand gestures from static images and live video. It:

  • Extracts 3D hand landmarks with MediaPipe
  • Converts them into feature vectors
  • Trains a Random Forest classifier
  • Performs real-time gesture detection with live visual feedback

The system resolves common issues like mislabelled gesture classes and low model accuracy by applying:

  • Class filtering
  • Data augmentation
  • Feature scaling

It runs seamlessly on:

  • Local machines (real-time webcam inference)
  • Cloud environments like Kaggle (dataset-based training)

🎯 Features

  • πŸ– Gesture Recognition: Detects wave, stop, thumbs_up
  • πŸŽ₯ Real-Time Inference: Predicts gestures live from webcam
  • πŸ“ Dynamic Dataset Loader: Auto-maps folder names like 0, 1, 2 or wave, stop, thumbs_up
  • πŸŒ€ Data Augmentation: Random rotations, flips, brightness adjustments
  • βš–οΈ Feature Scaling: Uses StandardScaler to normalize landmarks
  • πŸ“ˆ Live Confidence Plot: Real-time matplotlib graph during webcam predictions
  • πŸ’‘ Error Logging: Handles image load errors and missing landmarks
  • 🌐 Kaggle Compatible: Fully runnable in Kaggle notebooks (no webcam needed)

🧰 Technologies Used

  • Python 3.6+
  • OpenCV – for webcam & image handling
  • MediaPipe – for hand landmark extraction
  • Scikit-learn – for Random Forest classifier & scaling
  • NumPy – for feature manipulation
  • Matplotlib – for live prediction plotting
  • KaggleHub – for dataset download
  • Tqdm – for progress bars

βš™οΈ Installation

βœ… Prerequisites

  • Python 3.6+
  • Webcam (optional for real-time)
  • Internet access (for Kaggle dataset)
  • kaggle.json API token if using KaggleHub

πŸ”§ Step-by-Step

git clone https://github.com/your-username/SignSense.git
cd SignSense
pip install opencv-python mediapipe numpy scikit-learn matplotlib kagglehub tqdm

πŸ”‘ Setup Kaggle API (Optional)

  1. Get your kaggle.json from Kaggle Account Settings

  2. Place it in:

    • Linux/Mac: ~/.kaggle/kaggle.json
    • Windows: C:\Users\<Username>\.kaggle\kaggle.json
  3. Set permissions:

chmod 600 ~/.kaggle/kaggle.json

πŸ“¦ Dataset

gesture-recognition-dataset/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ 0/ or wave/
β”‚   β”œβ”€β”€ 1/ or stop/
β”‚   β”œβ”€β”€ 2/ or thumbs_up/
β”œβ”€β”€ val/
    β”œβ”€β”€ ...

πŸ” Dataset Processing

  • Filters non-gesture classes
  • Extracts 63 features per image (21 landmarks Γ— 3 coords)
  • Max 500 images per class
  • Augments each with flips, rotations, brightness tweaks

πŸ§ͺ Usage

πŸ”§ Cell 1: Install Libraries

pip install opencv-python mediapipe numpy scikit-learn matplotlib kagglehub tqdm

πŸ” Cell 2: Inspect Dataset

import os, glob, kagglehub

KAGGLE_DATASET = "abhishek14398/gesture-recognition-dataset"
dataset_path = kagglehub.dataset_download(KAGGLE_DATASET)

for root, dirs, files in os.walk(dataset_path):
    print(f"{root} β†’ {dirs}, Files: {len(files)}")

image_paths = glob.glob(os.path.join(dataset_path, "**/*.*"), recursive=True)
print(f"Found {len(image_paths)} images.")

🧠 Cell 3: Train & Run

  • Trains on dataset
  • Opens webcam
  • Live prediction + matplotlib plot
python gesture_recognition.py

🧭 Code Structure

gesture_recognition.py
β”œβ”€β”€ Config
β”œβ”€β”€ Feature Extraction
β”œβ”€β”€ Data Augmentation
β”œβ”€β”€ Dataset Loader
β”œβ”€β”€ Training + Inference
└── Live Visualization

🎯 Improving Model Accuracy

  • βœ… Feature scaling (StandardScaler)
  • βœ… 100-tree Random Forest
  • βœ… Augmentation Γ—3 (flip, rotate, brightness)
  • βœ… Min detection confidence: 0.3

πŸ“ˆ Try:

  • More augmentations
  • MLPClassifier
  • K-Fold validation

πŸ› οΈ Troubleshooting

Dataset Not Found?

Check dataset_path and folder mappings (GESTURE_MAPPING).

Low Accuracy?

Try:

  • Valid landmark detection:
img = cv2.imread("image.jpg")
result = hands.process(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
print("Landmarks:", bool(result.multi_hand_landmarks))
  • Increase AUGMENTATION_FACTOR
  • Use a different classifier

Webcam Not Working?

import cv2
cap = cv2.VideoCapture(0)
print("Webcam opened:", cap.isOpened())

🀝 Contributing

  1. Fork the repo
  2. Create a feature branch
  3. Commit & push
  4. Open a pull request

Please follow PEP8 and document your changes!


πŸ“„ License

MIT License β€” See the LICENSE file.


πŸ™ Acknowledgments

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors