Bridging the Gap: Real-time Sign Language to Speech Translation with Neural Intelligence.
SignBridge AI v2.0 is a high-performance, real-time sign language translation system designed to convert hand gestures and body language into natural spoken English. Built for the modern web, v2.0 features a completely redesigned "Neural Link" interface, asynchronous processing for zero-latency feedback, and deep integration with Google's Gemini generative AI for grammatically correct translations.
- Premium v2.0 UI/UX: A high-contrast, modern dark interface built with Next.js 16 and Tailwind CSS 4. Features glassmorphism, ambient glow effects, and a mobile-first responsive design.
- Asynchronous Neural Link: Refactored backend using
asynciothreading to decouple heavy ML processing from WebSocket I/O, enabling true 60 FPS real-time feedback even on older hardware. - Full-Body Intelligence: Optimized MediaPipe Holistic integration for face, body, and hand landmark extraction with mirrored canvas alignment.
- Integrated Lexicon: Built-in library of supported ISL gestures with predictive hints for users.
- Archived Sessions: Persistent session history with AI confidence tracking and timestamped translation logs.
- Real-time Audio: Low-latency neural Text-to-Speech (TTS) integration provides immediate verbal feedback.
┌─────────────────────────────────┐ WebSocket (JSON + Binary) ┌─────────────────────────────────┐
│ Next.js 16 (Redesigned) │ <───────────────────────────────────────────> │ FastAPI (Async) │
├─────────────────────────────────┤ ├─────────────────────────────────┤
│ │ 1. Camera Frames (640x360) │ │
│ ┌─────────────────────────┐ │ ────────────────────────────────────────────> │ ┌─────────────────────────┐ │
│ │ Neural Link App │ │ │ │ MediaPipe (Lite) │ │
│ └───────────┬─────────────┘ │ │ └────────────┬────────────┘ │
│ │ │ 2. Landmarks & Confidence │ │ │
│ ┌───────────▼─────────────┐ │ <──────────────────────────────────────────── │ ┌────────────▼────────────┐ │
│ │ Mirrored Overlay │ │ │ │ Async LSTM Worker │ │
│ └───────────┬─────────────┘ │ │ └────────────┬────────────┘ │
│ │ │ 3. Translated Audio (MP3) │ │ │
│ ┌───────────▼─────────────┐ │ <──────────────────────────────────────────── │ ┌────────────▼────────────┐ │
│ │ Translation Panel │ │ │ │ Gemini Service │ │
│ └─────────────────────────┘ │ │ └─────────────────────────┘ │
│ │ │ │
└─────────────────────────────────┘ └─────────────────────────────────┘
- Navigate to the backend directory:
cd backend - Activate your virtual environment:
.\venv\Scripts\activate(Windows) orsource venv/bin/activate(Linux/Mac) - Install dependencies:
pip install -r requirements.txt - Configure
.env.localwith yourGEMINI_API_KEY. - Start the engine:
python main.py
- Navigate to the frontend directory:
cd web-frontend - Install dependencies:
npm install - Run the pro interface:
npm run dev
SignBridge v2.0 is specifically engineered for legacy hardware (e.g., Intel i5-4440) and low-bandwidth environments:
- Threaded Inference: Uses
asyncio.to_threadto prevent compute-heavy models from blocking the network stack. - Lite Model Complexity: Defaults to MediaPipe
Complexity 0for maximum FPS without sacrificing accuracy. - Smart Frame Synchronization: Frontend only sends new frames once the previous one is processed, eliminating buffer bloat and queue latency.
- Targeted Resizing: Internal frame resizing to 320x180 (16:9) maintains high-speed throughput for neural analysis.
- Neural Link Integrity: WebSocket connections are restricted to authorized local interfaces by default.
- Environment Protection: All sensitive API keys are isolated in
.env.localand never committed to source control. - Stateless Processing: Video data remains in transient memory and is discarded immediately after landmark extraction.
Licensed under the Apache License 2.0.