YogaFix

YogaFix is a real-time Yoga Pose detection and feedback system built using Python, OpenCV, and Mediapipe for pose estimation. The system is served via a FastAPI backend that captures webcam frames server-side, processes them to detect poses, and provides real-time feedback over WebSocket connections.

Important: This version relies on server-side webcam access (using cv2.VideoCapture(0)). It must be deployed on hardware with an attached webcam (e.g., a local machine or a dedicated server/VPS with USB passthrough). Cloud platforms like Render or similar PaaS do not provide direct hardware access.

Overview
Features
Tech Stack
Directory Structure
How It Works
API Endpoints
Running Locally
Performance & Concurrency Considerations
Future Enhancements
License

Overview

This module provides real-time feedback for yoga poses by:

Capturing webcam video directly on the server.
Processing each frame with a pose detection algorithm.
Comparing the user's pose to predefined “ideal” poses.
Returning annotated frames and detailed feedback over WebSockets.

The system is designed for scenarios where server-side processing is viable (e.g., dedicated hardware) and offers low-latency feedback for enhanced user interaction.

Features

Real-Time Processing: Captures and processes frames from a physical webcam attached to the server.
Multiple Pose Detection: Supports a variety of yoga poses, including:
- T Pose
- Tree Pose
- Warrior 3 Pose
- Bridge Pose
- Cat Pose
- Cobra Pose
- Crescent Lunge Pose
- Downward Facing Dog Pose
- Leg-Up-The-Wall Pose
- Mountain Pose
- Padmasana (Lotus Pose)
- Pigeon Pose
- Seated Forward Bend
- Standing Forward Bend
- Triangle Pose
- Warrior Pose
Detailed Feedback: Computes similarity scores based on joint angles and generates corrective feedback.
WebSocket Communication: Uses FastAPI’s asynchronous WebSocket support for real-time bi-directional communication.
CORS Enabled: Easily integrates with separate front-end applications.

Tech Stack

Programming Language: Python 3.x
Backend Framework: FastAPI
WebSocket Server: Uvicorn (ASGI server)
Computer Vision: OpenCV
Pose Estimation: Mediapipe
Data Processing: NumPy
Asynchronous Programming: asyncio

Directory Structure

YogaModule/
├─ api/
│  └─ main.py               # FastAPI application with server-side webcam processing
├─ logic/
│  ├─ __init__.py
│  ├─ T_pose.py           # T Pose detection logic
│  ├─ traingle_pose.py    # Triangle Pose detection logic
│  ├─ Tree_pose.py        # Tree Pose detection logic
│  ├─ Crescent_lunge_pose.py  # Crescent Lunge detection logic
│  ├─ warrior_pose.py     # Warrior Pose detection logic
│  └─ mountain_pose.py    # Mountain Pose detection logic
├─ tests/
|    └─index.htm          # Client side code to test the API
└─ README.md              # This README file

web-app/app.py: Contains the FastAPI backend which captures frames from a server-side webcam, processes them, and sends back annotated frames and feedback.
logic/: Contains the pose checker classes that perform frame processing, angle calculations, and generate feedback.

How It Works

Webcam Capture:
The API opens a connection to a physical webcam using cv2.VideoCapture(0).
Frame Processing:
- Each frame is read, flipped for a mirror view, and passed to a selected pose checker.
- The pose checker uses Mediapipe to extract landmarks and compute joint angles.
- A similarity score is calculated by comparing the user's pose with the ideal pose.
- Annotated frames are generated by drawing landmarks using Mediapipe’s drawing utilities.
WebSocket Communication:
- The processed frame (encoded as a JPEG and then base64) and the feedback (similarity score, joint details, textual corrections) are sent back to the client via a WebSocket connection.
- A connection manager handles multiple clients and processing tasks concurrently.

API Endpoints

WebSocket Endpoint

URL: /ws/{client_id}
Method: WebSocket
Description:
When a client connects and sends a JSON message containing a "pose_type", the API starts processing frames from the server-side webcam. It continuously sends back a JSON response containing:
- frame: Base64-encoded annotated JPEG image.
- feedback: An object with:
  - similarity: A float value representing overall pose similarity.
  - feedback_text: A textual description of the feedback.
  - joint_similarities: Detailed feedback per joint (if applicable).
Stop Command:
Clients can send {"command": "stop"} to disconnect and stop processing.

Health Check Endpoint

URL: /health
Method: GET
Description:
Returns a JSON response indicating the server status.
```
{
  "status": "healthy"
}
```

Running Locally

Clone the Repository:

git clone https://github.com/yourusername/YogaModule.git
cd YogaModule

Set Up a Virtual Environment (Optional but Recommended):

python -m venv venv
source venv/bin/activate  # For Linux/Mac
# or venv\Scripts\activate   # For Windows

Install Dependencies:
```
pip install fastapi uvicorn opencv-python mediapipe numpy
```
If a requirements.txt is available, run:
pip install -r requirements.txt
Run the FastAPI Server:
```
cd web-app
uvicorn app:app --reload --host 0.0.0.0 --port 8000
```
The server will start at http://localhost:8000.
Connect a WebSocket Client: Use a WebSocket client or a browser-based front-end to connect to ws://localhost:8000/ws/{client_id} and send JSON messages as described.

Note: Ensure that the machine running the server has a webcam attached. If cv2.VideoCapture(0) fails, verify the webcam index or hardware permissions.

Performance & Concurrency Considerations

CPU-Intensive Processing:
Frame processing (especially with OpenCV and Mediapipe) is CPU-bound. For multiple concurrent connections, consider:
- Offloading heavy computations to separate worker threads or processes.
- Horizontal scaling (running multiple instances) if using dedicated hardware.
Vertical Scaling:
Since server-side webcam processing avoids network transmission delays and base64 overhead from the client, it can offer faster processing. However, vertical scaling (upgrading CPU/RAM) is crucial if many clients connect concurrently.
Hardware Constraints:
This approach requires a physical webcam. In cloud environments, server-side webcam access is typically not available, so this setup is best suited for dedicated hardware or on-premise servers.

Demo Videos

Demo of the project can be seen in this playlist

Future Enhancements

GPU Acceleration:
Integrate CUDA/TensorRT to speed up pose estimation on GPUs.
Asynchronous Processing:
Use thread pools or asynchronous libraries to better handle CPU-bound tasks without blocking the event loop.
Client-Side Integration:
Develop a web or mobile front-end that dynamically connects via WebSockets for real-time feedback.
Support for Multiple Cameras:
Extend the module to support multiple simultaneous camera inputs or multiple users.

License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YogaFix

Table of Contents

Overview

Features

Tech Stack

Directory Structure

How It Works

API Endpoints

WebSocket Endpoint

Health Check Endpoint

Running Locally

Performance & Concurrency Considerations

Demo Videos

Future Enhancements

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

YogaFix

Table of Contents

Overview

Features

Tech Stack

Directory Structure

How It Works

API Endpoints

WebSocket Endpoint

Health Check Endpoint

Running Locally

Performance & Concurrency Considerations

Demo Videos

Future Enhancements

License