Skip to content

aadr22/DocuMind

Repository files navigation

🧠 DocuMind AI - Intelligent Document Processing

A full-stack AI-powered document reading and analysis SaaS application built with FastAPI, Next.js, and Firebase.

✨ Features

  • πŸ“„ Document Processing: Upload PDF, JPG, PNG files for AI analysis
  • πŸ” OCR Extraction: Advanced text extraction using PaddleOCR
  • πŸ€– AI Summarization: GPT-4o-mini powered document summaries
  • πŸ’¬ Interactive Chat: Ask questions about your documents
  • πŸ“Š Real-time Processing: Live progress tracking with beautiful dashboard
  • ⚑ Live Updates: Real-time status updates and progress monitoring
  • 🎯 Step-by-step Tracking: Visual progress through each processing stage
  • ☁️ Cloud Storage: Automatic Firebase Storage integration
  • πŸ” Authentication: Firebase Auth with Google Sign-In
  • πŸ“± Responsive Design: Modern SaaS UI with TailwindCSS
  • πŸš€ Production Ready: Deployment configurations for all platforms

πŸ—οΈ Architecture

DocuMind AI/
β”œβ”€β”€ backend/                 # FastAPI Backend
β”‚   β”œβ”€β”€ main.py             # Main application
β”‚   β”œβ”€β”€ models/             # AI models (OCR, LLM, CV)
β”‚   β”œβ”€β”€ utils/              # Firebase storage utilities
β”‚   β”œβ”€β”€ requirements.txt    # Python dependencies
β”‚   β”œβ”€β”€ Dockerfile          # Container configuration
β”‚   └── Procfile           # Platform deployment
β”œβ”€β”€ frontend/               # Next.js Frontend
β”‚   β”œβ”€β”€ app/                # App Router pages
β”‚   β”œβ”€β”€ components/         # React components
β”‚   β”œβ”€β”€ contexts/           # Authentication context
β”‚   β”œβ”€β”€ utils/              # Utilities and configs
β”‚   β”œβ”€β”€ package.json        # Node.js dependencies
β”‚   └── vercel.json         # Vercel deployment
└── README.md               # This file

πŸš€ Quick Start

Prerequisites

  • Python 3.11+ for backend
  • Node.js 18+ for frontend
  • Firebase Project for authentication and storage
  • OpenAI API Key for AI processing

1. Clone Repository

git clone <your-repo-url>
cd DocuMind

2. Backend Setup

cd backend

# Install dependencies
pip install -r requirements.txt

# Copy environment file
cp env_example.txt .env

# Edit .env with your API keys
# OPENAI_API_KEY=your_key
# FIREBASE_SERVICE_ACCOUNT_JSON={"type":"service_account",...}

# Run backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

3. Frontend Setup

cd frontend

# Install dependencies
npm install

# Copy environment file
cp env.example .env.local

# Edit .env.local with Firebase config
# NEXT_PUBLIC_FIREBASE_API_KEY=your_key
# NEXT_PUBLIC_BACKEND_API_URL=http://localhost:8000

# Run frontend
npm run dev

4. Access Application

πŸ”₯ Firebase Setup

1. Create Firebase Project

  1. Go to Firebase Console
  2. Create new project or select existing
  3. Enable Authentication (Email/Password + Google)
  4. Enable Storage with test rules

2. Get Configuration

  1. Project Settings β†’ General β†’ Your Apps
  2. Add Web App
  3. Copy configuration to frontend .env.local

3. Service Account (Backend)

  1. Project Settings β†’ Service Accounts
  2. Generate new private key
  3. Copy JSON content to backend .env

πŸš€ Deployment

Backend Deployment

Render (Recommended)

  1. Connect GitHub repository
  2. Set build command: pip install -r requirements.txt
  3. Set start command: uvicorn main:app --host 0.0.0.0 --port $PORT
  4. Add environment variables
  5. Deploy automatically

Railway

  1. Import GitHub repository
  2. Configure as web service
  3. Set environment variables
  4. Deploy with automatic builds

Docker

# Build image
docker build -t documind-backend .

# Run container
docker run -p 8000:8000 --env-file .env documind-backend

Frontend Deployment

Vercel (Recommended)

  1. Import GitHub repository
  2. Add environment variables
  3. Deploy automatically on push
  4. Custom domain configuration

Other Platforms

  • Netlify: Similar to Vercel
  • AWS Amplify: Full-stack solution
  • Docker: Containerized deployment

πŸ“š API Documentation

Endpoints

  • POST /process-document - Upload and process documents
  • POST /ask-question - Ask questions about extracted text
  • GET /health - System health check

Example Usage

# Process document
curl -X POST "http://localhost:8000/process-document" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@document.pdf"

# Ask question
curl -X POST "http://localhost:8000/ask-question" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the main topic?", "extracted_text": "..."}'

πŸ› οΈ Development

Backend Development

cd backend

# Install development dependencies
pip install -r requirements.txt

# Run with auto-reload
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# Run tests (when implemented)
pytest

# Code formatting
black .
isort .

Frontend Development

cd frontend

# Install dependencies
npm install

# Run development server
npm run dev

# Build for production
npm run build

# Run linting
npm run lint

Docker Development

# Backend
cd backend
docker-compose up --build

# Frontend
cd frontend
docker build -t documind-frontend .
docker run -p 3000:3000 documind-frontend

πŸ”§ Configuration

Environment Variables

Backend (.env)

OPENAI_API_KEY=your_openai_api_key
FIREBASE_SERVICE_ACCOUNT_JSON={"type":"service_account",...}
FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com
HOST=0.0.0.0
PORT=8000
ALLOWED_ORIGINS=http://localhost:3000,https://your-domain.vercel.app

Frontend (.env.local)

NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project_id.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_messaging_sender_id
NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id
NEXT_PUBLIC_BACKEND_API_URL=http://localhost:8000

πŸ“± Features

Document Processing

  • Supported Formats: PDF, JPG, JPEG, PNG
  • AI Analysis: Automatic text extraction and summarization
  • File Storage: Cloud storage with Firebase
  • Real-time Progress: Live processing dashboard with step-by-step tracking
  • Live Updates: WebSocket-like polling for real-time status updates
  • Error Handling: Comprehensive error tracking and user feedback

AI Capabilities

  • OCR: Advanced text recognition with PaddleOCR
  • Summarization: GPT-4o-mini powered document summaries
  • Q&A: Interactive chat about document content
  • Document Detection: YOLOv8 for image preprocessing

User Experience

  • Authentication: Secure login with Firebase Auth
  • Responsive Design: Mobile-first SaaS interface
  • Real-time Updates: Live chat and processing feedback
  • Processing Dashboard: Beautiful real-time progress tracking
  • Loading Animations: Professional loading states and transitions
  • Modern UI: Glass-morphism design with TailwindCSS

πŸš€ Performance

Backend Optimization

  • Async Processing: Non-blocking document processing
  • Model Caching: Efficient AI model loading
  • Memory Management: Automatic cleanup of temporary files
  • Error Handling: Graceful fallbacks and retries

Frontend Optimization

  • Code Splitting: Automatic route-based code splitting
  • Image Optimization: Next.js built-in image optimization
  • Bundle Analysis: Webpack bundle analyzer
  • Performance Monitoring: Core Web Vitals tracking

πŸ”’ Security

Authentication & Authorization

  • Firebase Auth: Industry-standard authentication
  • JWT Tokens: Secure session management
  • Role-based Access: User permission management
  • Secure Storage: Environment variable protection

API Security

  • CORS Protection: Configurable cross-origin policies
  • Input Validation: Pydantic model validation
  • File Upload Security: Type and size validation
  • Rate Limiting: API abuse prevention

Data Protection

  • Encrypted Storage: Firebase Storage encryption
  • Secure Communication: HTTPS enforcement
  • Data Privacy: GDPR compliance considerations
  • Audit Logging: Access and modification tracking

πŸ“Š Monitoring & Logging

Health Checks

  • System Status: Model availability monitoring
  • Performance Metrics: Response time tracking
  • Error Rates: Failure rate monitoring
  • Resource Usage: Memory and CPU monitoring

Logging

  • Structured Logging: JSON format logs
  • Log Levels: Configurable logging verbosity
  • Error Tracking: Detailed error information
  • Performance Logs: Request/response timing

🀝 Contributing

Development Workflow

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/amazing-feature
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push to branch: git push origin feature/amazing-feature
  5. Open Pull Request

Code Standards

  • Python: Black, isort, flake8
  • TypeScript: ESLint, Prettier
  • CSS: TailwindCSS best practices
  • Testing: Unit and integration tests

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

Documentation

  • API Docs: /docs endpoint when backend is running
  • Component Library: Frontend components documentation
  • Deployment Guides: Platform-specific deployment instructions

Community

  • Issues: GitHub Issues for bug reports
  • Discussions: GitHub Discussions for questions
  • Wiki: Project wiki for detailed guides

Contact

πŸ™ Acknowledgments

  • OpenAI for GPT-4o-mini API
  • PaddlePaddle for OCR capabilities
  • Ultralytics for YOLOv8 models
  • Firebase for authentication and storage
  • Vercel for frontend hosting
  • Render/Railway for backend hosting

Built with ❀️ for intelligent document processing

About

AI-Powered Document Analysis for Text Documents & Images

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors