A full-stack AI-powered document reading and analysis SaaS application built with FastAPI, Next.js, and Firebase.
- π Document Processing: Upload PDF, JPG, PNG files for AI analysis
- π OCR Extraction: Advanced text extraction using PaddleOCR
- π€ AI Summarization: GPT-4o-mini powered document summaries
- π¬ Interactive Chat: Ask questions about your documents
- π Real-time Processing: Live progress tracking with beautiful dashboard
- β‘ Live Updates: Real-time status updates and progress monitoring
- π― Step-by-step Tracking: Visual progress through each processing stage
- βοΈ Cloud Storage: Automatic Firebase Storage integration
- π Authentication: Firebase Auth with Google Sign-In
- π± Responsive Design: Modern SaaS UI with TailwindCSS
- π Production Ready: Deployment configurations for all platforms
DocuMind AI/
βββ backend/ # FastAPI Backend
β βββ main.py # Main application
β βββ models/ # AI models (OCR, LLM, CV)
β βββ utils/ # Firebase storage utilities
β βββ requirements.txt # Python dependencies
β βββ Dockerfile # Container configuration
β βββ Procfile # Platform deployment
βββ frontend/ # Next.js Frontend
β βββ app/ # App Router pages
β βββ components/ # React components
β βββ contexts/ # Authentication context
β βββ utils/ # Utilities and configs
β βββ package.json # Node.js dependencies
β βββ vercel.json # Vercel deployment
βββ README.md # This file
- Python 3.11+ for backend
- Node.js 18+ for frontend
- Firebase Project for authentication and storage
- OpenAI API Key for AI processing
git clone <your-repo-url>
cd DocuMindcd backend
# Install dependencies
pip install -r requirements.txt
# Copy environment file
cp env_example.txt .env
# Edit .env with your API keys
# OPENAI_API_KEY=your_key
# FIREBASE_SERVICE_ACCOUNT_JSON={"type":"service_account",...}
# Run backend
uvicorn main:app --host 0.0.0.0 --port 8000 --reloadcd frontend
# Install dependencies
npm install
# Copy environment file
cp env.example .env.local
# Edit .env.local with Firebase config
# NEXT_PUBLIC_FIREBASE_API_KEY=your_key
# NEXT_PUBLIC_BACKEND_API_URL=http://localhost:8000
# Run frontend
npm run dev- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Go to Firebase Console
- Create new project or select existing
- Enable Authentication (Email/Password + Google)
- Enable Storage with test rules
- Project Settings β General β Your Apps
- Add Web App
- Copy configuration to frontend
.env.local
- Project Settings β Service Accounts
- Generate new private key
- Copy JSON content to backend
.env
- Connect GitHub repository
- Set build command:
pip install -r requirements.txt - Set start command:
uvicorn main:app --host 0.0.0.0 --port $PORT - Add environment variables
- Deploy automatically
- Import GitHub repository
- Configure as web service
- Set environment variables
- Deploy with automatic builds
# Build image
docker build -t documind-backend .
# Run container
docker run -p 8000:8000 --env-file .env documind-backend- Import GitHub repository
- Add environment variables
- Deploy automatically on push
- Custom domain configuration
- Netlify: Similar to Vercel
- AWS Amplify: Full-stack solution
- Docker: Containerized deployment
- POST
/process-document- Upload and process documents - POST
/ask-question- Ask questions about extracted text - GET
/health- System health check
# Process document
curl -X POST "http://localhost:8000/process-document" \
-H "Content-Type: multipart/form-data" \
-F "file=@document.pdf"
# Ask question
curl -X POST "http://localhost:8000/ask-question" \
-H "Content-Type: application/json" \
-d '{"question": "What is the main topic?", "extracted_text": "..."}'cd backend
# Install development dependencies
pip install -r requirements.txt
# Run with auto-reload
uvicorn main:app --reload --host 0.0.0.0 --port 8000
# Run tests (when implemented)
pytest
# Code formatting
black .
isort .cd frontend
# Install dependencies
npm install
# Run development server
npm run dev
# Build for production
npm run build
# Run linting
npm run lint# Backend
cd backend
docker-compose up --build
# Frontend
cd frontend
docker build -t documind-frontend .
docker run -p 3000:3000 documind-frontendOPENAI_API_KEY=your_openai_api_key
FIREBASE_SERVICE_ACCOUNT_JSON={"type":"service_account",...}
FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com
HOST=0.0.0.0
PORT=8000
ALLOWED_ORIGINS=http://localhost:3000,https://your-domain.vercel.appNEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project_id.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_messaging_sender_id
NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id
NEXT_PUBLIC_BACKEND_API_URL=http://localhost:8000- Supported Formats: PDF, JPG, JPEG, PNG
- AI Analysis: Automatic text extraction and summarization
- File Storage: Cloud storage with Firebase
- Real-time Progress: Live processing dashboard with step-by-step tracking
- Live Updates: WebSocket-like polling for real-time status updates
- Error Handling: Comprehensive error tracking and user feedback
- OCR: Advanced text recognition with PaddleOCR
- Summarization: GPT-4o-mini powered document summaries
- Q&A: Interactive chat about document content
- Document Detection: YOLOv8 for image preprocessing
- Authentication: Secure login with Firebase Auth
- Responsive Design: Mobile-first SaaS interface
- Real-time Updates: Live chat and processing feedback
- Processing Dashboard: Beautiful real-time progress tracking
- Loading Animations: Professional loading states and transitions
- Modern UI: Glass-morphism design with TailwindCSS
- Async Processing: Non-blocking document processing
- Model Caching: Efficient AI model loading
- Memory Management: Automatic cleanup of temporary files
- Error Handling: Graceful fallbacks and retries
- Code Splitting: Automatic route-based code splitting
- Image Optimization: Next.js built-in image optimization
- Bundle Analysis: Webpack bundle analyzer
- Performance Monitoring: Core Web Vitals tracking
- Firebase Auth: Industry-standard authentication
- JWT Tokens: Secure session management
- Role-based Access: User permission management
- Secure Storage: Environment variable protection
- CORS Protection: Configurable cross-origin policies
- Input Validation: Pydantic model validation
- File Upload Security: Type and size validation
- Rate Limiting: API abuse prevention
- Encrypted Storage: Firebase Storage encryption
- Secure Communication: HTTPS enforcement
- Data Privacy: GDPR compliance considerations
- Audit Logging: Access and modification tracking
- System Status: Model availability monitoring
- Performance Metrics: Response time tracking
- Error Rates: Failure rate monitoring
- Resource Usage: Memory and CPU monitoring
- Structured Logging: JSON format logs
- Log Levels: Configurable logging verbosity
- Error Tracking: Detailed error information
- Performance Logs: Request/response timing
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open Pull Request
- Python: Black, isort, flake8
- TypeScript: ESLint, Prettier
- CSS: TailwindCSS best practices
- Testing: Unit and integration tests
This project is licensed under the MIT License - see the LICENSE file for details.
- API Docs:
/docsendpoint when backend is running - Component Library: Frontend components documentation
- Deployment Guides: Platform-specific deployment instructions
- Issues: GitHub Issues for bug reports
- Discussions: GitHub Discussions for questions
- Wiki: Project wiki for detailed guides
- Email: [your-email@domain.com]
- GitHub: [your-github-username]
- Website: [your-website.com]
- OpenAI for GPT-4o-mini API
- PaddlePaddle for OCR capabilities
- Ultralytics for YOLOv8 models
- Firebase for authentication and storage
- Vercel for frontend hosting
- Render/Railway for backend hosting
Built with β€οΈ for intelligent document processing