A production-ready API service that analyzes sentiment in Reddit content using state-of-the-art NLP models with comprehensive MLOps infrastructure.
-
Real-time Sentiment Analysis
- Single text prediction
- Batch prediction support
- Subreddit analysis
- User comment analysis
- URL-based analysis
- Cross-subreddit trend analysis
-
Advanced ML Pipeline
- BERT-based sentiment classifier
- Custom feature engineering
- MLflow experiment tracking
- Model versioning and registry
- Automated model evaluation
-
Production-Ready Infrastructure
- FastAPI service
- Docker containerization
- Prometheus metrics
- Grafana dashboards
- Comprehensive testing
- Extensive monitoring
- Python 3.12
- Docker & Docker Compose
- Reddit API credentials
- Clone the repository:
git clone https://github.com/pedroscortes/sentiment-analysis-api.git
cd sentiment-analysis-api- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
.\venv\Scripts\activate # Windows- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
cp .env.example .env
# Edit .env with your Reddit API credentials and other settingsdocker-compose up -dThe API will be available at http://localhost:8000
- Start the API service:
python -m src.api.main- Visit
http://localhost:8000/docsfor the Swagger UI documentation
- Metrics dashboard:
http://localhost:3000(Grafana) - Prometheus metrics:
http://localhost:9090
Run the test suite:
pytest tests/ -v --cov=src --cov-report=term-missingPOST /api/v1/analyze/text- Analyze single textPOST /api/v1/analyze/batch- Batch text analysisPOST /api/v1/analyze/subreddit- Analyze subredditPOST /api/v1/analyze/user- Analyze user commentsPOST /api/v1/analyze/url- Analyze URL contentGET /api/v1/analyze/trends- Get sentiment trends
GET /health- Service health checkGET /metrics- Prometheus metrics
sentiment-analysis-api/
├── src/
│ ├── api/ # FastAPI service
│ ├── data/ # Data processing
│ ├── models/ # ML models
│ └── monitoring/ # Metrics & monitoring
├── tests/ # Test suites
├── docker/ # Docker configurations
├── notebooks/ # Development notebooks
└── mlruns/ # MLflow experiments
- Request metrics
- Request count by endpoint
- Latency distribution
- Error rates
- Model metrics
- Sentiment distribution
- Confidence scores
- Prediction timings
- System metrics
- Memory usage
- CPU utilization
- Model load times
- API Framework: FastAPI
- ML Framework: PyTorch, Transformers
- Data Processing: NLTK
- Monitoring: Prometheus, Grafana
- Experiment Tracking: MLflow
- Testing: pytest
- Container: Docker
- Documentation: FastAPI Swagger UI
Feel free for opening PRs or giving me any suggestions!