Ecommerce-SQL-Churn-Prediction

A full-stack SQL and machine learning project that simulates a real-world e-commerce data pipeline. From relational database design and advanced SQL analytics to customer churn prediction using Python, this project showcases end-to-end data-driven decision making.

Project Overview

This project analyzes synthetic e-commerce transactional data and uses SQL and Python to answer key business questions. It also builds a churn prediction model based on customer behavior and purchasing patterns.

Database Schema

The database is modeled using standard relational principles and contains five normalized tables:

Customers: Customer demographics and region
Products: Product metadata including category and price
Orders: High-level order details (status, date)
Order_Items: Line-level purchase details (quantity, price_each)
Payments: Payment amounts and methods

SQL table creation and data loading are defined in sql/create_all.sql.

Project Structure

E-Commerce-Sales-Analytics-Dashboard/
├── data/
│   ├── churn_features.csv         # Engineered features for ML
│   ├── churn_predictions.csv      # Model predictions
│   ├── customers.csv
│   ├── order_items.csv
│   ├── orders.csv
│   ├── payments.csv
│   └── products.csv
├── ml/
│   └── churn_prediction.py        # Python churn model
├── sql/
│   ├── create_all.sql             # Schema + table definitions
│   └── analytical_queries.sql     # Business-focused SQL insights
├── README.md

SQL Analysis Highlights

All analytical queries are found in sql/analytical_queries.sql. Key questions answered:

💰 Total revenue by product category
👑 Top customers by total spend
📈 Monthly revenue trends
🌍 Average order value by customer region
🔁 Repeat customer detection

These queries utilize complex JOINs, GROUP BY, subqueries, and aggregations to surface business-critical insights.

Customer Churn Prediction (ML)

Built using RandomForestClassifier, the churn model predicts customer dropout based on:

days_since_last_order
total_orders
total_spent

ML Workflow (`ml/churn_prediction.py`)

Load SQL-exported features (churn_features.csv)
Define churn: days_since_last_order > 90
Split into train/test, train a model, evaluate with confusion matrix
Predict churn probability and save results to churn_predictions.csv
Visualize feature importances and churn risk

Results Sample

Confusion Matrix:

Perfect model performance on test set
Model achieved perfect performance on the test set, suggesting strong signal in the features — though this may also indicate potential overfitting due to data simplicity or class separation.

Predicted
     0   1
  ----------
0 | 36 | 0
1 |  0 | 51

Top Features:

days_since_last_order = strongest signal
total_orders = valuable customer indicator

🛠️ Tools & Technologies

SQL: MySQL Workbench
Python: pandas, scikit-learn, matplotlib, seaborn
Version Control: Git & GitHub
Development: VS Code

Future Improvements

Deploy model as a Flask API or Streamlit dashboard
Add clustering to segment users further
Incorporate product return rates or promo codes
Replace synthetic data with anonymized real-world set

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ecommerce-SQL-Churn-Prediction

Project Overview

Database Schema

Project Structure

SQL Analysis Highlights

Customer Churn Prediction (ML)

ML Workflow (`ml/churn_prediction.py`)

Results Sample

🛠️ Tools & Technologies

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
ml		ml
sql		sql
.gitattributes		.gitattributes
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Ecommerce-SQL-Churn-Prediction

Project Overview

Database Schema

Project Structure

SQL Analysis Highlights

Customer Churn Prediction (ML)

ML Workflow (ml/churn_prediction.py)

Results Sample

🛠️ Tools & Technologies

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

ML Workflow (`ml/churn_prediction.py`)

Packages