Online shoppers often rely on ratings, reviews, and discounts to make decisions — but how reliable are these signals? In this project, I analyse an Amazon products dataset to explore:
- How review volume affects rating reliability
- Whether heavy discounts imply poor product quality
- The relationship between price, discount, and customer satisfaction
- Category-level pricing and rating behaviour
The project emphasises business-driven EDA, not prediction.
- Perform structured exploratory data analysis using Python
- Identify counterintuitive patterns in customer behaviour
- Visualise relationships between price, discount, rating, and review volume
- Translate insights into business-relevant conclusions
product_name category actual_price discounted_price discount_percentage rating rating_count
Python Pandas (data manipulation & aggregation) Matplotlib & Seaborn (visualisation)
🔹 1. Rating Reliability vs Review Volume 🔹 2. Ratings of Highly Discounted Products (>50%) 🔹 3. Price, Rating & Discount Interaction 🔹 4. Category-Level Insights 🔹 5. Review Behaviour Analysis
Amazon/ │ ├── data/ │ └── amazon_data.xlsx │ ├── notebooks/ │ └── Amazon.ipynb │ ├── visuals/ │ └── Amazon Visuals.ipynb │ ├── README.md
📊 EDA is crucial to assess data reliability, not just averages
📁 Project Structure