This project demonstrates a Student Performnce Prediction and Analysis system built using Python in a Google Colab environment with linea regression . It applies data analysis and machine leaning to predict a student’s final score based on stdy-related metrics such as study hours, attendance, previous exam results, and assignments completed.
This project walks through the following :
- Data Loading and Exploration – Load and inspect the dataset.
- Data Cleaning and Preprocessing – Handle missing values, duplicates, and data types.
- Visualization – Visualize relationships between metrics using charts and plots.
- Model Building – Use Linear Regression to predict final scores.
- Evaluation – Measure model performance using MAE, MSE, and R² metrics.
- Prediction Tool – Predict student performance interactively by entering input metrics.
- **Python **
- **Google Colab **
- Pandas for data handling
- NumPy for numerical operations
- Matplotlib & Seaborn for data visualization
- Scikit-learn (sklearn) for model training and evaluation
- Study Hours vs Final Score — Shows the direct relationship between time studied and performance.
- Distribution of Final Scores — Displays how final scores are spread across students.
- Pairplot of All Metrics — Illustrates relationships among all input variables.
You can input your own values for study hours, attendance, and other metrics to get a predicted final score. Example usage:
Enter number of study hours: 10 Enter attendance percentage: 85 Enter previous exam score: 70 Enter number of assignments done: 8 Predicted Final Score: 75.3