Predicting passenger survival on the Titanic using machine learning techniques. This project aims to analyze various passenger features to build a predictive model.
- About the Project 💻
- Project Workflow 📚
- Built With 🖥️
- Getting Started 🚀
- Usage 📋
- Contributing 🤝
- License 📄
- Acknowledgements 🙏
- Contact ☎️
The sinking of the RMS Titanic is a well-known historical tragedy. This project leverages machine learning to predict whether a passenger survived based on features like age, sex, class, and family relations. It's a classic introductory project for anyone diving into data science and predictive modeling.
Key aspects of this project:
- Data cleaning and preprocessing.
- Exploratory Data Analysis (EDA) to understand the dataset.
- Feature engineering to improve model performance.
- Training and evaluating various machine learning models.
-
Data Collection and Overview:
- Gathering the Titanic dataset and understanding its structure.
-
Data Preprocessing and Cleaning:
- Handling missing values, outliers, and converting categorical data.
-
Exploratory Data Analysis (EDA):
- Visualizing data, identifying patterns, and understanding feature relationships.
-
Feature Engineering:
- Creating new features (e.g., family size, title extraction) to enhance model accuracy.
-
Model Selection and Training:
- Splitting data into training and testing sets.
- Training models like Logistic Regression, Random Forests, and Gradient Boosting.
-
Model Evaluation and Performance Metrics:
- Evaluating models using metrics like accuracy, precision, recall, and F1-score.
- Hyperparameter tuning for optimal performance.
-
Conclusion and Results:
- Summarizing model performance and identifying important features.
- Python
- Pandas
- NumPy
- Scikit-learn
- Matplotlib
- Seaborn
Follow these steps to set up the project locally.
- Python 3.x installed.
- Pip package manager.
-
Clone the repository:
git clone [https://github.com/YourUsername/titanic-survival-prediction.git](https://www.google.com/search?q=https://github.com/YourUsername/titanic-survival-prediction.git) -
Navigate to the project directory:
cd titanic-survival-prediction -
Create a virtual environment (recommended):
python3 -m venv venv source venv/bin/activate # On macOS and Linux venv\Scripts\activate # On Windows
-
Install the required packages:
pip install -r requirements.txt
-
Run the jupyter notebook
jupyter notebook titanic_notebook.ipynb