Skip to content

Latest commit

 

History

History
33 lines (25 loc) · 1.5 KB

File metadata and controls

33 lines (25 loc) · 1.5 KB

Mathematics-Machine-Learning

This repository contains an individual final project for the course Mathematics in Machine Learning, part of the MSc in Data Science and Engineering at Politecnico di Torino.

The project develops an end-to-end machine learning pipeline for a binary classification task based on the HCV (Hepatitis C Virus) diagnostic dataset from the UCI Machine Learning Repository, which contains blood test measurements used for medical classification.

The focus of the work is not only on predictive performance, but on the mathematical structure and assumptions underlying each stage of the modeling process. This includes data preprocessing, model formulation, and a comparative evaluation of different classification approaches.

Project Structure

1. Jupyter Notebook

  • End-to-end implementation of the workflow:
    • Data exploration and preprocessing
    • Model training
    • Performance evaluation
    • Comparative analysis across multiple models

2. Theoretical Report

  • In-depth treatment of the mathematical foundations:
    • Model assumptions
    • Key derivations
    • Theoretical justification of methods
    • Discussion of limitations and applicability

Key Skills & Techniques

  • Statistical modeling and inference
  • Optimization methods for machine learning
  • Model evaluation and performance metrics
  • Assumption validation and model diagnostics
  • Comparative analysis of machine learning models
  • Data preprocessing and feature engineering (Python, Jupyter)