Skip to content

Latest commit

 

History

History
36 lines (30 loc) · 1.21 KB

File metadata and controls

36 lines (30 loc) · 1.21 KB

SUBSCRIPTION PROBABILITY MODEL

Using machine learning to develop models that is capable of predicting customer subscription based on current dataset

The dataset is composed of 21 variables: [20 independent (X) & 1 dependent(y)]

  • The code analyses the spread of the current dataset
  • Calculates the correlation between the input datasets and the output dataset
  • Identifies 5 most correlated x(i) variable to y
  • Model 1 - Linear regression model using 20 inputs
  • Model 2 - Linear regression model using 5 most correlated input to 'y'
  • Splits the data set to Training and Test Data set with a ratio of 9:1
  • Evaluates the accuracy of the machine models

PREREQUISITIE:

  1. The program is scripted in Python3. Have the correct version of python installed
  2. Please ensure you have following libraries installed:
  • pandas
  • numpy
  • matplotlib
  • scipy
  • sklearn
  1. Please have the following files in the same directory as main.py
  • data set.csv

INSTRUCTIONS:

  1. Load Terminal
  2. Set current directory to the folder containing the main.py file.
  3. Run script from a Terminal. Type python3 main.py
  4. The plots generated by the code will be saved in the same directory as main.py file.

Author

  • Jerin Philips Rajan