This repository contains a PyTorch implementation of a neural network for credit card fraud detection.
Accuracy: 0.9994
Precision: 0.8144
Recall: 0.8061
F1-score: 0.8103
The notebook walks through the process of building and training a binary classification model using a dataset containing anonymized credit card transactions. The goal is to identify fraudulent transactions.
The model is trained on a dataset that includes transaction details such as time, amount, and 28 anonymized features (V1-V28), along with a 'Class' label indicating whether the transaction is fraudulent (1) or not (0).
The model is a simple feedforward neural network with two hidden layers and dropout for regularization.
- Input Layer: Matches the number of features in the dataset.
- Hidden Layer 1: 64 neurons, followed by ReLU activation and dropout.
- Hidden Layer 2: 32 neurons, followed by ReLU activation and dropout.
- Output Layer: 1 neuron with a linear activation (sigmoid is applied during evaluation for probability).
The following libraries are required to run the notebook:
torchtorch.nntorch.nn.functionalsklearn.model_selectionsklearn.preprocessingtorch.optimnumpypandassklearn.metrics
- Load the data: Load the
creditcard.csvfile into a pandas DataFrame. - Preprocessing:
- Separate features (X) and labels (y).
- Convert data to PyTorch tensors.
- Handle any potential NaN values.
- Split the data into training and testing sets.
- Scale the features using
StandardScaler.
- Model Definition: The
FraudDetectionModelclass defines the neural network architecture. - Training:
- Instantiate the model, loss function (
BCEWithLogitsLosswithpos_weightto handle class imbalance), and optimizer (Adam). - Create
DataLoaderinstances for the training and testing datasets. - Train the model using the
train_modelmethod.
- Instantiate the model, loss function (
- Evaluation: Evaluate the trained model on the test set using the
evaluate_modelmethod, which calculates accuracy, precision, recall, and F1-score. A threshold is applied to the model's output to classify predictions.
creditcard.csv: The dataset containing the transaction data.- This notebook: Contains the Python code for the model implementation, training, and evaluation.
The evaluation metrics (Accuracy, Precision, Recall, F1-score) on the test set are printed after the model is evaluated.