Recently I’ve been playing around with Keras for fitting neural networks. Whenever I’m learning a new tool I find it’s easier to start by applying it to something familiar. This time I decided to try using an unfamiliar dataset for learning about Keras.
The dataset I chose was the Credit Card Fraud Detection dataset hosted on Kaggle, submitted by Andrea Dal Pozzolo, who collected and analysed the data during a research collaboration of Worldline and the Machine Learning Group of Université Libre de Bruxelles (ULB) on big data mining and fraud detection.
The datasets contains credit card transactions over a two day collection period in September 2013 by European cardholders. There are a total of 284,807 transactions, of which 492 (0.172%) are known to be fraudulent.
As predictors, the dataset contains numerical variables that are the result of a principal components analysis (PCA) transformation. This transformation was applied by the original authors to maintain confidentiality of sensitive information. Additionally the dataset contains
Amount, which were not transformed by PCA. The
Time variable contains the seconds elapsed between each transaction and the first transaction in the dataset. The
Amount variable is the transaction amount, this feature can be used for example-dependant cost-senstive learning. The
Class variable is the response variable and indicates whether the transaction was fraudulent.
The data is distributed in a very clean format and there were no missing values. Since the features have been transformed by PCA, I did not spend much time on exploratory data analysis. Instead, I focused on trying out different approaches to modelling the data.
My final model consists of a stratified k-fold multi-layer perceptron (MLP) neural network (GitHub repo). I first standardized the PCA variables to each have mean 0 and standard deviation 1. Next, knowing the data was imbalanced, I computed the class weights. Then I setup the data for 5-fold stratified cross-validation. The number of folds was somewhat arbitrary but I wanted to be sure there would be enough examples of fraudulent transactions in each fold. This provided opportunity to monitor the loss of the validation set for early stopping. Finally I averaged the predicted probabilities from each model.
The model achieves an overall f1 score of 0.99, with 99% sensitivity and 14% precision for the positive class. That is, the model correctly identifies 99% of the fraud cases (true positives) but only 14% of the transactions predicted as fraudulent were actually fraudulent.
This model demonstrates utility in flagging potentially fraudulent transactions but produces many false positives. This type of model could be useful as part of an ensemble.