Machine Learning Project 3 - Credit Card Fraud Detection
![Image](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFgMl9kLQfgJko1Ezy-c9cUshkAolWfnSdEnwbL5pjMZa_rEqlKT4LcVCPkCzx9JVUr_NOv5B7Y3fb0PwJiBUyluPIebNSARD1NCRQTtT4j10Gz2BhvF7s9_aZl11SputB1-ni1YrsQ28/s640/credit-card-fraud-detection-1-638.jpg)
Aim: The goal of this project is to automatically identify fraudulent credit card transactions using Machine Learning. This is a binary classification problem. My approach is explained below: Workflow: 1. Check the distribution of the classes in the response variable - whether it is an imbalanced or a balanced dataset. 2. Create a baseline model (LogisticRegression), and check recall value for fraudulent transaction class : a. if low recall value, then solve for the imbalanced data. i. Under-sampling ii. Over-sampling methods. b. if high recall value, move forward. 3. Model Selection: train with other models and select the best one. 4. Feature Selection: apply feature selection techniques to select the best features. 5. Final model Model Evaluation methods: 1. Recall Values 2. Precison Values 3. Area under curve 1. Class Distribution: This is an imbalanced dataset. There are total 2,84,807 number of transaction