Credit card fraud detection is presently the most frequently occurring problem in the present world. This is due to the rise in both online transactions and e-commerce platforms. Credit card fraud generally happens when the card was stolen for any of the unauthorized purposes or even when the fraudster uses the credit card information for his use. In the present world, we are facing a lot of credit card problems. To detect the fraudulent activities the credit card fraud detection system was introduced. This project aims to focus mainly on machine learning algorithms. The algorithms used are random forest algorithm and the Adaboost algorithm. The results of the two algorithms are based on accuracy, precision, recall, and F1-score. The ROC curve is plotted based on the confusion matrix. The Random Forest and the Adaboost algorithms are compared and the algorithm that has the greatest accuracy, precision, recall, and F1-score is considered as the best algorithm that is used to detect the fraud.
New methods for credit card fraud detection with a lot of research methods and several fraud detection techniques with a special interest in the neural networks, data mining, and distributed data mining. Many other techniques are used to detect such credit card fraud. When done the literature survey on various methods of credit card fraud detection, we can conclude that to detect credit card fraud there are many other approaches in Machine Learning itself.
In 2019 Sahayasakila V, D.Kavya Monisha, Aishwarya, Sikhakolli Venkatavisalakshiswshai Yasaswi have explained the Twain important algorithmic techniques [8] which are the Whale Optimization Techniques (WOA) and SMOTE (Synthetic Minority Oversampling Techniques). They mainly aimed to improve the convergence speed and to solve the data imbalance problem. The class imbalance problem is overcome using the SMOTE technique and the WOA technique. The SMOTE technique discriminates all the transactions which are synthesized are again re-sampled to check the data accuracy and are optimized using the WOA technique. The algorithm also improves the convergence speed, reliability, and efficiency of the system.
In 2018 Navanushu Khare and Saad Yunus Sait have explained their work [5] on decision trees, random forest, SVM, and logistic regression. They have taken the highly skewed dataset and worked on such type of dataset. The performance evaluation is based on accuracy, sensitivity, specificity, and precision. The results indicate that the accuracy for the Logistic Regression is 97.7%, for Decision Trees is 95.5%, for Random Forest is 98.6%, for SVM classifier is 97.5%. They have concluded that the Random Forest algorithm has the highest accuracy among the other algorithms and is considered as the best algorithm to detect the fraud. They also concluded that the SVM algorithm has a data imbalance problem and does not give better results to detect credit card fraud.
Steps for Random Forest Algorithm
1. Take the Kaggle credit card fraud dataset that is trained and randomly select some of the sample data.
2. Using the randomly created sample data now creates the Decision Trees that are used to classify the cases into the fraud and non-fraud cases.
3. The Decision Trees are formed by splitting the nodes, the nodes which have the highest Information gain make it as the root node and classify the fraud and non-fraud cases.
4. Now the majority vote is performed and the decision Trees may result in 0 as output which includes that these are the non-fraud cases.
5. Finally, we find the accuracy, precision, recall, and F1 -score for both the fraud and non-fraud cases.
Random Forest algorithm
Algorithm Random Forest :
To generate c classifiers:
For i=1 to c do
Randomly select the training data D with
replacement to produce Di
Create a root node N containing Di and cell
Build Tree(N)
End for
Majority Vote
Build Tree(N)
Randomly select x% of all the possible splitting
features in N
Select the features F that has the highest Information
A gain for further splitting
Gain (T,X)=Entropy (T)-Entropy(T,X)
Now to calculate the entropy we use,
( ) ∑ ( )
Create f child nodes
For i=1 to f do
Set contents f N to Di
Call Build Tree(Ni)
End for
End
Steps for Adaboost Algorithm
1. The Kaggle credit card fraud dataset is taken and is
trained. Randomly select some of the sample data.
2. Using the randomly created sample data now creates
the decision trees sequentially for classifying the
fraud and non-fraud cases.
3. The decision trees are formed initially. This can be
done by splitting the node based on which has the
highest information gain, make it as the root node,
and classify the fraud and non-fraud cases.
4. Now calculate the error rate, performance, and update
the weights of the fraud and non-fraud transactions
that are incorrectly classified.
5. Now majority vote is performed and the decision
trees may result as output which indicates the nonfraud cases.
6. The decision trees may output 1 which indicates that
it is a fraud case.
7. Finally, we find the accuracy, precision, recall, and
F1-score for both the fraud and non-fraud cases.
Adaboost Algorithm
Algorithm Adaboost :
IINPUTdataset
Initialize weights, w1(n)=1/n
Create a decision tree
Select the one that has the lowest Entropy
If Incorrectly classified
Calculate Total Error (TE)= sum of up incorrectly
Classified sample weights
Calculate Performance,
For each
Incorrectly classified, increase weights:
Weights incorrect =old weight *
Correctly classified, decrease the weights:
Weight correct =old weight *
Normalized weight of each sample:
Normalized weight = updated weight/sum of updated weight
End for
End if
CONCLUSION
Even though there are many fraud detection techniques we can’t say that this particular algorithm detects the fraud completely. From our analysis, we can conclude that the accuracy is the same for both the Random Forest and the Adaboost algorithms. When we consider the precision, recall, and the F1-score the Random Forest algorithm has the highest value than the Adaboost algorithm. Hence we conclude that the Random Forest Algorithm works best than the Adaboost algorithm to detect credit card fraud.
FUTURE SCOPE
From the above analysis, it is clear that many machine learning algorithms are used to detect the fraud but we can observe that the results are not satisfactory. So, we would like to implement deep learning algorithms to detect credit card fraud accurately.
latest engineering projects on data science
engineering projects on machine learning
latest engineering projects on data science
engineering projects on machine learning
best engineering projects on machine learning
best engineering projects on machine learning
best projects on machine learning
best projects in deep learning
best machine learning projects for resume
best machine learning projects for final year
best machine learning projects for beginners
best machine learning projects for portfolio
best machine learning projects for jobs
best machine learning projects github
best projects in machine learning
best machine learning projects with source code
best deep learning projects for resume
best deep learning projects github
best deep learning research projects
best machine learning project ideas
best machine learning projects
best ml projects for resume
top 5 machine learning projects for beginners
top 10 machine learning projects for beginners
best ai projects for beginners
best ml projects for final year students
best engineering projects on machine learning
best projects on machine learning
best projects in deep learning
best machine learning projects for resume
best machine learning projects for final year
best machine learning projects for beginners
best machine learning projects for portfolio
best machine learning projects for jobs
best machine learning projects github
best projects in machine learning
best machine learning projects with source code
best deep learning projects for resume
best deep learning projects github
best deep learning research projects
best machine learning project ideas
best machine learning projects
best ml projects for resume
top 5 machine learning projects for beginners
top 10 machine learning projects for beginners
best ai projects for beginners
best ml projects for final year students
best project for machine learning
best ml projects for beginners
best machine learning tutorial for beginners
mifra tech is the best place technical course learner
best project institute in bangalore is the mifratech
best machine learning course with projects
best machine learning projects in python
best machine learning projects on github
mifratech is the best engineering project center for ece and cse
best machine learning programs online
top 10 machine learning projects for beginners in python
easy machine learning projects for beginners