Advanced Machine Learning Models for Detecting and Classifying Financial Fraud in Big Data-Driven

Authors

  • Bhavana Kamarthapu Fairleigh Dickinson University. Author
  • Ajay Babu Kakani Wright State University. Author
  • Sri Krishna Kireeti Nandiraju University of Illinois at Springfield. Author
  • Sandeep Kumar Chundru University of Central Missouri. Author
  • Srikanth Reddy Vangala University of Bridgeport. Author
  • Ram Mohan Polam University of Illinois at Springfield. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V2I3P105

Keywords:

Financial Fraud detection, Credit Card Fraud, big data, Machine Learning K-Nearest Neighbor (KNN)

Abstract

The banking sector faces a major challenge in identifying credit card fraud, especially as online transactions increase. This study employs the Kaggle Credit Card Fraud Detection dataset to present a machine learning (ML)-based method to credit card fraud detection. The collection contains de-identified transaction information from European cardholders. With 284,807 transactions, only 492 included fraud, suggesting a significant disparity in class. Therefore, data balancing techniques were used to improve model training. Among the data pretreatment procedures were label encoding for categorical conversion and standardization to normalize feature scales.  Using Euclidean distance, for the purpose of identifying the majority-belonging k-nearest neighbor class, A classifier called K-Nearest Neighbors (KNN) was created. The model was assessed using ROC-AUC, F1-score, accuracy, precision, recall, and K-fold cross-validation, among other important performance measures. The KNN model outperformed benchmark models like MLP and Naïve Bayes, obtaining 98.56% accuracy and an AUC of 96.07, according to experimental data, demonstrating great classification efficacy.  The promise of KNN in creating reliable and accurate fraud detection systems for cybersecurity applications is confirmed by these findings

References

1. D. Varmedja, M. Karanovic, S. Sladojevic, M. Arsenovic, and A. Anderla, “Credit Card Fraud Detection - Machine Learning methods,” in 2019 18th International Symposium INFOTEH-JAHORINA (INFOTEH), IEEE, Mar. 2019, pp. 1–5. doi: 10.1109/INFOTEH.2019.8717766.

2. M. Herland, R. A. Bauder, and T. M. Khoshgoftaar, “The effects of class rarity on the evaluation of supervised healthcare fraud detection models,” J. Big Data, 2019, doi: 10.1186/s40537-019-0181-8.

3. V. Kolluri, “An Innovative Study Exploring Revolutionizing Healthcare with AI: Personalized Medicine: Predictive Diagnostic Techniques and Individualized Treatment,” J. Emerg. Technol. Innov. Res. (, vol. 3, no. 11, 2016.

4. M. Zareapoor and P. Shamsolmoali, “Application of Credit Card Fraud Detection: Based on Bagging Ensemble Classifier,” Procedia Comput. Sci., vol. 48, pp. 679–685, 2015, doi: 10.1016/j.procs.2015.04.201.

5. C. Chimonaki, S. Papadakis, K. Vergos, and A. Shahgholian, “Identification of Financial Statement Fraud in Greece by Using Computational Intelligence Techniques,” in Lecture Notes in Business Information Processing, 2019, pp. 39–51. doi: 10.1007/978-3-030-19037-8_3.

6. A. Thennakoon, C. Bhagyani, S. Premadasa, S. Mihiranga, and N. Kuruwitaarachchi, “Real-time credit card fraud detection using machine learning,” in Proceedings of the 9th International Conference On Cloud Computing, Data Science and Engineering, Confluence 2019, 2019. doi: 10.1109/CONFLUENCE.2019.8776942.

7. V. Kolluri, “A Comprehensive Analysis on Explainable and Ethical Machine: Demystifying Advances in Artificial Intelligence,” Int. Res. J., vol. 2, no. 7, 2015.

8. S. Patil, V. Nemade, and P. K. Soni, “Predictive Modelling for Credit Card Fraud Detection Using Data Analytics,” in Procedia Computer Science, 2018. doi: 10.1016/j.procs.2018.05.199.

9. N. Khare and S. Yunus Sait, “Credit Card Fraud Detection Using Machine Learning Models and Collating Machine Learning Models,” Int. J. Pure Appl. Math., vol. 118, no. 20, pp. 825–838, 2018.

10. O. Masters, H. Hunt, E. Steffinlongo, J. Crawford, and F. Bergamaschi, “Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector,” IACR Cryptol. ePrint Arch., 2019.

11. D. D. Rao, “Multimedia based intelligent content networking for future internet,” EMS 2009 - UKSim 3rd Eur. Model. Symp. Comput. Model. Simul., pp. 55–59, 2009, doi: 10.1109/EMS.2009.108.

12. M. S. Kumar, V. Soundarya, S. Kavitha, E. S. Keerthika, and E. Aswini, “Credit Card Fraud Detection Using Random Forest Algorithm,” in 2019 Proceedings of the 3rd International Conference on Computing and Communications Technologies, ICCCT 2019, 2019. doi: 10.1109/ICCCT2.2019.8824930.

13. T. R. Pillai, I. A. T. Hashem, S. N. Brohi, S. Kaur, and M. Marjani, “Credit Card Fraud Detection Using Deep Learning Technique,” in Proceedings - 2018 4th International Conference on Advances in Computing, Communication and Automation, ICACCA 2018, 2018. doi: 10.1109/ICACCAF.2018.8776797.

14. R. R. Popat and J. Chaudhary, “A Survey on Credit Card Fraud Detection Using Machine Learning,” in Proceedings of the 2nd International Conference on Trends in Electronics and Informatics, ICOEI 2018, 2018. doi: 10.1109/ICOEI.2018.8553963.

15. M. Zamini and G. Montazer, “Credit Card Fraud Detection using autoencoder based clustering,” in 9th International Symposium on Telecommunication: With Emphasis on Information and Communication Technology, IST 2018, 2018. doi: 10.1109/ISTEL.2018.8661129.

16. J. O. Awoyemi, A. O. Adetunmbi, and S. A. Oluwadare, “Credit card fraud detection using machine learning techniques: A comparative analysis,” in 2017 International Conference on Computing Networking and Informatics (ICCNI), IEEE, Oct. 2017, pp. 1–9. doi: 10.1109/ICCNI.2017.8123782.

17. M. S. Mahmud, “An evaluation of computational intelligence in credit card fraud detection,” in 20th International Computer Science and Engineering Conference: Smart Ubiquitos Computing and Knowledge, ICSEC 2016, 2017. doi: 10.1109/ICSEC.2016.7859947.

18. J. R. D. Kho and L. A. Vea, “Credit card fraud detection based on transaction behavior,” in TENCON 2017 - 2017 IEEE Region 10 Conference, IEEE, Nov. 2017, pp. 1880–884. doi: 10.1109/TENCON.2017.8228165.

19. Kalla, D., & Samiuddin, V. (2020). Chatbot for medical treatment using NLTK Lib. IOSR J. Comput. Eng, 22, 12.

20. Kuraku, S., & Kalla, D. (2020). Emotet malware a banking credentials stealer. Iosr J. Comput. Eng, 22, 31-41.

Published

2021-10-30

Issue

Section

Articles

How to Cite

1.
Kamarthapu B, Kakani AB, Nandiraju SKK, Chundru SK, Vangala SR, Polam RM. Advanced Machine Learning Models for Detecting and Classifying Financial Fraud in Big Data-Driven. IJAIDSML [Internet]. 2021 Oct. 30 [cited 2025 Sep. 15];2(3):39-46. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/175