An Advanced Machine Learning Models Design for Fraud Identification in Healthcare Insurance

Authors

  • Sriram Pabbineedi University of Central Missouri. Author
  • Mitra Penmetsa University of Illinois at Springfield. Author
  • Jayakeshav Reddy Bhumireddy University of Houston. Author
  • Rajiv Chalasani Sacred Heart University. Author
  • Mukund Sai Vikram Tyagadurgam University of Illinois at Springfield. Author
  • Venkataswamy Naidu Gangineni University of Madras, Chennai. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V2I1P104

Keywords:

Healthcare, Insurance, Fraud Detection, Insurance fraud analytics, Artificial Intelligence, Insurance Claims, Machine Learning, Insurance fraud detection dataset

Abstract

Healthcare fraud threatens the interests of both healthcare facilities and the patients that they serve. In turn, such actions usually cause enormous financial damage and compromise the trustworthiness of healthcare systems. The proposed study intends, by adding a machine learning framework, to counter the issue of healthcare fraud. The systematic analysis of healthcare data shows patterns and anomalies which can be used to identify fraudulent behavior with more precision and speed. Based on the XG boost algorithm, this work describes a machine learning approach to irregularity detection and health insurance premium estimation. XG Boost algorithm was evaluated using the standardized performance measures such as R², MAE, RMSE and MAPE. Model’s 89. 47% R² coupled with MAE of 1. XGBoost showed more predictive performance than Random Forest and Genetic Support Vector Machines (GSVMs) when compared. Support for generalization of the model was also offered when learning curves and prediction error plots were considered. From these results it is evident that XGBoost is a reliable approach for detecting insurance fraud and pricing within structured healthcare settings

References

N. Rayan, “Framework for analysis and detection of fraud in health insurance,” in Proceedings of 2019 6th IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2019, 2019. doi: 10.1109/CCIS48116.2019.9073700.

[2] S. Chen and A. Gangopadhyay, “A novel approach to uncover health care frauds through spectral analysis,” in Proceedings - 2013 IEEE International Conference on Healthcare Informatics, ICHI 2013, 2013. doi: 10.1109/ICHI.2013.77.

[3] S. Kareem, R. B. Ahmad, and A. B. Sarlan, “Framework for the identification of fraudulent health insurance claims using association rule mining,” in 2017 IEEE Conference on Big Data and Analytics, ICBDA 2017, 2017. doi: 10.1109/ICBDAA.2017.8284114.

[4] V. Rawte and G. Anuradha, “Fraud detection in health insurance using data mining techniques,” in Proceedings - 2015 International Conference on Communication, Information and Computing Technology, ICCICT 2015, 2015. doi: 10.1109/ICCICT.2015.7045689.

[5] C. Sun, Q. Li, H. Li, Y. Shi, S. Zhang, and W. Guo, “Patient Cluster Divergence Based Healthcare Insurance Fraudster Detection,” IEEE Access, vol. 7, pp. 14162–14170, 2019, doi: 10.1109/ACCESS.2018.2886680.

[6] A. Bayerstadler, L. van Dijk, and F. Winter, “Bayesian Multinomial Latent Variable Modeling for Fraud and Abuse Detection in Health Insurance,” Insur. Math. Econ., vol. 71, pp. 244–252, Nov. 2016, doi: 10.1016/j.insmatheco.2016.09.013.

[7] A. Verma, A. Taneja, and A. Arora, “Fraud detection and frequent pattern matching in insurance claims using data mining techniques,” in 2017 10th International Conference on Contemporary Computing, IC3 2017, 2017. doi: 10.1109/IC3.2017.8284299.

[8] P. Doupe, J. Faghmous, and S. Basu, “Machine Learning for Health Services Researchers,” Value Heal., 2019, doi: 10.1016/j.jval.2019.02.012.

[9] C. Francis, N. Pepper, and H. Strong, “Using support vector machines to detect medical fraud and abuse,” in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 2011. doi: 10.1109/IEMBS.2011.6092044.

[10] J. M. Johnson and T. M. Khoshgoftaar, “Medicare fraud detection using neural networks,” J. Big Data, 2019, doi: 10.1186/s40537-019-0225-0.

[11] R. A. Sowah et al., “Decision Support System (DSS) for Fraud Detection in Health Insurance Claims Using Genetic Support Vector Machines (GSVMs),” J. Eng. (United Kingdom), 2019, doi: 10.1155/2019/1432597.

[12] C. Y. Hung, C. H. Lin, and C. C. Lee, “Improving Young Stroke Prediction by Learning with Active Data Augmenter in a Large-Scale Electronic Medical Claims Database,” in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 2018. doi: 10.1109/EMBC.2018.8513479.

[13] C. Y. Hung, W. C. Chen, P. T. Lai, C. H. Lin, and C. C. Lee, “Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database,” in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 2017. doi: 10.1109/EMBC.2017.8037515.

[14] R. A. Bauder and T. M. Khoshgoftaar, “Medicare Fraud Detection Using Machine Learning Methods,” in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017, pp. 858–865. doi: 10.1109/ICMLA.2017.00-48.

[15] A. Immadisetty, “Edge Analytics vs. Cloud Analytics: Tradeoffs in Real-Time Data Processing,” J. Recent Trends Comput. Sci. Eng., vol. 13, no. 1, pp. 42–52, 2016.

[16] A. H. Anju, “Extreme Gradient Boosting using Squared Logistics Loss function,” Int. J. Sci. Dev. Res., vol. 2, no. 8, pp. 54–61, 2017.

[17] L. Torlay, M. Perrone-Bertolotti, E. Thomas, and M. Baciu, “Machine learning–XGBoost analysis of language networks to classify patients with epilepsy,” Brain Informatics, 2017, doi: 10.1007/s40708-017-0065-7.

[18] R. Tarafdar and Y. Han, “Finding Majority for Integer Elements,” J. Comput. Sci. Coll., vol. 33, no. 5, pp. 187–191, 2018.

[19] S. Suri and D. V Jose, “Effective Fraud Detection in Healthcare Domain using Popular Classification Modeling Techniques,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, 2019.

[20] Kalla, D., & Samiuddin, V. (2020). Chatbot for medical treatment using NLTK Lib. IOSR J. Comput. Eng, 22, 12.

[21] Kuraku, S., & Kalla, D. (2020). Emotet malware a banking credentials stealer. Iosr J. Comput. Eng, 22, 31-41.

Published

2021-03-30

Issue

Section

Articles

How to Cite

1.
Pabbineedi S, Penmetsa M, Bhumireddy JR, Chalasani R, Tyagadurgam MSV, Gangineni VN. An Advanced Machine Learning Models Design for Fraud Identification in Healthcare Insurance. IJAIDSML [Internet]. 2021 Mar. 30 [cited 2025 Sep. 15];2(1):26-34. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/177