Using ML Models to Detect Unusual Database Activity or Performance Degradation

Authors

  • Nagireddy Karri Senior IT Administrator Database, Sherwin-Williams, USA. Author
  • Sandeep Kumar Jangam Lead Consultant, Infosys Limited, USA. Author
  • Partha Sarathi Reddy Pedda Muntala Software Developer at Cisco Systems, Inc, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V3I3P111

Keywords:

Machine Learning, Database Monitoring, Anomaly Detection, Performance Degradation, Predictive Analytics, Autoencoders, Random Forest, Support Vector Machines

Abstract

Database performance and activity has turned into a major concern due to the demanding growth of data and increased complexity of the existing database systems. Conventional monitoring is normally applied on the available thresholds and the potential ability to suspect the relationship by human eye, which cannot act as proper sensors to indicate the occurrence of subtle anomalies or predict the degree of performance degradation. The present paper considers the potential of machine learning (ML) models in identifying the signs of abnormal activity in the database and performance issues. The patterns are the indicators of suspicious activity that are recognized by the ML algorithms on both the historical database data metrics and transactional logs. The paper taking a comparison of various supervised /unsupervised approaches to learning includes Random Forest, Support Thanks Machines (SVM), K-Means clustering, and Auto encoders by their accuracy, precision, recall, and time efficiency. The results of the experimental evidence confirm that the ML-driven models are more efficient in comparison with the conventional monitoring frameworks in detecting deviations, such as query delays, resource bottlenecks and the unforeseen access pattern. The paper further proposes a blend structure of identifying the aberrations that would be an integration of the predictive performance analytics and anomaly detection to enable the preemptive action. The results demonstrate that the advantage of applying ML to the process of monitoring databases is that it can increase the detection rate, reduce the instances when false positives are detected, which is best utilized to control the resources and enhance the reliability of the system. The task adds to the growing body of research on intelligent database management and provides a glimpse of the actual implementation of the systems based on the use of MLs as a means of monitoring

References

[1] Karakurt, İ., Özer, S., Ulusinan, T., & Ganiz, M. C. (2017, October). A machine learning approach to database failure prediction. In 2017 International Conference on Computer Science and Engineering (UBMK) (pp. 1030-1035). IEEE.

[2] Young, Z., & Steele, R. (2022). Empirical evaluation of performance degradation of machine learning-based predictive models–A case study in healthcare information systems. International Journal of Information Management Data Insights, 2(1), 100070.

[3] Mauri, L., & Damiani, E. (2021). Estimating degradation of machine learning data assets. ACM Journal of Data and Information Quality (JDIQ), 14(2), 1-15.

[4] Costante, E., Vavilis, S., Etalle, S., den Hartog, J., Petković, M., & Zannone, N. (2013, July). Database anomalous activities detection and quantification. In 2013 International Conference on Security and Cryptography (SECRYPT) (pp. 1-6). IEEE.

[5] Mazzawi, H., Dalal, G., Rozenblatz, D., Ein-Dorx, L., Niniox, M., & Lavi, O. (2017, April). Anomaly detection in large databases using behavioral patterning. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE) (pp. 1140-1149). IEEE.

[6] Goulet, J. A., & Smith, I. F. (2011). Overcoming the limitations of traditional model-updating approaches. In Vulnerability, Uncertainty, and Risk: Analysis, Modeling, and Management (pp. 905-913).

[7] Chakravarthy, S. (1995). Architectures and monitoring techniques for active databases: An evaluation. Data & knowledge engineering, 16(1), 1-26.

[8] Jaaz, Z. A., Oleiwi, S. S., Sahy, S. A., & Albarazanchi, I. (2020). Database techniques for resilient network monitoring and inspection. TELKOMNIKA (Telecommunication Computing Electronics and Control), 18(5), 2412-2420.

[9] Yin, S., Li, X., Gao, H., & Kaynak, O. (2014). Data-based techniques focused on modern industry: An overview. IEEE Transactions on industrial electronics, 62(1), 657-667.

[10] Nassif, A. B., Talib, M. A., Nasir, Q., & Dakalbab, F. M. (2021). Machine learning for anomaly detection: A systematic review. IEEE Access, 9, 78658-78700.

[11] Lazarevic, A., Ertoz, L., Kumar, V., Ozgur, A., & Srivastava, J. (2003, May). A comparative study of anomaly detection schemes in network intrusion detection. In Proceedings of the 2003 SIAM international conference on data mining (pp. 25-36). Society for Industrial and Applied Mathematics.

[12] Demestichas, K., Alexakis, T., Peppes, N., & Adamopoulou, E. (2021). Comparative analysis of machine learning-based approaches for anomaly detection in vehicular data. Vehicles, 3(2), 171-186.

[13] Qasim, M., Khan, M., Mehmood, W., Sobieczky, F., Pichler, M., & Moser, B. (2022, August). A comparative analysis of anomaly detection methods for predictive maintenance in SME. In International Conference on Database and Expert Systems Applications (pp. 22-31). Cham: Springer International Publishing.

[14] Narang, R. (2018). Database management systems. PHI Learning Pvt. Ltd..

[15] Macdonald, C., Tonellotto, N., & Ounis, I. (2012, August). Learning to predict response times for online query scheduling. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 621-630).

[16] Shaik, A. B., & Srinivasan, S. (2018, November). A brief survey on random forest ensembles in classification model. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2018, Volume 2 (pp. 253-260). Singapore: Springer Singapore.

[17] Baldi, P. (2012, June). Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning (pp. 37-49). JMLR Workshop and Conference Proceedings.

[18] Najmi, M., Rigas, J., & Fan, I. S. (2005). A framework to review performance measurement systems. Business process management journal, 11(2), 109-122.

[19] Kumar, K., Chaudhury, K., & Tripathi, S. L. (2022). Future of machine learning (ML) and deep learning (DL) in healthcare monitoring system. Machine learning algorithms for signal and image processing, 293-313.

[20] Ahsan, M. M., Mahmud, M. P., Saha, P. K., Gupta, K. D., & Siddique, Z. (2021). Effect of data scaling methods on machine learning algorithms and model performance. Technologies, 9(3), 52.

Published

2022-10-30

Issue

Section

Articles

How to Cite

1.
Karri N, Jangam SK, Pedda Muntala PSR. Using ML Models to Detect Unusual Database Activity or Performance Degradation. IJAIDSML [Internet]. 2022 Oct. 30 [cited 2025 Oct. 30];3(3):102-10. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/284