False Positive & False Negative Mitigation in ML-Based Threat Detection

Authors

  • John Komarthi San Jose, CA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V7I2P111

Keywords:

False Positive, Machine Learning, Class Imbalance, Cost-Sensitive Learning, Uncertainty, Intrusion Detection, Fraud Detection, False Negative, Anomaly Detection, Model Calibration, Explainability, SOC Operations, Phishing, Monitoring

Abstract

Machine learning has become an important part of cybersecurity for detecting malware, fraud, intrusions, phishing, and multiple other threats. The ML systems also face challenges with false alarms (false positives) and missing detections (false negatives), which can significantly impact the effectiveness of the Security Operations Centers (SOCs). In this paper, a detailed analysis is done of the challenges that are associated with false positives and false negatives in ML-based threat detection, while exploring the root causes and the possible mitigation strategies. The data quality issues will be examined, such as the label noise, rare event imbalance, and evolving attack patterns. Also, several model-level strategies, which include the probability calibration, cost-sensitive learning, anomaly detection methods, threshold tuning, monitoring models in production, uncertainty estimation, and adversarial robustness. The operational best practices will also be discussed, which include the evolution metrics, creating feedback loops with analysts, monitoring models in production, integrating incident response processes, and incorporating human oversight. These practices ensure that robust deployment of machine learning systems, real-world examples from the industry, such as Intrusion Detection Systems (IDS), SOCs, email phishing detection, and fraud detection, illuminating the trade-offs that are involved and the lessons learned from various implementations. Furthermore, limitations will be addressed, ethical and regulatory concerns, and the potential ways in which the attackers might exploit the mitigation of the false positives and false negatives. This outlines the various mitigation methods, which highlight the trade-offs which are related to complexity, data requirements, and the typical impact on false positives and false negatives. Recommendations are offered, such as adapting to the multi-layered systems, fostering continuous learning, and interfacing explainable AI (XAI) approaches along with future research directions. This report aims to serve as a guide for security stakeholders dealing with ML-driven false positives and false negatives.

References

[1] Axelsson, S. (2000). The base-rate fallacy and the difficulty of intrusion detection. Proceedings of the 6th ACM Conference on Computer and Communications Security.

[2] Sommer, R., & Paxson, V. (2010). Outside the closed world: On using machine learning for network intrusion detection. IEEE Symposium on Security and Privacy.

[3] Hendrycks, D., & Gimpel, K. (2017). A baseline for detecting misclassified and out-of-distribution examples in neural networks. International Conference on Learning Representations (ICLR).

[4] Ring, M., Wunderlich, S., Grüdl, D., Landes, D., & Hotho, A. (2019). A survey of intrusion detection systems. Computers & Security, 86, 1–23.

[5] Papernot, N., McDaniel, P., Sinha, A., & Wellman, M. (2016). The limitations of deep learning in adversarial settings. IEEE European Symposium on Security and Privacy.

[6] Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys, 46(4), 1–37.

[7] He, H., & Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

[8] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

[9] Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers.

[10] Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. Proceedings of ICML.

[11] Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple Classifier Systems.

[12] Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 1–58.

[13] Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proceedings of ICML.

[14] Saito, T., & Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot for imbalanced datasets. PLoS ONE, 10(3).

[15] Elkan, C. (2001). The foundations of cost-sensitive learning. Proceedings of IJCAI.

[16] Dietterich, T. G. (2000). Ensemble methods in machine learning. (Repeated intentionally for reuse consistency)

[17] Chandola, V., Banerjee, A., & Kumar, V. (2009). (Repeated – anomaly detection foundational)

[18] Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. Proceedings of ICML.

[19] Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS).

[20] Biggio, B., & Roli, F. (2018). Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 84, 317–331.

[21] Settles, B. (2010). Active learning literature survey. University of Wisconsin-Madison.

[22] LinkedIn Engineering. (2021). Human-in-the-loop AI for threat detection.

[23] Ponemon Institute. (2019). The cost of alert fatigue in security operations centers.

[24] Gartner. (2020). Market guide for network detection and response.

[25] Roesch, M. (1999). Snort: Lightweight intrusion detection for networks. USENIX LISA.

[26] Sommer, R., & Paxson, V. (2010). (Repeated – ML IDS limitations)

[27] Rawat, D. B., & Reddy, S. (2017). Software defined networking architecture, security and energy efficiency: A survey. IEEE Communications Surveys & Tutorials.

[28] Microsoft Security Team. (2022). Exchange Online protection false positive incidents.

[29] Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. (2015). Benchmarking state-of-the-art classification algorithms for credit card fraud detection. IEEE Transactions on Knowledge and Data Engineering.

[30] Verbraeken, J., Wolting, M., Katzy, J., Kloppenburg, J., Verbelen, T., & Rellermeyer, J. S. (2020). A survey on distributed machine learning. ACM Computing Surveys.

[31] European Union. (2016). General Data Protection Regulation (GDPR).

[32] Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and machine learning.

[33] Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of ICML.

[34] Scheirer, W. J., Jain, L. P., & Boult, T. E. (2013). Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.

Published

2026-04-18

Issue

Section

Articles

How to Cite

1.
Komarthi J. False Positive & False Negative Mitigation in ML-Based Threat Detection. IJAIDSML [Internet]. 2026 Apr. 18 [cited 2026 May 3];7(2):70-7. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/554