Predictive Failure Detection in Healthcare Integration Middleware Using Hybrid Ensemble Time-Series Machine Learning
DOI:
https://doi.org/10.63282/3050-9262.IJAIDSML-V7I2P105Keywords:
Predictive Analytics, Failure Detection, Healthcare Middleware, Anomaly Detection, LSTM, Time-Series Analysis, Integration Engine, Operational IntelligenceAbstract
Healthcare integration engines process millions of clinical messages daily, yet operational failures including queue saturation, memory exhaustion, thread starvation, and connection pool depletion are detected only after disrupting clinical workflows. This paper presents a Predictive Failure Detection System (PFDS) applying time-series machine learning to integration engine telemetry for proactive failure identification. Three model architectures are evaluated: Long Short-Term Memory (LSTM) networks, Isolation Forest, and a hybrid ensemble combining both with a gradient-boosted meta-classifier. Evaluation across 180 days of simulated enterprise telemetry (200+ channels, 500 messages/second, 847 injected failure events) demonstrates the hybrid ensemble achieves an F1-score of 0.91, median predictive lead-time of 22 minutes, and false positive rate of 4.2%. Detection rates reach 93% for queue saturation and thread starvation, 87–88% for memory exhaustion and connection pool depletion, with the longest observed lead-time at 47 minutes. Aggregate detection of gradual-onset failures (F1–F4) reaches 90.4%. PFDS enables a paradigm shift from reactive incident response to proactive failure prevention in healthcare middleware.
References
[1] R. Haux, "Health information systems past, present, future," Int. J. Med. Inform., vol. 75, pp. 268–281, 2006.
[2] D. Bender and K. Sartipi, "HL7 FHIR: An agile and RESTful approach to healthcare information exchange," in Proc. IEEE CBMS, 2013, pp. 326–331.
[3] D. W. Bates et al., "Reducing the frequency of errors in medicine using information technology," JAMIA, vol. 8, no. 4, pp. 299–308, 2001.
[4] M. Nygard, Release It! Design and Deploy Production-Ready Software, 2nd ed. Pragmatic Bookshelf, 2018.
[5] HL7 International, "HL7 v2.x Messaging Standard," 2019. [Online]. Available: hl7.org/imple...d=185
[6] S. Sundaram, "A fault-tolerant timeout framework for external service calls in healthcare integration engines," The American J. Eng. Technol., vol. 8, no. 3, pp. 121–126, 2026, doi: 10.37547/tajet/v8i3-323.
[7] R. Zhao et al., "Deep learning and its applications to machine health monitoring," Mechanical Systems and Signal Processing, vol. 115, pp. 213–237, 2019.
[8] N. Laptev et al., "Generic and scalable framework for automated time-series anomaly detection," in Proc. ACM KDD, 2015, pp. 1939–1947.
[9] I. Ö. Dogan and A. Beyza, "A survey on AIOps: Algorithms, techniques and open challenges," ACM Computing Surveys, vol. 56, no. 1, pp. 1–38, 2023.
[10] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[11] F. T. Liu, K. M. Ting, and Z.-H. Zhou, "Isolation forest," in Proc. IEEE ICDM, 2008, pp. 413–422.
[12] T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," in Proc. ACM KDD, 2016, pp. 785–794.
[13] J. Walonoski et al., "Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record," JAMIA, vol. 25, no. 3, pp. 230–238, 2018.
[14] S. M. Lundberg and S.-I. Lee, "A unified approach to interpreting model predictions," in Proc. NeurIPS, 2017, pp. 4765–4774.










