Multi-Cloud Resource Stability Forecasting Using Temporal Fusion Transformers

Authors

  • Parameswara Reddy Nangi Independent Researcher, USA. Author
  • Chaithanya Kumar Reddy Nala Obannagari Independent Researcher, USA. Author
  • Sailaja Settipi Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V3I3P113

Keywords:

Resource Stability Forecasting, Temporal Fusion Transformer, Sla Management, Proactive Resource Management

Abstract

The growing adoption of multi-cloud strategies enables enterprises to improve resilience, flexibility, and cost efficiency by leveraging services from multiple cloud providers such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). However, managing resource stability across heterogeneous cloud environments remains a significant challenge. Resource utilization patterns vary dynamically due to workload fluctuations, provider-specific autoscaling mechanisms, and infrastructure differences, often leading to Service Level Agreement (SLA) violations and unexpected operational costs. Existing cloud monitoring and management solutions are largely reactive, relying on threshold-based alerts that respond only after performance degradation has occurred. This paper presents a predictive multi-cloud resource stability forecasting framework based on Temporal Fusion Transformers (TFTs). The proposed approach integrates heterogeneous telemetry data including CPU utilization, memory usage, network latency, I/O performance, and cost metrics from multiple cloud providers into a unified multivariate time-series modeling pipeline. TFTs are employed to capture both short-term volatility and long-term temporal dependencies while supporting interpretable attention mechanisms and variable selection networks. The model generates multi-horizon forecasts with associated confidence intervals, enabling probabilistic estimation of SLA violation risk. By coupling predictive forecasts with an SLA-aware decision engine, the framework supports proactive resource management actions such as predictive autoscaling, cross-cloud workload reallocation, and cost-aware optimization. Experimental evaluation on representative multi-cloud telemetry datasets from 2022 demonstrates improved forecasting accuracy, reduced SLA violations, and enhanced cost efficiency compared to traditional statistical and recurrent deep learning baselines. The results highlight the effectiveness of transformer-based time-series models in enabling proactive, reliable, and economically efficient resource management in complex multi-cloud environments

References

[1] Kundu, S. (2021). Multi-Cloud Federated Computing: Optimizing Cost, Performance, and Disaster Recovery Across AWS, Azure, and GCP. IJSAT-International Journal on Science and Technology, 12(2).

[2] Verma, S., & Bala, A. (2021). Auto-scaling techniques for IoT-based cloud applications: a review. Cluster Computing, 24(3), 2425-2459.

[3] Radhika, E. G., & Sadasivam, G. S. (2021). A review on prediction based autoscaling techniques for heterogeneous applications in cloud environment. Materials Today: Proceedings, 45, 2793-2800.

[4] Saleh, O., Gropengießer, F., Betz, H., Mandarawi, W., & Sattler, K. U. (2013, December). Monitoring and autoscaling IaaS clouds: a case for complex event processing on data streams. In 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing (pp. 387-392). IEEE.

[5] Saxena, D., & Singh, A. K. (2021). Workload forecasting and resource management models based on machine learning for cloud computing environments. arXiv preprint arXiv:2106.15112.

[6] Bankole, A. A., & Ajila, S. A. (2013, May). Predicting cloud resource provisioning using machine learning techniques. In 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) (pp. 1-4). IEEE.

[7] Kumar, J., Singh, A. K., Mohan, A., & Buyya, R. (2021). Machine learning for cloud management. Chapman and Hall/CRC.

[8] Panda, S. K., & Jana, P. K. (2018). Normalization-based task scheduling algorithms for heterogeneous multi-cloud environment. Information Systems Frontiers, 20(2), 373-399.

[9] Hong, J., Dreibholz, T., Schenkel, J. A., & Hu, J. A. (2019, March). An overview of multi-cloud computing. In Workshops of the international conference on advanced information networking and applications (pp. 1055-1068). Cham: Springer International Publishing.

[10] Raj, P., & Raman, A. (2018). Multi-cloud management: Technologies, tools, and techniques. In Software-defined cloud centers: Operational and management technologies and tools (pp. 219-240). Cham: Springer International Publishing.

[11] Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal fusion transformers for interpretable multi-horizon time series forecasting. International journal of forecasting, 37(4), 1748-1764.

[12] Zhang, W., Zhang, C., & Tsung, F. (2021, August). Transformer based spatial-temporal fusion network for metro passenger flow forecasting. In 2021 IEEE 17th international conference on automation science and engineering (CASE) (pp. 1515-1520). IEEE.

[13] Lim, B. (2018). Forecasting treatment responses over time using recurrent marginal structural networks. Advances in neural information processing systems, 31.

[14] Das, P., Mathur, J., Bhakar, R., & Kanudia, A. (2018). Implications of short-term renewable energy resource intermittency in long-term power system planning. Energy strategy reviews, 22, 1-15.

[15] Gaur, A. S., Das, P., Jain, A., Bhakar, R., & Mathur, J. (2019). Long-term energy system planning considering short-term operational constraints. Energy Strategy Reviews, 26, 100383.

[16] Ehsan, B. M. A., Begum, F., Ilham, S. J., & Khan, R. S. (2019). Advanced wind speed prediction using convective weather variables through machine learning application. Applied Computing and Geosciences, 1, 100002.

[17] Bendriss, J., Yahia, I. G. B., & Zeghlache, D. (2017, March). Forecasting and anticipating SLO breaches in programmable networks. In 2017 20th Conference on Innovations in Clouds, Internet and Networks (ICIN) (pp. 127-134). IEEE.

[18] Kim, S., & Wook Kim, S. (2010). The trade‐off of service quality and cost: a system dynamics approach. Asian Journal on Quality, 11(1), 69-78.

[19] Scarpin, M. R. S., & Brito, L. A. L. (2018). Operational capabilities in an emerging country: Quality and the cost trade-off effect. International Journal of Quality & Reliability Management, 35(8), 1617-1638.

[20] Amiri, A., Zdun, U., & van Hoorn, A. (2021). Modeling and empirical validation of reliability and performance trade-offs of dynamic routing in service-and cloud-based architectures. IEEE Transactions on Services Computing, 15(6), 3372-3386.

Published

2022-10-30

Issue

Section

Articles

How to Cite

1.
Nangi PR, Reddy Nala Obannagari CK, Settipi S. Multi-Cloud Resource Stability Forecasting Using Temporal Fusion Transformers. IJAIDSML [Internet]. 2022 Oct. 30 [cited 2026 Mar. 9];3(3):123-35. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/347