Carbon-Aware Dynamic Batching for Deep Learning Inference: Optimizing the Energy-Latency Trade-off in High-Frequency Transaction Monitoring

Authors

  • Anvesh Katipelly Senior Software Engineer, PayPal, Texas, USA. Author
  • Sumith Thalary Sr DevOPs Engineer, Kubota Tractor Corp, Dallas, TX, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V6I3P121

Keywords:

Dynamic Batching, Deep Learning Inference, Energy Efficiency, High-Frequency Transactions, Green AI

Abstract

The increasing demand for real-time analytics in high-frequency transaction monitoring systems has led to a rapid growth in deep learning inference workloads, resulting in significant energy consumption and environmental impact. The current paper manages to introduce a carbon-conscious dynamic batching system that makes sure to trade off energy efficiency and latency in latency-sensitive inferences. The suggested method combines the real-time carbon intensity indicators, the workload characteristics and the performance metrics of the system into one decision-making pipeline. The control mechanism is a feedback-based mechanism dynamically varying the batch sizes according to the transaction arrival rates, queue conditions and service-level agreement (SLA) constraints, to make sure that the latency requirements are always satisfied. At the same time, the framework offers congruence between the execution of inferences and the low-carbon energy intervals, minimizing the carbon footprint. Experimental findings indicate that the proposed system can reduce energy usage and carbon emissions by up to 35% and 40%, respectively, over the conventional methods of performing static batching and still undergoes p95 latency within the strict operational limits. Moreover, the framework has a high scalability and flexibility in different workload conditions such as bursty and high-throughput conditions. With the well-balanced performance and sustainability goals, the work helps to improve the green AI practices and offers a viable solution on the implementation of environmentally friendly deep learning systems in the real-time financial monitoring systems.

References

[1] Chennareddy, R. K. (2020). Engineering Intelligence Systems Using Big Data and Cloud Architectures for Modern Data Intensive Applications. International Journal of AI, BigData, Computational and Management Studies, 1(2), 41-50.

[2] Chennareddy, R. K. (2021). Designing Data and Analytics Ecosystems for High Volume Transaction Processing Applications. International Journal of AI, BigData, Computational and Management Studies, 2(2), 95-106.

[3] Sethuraman, P. (2022). Latency-Aware Scheduling and Resource Control Algorithms for Emergency and Public Safety Wireless Networks. International Journal of Emerging Research in Engineering and Technology, 3(4), 133-140.

[4] Sethuraman, P., & Chennareddy, R. K. (2023). AI-Based Fraud Detection and Prevention at the Radio Access Network: Architectures and Mechanisms for Financial Wireless Service. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(4), 132-141.

[5] Chennareddy, R. K., & Sethuraman, P. (2023). Enterprise and RAN-Aware Data and Analytics Platforms for Mission-Critical and Low-Latency Digital Services. International Journal of Emerging Trends in Computer Science and Information Technology, 4(4), 184-192.

[6] Chennareddy, R. K. (2023). Enterprise-Scale AI and Analytics Strategy for End-to-End Business Transformation across Global Organizations. International Journal of AI, BigData, Computational and Management Studies, 4(3), 134-145.

[7] Sethuraman, P., & Chennareddy, R. K. (2024). RAN-AI Architectures Supporting Personalized Customer Interaction and Virtual Assistance in Banking Services. American International Journal of Computer Science and Technology, 6(6), 57-66.

[8] Chennareddy, R. K., & Sethuraman, P. (2024). Decision-Centric Architectures for Intelligent and Networked Wireless Computing Environments Operating at Scale and Uncertainty. International Journal of Emerging Trends in Computer Science and Information Technology, 5(3), 150-160.

[9] Chennareddy, R. K., & Sethuraman, P. (2024). Data and Analytics Workflows for Decision Systems Enabled by Learning-Based RAN Intelligence across Distributed Computing Environments. International Journal of Emerging Trends in Computer Science and Information Technology, 5(2), 149-158.

[10] Chennareddy, R. K., & Sethuraman, P. (2024). AI-Enabled Data-Driven Decision Frameworks for Enterprise Platforms and Tactical Defense Wireless Networks. American International Journal of Computer Science and Technology, 6(4), 39-49.

[11] Sethuraman, P., & Chennareddy, R. K. (2023). System-Level Design and Orchestration of Large-Scale Cellular Access Networks for Regulatory-Compliant Financial Services. International Journal of Emerging Research in Engineering and Technology, 4(3), 140-150.

[12] Sethuraman, P. (2023). Implicit Channel Inference Techniques for Pilotless OFDM Reception in Next-Generation Wireless Systems. International Journal of Emerging Research in Engineering and Technology, 4(1), 143-152.

[13] Ahmed, I. (2015). Green Service Level Agreement under Sustainability Lens in IT Industry.

[14] Lewis, A. W., Ghosh, S., & Tzeng, N. F. (2008). Run-time Energy Consumption Estimation Based on Workload in Server Systems. HotPower, 8, 17-21.

[15] Xu, S., Zhang, Y., & Chen, X. (2020). Forecasting Carbon Emissions with Dynamic Model Averaging Approach: Time‐Varying Evidence from China. Discrete Dynamics in Nature and Society, 2020(1), 8827440.

[16] Khan, I., Jack, M. W., & Stephenson, J. (2018). Analysis of greenhouse gas emissions in electricity systems using time-varying carbon intensity. Journal of Cleaner Production, 184, 1091-1101.

[17] Badshah, A., Ghani, A., Shamshirband, S., Aceto, G., & Pescapè, A. (2020). Performance‐based service‐level agreement in cloud computing to optimise penalties and revenue. IET Communications, 14(7), 1102-1112.

[18] Murino, T., Monaco, R., Nielsen, P. S., Liu, X., Esposito, G., & Scognamiglio, C. (2023). Sustainable energy data centres: A holistic conceptual framework for design and operations. Energies, 16(15), 5764.

[19] Devarakonda, A., Naumov, M., & Garland, M. (2017). Adabatch: Adaptive batch sizes for training deep neural networks. arXiv preprint arXiv:1712.02029.

[20] Wang, Y., Qiu, J., & Tao, Y. (2021). Optimal power scheduling using data-driven carbon emission flow modelling for carbon intensity control. IEEE Transactions on Power Systems, 37(4), 2894-2905.

[21] Wu, W., Yang, H., Chew, D., Hou, Y., & Li, Q. (2014). A real-time recording model of key indicators for energy consumption and carbon emissions of sustainable buildings. Sensors, 14(5), 8465-8484.

[22] Sugihara, R., & Gupta, R. K. (2009). Optimizing energy-latency trade-off in sensor networks with controlled mobility (pp. 2566-2570). IEEE.

Published

2025-09-07

Issue

Section

Articles

How to Cite

1.
Katipelly A, Thalary S. Carbon-Aware Dynamic Batching for Deep Learning Inference: Optimizing the Energy-Latency Trade-off in High-Frequency Transaction Monitoring. IJAIDSML [Internet]. 2025 Sep. 7 [cited 2026 Apr. 30];6(3):160-9. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/513