Carbon-Aware Dynamic Batching for Deep Learning Inference: Optimizing the Energy-Latency Trade-off in High-Frequency Transaction Monitoring
DOI:
https://doi.org/10.63282/3050-9262.IJAIDSML-V6I3P121Keywords:
Dynamic Batching, Deep Learning Inference, Energy Efficiency, High-Frequency Transactions, Green AIAbstract
The increasing demand for real-time analytics in high-frequency transaction monitoring systems has led to a rapid growth in deep learning inference workloads, resulting in significant energy consumption and environmental impact. The current paper manages to introduce a carbon-conscious dynamic batching system that makes sure to trade off energy efficiency and latency in latency-sensitive inferences. The suggested method combines the real-time carbon intensity indicators, the workload characteristics and the performance metrics of the system into one decision-making pipeline. The control mechanism is a feedback-based mechanism dynamically varying the batch sizes according to the transaction arrival rates, queue conditions and service-level agreement (SLA) constraints, to make sure that the latency requirements are always satisfied. At the same time, the framework offers congruence between the execution of inferences and the low-carbon energy intervals, minimizing the carbon footprint. Experimental findings indicate that the proposed system can reduce energy usage and carbon emissions by up to 35% and 40%, respectively, over the conventional methods of performing static batching and still undergoes p95 latency within the strict operational limits. Moreover, the framework has a high scalability and flexibility in different workload conditions such as bursty and high-throughput conditions. With the well-balanced performance and sustainability goals, the work helps to improve the green AI practices and offers a viable solution on the implementation of environmentally friendly deep learning systems in the real-time financial monitoring systems.
References
[1] Chennareddy, R. K. (2020). Engineering Intelligence Systems Using Big Data and Cloud Architectures for Modern Data Intensive Applications. International Journal of AI, BigData, Computational and Management Studies, 1(2), 41-50.
[2] Chennareddy, R. K. (2021). Designing Data and Analytics Ecosystems for High Volume Transaction Processing Applications. International Journal of AI, BigData, Computational and Management Studies, 2(2), 95-106.
[3] Sethuraman, P. (2022). Latency-Aware Scheduling and Resource Control Algorithms for Emergency and Public Safety Wireless Networks. International Journal of Emerging Research in Engineering and Technology, 3(4), 133-140.
[4] Sethuraman, P., & Chennareddy, R. K. (2023). AI-Based Fraud Detection and Prevention at the Radio Access Network: Architectures and Mechanisms for Financial Wireless Service. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 4(4), 132-141.
[5] Chennareddy, R. K., & Sethuraman, P. (2023). Enterprise and RAN-Aware Data and Analytics Platforms for Mission-Critical and Low-Latency Digital Services. International Journal of Emerging Trends in Computer Science and Information Technology, 4(4), 184-192.
[6] Chennareddy, R. K. (2023). Enterprise-Scale AI and Analytics Strategy for End-to-End Business Transformation across Global Organizations. International Journal of AI, BigData, Computational and Management Studies, 4(3), 134-145.
[7] Sethuraman, P., & Chennareddy, R. K. (2024). RAN-AI Architectures Supporting Personalized Customer Interaction and Virtual Assistance in Banking Services. American International Journal of Computer Science and Technology, 6(6), 57-66.
[8] Chennareddy, R. K., & Sethuraman, P. (2024). Decision-Centric Architectures for Intelligent and Networked Wireless Computing Environments Operating at Scale and Uncertainty. International Journal of Emerging Trends in Computer Science and Information Technology, 5(3), 150-160.
[9] Chennareddy, R. K., & Sethuraman, P. (2024). Data and Analytics Workflows for Decision Systems Enabled by Learning-Based RAN Intelligence across Distributed Computing Environments. International Journal of Emerging Trends in Computer Science and Information Technology, 5(2), 149-158.
[10] Chennareddy, R. K., & Sethuraman, P. (2024). AI-Enabled Data-Driven Decision Frameworks for Enterprise Platforms and Tactical Defense Wireless Networks. American International Journal of Computer Science and Technology, 6(4), 39-49.
[11] Sethuraman, P., & Chennareddy, R. K. (2023). System-Level Design and Orchestration of Large-Scale Cellular Access Networks for Regulatory-Compliant Financial Services. International Journal of Emerging Research in Engineering and Technology, 4(3), 140-150.
[12] Sethuraman, P. (2023). Implicit Channel Inference Techniques for Pilotless OFDM Reception in Next-Generation Wireless Systems. International Journal of Emerging Research in Engineering and Technology, 4(1), 143-152.
[13] Ahmed, I. (2015). Green Service Level Agreement under Sustainability Lens in IT Industry.
[14] Lewis, A. W., Ghosh, S., & Tzeng, N. F. (2008). Run-time Energy Consumption Estimation Based on Workload in Server Systems. HotPower, 8, 17-21.
[15] Xu, S., Zhang, Y., & Chen, X. (2020). Forecasting Carbon Emissions with Dynamic Model Averaging Approach: Time‐Varying Evidence from China. Discrete Dynamics in Nature and Society, 2020(1), 8827440.
[16] Khan, I., Jack, M. W., & Stephenson, J. (2018). Analysis of greenhouse gas emissions in electricity systems using time-varying carbon intensity. Journal of Cleaner Production, 184, 1091-1101.
[17] Badshah, A., Ghani, A., Shamshirband, S., Aceto, G., & Pescapè, A. (2020). Performance‐based service‐level agreement in cloud computing to optimise penalties and revenue. IET Communications, 14(7), 1102-1112.
[18] Murino, T., Monaco, R., Nielsen, P. S., Liu, X., Esposito, G., & Scognamiglio, C. (2023). Sustainable energy data centres: A holistic conceptual framework for design and operations. Energies, 16(15), 5764.
[19] Devarakonda, A., Naumov, M., & Garland, M. (2017). Adabatch: Adaptive batch sizes for training deep neural networks. arXiv preprint arXiv:1712.02029.
[20] Wang, Y., Qiu, J., & Tao, Y. (2021). Optimal power scheduling using data-driven carbon emission flow modelling for carbon intensity control. IEEE Transactions on Power Systems, 37(4), 2894-2905.
[21] Wu, W., Yang, H., Chew, D., Hou, Y., & Li, Q. (2014). A real-time recording model of key indicators for energy consumption and carbon emissions of sustainable buildings. Sensors, 14(5), 8465-8484.
[22] Sugihara, R., & Gupta, R. K. (2009). Optimizing energy-latency trade-off in sensor networks with controlled mobility (pp. 2566-2570). IEEE.










