Reinforcement Learning for Intelligent Batching in Production Pipelines

Authors

  • Kiran Kumar Pappula Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V4I4P109

Keywords:

Reinforcement Learning, Intelligent Batching, Production Pipelines, Conservative Q-Learning, Service-Level Agreements

Abstract

Efficient batching forms a key element of industrial production lines, directly affecting throughput, use of resources and product quality. Static and rule-based strategies of traditional batching are limited in being able to deal with the complexity and variability of dynamic manufacturing environments. In this paper, a new method with reinforcement learning (RL) to assist the intelligent selection of optimal batching decisions in industries is proposed. Modelling the batching process using Markov Decision Process (MDP) would allow us to train an RL agent to achieve optimal policies that can resolve competing goals like minimizing cycle time, reducing defects and meeting service-level agreement (SLA). The simulation environment is created on the basis of the press hardening process, producing scaled data of the variety that resembles the operational requirements. CQL refers to a safe approach to learning policies offline with historical data. It is an offline algorithm that can safely learn policies directly through historical interaction data, requiring no real-time access to the system. Experimental evidence indicates that RL-based policies produce a reward advantage of up to 13 percent more than both the static and dynamic baselines, reduce defective parts, and are more resource efficient. There is also a modular system architecture explained in the study that incorporates policies of RL in production pipelines for training, inference, and control. Work outlines the future of data-driven intelligent systems in production and paves the way to the research potential of real-time adaptation, multi-agent manipulation, and integration with smart factories powered by the Internet of Things

References

[1] Weichert, D., Link, P., Stoll, A., Rüping, S., Ihlenfeldt, S., & Wrobel, S. (2019). A review of machine learning for the optimization of production processes. The International Journal of Advanced Manufacturing Technology, 104(5), 1889-1902.

[2] Li, Y., Carabelli, S., Fadda, E., Manerba, D., Tadei, R., & Terzo, O. (2020). Machine Learning and Optimisation for Production Rescheduling in Industry 4.0 The International Journal of Advanced Manufacturing Technology, 110(9), 2445-2463.

[3] Teixeira, A. F., & Secchi, A. R. (2019). Machine learning models to support reservoir production optimization. IFAC-PapersOnLine, 52(1), 498-501.

[4] Petsagkourakis, P., Sandoval, I. O., Bradford, E., Zhang, D., & del Rio-Chanona, E. A. (2020). Reinforcement learning for batch bioprocess optimization. Computers & Chemical Engineering, 133, 106649.

[5] Ostwald, P. F., & Munoz, J. (2008). Manufacturing processes and systems. John Wiley & Sons.

[6] Sallab, A. E., Abdou, M., Perot, E., & Yogamani, S. (2017). Deep Reinforcement Learning Framework for Autonomous Driving. arXiv preprint arXiv:1704.02532.

[7] Zheng, G., Zhang, F., Zheng, Z., Xiang, Y., Yuan, N. J., Xie, X., & Li, Z. (2018, April). DRN: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference (pp. 167-176).

[8] Li, H., Wei, T., Ren, A., Zhu, Q., & Wang, Y. (2017, November). Deep reinforcement learning: Framework, applications, and embedded implementations. In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (pp. 847-854). IEEE.

[9] Rocchetta, R., Bellani, L., Compare, M., Zio, E., & Patelli, E. (2019). A reinforcement learning framework for optimal operation and maintenance of power grids. Applied energy, 241, 291-301.

[10] Wei, S., Bao, Y., & Li, H. (2020). Optimal policy for structure maintenance: A deep reinforcement learning framework. Structural Safety, 83, 101906.

[11] Nguyen, T. T., Nguyen, N. D., Vamplew, P., Nahavandi, S., Dazeley, R., & Lim, C. P. (2020). A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 96, 103915.

[12] Mayer, S., Classen, T., & Endisch, C. (2021). Modular production control using deep reinforcement learning: proximal policy optimization. Journal of Intelligent Manufacturing, 32(8), 2335-2351.

[13] Jalalimanesh, A., Haghighi, H. S., Ahmadi, A., & Soltani, M. (2017). Simulation-based optimization of radiotherapy: Agent-based modelling and reinforcement learning. Mathematics and Computers in Simulation, 133, 235-248.

[14] Osiński, B., Jakubowski, A., Zięcina, P., Miłoś, P., Galias, C., Homoceanu, S., & Michalewski, H. (2020, May). Simulation-based reinforcement learning for real-world autonomous driving. In 2020 IEEE International Conference on Robotics and Automation (ICRA) (pp. 6411-6418). IEEE.

[15] Kumar, A., Zhou, A., Tucker, G., & Levine, S. (2020). Conservative Q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems, 33, 1179-1191.

[16] Yadav, A., & Jayswal, S. C. (2019). Evaluation of batching and layout on the performance of a flexible manufacturing system. The International Journal of Advanced Manufacturing Technology, 101, 1435-1449.

[17] Yoo, H., Byun, H. E., Han, D., & Lee, J. H. (2021). Reinforcement learning for batch process control: Review and perspectives. Annual Reviews in Control, 52, 108-119.

[18] Martínez, E. C. (2000). Batch process modelling for optimisation using reinforcement learning. Computers & Chemical Engineering, 24(2-7), 1187-1193.

[19] Martinez, E. C. (1999). Solving batch process scheduling/planning tasks using reinforcement learning. Computers & Chemical Engineering, 23, S527-S530.

[20] Lange, S., Gabel, T., & Riedmiller, M. (2012). Batch reinforcement learning. In Reinforcement learning: State-of-the-art (pp. 45-73). Berlin, Heidelberg: Springer Berlin Heidelberg.

[21] Rahul, N. (2020). Vehicle and Property Loss Assessment with AI: Automating Damage Estimations in Claims. International Journal of Emerging Research in Engineering and Technology, 1(4), 38-46. https://doi.org/10.63282/3050-922X.IJERET-V1I4P105

[22] Enjam, G. R. (2020). Ransomware Resilience and Recovery Planning for Insurance Infrastructure. International Journal of AI, BigData, Computational and Management Studies, 1(4), 29-37. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V1I4P104

[23] Pedda Muntala, P. S. R., & Jangam, S. K. (2021). End-to-End Hyperautomation with Oracle ERP and Oracle Integration Cloud. International Journal of Emerging Research in Engineering and Technology, 2(4), 59-67. https://doi.org/10.63282/3050-922X.IJERET-V2I4P107

[24] Rahul, N. (2021). AI-Enhanced API Integrations: Advancing Guidewire Ecosystems with Real-Time Data. International Journal of Emerging Research in Engineering and Technology, 2(1), 57-66. https://doi.org/10.63282/3050-922X.IJERET-V2I1P107

[25] Enjam, G. R., & Chandragowda, S. C. (2021). RESTful API Design for Modular Insurance Platforms. International Journal of Emerging Research in Engineering and Technology, 2(3), 71-78. https://doi.org/10.63282/3050-922X.IJERET-V2I3P108

[26] Rusum, G. P., & Pappula, kiran K. . (2022). Event-Driven Architecture Patterns for Real-Time, Reactive Systems. International Journal of Emerging Research in Engineering and Technology, 3(3), 108-116. https://doi.org/10.63282/3050-922X.IJERET-V3I3P111

[27] Anasuri, S. (2022). Zero-Trust Architectures for Multi-Cloud Environments. International Journal of Emerging Trends in Computer Science and Information Technology, 3(4), 64-76. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P107

[28] Pedda Muntala, P. S. R., & Karri, N. (2022). Using Oracle Fusion Analytics Warehouse (FAW) and ML to Improve KPI Visibility and Business Outcomes. International Journal of AI, BigData, Computational and Management Studies, 3(1), 79-88. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V3I1P109

[29] Rahul, N. (2022). Enhancing Claims Processing with AI: Boosting Operational Efficiency in P&C Insurance. International Journal of Emerging Trends in Computer Science and Information Technology, 3(4), 77-86. https://doi.org/10.63282/3050-9246.IJETCSIT-V3I4P108

[30] Enjam, G. R., & Tekale, K. M. (2022). Predictive Analytics for Claims Lifecycle Optimization in Cloud-Native Platforms. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 3(1), 95-104. https://doi.org/10.63282/3050-9262.IJAIDSML-V3I1P110

Published

2023-12-30

Issue

Section

Articles

How to Cite

1.
Pappula KK. Reinforcement Learning for Intelligent Batching in Production Pipelines. IJAIDSML [Internet]. 2023 Dec. 30 [cited 2025 Oct. 6];4(4):76-8. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/253