Attention-Based Driver Behavior Monitoring System Using Multi-Modal Deep Learning

Authors

  • Omprakash Gurrapu Independent Researcher, USA. Author
  • Avinash Chandra Independent Researcher, USA. Author
  • Pruthvi Kaluvala Independent Researcher, USA. Author
  • Sunny Solmen Edelli Padmarao Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V7I1P146

Keywords:

Driver Behavior Monitoring, Multi-Modal Deep Learning, CNN, BiLSTM, Driver Distraction Detection, Drowsiness Detection, Advanced Driver Assistance Systems (ADAS)

Abstract

Road traffic accidents resulting from driver distractions, drowsiness, and inattentiveness are a significant global concern. To address this problem, It proposes an "Attention-Based Driver Behavior Monitoring System Using Multi-Modal Deep Learning" for real-time classification of drivers' states. The proposed framework uses visual features extracted from in-cabin camera images along with behavioral features for more robust detection. CNN are used for spatial feature extraction, while BiLSTM networks are used in temporal behavior. Additionally, an adaptive attention mechanism is proposed for more robust feature modeling. It performs with standard metrics. From the experimental results, it is clear that the proposed attention-based framework achieved an accuracy of 96.8%, better than conventional models, including CNN and CNN+LSTM. The inclusion of the attention-based feature weighting process also minimizes false alarms and increases the sensitivity of the system for the detection of safety-critical conditions such as drowsiness and distraction. The proposed architecture also ensures computational efficiency for the deployment of the system in ADAS in the context of transportation systems, and the results validate the effectiveness of the fusion of the spatial-temporal modeling and adaptive multi-modal fusion for the development of intelligent driver behavior monitoring in smart transportation systems.

References

[1] L. Mou, C. Zhou, P. Xie, P. Zhao, R. Jain, W. Gao, et al., "Isotropic self-supervised learning for driver drowsiness detection with attention-based multimodal fusion," IEEE Transactions on Multimedia, vol. 25, pp. 529-542, 2021.

[2] O. Aboulola, M. Khayyat, B. Al-Harbi, M. S. A. Muthanna, A. Muthanna, H. Fasihuddin, et al., "Multimodal feature-assisted continuous driver behavior analysis and solving for edge-enabled internet of connected vehicles using deep learning," Applied Sciences, vol. 11, p. 10462, 2021.

[3] Y. Zhao, S. Guo, Z. Chen, Q. Shen, Z. Meng, and H. Xu, "Marfusion: An attention-based multimodal fusion model for human activity recognition in real-world scenarios," Applied Sciences, vol. 12, p. 5408, 2022.

[4] Y. Zhang, P. Tiwari, Q. Zheng, A. El Saddik, and M. S. Hossain, "A multimodal coupled graph attention network for joint traffic event detection and sentiment classification," IEEE Transactions on Intelligent Transportation Systems, vol. 24, pp. 8542-8554, 2022.

[5] O. Gurrapu et al., "Prediction of Psychiatric Disorders Using Deep Learning," 2025 9th International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 2025, pp. 516-519

[6] J. Gao, J. Yi, and Y. L. Murphey, "Attention-based global context network for driving maneuvers prediction," Machine Vision and Applications, vol. 33, p. 53, 2022.

[7] X. Zhang, Y. Gong, Z. Li, X. Liu, S. Pan, and J. Li, "Multi-modal attention guided real-time lane detection," in 2021 6th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM), 2021, pp. 146-153.

[8] Q. Abbas, M. E. Ibrahim, S. Khan, and A. R. Baig, "Hypo-driver: a multiview driver fatigue and distraction level detection system," Computers, Materials, & Continua, vol. 71, p. 1999, 2022.

[9] J. Liu, Y. Liu, C. Tian, M. Zhao, X. Zeng, and L. Song, "Multi-level attention fusion for multimodal driving maneuver recognition," in 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 2022, pp. 2609-2613.

[10] I. Kotseruba and J. K. Tsotsos, "Attention for vision-based assistive and automated driving: A review of algorithms and datasets," IEEE transactions on intelligent transportation systems, vol. 23, pp. 19907-19928, 2022.

[11] L. Wang, X. Zhang, J. Li, B. Xv, R. Fu, H. Chen, et al., "Multi-modal and multi-scale fusion 3D object detection of 4D radar and LiDAR for autonomous driving," IEEE Transactions on Vehicular Technology, vol. 72, pp. 5628-5641, 2022.

[12] B. Kim, S. H. Park, S. Lee, E. Khoshimjonov, D. Kum, J. Kim, et al., "Lapred: Lane-aware prediction of multi-modal future trajectories of dynamic agents," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14636-14645.

[13] Z. Zhang, R. Tian, R. Sherony, J. Domeyer, and Z. Ding, "Attention-based interrelation modeling for explainable automated driving," IEEE Transactions on Intelligent Vehicles, vol. 8, pp. 1564-1573, 2022.

[14] J.-H. Huang, L. Murn, M. Mrak, and M. Worring, "Gpt2mvs: Generative pre-trained transformer-2 for multi-modal video summarization," in Proceedings of the 2021 international conference on multimedia retrieval, 2021, pp. 580-589.

[15] Y. Zhi, Z. Bao, S. Zhang, and R. He, "BiGRU based online multi-modal driving maneuvers and trajectory prediction," Proceedings of the institution of mechanical engineers, part d: journal of automobile engineering, vol. 235, pp. 3431-3441, 2021.

[16] G. Yuan, Y. Wang, J. Peng, and X. Fu, "A novel driving behavior learning and visualization method with natural gaze prediction," IEEE Access, vol. 9, pp. 18560-18568, 2021.

[17] Y. Zhang, P. Tiwari, L. Rong, R. Chen, N. A. AlNajem, and M. S. Hossain, "Affective interaction: Attentive representation learning for multi-modal sentiment classification," ACM Transactions on Multimedia Computing, Communications and Applications, vol. 18, pp. 1-23, 2022.

[18] N. M. Shafiullah, Z. Cui, A. A. Altanzaya, and L. Pinto, "Behavior transformers: Cloning $ k $ modes with one stone," Advances in neural information processing systems, vol. 35, pp. 22955-22968, 2022.

[19] J. V. Suman et al., "Real-Time EEG-Based Drowsiness Detection Using Deep Learning Algorithms," 2025 7th International Conference on Energy, Power and Environment (ICEPE), Sohra (Cherrapunjee), India, 2025, pp. 1-5.

[20] V. Painuly, O. Gurrapu, W. H. Jebaselvi, U. Abdalov, Y. Noushad and V. C. Gandhi, "AI-Enhanced Collision Detection for Autonomous Drones Using LiDAR and Neural Network," 2025 Second International Conference on Networks and Soft Computing (ICNSoC), Vadlamudi, India, 2025, pp. 564-568

[21] Z. Huang, X. Mo, and C. Lv, "Recoat: A deep learning-based framework for multi-modal motion prediction in autonomous driving application," in 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), 2022, pp. 988-993.

[22] A. Prakash, K. Chitta, and A. Geiger, "Multi-modal fusion transformer for end-to-end autonomous driving," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 7077-7087.

[23] O. Gurrapu and P. Kaluvala, "Deep Learning-based Object identification in Ocean Environment by Convolutional Neural models," 2025 International Conference on NexGen Networks and Cybernetics (IC2NC), Erode, India, 2025, pp

[24] O. Gurrapu and J. V. Suman, "A Machine Learning Framework for Fault Detection in IoT Enabled Smart Sensor Networks," 2025 Global Conference on Information Technology and Communication Networks (GITCON), Belagavi, India, 2025, pp. 1-6.

[25] X. Li, L. Song, L. Liu, and L. Zhou, "GSS-RiskAsser: A Multi-Modal Deep-Learning Framework for Urban Gas Supply System Risk Assessment on Business Users," Sensors, vol. 21, p. 7010, 2021.

[26] K. S. Kumar, O. Gurrapu, J. Prabhakaran, C. Bhavani, D. M. Latha and L. Kavya, "Real-Time Driver Drowsiness Detection System using IoT-based Physiological Monitoring and Web Interface," 2025 8th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2025, pp. 412-416.

Published

2026-03-05

Issue

Section

Articles

How to Cite

1.
Gurrapu O, Chandra A, Kaluvala P, Edelli Padmarao SS. Attention-Based Driver Behavior Monitoring System Using Multi-Modal Deep Learning. IJAIDSML [Internet]. 2026 Mar. 5 [cited 2026 Mar. 11];7(1):271-8. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/472