Abstract
Real-time assessment of drivers’ cognitive states is critical for improving road safety, especially in freight transport, where long-haul truck drivers frequently encounter prolonged fatigue and diverse traffic interactions. Current methods for cognitive state prediction predominantly rely on subjective surveys or unimodal physiological data, which are invasive, limited in scalability, and insufficient to capture the dynamic interplay between driver behavior, vehicle dynamics, and environmental context. To address this gap, this paper proposes a multimodal attention neural network (MMANN) framework that integrates three asynchronous data streams: vehicle kinematics, driver facial states captured via low-frequency imaging, and driving environment videos. The proposed model utilizes interpretable attention mechanisms to fuse multimodal data, facilitating the classification of cognitive states into low-activity (characterized by distraction or fatigue), normal-activity, or high-activity (indicative of stress or aggressive driving). Trained on an extensive 180-day naturalistic dataset, MMANN achieves an impressive recognition accuracy of 82.4%—a notable improvement of 8.6% over single-modal baselines. This research pioneers the development of adaptive cognitive models, specifically tailored to the unique operational patterns and environmental constraints of truck driving, thereby enabling real-time safety interventions such as adaptive warning strategies and prioritized hazard alerts.
Keywords
Get full access to this article
View all access options for this article.
