Abstract
Fatigue is a significant contributor to accidents in the high-risk oil and gas industry. This study developed and evaluated models for forecasting fatigue manifestation in offshore workers using the Psychomotor Vigilance Task (PVT). Seventy offshore workers participated in a four-week study, providing data on sleep, physiological, subjective, and performance measures. Various machine learning models (Ridge, Random Forest, Support Vector, Long Short-Term Memory (LSTM) regressions) were employed to predict PVT reaction times using different data normalization between generalized and personalized datasets. Results indicate that personalized Support Vector Regression models outperform other models in predicting short-term fatigue. Age and perceived exertion emerged as crucial predictors of fatigue. The findings underscore the potential of personalized fatigue forecasting for enhancing safety in the oil and gas industry.
The oil and gas industry is a high-risk industry with a fatality rate almost seven times that of all U.S. workers; average annual fatality rate of 25.1 versus 3.7 per 100,000 (Hagan-Haynes et al., 2022). Fatigue, defined as the physiological and/or psychophysiological response to prolonged physical activity, mental exertion, and/or sleep deprivation (Kang et al., 2024), has contributed to some of the deadliest and costliest disasters in oil and gas operations, such as the Texas City Refinery explosion (BP America [Texas City] Refinery Explosion | CSB, 2005) and the Exxon Valdez oil spill (US EPA, 1991). Managing fatigue requires proactive fatigue forecasting, however many fatigue forecasting datasets have been done in lab-controlled settings which may not be impacted by data quality, contextual variables, and fatigue recovery patterns. Current oil and gas assessments have been dominated by various subjective questionnaires which are easier to conduct and estimate offshore workers fatigue levels across days and shifts (Kang et al., 2021). There have been other combinations of sleep quality and performance metrics (Soares & De Almondes, 2017), measures combining physiological and subjective metrics (Mehta et al., 2017), and combinations of subjective, sleep, performance and physiological measures (Kang et al., 2024). These studies have been focused on inference and not on forecasting fatigue states which would provide an appropriate pathway for fatigue mitigation. The Psychomotor Vigilance Task (PVT) has been a standard to assess fatigue-related changes. It has been proven to be robust in capturing fatigue related changes and has held its ecological validity (Basner et al., 2011; Basner & Dinges, 2011). Due to its validity, this study was designed to explore forecasting PVT reaction times to determine fatigue manifestation. While fatigue forecasting, the differences between generalized models and individualized models, along with normalization per participant versus population, are explored. The impact of time horizon on the model accuracy is also explored.
Seventy offshore workers participated in this four-week experiment. Participants were observed and completed assessments including Karolinska Sleepiness Scale (KSS), Borg Ratings of Perceived Exertion (RPE), Mental Fatigue (MF, 1 [low] to 5 [high]), sleep (Actigraphy sleep time and efficiency), physiological (pre- and post-shift HR, RMSSD, LF/HF), and performance (PVT reaction time) assessments. Data obtained from the first two weeks were used for this analysis. Models were trained and tested in an 80% to 20% train-test ratio using generalized data with participant normalization, individualized data, and generalized data with population normalization. PVT reaction times were forecasted using age, shift type, subjective, sleep, and pre/post-shift physiological measures as predictors with Ridge (RR), Random Forest (RF), Support Vector (SVR), and Long Short-Term Memory (LSTM) regressions. A three-step approach predicted y (t) to y (t + 2Δt) using x (t-2Δt) to x (t) where Δt = 24 hr. Shapley Additive Explanations (SHAP), feature importance, and forecast horizon test of the best model were generated.
At t + 2Δt, SVR trained on individualized data, had the lowest mean absolute error (MAE) of 74.06 ± 6.39 ms, followed by the RF model with 77.84 ± 7.29 ms. The LSTM model had the lowest MAE, 130.94 ± 12.4 ms, when forecasting using generalized models with individual normalized physiological data, while the SVR model had the lowest MAE, 105.81 ± 10.07 ms, when forecasting with generalized models with population normalized physiological data. SHAP analysis of the SVR individualized model showed age and RPE had SHAP values of +2.88 and +0.09 indicating that these features contributed to an increase in the baseline predicted in the model. The shift type aimed to decrease the baseline prediction with a SHAP score of −0.07. The RF’s feature importance had subjective measures and shift type as the least important among the features, while the LFHF ratio, sleep time, post-shift RMSSD, sleep efficiency, and post-shift heart rate were the most important features. To explore the MAE change with different forecasting horizons, an input of [t-2Δt, t-Δt, t] was maintained and the mean PVT reaction time across t + nΔt was explored, where n = {0,1,2}. The SVR and RF individualized models had a decrease in the MAE over varying output horizons. SVR started at a MAE of 85.24 ± 9.64 ms to 74.37 ± 8.36 ms RF started with a mean of 85.01 ± 10.56 ms to 76.99 ± 9.2 ms.
These findings suggest that personalized models are more accurate in forecasting reaction times compared to generalized models. LSTM models require a significant amount of data for training, hence why the results did not converge with the individualized model. Also, normalizing per participant versus per population did not produce any significant MAE differences. SVR and RF models both showed that participant age was an important forecasting feature while the RF feature importance showed that sleep metrics and HRV features were salient. Both RF and SVR models displayed a decrease in MAE over an increasing forecast horizon which contradicted the initial hypothesis of increasing MAE with an increase in forecasting window. These fatigue forecasting results provide an opportunity to forecast farther in the future for the application of fatigue mitigation techniques in the industry.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
