Sage Journals: Discover world-class research

Abstract

Objective

Fatigue is a critical indicator in modern health management, and efficient, accurate methods for predicting fatigue levels using wearable devices have garnered increasing attention. Although recent advancements have enabled non-invasive cortisol measurement via wearable sensors, it remains unclear how effectively cortisol, in combination with other physiological biomarkers, predicts fatigue. Therefore, this study aimed to evaluate the effectiveness of a multimodal machine learning model that integrates cortisol levels and heart rate variability (HRV) for fatigue prediction.

Methods

Data from 336 participants who completed the Fatigue Severity Scale (FSS) were analyzed. Missing data mechanisms for cortisol were examined, and multivariate imputation by chained equations (MICEs) were applied. A TabNet deep-learning model was used to predict low and high fatigue levels based on HRV and cortisol data.

Results

The model using only HRV variables achieved a test AUC of 0.774, whereas the model incorporating both HRV and cortisol levels achieved 0.741, indicating a minimal overall performance difference. Feature importance analysis revealed that, in the cortisol-included model, predictions relied on a limited set of features. When feature selection was applied to this model, a reduced set of variables—age, cortisol, and logarithmic very low frequency—achieved comparable predictive performance (AUC = 0.759) without performance degradation.

Conclusion

This study demonstrated that a fatigue prediction model based on cortisol and HRV can maintain significant predictive power with a reduced number of variables. These findings suggest the potential for practical implementation in wearable devices, enabling accurate fatigue monitoring while minimizing sensor count and computational burden.

Keywords

Heart rate variability cortisol biosensing fatigue severity prediction wearable health monitoring multimodal machine learning

Introduction

Fatigue is a prevalent concern in modern society that significantly affects both physiological and psychological well-being.^1,2 Chronic fatigue, distinct from transient tiredness, is associated with a persistent reduction in quality of life and productivity and is frequently comorbid with conditions, such as depression, sleep disorders, and immune dysfunction.^3–8 These associations underscore the critical need for early detection and effective management of fatigue. Although self-report instruments, such as the Fatigue Severity Scale (FSS), are widely used due to their practicality, their reliance on subjective input introduces limitations in reliability and precision.^7,9,10 Consequently, there has been a growing interest in developing objective and quantifiable methods for fatigue assessment based on physiological data.¹¹

Heart rate variability (HRV), an established indicator of autonomic nervous system activity, has emerged as a valuable physiological marker in fatigue research.¹² Specific HRV indices, such as very low frequency (VLF) and low-to-high frequency ratio (LF/HF), have been linked to chronic fatigue symptoms and autonomic dysregulation,^8,10,13 which are hallmarks of fatigue pathology rather than general stress responses. However, the predictive utility of HRV is limited due to its sensitivity to individual physiological differences and environmental factors.^7,10,12 Furthermore, employing HRV with basic statistical or binary classification approaches may not adequately capture the complexity of fatigue dynamics.

Although HRV primarily reflects cumulative autonomic activity rather than immediate physiological responses, it may provide a novel diagnostic perspective for fatigue, distinct from conventional hospital-based static HRV assessments, particularly when continuously measured and analyzed as time-series data using wearable technologies. Accordingly, the main objective of this study is to evaluate fatigue, not general stress, using these physiological markers. However, two major challenges hinder the implementation of HRV analyses using wearable devices. First, real-world feasibility is often limited by hardware constraints and the lack of robust algorithms capable of handling signal variability. Second, although blood-based biomarkers, such as complete blood count (CBC) and thyroid function tests, are commonly used to assess fatigue in clinical settings, they are impractical for continuous monitoring in nonclinical environments.¹⁴ Among these biomarkers, cortisol—a key hormone regulated by the hypothalamic-pituitary-adrenal (HPA) axis—has gained attention for its central role in stress physiology and its increasing accessibility through non-invasive measurement techniques. Recent advances in sensor technology have enabled cortisol detection in sweat or saliva via wearable devices, thereby presenting new opportunities for integrated, real-time fatigue monitoring.¹² For example, platforms like MyWear have successfully integrated multimodal wearable sensors with machine learning (ML) algorithms for accurate real-time stress and HRV monitoring across diverse environments.⁶

Researchers have proposed the integration of HRV with additional physiological markers—particularly endocrine biomarkers, such as cortisol—as a strategy to overcome these limitations. While HRV reflects autonomic regulation, cortisol captures the endocrine dimensions of the stress response, making these biomarkers complementary.¹² The convergence of wearable technology and non-invasive biosensing allows for the simultaneous monitoring of both HRV and cortisol in everyday environments, facilitating comprehensive fatigue assessment.¹³

ML techniques offer substantial advantages for modeling complex nonlinear physiological signals, such as HRV.¹⁵ ML models, particularly when leveraging multimodal data, can flexibly capture intricate feature interactions and account for interindividual variability, thereby overcoming the limitations of traditional statistical approaches.¹⁶ The use of ML models is particularly beneficial for integrating heterogeneous data sources, such as physiological signals and laboratory biomarkers, to achieve more accurate and personalized fatigue predictions.¹⁷

Wearable health technologies have revolutionized the collection of continuous physiological data, including HRV, activity levels, and stress markers, thereby enabling remote patient monitoring (RPM) and wellness applications.¹⁸ Although wearable-derived HRV has demonstrated validated efficacy in stress prediction and fatigue monitoring,¹⁹ studies integrating HRV and cortisol for fatigue prediction are still limited. Furthermore, current research employing HRV-cortisol integrated models faces methodological constraints, including small sample sizes, and limited application of analytical approaches, such as ML techniques.^6–8^,20

Considering these limitations, the present study aimed to develop a predictive model for fatigue by integrating HRV and blood cortisol data collected in a clinical setting. Using machine-learning techniques optimized for multimodal physiological inputs, we assessed whether a compact set of features could reliably classify fatigue severity, thereby supporting the potential for practical fatigue monitoring via wearable technologies.

Methods

Participants

We analyzed data from a total of 336 patients (191 males and 145 females) who underwent both self-reported assessments and HRV measurements during routine health examinations at the Health Promotion Center of Seoul National University Bundang Hospital in Korea. The study population of this ambispective cohort study consisted of two cohorts: a retrospective cohort of 236 patients identified from routine health examination records, and a prospective cohort of 100 patients who were newly recruited with written informed consent. This study was approved by the Institutional Review Board (IRB) of Seoul National University Bundang Hospital (IRB No. B-2302-810-301).

Assessments

Data collection followed a standardized protocol across all participants. The FSS was administered within 7 days prior to the clinical examination to capture recent fatigue status. On the examination day, venous blood samples were collected before 10:00 a.m. to control for cortisol's circadian variation, followed by anthropometric measurements and HRV assessment. HRV data were acquired using 3-minute short-term recordings with the Medicore SA-3000P system (Medicore Co., Ltd., Korea) according to the manufacturer's validated protocol.

In this study, features were extracted based on demographic variables, including age and sex, as well as HRV indices related to autonomic nervous system activity and stress. Indicators related to stress and autonomic nervous system function included autonomic nervous system activity, autonomic balance, stress resistance, stress index, and cardiac stability. A total of 26 HRV features were automatically extracted by the measurement device and categorized as follows: heart rate-based indicators included average heart rate, standard deviation of normal-to-normal intervals (SDNN), root mean square of successive differences (RMSSD), approximate entropy (ApEn), sympathetic reactivity difference (SRD), and total sympathetic reactivity difference (TSRD). These variables were used to quantify the temporal stability and variability of the heart rhythm. Frequency-domain features and their logarithmic transformations included total power (TP), VLF, LF, HF, normalized low frequency (LF norm), normalized high frequency (HF norm), LF/HF, power spectrum index (PSI), and their logarithmic values TP(ln), VLF(ln), LF(ln), and HF(ln).

The FSS is a widely used nine-item self-report questionnaire designed to evaluate the severity and impact of fatigue on daily functioning over the past week. Each item was rated on a 7-point Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree), and the final score was calculated as the average of the nine items. Higher scores indicated more severe fatigue. The FSS has demonstrated excellent internal consistency (Cronbach's α = 0.93) and test–retest reliability, and it has been validated across various clinical populations. In this study, the FSS was used to quantify subjective fatigue levels among the participants due to its ease of administration and robust psychometric properties.²¹

Data processing

Participants were classified into two groups based on the presence or absence of cortisol values: the missing group (n = 108) and the non-missing group (n = 228). We then compared the distribution of relevant variables between these groups to evaluate statistical differences. An FSS score ≥ 4.0 is the most widely accepted cut-off, indicating clinically significant fatigue. The FSS scores were binarized using a predefined cut-off point (0 = low fatigue, 1 = high fatigue), and the association between cortisol missingness and fatigue classification was assessed using the chi-square test. The results showed a statistically significant relationship between the two variables (p < 0.001), indicating a nonrandom pattern of missingness.

Missing data handling

Among the 336 participants, cortisol data were missing for 108 individuals. We first examined the underlying mechanism of missingness to minimize the impact of missing data on model performance. The normality of continuous variables was assessed using the Shapiro–Wilk test and Q–Q plots; due to violations of normality assumptions, non-parametric Mann–Whitney U tests were applied. Significant differences (p < 0.05) were observed between the cortisol-missing and non-missing groups in several demographic and physiological variables; specific details are provided in the Supplementary Material. Little's MCAR test and the score-based test by Wang et al.²² indicated a Missing Not At Random (MNAR) mechanism. Therefore, the missing cortisol values were imputed using multivariate imputation by chained equations (MICEs). A total of 336 samples with completed imputations were used to train the TabNet model, which predicted binary fatigue status based on 25 variables, including demographic, autonomic nervous system, stress-related, and heart rate-based indicators (Supplementary Table S1, Supplementary Figure S1).

Statistical analysis

Statistical analyses were conducted to compare baseline characteristics between groups. The normality of continuous variables was examined using the Shapiro–Wilk test and Q–Q plots; due to violations of normality assumptions. Statistical analyses were conducted using the Mann–Whitney U test for continuous variables and the chi-square test for categorical variables. A significance level of p < 0.05 was considered statistically significant. All statistical analyses were performed using Python (version 3.10).

Reporting guideline

This study adhered to the TRIPOD-AI (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis—Artificial Intelligence) reporting guideline.²³ A completed TRIPOD-AI checklist is provided in the Supplementary Material.

Machine learning training

In this study, we developed a TabNet-based binary classification model to predict fatigue status, defined by a binarized FSS, using a comprehensive set of features, including demographic variables, stress and autonomic nervous system indicators, heart rate-based metrics, frequency-domain features and their logarithmic transformations, and cortisol levels.

Model training was conducted using stratified five-fold cross-validation, with validation and test AUC values serving as the main evaluation metrics. Depending on the experimental settings, the imputed cortisol variable was either included or excluded, resulting in three experimental conditions: (1) Model without Cortisol: a model trained using only 25 HRV-based physiological features, excluding cortisol. (2) Model with Cortisol: a model including the imputed cortisol variable obtained through MICE. (3) Feature Selection Model: a model with a reduced set of input variables selected to minimize model complexity while maintaining predictive performance. The interpretation of the models was facilitated by employing SHapley Additive Explanations (SHAP) values and permutation importance analyses to assess the contribution of each feature to model predictions quantitatively.

Results

Demographic features

Among the 336 participants, 180 were classified as having low fatigue (FSS < 4.0), while 156 were classified as having high fatigue (FSS ≥ 4.0) based on the binarized FSS. The baseline demographic and physiological characteristics of the participants, including age, sex, body mass index, antihypertensive and antidiabetic medication use and autonomic, and HRV features, were compared between the two fatigue groups and are summarized in Table 1.

Table 1.

Summary of participant characteristics: low vs. high fatigue.

Variable	Low fatigue (FSS < 4.0)	High fatigue (FSS ≥ 4.0)	p-value
Age (Years)	51.13 ± 11.32	45.29 ± 12.03	<0.05
Sex
Male	103 (57.2%)	88 (56.4%)	0.969
Female	77 (42.8%)	68 (43.6%)	0.969
Body mass index
Low body weight	8 (4.5%)	7 (4.5%)	0.965
Normal weight	114 (63.3%)	96 (61.9%)
Obesity	58 (32.2%)	52 (33.5%)
Antihypertensive medication use
No	145 (80.6%)	134 (85.9%)	0.248
Yes	35 (19.4%)	22 (14.1%)	0.248
Antidiabetic medication use
No	165 (91.7%)	148 (94.9%)	0.345
Yes	15 (8.3%)	8 (5.1%)	0.345
Cortisol (µg/dL)	13.62 ± 5.61	12.94 ± 6.04	nan
HRV features
Autonomic nervous system activity	92.87 ± 16.50	91.66 ± 15.78	0.374
Autonomic balance	64.75 ± 43.91	61.43 ± 40.48	0.503
Stress resistance	95.42 ± 15.26	94.04 ± 14.14	0.222
Stress index	97.87 ± 17.05	99.47 ± 14.94	0.208
Average heart rate (bpm)	67.69 ± 10.42	69.80 ± 10.33	<0.05
Cardiac stability	100.77 ± 21.62	97.27 ± 21.03	0.209
Sdnn (ms)	34.73 ± 16.43	35.37 ± 16.02	0.718
Psi	80.20 ± 92.82	69.57 ± 61.69	0.873
Tp (ms²)	982.66 ± 1249.51	1007.40 ± 998.81	0.300
Vlf (ms²)	405.93 ± 493.40	438.26 ± 499.80	0.153
Lf (ms²)	263.65 ± 496.65	308.01 ± 521.52	0.260
Hf (ms²)	313.08 ± 602.14	261.13 ± 301.18	0.981
LfNorm (%)	47.86 ± 23.74	50.96 ± 23.05	0.219
HfNorm (%)	52.14 ± 23.74	49.04 ± 23.05	0.219
Lf/Hf (ratio)	1.82 ± 2.96	1.92 ± 2.89	0.219
Rmssd (ms)	29.08 ± 20.97	29.16 ± 18.20	0.747
Apen (unitless)	0.93 ± 0.13	0.95 ± 0.11	0.137
Srd	0.97 ± 0.12	0.98 ± 0.12	0.559
Tsrd	126.38 ± 49.91	125.78 ± 46.82	0.912
Tp (ln, ms²)	6.40 ± 0.99	6.52 ± 0.90	0.300
Vlf (ln, ms²)	5.42 ± 1.13	5.59 ± 1.02	0.153
Lf (ln, ms²)	4.82 ± 1.17	4.99 ± 1.16	0.259
Hf (ln, ms²)	4.93 ± 1.31	4.94 ± 1.22	0.982

Table 1 summarizes the demographic and physiological characteristics of participants stratified by fatigue status (low vs. high fatigue). A total of 180 participants were classified as low fatigue and 156 as high fatigue based on the binarized FSS score. Among the variables examined, age and average heart rate showed statistically significant differences between the two groups (p < 0.05), with the low fatigue group being older and having a lower average heart rate. No significant differences were observed in cortisol level, sex distribution, or other autonomic, stress-related, and frequency-domain indicators.

Abbreviations used in feature names: Sdnn (ms) = standard deviation of normal-to-normal intervals; Psi = power spectrum index; Tp (ms²) = total power; Vlf (ms²) = very low frequency; Lf (ms²) = low frequency; Hf (ms²) = high frequency; Lf/Hf (ratio) = ratio of low to high frequency power; LfNorm (%) = normalized low frequency; HfNorm (%) = normalized high frequency; Rmssd (ms) = root mean square of successive differences; Apen (unitless) = approximate entropy; Srd = sympathetic reactivity difference; Tsrd = total sympathetic reactivity difference; Tp (ln, ms²) = log-transformed total power; Vlf (ln, ms²), Lf (ln, ms²), Hf (ln, ms²) = log-transformed frequency domain features.

Model performance

Model performance was compared based on the validation and test AUC values (Table 2). The Model without Cortisol achieved an AUC of 0.774, whereas the Model with Cortisol, which included the imputed cortisol variable, yielded an AUC of 0.741. The feature selection model, which utilized only three variables—age, cortisol, and VLF (ln)—demonstrated an AUC of 0.76 while maintaining competitive performance with a minimal feature set. The ROC curves for each model condition are shown in Figure 1 (see also Supplementary Table S2, Figure S2).

Figure 1.

Receiver operating characteristic (ROC) curves for each model condition. (A) The model, including the imputed cortisol variable (Model with Cortisol), yielded an AUC of 0.74. (B) The model excluding the cortisol variable (Model without Cortisol) achieved an AUC of 0.77. (C) The Feature Selection Model, which used only age, cortisol, and VLF (ln), demonstrated an AUC of 0.76. All models showed moderate discriminative performance, with minimal differences across experimental conditions.

Table 2.

AUC comparison across models.

Model	Validation AUC	Test AUC	# Features
Model with Cortisol	0.727	0.741	26
Model without Cortisol	0.743	0.774	25
Feature Selection Model	0.752	0.760	3

Validation and test AUCs across three model configurations. Despite using only three input variables, the feature selection model achieved a performance comparable to the full-feature models.

Feature importance

SHAP analysis revealed that, in the Model with Cortisol, the impact of top-ranked features was clearly distinguishable, whereas lower-ranked features exhibited distributions tightly clustered around the center (Figure 2). This pattern suggests that a small number of key features contribute the most significantly to the model's predictions. The widespread of SHAP values among the top-ranked features, as opposed to the tight clustering of lower-ranked features, indicates that a few key variables, particularly cortisol, play a dominant role in prediction.

Figure 2.

SHAP summary plots for feature importance in different model settings. (A) In the model, including the imputed cortisol variable, a small number of top features—including cortisol and age—exhibited wide SHAP value dispersion, indicating a strong influence on model predictions. Lower-ranked features showed values tightly clustered around zero. (B) In the Model without Cortisol, the overall distribution of SHAP values was more uniform, and no single feature showed dominant predictive power. These findings support the role of cortisol as a key contributor and justify its inclusion in the final feature set.

Forward feature selection further confirmed that high predictive performance could be maintained with a minimal set of variables, supporting the importance of cortisol as an informative feature. Based on this analysis, a final lightweight model was constructed using only three features: age, cortisol level, and the VLF (Supplementary Figure S3).

Discussion

In this study, we developed and evaluated ML models for fatigue prediction by integrating HRV and cortisol data collected from a cohort of 336 participants. Participants were stratified into low- and high-fatigue groups according to the FSS,⁹ and subsequent feature comparisons revealed statistically significant differences in age and heart rate. In contrast, no significant group differences in cortisol levels were observed. Notably, a feature selection model incorporating age, cortisol, and VLF (ln) achieved a robust balance of predictive performance (AUC = 0.76), closely approximating the full model (AUC = 0.77) while utilizing only three features. SHAP analysis²⁴ further substantiated the pivotal role of cortisol in prediction, even in the absence of group-level statistical differences.

Beyond the overall predictive performance, the analysis yielded important insights into model stability and generalizability. The feature selection model that excluded cortisol exhibited the largest discrepancy between the validation and test AUCs, suggesting potential overfitting. In contrast, including cortisol resulted in more consistent AUCs across datasets, indicating superior generalization.¹² Moreover, the model incorporating cortisol achieved the most balanced outcomes across precision, recall, and F1 scores. In the absence of cortisol, the precision remained high, but the recall was substantially lower, leading to a greater number of missed positive cases. This finding underscores the practical importance of cortisol in fatigue detection, particularly in applications where false negatives may lead to overlooked fatigue-related risks.¹⁴

Table 3 summarizes previous HRV-based studies, outlining their clinical associations.

Table 3.

Summary of HRV-based studies investigating clinical symptom associations.

Paper title	Authors (Ref)	Sample size	Measurement duration	HRV variables	Key findings
Stress and Heart Rate Variability: A Meta-Analysis and Review of the Literature	Kim HG et al. (Ref 12)	37 studies (meta-analysis, humans)	Varied (short- and long-term HRV across included studies)	HF, LF, LF/HF, SDNN, RMSSD	Stress generally associated with ↓ HF, ↑ LF/HF; HRV is a reliable indicator of stress.
The very low-frequency band of heart rate variability represents the slow recovery component after a mental stress task	Usui H and Nishida Y (Ref 13)	19 healthy young men (mean age = 26.5)	10-minute rest, 20-minute Stroop task, 120-minute recovery with 15-minute intervals	VLF, HF, LF/HF	HF and LF/HF returned quickly; VLF showed slow recovery lasting 2 hours.
Prenatal stress assessment using heart rate variability and salivary cortisol: A machine learning-based approach	Cao R, Rahmani AM, Lindsay KL (Ref 17)	29 pregnant women (final analyzable; initial n = 33)	5-minute HRV and salivary cortisol windows across visits	Time-domain (SDNN, RMSSD), Frequency-domain, HR + salivary cortisol	ML model combining HRV + cortisol achieved 92.3% accuracy; cortisol improved prediction vs. HRV only.
Photoplethysmography-based HRV analysis and machine learning for real-time stress quantification in mental health applications	Tsai YY et al. (Ref 15)	25 healthy adults (mean age 24.8)	30-second PPG windows during stress task	Mean RR, SDNN, HF, LF, LF/HF, SampEn, fractal measures	PPG-derived HRV features with ML correlated strongly with subjective stress.

HRV: heart rate variability; HF: high-frequency power; LF: low-frequency power; LF/HF: ratio of low- to high-frequency power; SDNN: standard deviation of normal-to-normal intervals; RMSSD: root mean square of successive differences; VLF: very low-frequency power; HR: heart rate; SampEn: sample entropy; PPG: photoplethysmography.

Although models utilizing only age as a predictor yielded reasonably good results, age is a static characteristic that cannot capture real-time physiological responses. In contrast, cortisol levels reflect acute stress-related changes and offer greater sensitivity to short-term physiological variations, which are particularly pertinent for real-time fatigue alert systems.²⁵ Furthermore, the cortisol-based model achieved near-optimal performance with only three features, rendering it well-suited for lightweight real-time applications in wearable environments.^7,20,26

Despite its moderate predictive accuracy and relatively limited dataset, the present study represents a meaningful conceptual advance in the non-invasive prediction of fatigue using biomarkers compatible with wearable technologies.¹¹ The significance of this research extends beyond current performance metrics, as it demonstrates the feasibility of predicting fatigue using a minimal set of input features through machine-learning models.²⁷

Another strength of this study is its rigorous approach to handling missing data. In real-world RPM environments, data loss is an inevitable challenge owing to device limitations, sensor errors, and connectivity issues.²⁸ The modeling approach maintained robust performance despite incomplete datasets through the use of MICEs,²⁹ thereby demonstrating the practical viability of fatigue prediction under imperfect data conditions—a scenario frequently encountered in wearable-based health monitoring systems.³⁰

This study has several limitations. First, the temporal alignment of assessments differed across measures. The FSS captures subjective fatigue experienced over the preceding 7 days, while cortisol and HRV measurements reflect physiological status at a single time point on the examination day. Although FSS administration occurred within 7 days prior to biomarker collection, this temporal discordance may have introduced measurement error and potentially attenuated the observed associations between subjective fatigue and physiological markers. The cross-sectional nature of our analysis precludes causal inference, and the temporal misalignment further limits interpretability of the observed relationships. Second, the moderate sample size (n = 336) and single-center design may limit generalizability to broader populations. Third, we utilized only morning cortisol measurements, which may not capture the full dynamics of HPA axis dysregulation associated with chronic fatigue. Future studies should prioritize synchronized data collection protocols, larger multi-site cohorts, and multiple cortisol sampling timepoints to address these limitations.

A parallel can be drawn with the recent advancements in continuous glucose monitoring (CGM). Metwally et al. demonstrated that CGM data, when combined with ML, could accurately classify metabolic subphenotypes of type 2 diabetes—including muscle insulin resistance and β-cell dysfunction—with AUCs up to 0.95, using at-home oral glucose tolerance tests.³¹ This approach has transformed complex hospital-based metabolic assessments into scalable real-world monitoring tools.

Analogously, integrating HRV and cortisol data for fatigue prediction in this study may facilitate more precise identification and stratification of fatigue syndromes, such as chronic fatigue syndrome, which currently lacks definitive diagnostic criteria and reliable biomarkers.² As future research incorporates larger datasets and more stable time-series data, the predictive performance of such algorithms is expected to improve.³² This study contributes to the development of scalable, personalized fatigue monitoring systems utilizing RPM technologies.³³

Conclusion

This study demonstrated that fatigue levels can be effectively predicted using a minimal set of physiological features—specifically age, HRV-derived VLF (ln), and blood cortisol—through ML techniques. Despite the moderate sample size and inherent limitations of the cross-sectional data, restricted assessment of comorbid illnesses, the model achieved stable performance, suggesting the feasibility of developing lightweight, scalable fatigue monitoring systems for real-world applications. The integration of HRV and cortisol offers a promising foundation for personalized fatigue assessment tools, particularly in wearable or remote health monitoring environments. Further studies using larger longitudinal datasets are warranted to validate and enhance the clinical utility of these predictive models.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251395570 - Supplemental material for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring

Supplemental material, sj-docx-1-dhj-10.1177_20552076251395570 for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring by Joung Eun Kim, Na Hyeon Kim, Soo Kyung Choi, Ji-Yoon Lee, Keehyuck Lee and Jong Soo Han in DIGITAL HEALTH

Supplemental Material

sj-png-2-dhj-10.1177_20552076251395570 - Supplemental material for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring

Supplemental material, sj-png-2-dhj-10.1177_20552076251395570 for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring by Joung Eun Kim, Na Hyeon Kim, Soo Kyung Choi, Ji-Yoon Lee, Keehyuck Lee and Jong Soo Han in DIGITAL HEALTH

Supplemental Material

sj-png-3-dhj-10.1177_20552076251395570 - Supplemental material for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring

Supplemental material, sj-png-3-dhj-10.1177_20552076251395570 for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring by Joung Eun Kim, Na Hyeon Kim, Soo Kyung Choi, Ji-Yoon Lee, Keehyuck Lee and Jong Soo Han in DIGITAL HEALTH

Supplemental Material

sj-png-4-dhj-10.1177_20552076251395570 - Supplemental material for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring

Supplemental material, sj-png-4-dhj-10.1177_20552076251395570 for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring by Joung Eun Kim, Na Hyeon Kim, Soo Kyung Choi, Ji-Yoon Lee, Keehyuck Lee and Jong Soo Han in DIGITAL HEALTH

Supplemental Material

sj-pdf-5-dhj-10.1177_20552076251395570 - Supplemental material for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring

Supplemental material, sj-pdf-5-dhj-10.1177_20552076251395570 for Machine learning-based fatigue classification using heart rate variability and cortisol: A multimodal approach to wearable health monitoring by Joung Eun Kim, Na Hyeon Kim, Soo Kyung Choi, Ji-Yoon Lee, Keehyuck Lee and Jong Soo Han in DIGITAL HEALTH

Footnotes

Acknowledgments

The authors would like to thank all contributors who participated in the study. No additional acknowledgments to declare.

ORCID iDs

Joung Eun Kim

Na Hyeon Kim

Soo Kyung Choi

Ji-Yoon Lee

Keehyuck Lee

Jong Soo Han

Ethical considerations

This study was approved by the Institutional Review Board (IRB) of Seoul National University Bundang Hospital (IRB No. B-2302-810-301).

Consent to participate

This study was approved by the Institutional Review Board (IRB) of Seoul National University Bundang Hospital (IRB No. B-2302-810-301). For the retrospective cohort (n = 236), the requirement for written informed consent was waived. For the prospective cohort (n = 100), written informed consent was obtained from all participants prior to enrollment.

Consent for publication

Written consent for publication of anonymized data was obtained from all participants in the prospective cohort. For the retrospective cohort, consent for publication was waived by the IRB.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Industrial Technology Innovation Program (Project Number: 20020423), funded by the Ministry of Trade, Industry and Energy (MOTIE), Republic of Korea [Grant ID: 501100003052].

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The raw data will be made available by the corresponding authors upon request.

AI tool disclosure

The authors declared that no AI-assisted technologies were used in the writing, editing, data analysis, or figure generation of this manuscript.

Supplemental Material

All supplemental material mentioned in the text is available in the online version of the journal.

References

Yoon

Park

Kang

, et al. The demographic features of fatigue in the general population worldwide: a systematic review and meta-analysis. Front Public Health 2023; 11: 1192121.

Afari

Buchwald

. Chronic fatigue syndrome: a review. Am J Psychiatry 2003; 160: 221–236.

Aaron

Herrell

Ashton

, et al. Comorbid clinical conditions in chronic fatigue: a co-twin control study. J Gen Intern Med 2001; 16: 24–31.

Dickson

Toft

O'Carroll

. Neuropsychological functioning, illness perception, mood and quality of life in chronic fatigue syndrome, autoimmune thyroid disease and healthy participants. Psychol Med 2009; 39: 1567–1576.

Sparasci

Gobbi

Castelnovo

, et al. Fatigue, sleepiness and depression in multiple sclerosis: defining the overlaps for a better phenotyping. J Neurol 2022; 269: 4961–4971.

Prakash

Harshitha

Lakshmi

, et al. MyWear revolutionizes real-time health monitoring with comparative analysis of machine learning. Sci Rep 2025; 15: 17026.

Zhang

Yan

, et al. Recent advancements in wearable sensors: integration with machine learning for human-machine interaction. RSC Adv 2025; 15: 7844–7854.

Flintoff

Pattinson

Ahamed

, et al. Predictive biomarkers of performance under stress: a two-phase study protocol to develop a wearable monitoring system. BMJ Open Sport Exerc Med 2025; 11: e002410.

Krupp

LaRocca

Muir-Nash

, et al. The Fatigue Severity Scale. Application to patients with multiple sclerosis and systemic lupus erythematosus. Arch Neurol 1989; 46: 1121–1123.

10.

Guo

Wang

Qin

, et al. Assessment of flight fatigue using heart rate variability and machine learning approaches. Front Neurosci 2025; 19: 1621638.

11.

Adão Martins

Annaheim

Spengler

, et al. Fatigue monitoring through wearables: a state-of-the-art review. Front Physiol 2021; 12: 790292.

12.

Kim

Cheon

Bai

, et al. Stress and heart rate variability: a meta-analysis and review of the literature. Psychiatry Investig 2018; 15: 235–245.

13.

Usui

Nishida

. The very low-frequency band of heart rate variability represents the slow recovery component after a mental stress task. PLoS One 2017; 12: e0182611.

14.

Vyas

Muirhead

Singh

, et al. Impact of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) on the quality of life of people with ME/CFS and their partners and family members: an online cross-sectional survey. BMJ Open 2022; 12: e058128.

15.

Tsai

Chen

Lin

, et al. Photoplethysmography-based HRV analysis and machine learning for real-time stress quantification in mental health applications. APL Bioeng 2025; 9: 026103.

16.

Madakkatel

Zhou

McDonnell

, et al. Combining machine learning and conventional statistical approaches for risk factor discovery in a large cohort study. Sci Rep 2021; 11: 22997.

17.

Cao

Rahmani

Lindsay

. Prenatal stress assessment using heart rate variability and salivary cortisol: a machine learning-based approach. PLoS One 2022; 17: e0274298.

18.

Kakhi

Jagatheesaperumal

Khosravi

, et al. Fatigue monitoring using wearables and AI: trends, challenges, and future opportunities. 2024. arXiv preprint arXiv:241216847.

19.

Al-Libawy

Al-Ataby

Al-Nuaimy

, et al. HRV-based operator fatigue analysis and classification using wearable sensors. In: 2016 13th international multi-conference on systems, signals & devices (SSD), 21–24 March . Piscataway, NJ: IEEE, 2016, pp.268–273.

20.

Mohapatra

Aravind

Bisram

, et al. Wearable network for multilevel physical fatigue prediction in manufacturing workers. PNAS Nexus 2024; 3: pgae421.

21.

Valko

Bassetti

Bloch

, et al. Validation of the fatigue severity scale in a Swiss cohort. Sleep 2008; 31: 1601–1607.

22.

Wang

Liu

. Score test for missing at random or not. Biometrics 2023; 79(2): 1268–1279.

23.

Collins

Moons

KGM

Dhiman

, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. Br Med J 2024; 385: e078378.

24.

Chen

Liu

, et al. The potential of SHAP and machine learning for personalized explanations of influencing factors in myopic treatment for children. Medicina (Kaunas) 2024; 61: 20241226.

25.

Kim

Park

, et al. Wearable cortisol aptasensor for simple and rapid real-time monitoring. ACS Sens 2022; 7: 99–108.

26.

Rajkomar

Dean

Kohane

. Machine learning in medicine. N Engl J Med 2019; 380: 1347–1358.

27.

Adam

Kumari

. Assessing salivary cortisol in large-scale, epidemiological research. Psychoneuroendocrinology 2009; 34: 1423–1436.

28.

Piwek

Ellis

Andrews

, et al. The rise of consumer health wearables: promises and barriers. PLoS Med 2016; 13: e1001953.

29.

Buuren

Groothuis-Oudshoorn

. MICE: multivariate imputation by chained equations in R. J Stat Softw 2011; 45(3): 1–67.

30.

Russell

Koren

Rieder

, et al. Hair cortisol as a biological marker of chronic stress: current status, future directions and unanswered questions. Psychoneuroendocrinology 2012; 37: 589–601.

31.

Metwally

Perelman

Park

, et al. Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning. Nat Biomed Eng 2024; 20241223. DOI: 10.1038/s41551-024-01311-6

32.

Steinhubl

Muse

Topol

. The emerging field of mobile health. Sci Transl Med 2015; 7: 283rv283.

33.

Goldsack

Coravos

Bakker

, et al. Verification, analytical validation, and clinical validation (V3): the foundation of determining fit-for-purpose for biometric monitoring technologies (BioMeTs). NPJ Digit Med 2020; 3: 55.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

0.26 MB

0.23 MB

0.51 MB

0.32 MB