Sage Journals: Discover world-class research

Abstract

Purpose:

COVID-19 impact on the population’s mental health has been reported worldwide. Predicting healthcare workers’ mental health and life stress is needed to proactively plan for future emergencies.

Design:

Statistics Canada has surveyed Canadian healthcare workers and those working in healthcare settings to gauge their perceived mental health and perceived life stress.

Setting:

A cross-sectional survey of healthcare workers in Canada.

Subjects:

A sample of 18,139 healthcare workers respondents.

Analysis:

Eight algorithms, including Logistic Regression, Random Forest (RF), Naive Bayes (NB), K Nearest Neighbours (KNN), Adaptive boost (AdaBoost), Multi-layer perceptron (MLP), XGBoost, and LightBoost. AUC scores, accuracy and precision were measured for all models.

Results:

XGBoost provided the highest performing model AUC score (AUC = 82.07%) for predicting perceived mental health, and Random Forest performed the best for predicting perceived life stress (AUC = 77.74%). Perceived health, age group of participants, and perceived mental health compared to before the pandemic were found to be the most important 3 features to predict perceived mental health and perceived stress. Perceived mental health compared to before the pandemic was the most important predictor for perceived life stress.

Conclusion:

Our models are highly predictive of healthcare workers’ perceived mental health and life stress. Implementing scalable, non-expensive virtual mental health solutions to address mental health challenges in the workplace could mitigate the impact of workplace conditions on healthcare workers’ mental health.

Keywords

machine learning mental health stress virtual care healthcare

Introduction

COVID-19 had a high impact on the population’s mental health,^1,2 including depression, anxiety, and life stress.³ The impact has been observed in different communities, including healthcare workers and those in healthcare settings.⁴ In Canada, healthcare professionals reported higher anxiety symptoms,⁵ and quarantine was associated with higher odds of mental distress.⁶ High life stress and worsening mental health among healthcare workers⁷ could be debilitating conditions.

An inquiry into the mental health and life stress of healthcare workers is needed to understand and proactively plan for mental health interventions (eg, virtual care or tele-mental health) for future pandemics or emergencies. An analysis of healthcare workers and people working in healthcare settings’ experiences of mental health will allow evidence-based programming for tailored mental health interventions.

While machine learning (ML) has been used to predict mental health, to our knowledge, this is the first study that uses ML on a Statistics Canada dataset measuring the impacts of COVID-19 on healthcare workers and people working in healthcare settings⁸ to predict perceived life stress and mental health among healthcare workers. While self-perceived mental health may not necessarily reflect actual mental health conditions measured objectively using validated instruments, studies have indicated that individuals who self-report mental health issues tend to utilize mental health services more frequently,^9
-15 hence the usefulness of predicting perceived mental health.

Methods

Dataset

The analysis is based on the crowdsource questionnaire (18,139 respondents) conducted from November 24, 2020, to December 13, 2020, by Statistics Canada: Impacts of COVID-19 on Healthcare Workers.⁸ The participants for this survey were healthcare workers and those working in healthcare settings from ten different provinces and 3 territories in Canada.

Feature Selection

There were 94 variables, including age, gender, indigenous identity, population group, immigration status, province of residence, perceived health, perceived mental health compared to before the pandemic, and questions related to access to personal protective equipment (PPE) and experience at work.

The outcome (ie, target) variables were perceived mental health and perceived life stress. Both target variables were measured on a Likert-type scale. Perceived mental health was coded 0 for “Poor,” 1 for “Fair,” 2 for “Good,” 3 for “Very Good,” and 4 for “Excellent,” while perceived life stress was coded 1 for “Not at all stressful,” 2 for “Not very stressful,” 3 for “A bit stressful,” 4 for “Quite a bit stressful,” and 5 for “Extremely stressful.” (see Appendix A for the literal questions asked for major features and the outcomes).

Pre-Processing and Algorithm Selection

The dataset missing values were imputed using KNN Imputer, and one-hot-encoding was used for the feature “province or region of residence.” We investigated the following algorithms: Logistic Regression, Random Forest, Naive Bayes, KNN, XG Boost, AdaBoost, Multilayer Perceptron (MLP), and Light Boost. The performance measurements were the receiver operating characteristic (ROC) Area Under the Curve (AUC) score, Accuracy, Precision, and F1 score. We performed hyperparameter tuning using a Randomized Search with a cross-validation of 5 for all the above algorithms. The complete list of hyperparameters used during hyperparameter tuning is provided in Table 1.

Table 1.

Algorithms and Their Tuned Hyperparameters.

Algorithm	Hyperparameters
LR	{'solver':['newton-cg', 'lbfgs', 'liblinear'], 'penalty': [ 'l1', 'l2', 'elasticnet'], 'C': [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1, 10, 100]
Random Forest	{'n_estimators': [100,200,300], 'max_features': ['sqrt', 'log2', None], 'max_depth' : [5,10,15,20, 30, 40, 50], 'criterion' :['gini', 'entropy']}
NB	{'alpha': np.arange(0.0,1,0.05)}
KNN	{'K' : [1, 3, 5, 7, 21]}
XGBoost	{'n_estimators':[100,200,300],'max_depth':[3,4,5,6,7,8,9],'learning_rate':np.logspace(-3,0,10),'subsample': [0.5, 0.6, 0.7, 0.8, 0.9],'colsample_bytree': [0.5, 0.6, 0.7, 0.8, 0.9]}
AdaBoost	{"n_estimators": np.arange(5,100,5),'learning_rate': [0.0001, 0.001, 0.01, 0.1, 1.0]}
LightGBM	{'num_leaves': [31, 63, 127],'max_depth': [-1, 5, 10],'learning_rate': [0.1, 0.01, 0.001],'n_estimators': [100, 200, 300],}
MLP	parameter_space = { 'hidden_layer_sizes': [(50,50,50), (50,100,50), (100,100)],'activation': ['tanh', 'relu'], 'solver': ['sgd', 'adam'], 'alpha': [0.0001, 0.05],'learning_rate': ['constant','adaptive']}

To enhance the predictability, the target variables were transformed by merging classes. For perceived mental health, classes 0 and 1 (ie, poor to fair), as well as classes 3 and 4 (ie, very good to excellent), were merged, while class 2 (ie, sound) was kept as is. The OneVsRest (ovr) strategy was used to calculate the models’ performance in the multi-class scenario. For perceived life stress, classes 1, 2, and 3 were merged to represent “low stress,” and classes 4 and 5 were merged to represent “high stress” (see Appendix B for a link to our Python code).

Implementation Procedure

Implementation is done using Python (3.10.12) and the Scikit-learn library (1.2.2). The CSV file is read into a data frame. Imputation is performed using KNNImputer. The obtained dataset has been split into training and test data using Stratified Shuffle Split, 80% for training and 20% for testing.

Results

Population Characteristics

In the sample, 28.49% of people were less than 35 of age, 27.47% were 35 to 44 years of age, 24.65% were 45 to 54 years old, and 19.02% were 55 and older; 0.37% did not answer.

The sample had 8.11% of people with poor perceived mental health, 24.81% of people with fair perceived mental health, 32.38% of people with good perceived mental health, 25.26% of people with very good perceived mental health, and 9.44% of people with excellent perceived mental health. As for the perceived life stress, 12.23% of people had an extremely stressful perceived life, 6.8% people had not very stressful life, 41.66% people had quite a bit stressful life, 38.38% people had a bit stressful life, and 0.92% people had not at all stressful life. The distribution across professions is shown in Table 2.

Table 2.

Distribution of Respondents Across Professions.

Profession	N (%)
Physician	572 (3.15%)
Nurse	5361 (29.56%)
Three personal support worker or care aide	654 (3.61%)
Emergency medical personnel	269 (1.48%)
Allied health professional	6813 (37.56%)
Laboratory worker	1550 (8.55%)
Pharmacist	281 (1.55%)
Dental professional	1507 (8.31%)
Other (students, support services, etc.)	1108 (6.11%)
Not stated	24 (0.13%)
Total	18139 (100%)

Perceived Mental Health

We have used the AUC score as the parameter to compare the models as both the true positive and false negative rates are essential for health situations (ie, mental health). For perceived mental health, the best AUC score was obtained with the XGBoost model (82.07%), closely followed by LightBoost (81.77%) and Random Forest (80.80%). The KNN model had the lowest AUC score (66.36%). The complete models’ performance measurements are presented in Table 3.

Table 3.

Models’ Performance Measurements for Perceived Mental Health After Merging Classes.

Algorithm	AUC (%)	Accuracy (%)	Precision (%)	F1-score (%)
LR	81.10	64.18	63.71	63.92
RF	80.80	65.15	64.75	64.84
Naive NB	66.85	50.01	47.43	47.58
KNN	66.36	49.31	51.30	49.78
XG Boost	82.07	64.30	64.29	64.29
MLP	81.18	61.93	61.88	61.79
Light Boost	81.77	65.28	65.66	65.46
AdaBoost	79.85	63.91	64.79	64.25%

We further performed feature importance analysis for perceived mental health using XGBoost. “Perceived health” (score 100%), “age group of participants” (score 97.21%), and “perceived mental health compared to before the pandemic” (score 85.48%) were the 3 most important features for the prediction of perceived mental health. All other features scored below 63%.

Perceived Life Stress

For perceived life stress, the best AUC scores were obtained from the Random Forest model (77.74%), closely followed by the MLP model (77.56%). Results from all the models after merging classes are given in Table 4.

Table 4.

Models’ Performance Measurements for Perceived Life Stress After Merging Classes.

Algorithm	AUC (%)	Accuracy (%)	Precision (%)	F1-score (%)
LR	75.21	70.11	70.29	69.81
RF	77.74	70.61	70.53	70.53
Naive NB	67.03	61.84	61.96	61.87
KNN	65.32	60.19	60.39	60.22
XG Boost	73.13	68.18	68.13	68.11
MLP	77.56	70.36	70.27	70.26
Light Boost	75.07	69.42	69.39	69.32
AdaBoost	75.36	69.15	69.14	69.01

Using the Random Forest model, we performed a feature importance analysis for perceived life stress. “Perceived mental health compared to before the pandemic” (score 100%) was the most important feature for predicting perceived life stress. All other features scored below 10%.

Predictions Using the Most Important Features Only

Using the features with feature importance >70%, we ran the XGBoost and Random Forest models to predict the perceived mental health and perceived life stress, respectively. For predicting perceived mental health using XGBoost, we obtained an AUC score of 81.96% and an accuracy of 64.23% using only the 3 most important features cited above. For predicting perceived life stress using the Random Forest model, we obtained a AUC score of 73.24% and an accuracy of 68.61% using only the most important feature cited above.

Discussion

Models of Choice

This study aimed to develop predictive models for perceived mental health and perceived life stress. A test with an AUC value between 80 and 90% is considered excellent, while a test with more than 90% AUC is considered outstanding.¹⁶ While predicting the perceived mental health, XGBoost achieved the highest AUC score (82.07%) and close to the third-highest accuracy score (64.30%), which is close to the highest score (65.28%) and second-highest score (65.15%). Light Boost and Random Forest achieved similar accuracy precision and F1 score, but XGBoost outperformed both in terms of AUC score. XG Boost could be a model choice for predicting perceived mental health.

Whereas, while predicting perceived life stress, Random Forest could be the best model as it achieved the highest AUC (77.74%), accuracy (70.61%), precision (70.53%), and F1 scores (70.53%).

Model Implementability

It is interesting to note that considering only the top 3 features as predictors for perceived mental health, we obtained an AUC score of 81.96% for the XGBoost model. This is very close to the AUC score (82.07%) we obtained when considering all 94 features. For predicting perceived life stress using the topmost feature, we obtained an AUC score (73.24%) for the Random Forest model. This AUC score is close to the AUC score (77.74%) we obtained after considering all 94 features. Hence, the updated models pave a new path to predict perceived mental health and perceived life stress, with a few features without jeopardizing the models’ performance (ie, AUC score, accuracy). This allows using features with feature importance >70% to obtain predictions as good as those made after considering all 94 features. When fewer features are used, the prediction processing time is reduced, making a model implementable in real-life systems.

Policy Implications

It can be observed that perceived health, age group of participants, and perceived mental health compared to before the pandemic are the 3 most important determining factors for perceived mental health. This is in line with prior research; it has been found that age is one of the most significant predictors of mental health.¹⁷ On the other hand, perceived mental health compared to before the pandemic is our study’s most important predictor for perceived life stress; we could not find literature that allows us to compare our findings.

Perceived mental health compared to before the pandemic was a common feature for predicting perceived mental health as well as perceived stress; this finding has direct workplace policy implications. A continuous assessment of the perceived mental health of healthcare workers becomes advisable since it is a major predictor of stress and mental health well-being. This is confirmed by a previous study that found that perception of mental health compared to before physical distancing, as well as negatively perceived life stress and perceived mental health, were all high predictors of anxiety in the general population in Canada.¹⁸

Monitoring is 1 step only; addressing the perceived stress and negatively perceived mental health is the second important step. An important implication is implementing programs to support mental health and well-being for healthcare workers. Implementing such programs is paramount, and while face-to-face programs can be expensive, eHealth applications addressing mental health are not as expensive. Virtual care can be a great resource.¹⁹ Particularly, online mindfulness has been proven to be effective in addressing depression, anxiety, and stress in various populations,^20-24 including for workplace interventions,²⁵ and including for healthcare workers.²⁶ These online mindfulness applications do not have to isolate healthcare workers, as they can embed an optional virtual community component²⁷ where participants can feel a sense of community. Policymakers can deploy scalable virtual mindfulness tools to address the mental health effects,^18,28 especially affecting the young,²⁹ while considering equity implications.³⁰

Study Limitations

While the dataset is large (18,139 respondents) and based on a national survey, 1 limitation of this study is that the sample is not representative of Canadian healthcare workers and those working in healthcare settings. The healthcare workers respondents were physicians, nurses, personal support workers or care aides, emergency medical personnel, allied health professionals, laboratory workers, pharmacists, dental professionals, and others (students, support services, etc.). Still, the sample does not represent their percentages in the workplace. This might affect the performance of our model.

Like in most machine learning models, our model’s performance was not confirmed by another “external” data set. This step is important if such a model is to be implemented in a workplace to predict healthcare workers’ mental health and well-being.

Study Contributions

This study showcases the potential of machine learning in health research, provides critical insights for targeted intervention strategies, and suggests practical, policy-oriented applications to support the mental well-being of healthcare professionals.

Evidence-Based Insights for Tailored Mental Health Interventions

This study relies on an innovative use of machine learning; it is the first to apply ML techniques to the Statistics Canada dataset concerning COVID-19’s impact on healthcare workers for predicting mental health outcomes. This approach can set a precedent for future research in the domain, demonstrating the potential of ML in public health research. By analyzing healthcare workers' mental health and life stress experiences during the pandemic, the study provides evidence-based insights that could inform the development of tailored mental health interventions.

Policy Implications and Intervention Strategies

The findings are crucial for planning and implementing effective support systems for healthcare professionals, particularly in anticipation of future pandemics or emergencies. Also, the findings have direct implications for workplace policies and mental health support programs for healthcare workers. By identifying the most significant predictors of mental health and stress, policymakers and healthcare administrators can design more effective programs, such as virtual care and online mindfulness interventions, to support healthcare workers’ mental health.

Highlighting the Importance of Continuous Mental Health Assessment

The study underscores the importance of continuous assessment of healthcare workers’ perceived mental health as a major predictor of stress and overall well-being. This can inform ongoing support and intervention strategies, emphasizing the need for proactive rather than reactive mental health support services. The novelty resides in the fact of being able to predict perceived mental health status based on a few respondents’ characteristics (eg, perceived health, age group, and changes in perceived mental health since the pandemic) without the use of an instrument.

Advocating for the Use of eHealth Tools

Finally, our study advocates for the implementation of eHealth solutions (eg, virtual mindfulness programs) as cost-effective and scalable options for addressing mental health issues among healthcare workers. This recommendation aligns with a broader trend toward digital health solutions and could significantly impact how mental health support is provided in the healthcare sector.

Conclusion

Based on a national survey, we have explained and discussed a model to predict perceived mental health and perceived life stress for healthcare workers in Canada. XGBoost and Random Forest models are the best models for predicting perceived mental health and perceived life stress, respectively; the model’s performance remained stable by using only the features with importance above 70%. Perceived mental health before the pandemic was the most important predictor for perceived life stress. It is important to implement scalable virtual mental health solutions to address workplace challenges for healthcare workers.

Footnotes

Appendix A

Appendix B

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was supported by a Mitacs Globalink Research Internship award.

ORCID iD

Christo El Morr

References

Cui

Weng

COVID-19 impact on mental health. BMC Med Res Methodol. 2022;22(1):15. doi:10.1186/s12874-021-01411-w

Ghio

Acosta

Fisman

Noymer

Stilianakis

Assche

SB.

Population health and COVID-19 in Canada: a demographic comparative perspective. Can Stud Popul. 2021;48(2-3):131-137. doi:10.1007/s42650-021-00057-9

Styra

Hawryluck

Mc Geer

, et al. Surviving SARS and living through COVID-19: healthcare worker mental health outcomes and insights for coping. PLoS One. 2021;16(11):e0258893. doi:10.1371/journal.pone.0258893

Tardif

Gupta

McNeely

Feeney

Impact of the COVID-19 pandemic on the health workforce in Canada. Healthc Q. 2022;25(1):17-20. doi:10.12927/hcq.2022.26812

Turna

Patterson

Goldman Bergmann

, et al. Mental health during the first wave of COVID-19 in Canada, the USA, Brazil and Italy. Int J Psychiatry Clin Pract. 2022;26(2):148-156. doi:10.1080/13651501.2021.1956544

Daly

Slemon

Richardson

, et al. Associations between periods of COVID-19 quarantine and mental health in Canada. Psychiatry Res. 2021;295:113631. doi:10.1016/j.psychres.2020.113631

Gupta

Jha

Bansal

, et al. COVID 19-related burnout among healthcare workers in India and ECG based predictive machine learning model: insights from the BRUCEE- Li study. Indian Heart J. 2021;73(6):674-681. doi:10.1016/j.ihj.2021.10.002

Statistics Canada. Impacts of COVID-19 on Health Care Workers: Infection Prevention and Control (ICHCWIPC). Statistics Canada. Updated February 1, 2021. Accessed February 7, 2024. https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=5340

Albizu-Garcia

Alegría

Freeman

Vera

Gender and health services use for a mental health problem. Soc Sci Med. 2001;53(7):865-878. doi:10.1016/s0277-9536(00)00380-4

10.

Bhavsar

McGuire

MacCabe

Oliver

Fusar-Poli

A systematic review and meta-analysis of mental health service use in people who report psychotic experiences. Early Interv Psychiatry. 2018;12(3):275-285. doi:10.1111/eip.12464

11.

Maria

Abigail

Xuesong

Simone

Paul

Trends in objectively measured and perceived mental health and use of mental health services: a population-based study in Ontario, 2002–2014. Can Med Assoc J. 2020;192(13):E329. doi:10.1503/cmaj.190603

12.

Jin

Chua

Fones

Lim

Health beliefs and help seeking for depressive and anxiety disorders among urban Singaporean adults. Psychiatr Serv. 2008;59(1):105-108. doi:10.1176/ps.2008.59.1.105

13.

Olfson

Marcus

Tedeschi

Wan

GJ.

Continuity of antidepressant treatment for adults with depression in the United States. Am J Psychiatry. 2006;163(1):101-108. doi:10.1176/appi.ajp.163.1.101

14.

Ryan

Marley

Still

Lyons

Hood

Use of mental-health services by Australian medical students: a cross-sectional survey. Australas Psychiatry. 2017;25(4):407-410. doi:10.1177/1039856217715990

15.

Zuvekas

Fleishman

JA.

Self-rated mental health and racial/ethnic disparities in mental health service use. Med Care. 2008;46(9):915-23. doi:10.1097/MLR.0b013e31817919e5

16.

Mandrekar

JN.

Receiver operating characteristic curve in diagnostic test assessment. J Thorac Oncol. 2010;5(9):1315-1316. doi:https://doi.org/10.1097/JTO.0b013e3181ec173d

17.

Rezapour

Hansen

A machine learning analysis of COVID-19 mental health data. Sci Rep. 2022;12(1):14965. doi:10.1038/s41598-022-19314-1

18.

Kumari

Goyal

Elmorr

Planning for a crisis: predicting anxiety in a population during COVID-19 using machine learning. Stud Health Technol Inform. 2023;309:13-17. doi:10.3233/shti230730

19.

El Morr

. Introduction to Health Informatics: A Canadian Perspective. 2nd ed. Canadian Scholars’ Press; 2023: 354.

20.

Ahmad

Wang

El Morr

Online mindfulness interventions: a systematic review. In: El Morr

, ed. Novel Applications of Virtual Communities in Healthcare Settings. IGI Global; 2018: chapter 1.

21.

El Morr

Maule

Ashfaq

Ritvo

Ahmad

. A student-centered mental health virtual community needs and features: a focus group study. Stud Health Technol Inform. 2017;234:104-108.

22.

Vollestad

Nielsen

GH.

Mindfulness- and acceptance-based interventions for anxiety disorders: a systematic review and meta-analysis. Br J Clin Psychol. 2012;51(3):239-260. doi:10.1111/j.2044-8260.2011.02024.x

23.

Boettcher

Astrom

Pahlsson

Schenstrom

Andersson

Carlbring

Internet-based mindfulness treatment for anxiety disorders: a randomized controlled trial. Behav Ther. 2014;45(2):241-253. doi:10.1016/j.beth.2013.11.003

24.

Hofmann

Sawyer

Witt

The effect of mindfulness-based therapy on anxiety and depression: a meta-analytic review. J Consult Clin Psychol. 2010;78(2):169-183. doi:10.1037/a0018555

25.

Aikens

Astin

Pelletier

, et al. Mindfulness goes to work: impact of an online workplace intervention. J Occup Environ Med. 2014;56(7):721-731. doi:10.1097/JOM.0000000000000209

26.

Yogeswaran

El Morr

Effectiveness of online mindfulness interventions on medical students’ mental health: a systematic review. BMC Public Health. 2021;21(1):2293. doi:10.1186/s12889-021-12341-z

27.

El Morr

. Health care virtual communities: challenges and opportunities. In: Cruz-Cunha

Tavares

Simoes

, eds. Handbook of Research on Developments in E-Health and Telemedicine. IGI Global; 2010: 278-298: chapter 13.

28.

El Morr

. Virtual communities, machine learning and IoT: opportunities and challenges in mental health research. Int J Extreme Automation Connectivity Healthc. 2019;1(1):4-11. doi:10.4018/ijeach.2019010102

29.

Bou-Hamad

Hoteit

Hijazi

Ayna

Romani

El Morr

Coping with the COVID-19 pandemic: a cross-sectional study to investigate how mental health, lifestyle, and socio-demographic factors shape students’ quality of life. PLoS One. 2023;18(7):e0288358. doi:10.1371/journal.pone.0288358

30.

Gurevich

El Hassan

El Morr

Equity within AI systems: what can health leaders expect?

Healthc Manage Forum. 2023;36(2):119-124. doi:10.1177/08404704221125368

Predictive Models for Canadian Healthcare Workers Mental Health During COVID-19

Abstract

Purpose:

Design:

Setting:

Subjects:

Analysis:

Results:

Conclusion:

Keywords

Introduction

Methods

Dataset

Feature Selection

Pre-Processing and Algorithm Selection

Implementation Procedure

Results

Population Characteristics

Perceived Mental Health

Perceived Life Stress

Predictions Using the Most Important Features Only

Discussion

Models of Choice

Model Implementability

Policy Implications

Study Limitations

Study Contributions

Evidence-Based Insights for Tailored Mental Health Interventions

Policy Implications and Intervention Strategies

Highlighting the Importance of Continuous Mental Health Assessment

Advocating for the Use of eHealth Tools

Conclusion

Footnotes

Appendix A

Appendix B

Declaration of Conflicting Interests

Funding

ORCID iD

References