Abstract
University students are experiencing a mental health crisis. COVID-19 has exacerbated this situation. We have surveyed students in 2 universities in Lebanon to gauge their mental health challenges. We have constructed a machine learning (ML) approach to predict symptoms of depression, anxiety, and stress based on demographics and self-rated health measures. Our approach involved developing 8 ML predictive models, including Logistic Regression (LR), multi-layer perceptron (MLP) neural network, support vector machine (SVM), random forest (RF) and XGBoost, AdaBoost, Naïve Bayes (NB), and K-Nearest neighbors (KNN). Following their construction, we compared their respective performances. Our evaluation shows that RF (AUC = 78.27%), NB (AUC = 76.37%), and AdaBoost (AUC = 72.96%) have provided the highest-performing AUC scores for depression, anxiety, and stress, respectively. Self-rated health is found to be the top feature in predicting depression, while age was the top feature in predicting anxiety and stress, followed by self-rated health. Future work will focus on using data augmentation approaches and extending to multi-class anxiety predictions.
Introduction
The COVID-19 pandemic 1 had a drastic effect on people’s mental health across the globe, 2 mainly due to the impact of quarantine procedures (eg, suspension of many activities and social isolation). The long duration of associated measures had a negative effect on mental health conditions, including depression, anxiety, and stress.3,4 The impact was especially recognized among the young population, including students, and is a cause of concern. 5 Students’ distress impacts their lives and academic achievement, which is associated with societal impact in terms of loss in productivity and economic losses. 6 Several studies uncovered increased levels of anxiety, depression, and stress around the globe.7 -13
A multicounty cross-sectional survey conducted in Pakistan, China, India, Indonesia, Saudi Arabia, Malaysia, and Bangladesh showed 35.6% mild to severe anxiety. 14 Another study in the United States reported that 48.14% of university students had moderate-to-severe depression, 38.48% had moderate-to-severe anxiety, and 71.26% reported that their stress levels had increased during the pandemic. 15 Research on university students from Bangladesh, Egypt, Ethiopia, Lebanon, Turkey, and Brazil reported similar findings related to depression, anxiety, and stress; symptoms of depression levels varied between 21.2% in Ethiopia and 82.4% in Bangladesh, symptoms of anxiety varied between 27.7% in Ethiopia and 87.7% in Bangladesh, and symptoms of stress between 12.7% in Lebanon to 57.5% in Brazil.16 -20 A systematic review and meta-analysis found that the prevalence of anxiety, depression, and stress among college students during the COVID-19 pandemic was 29%, 37%, and 23%, respectively. 7
The negative impact of the COVID-19 pandemic on mental health in Lebanon was demonstrated in the tertiary referral hospital population, 21 healthcare workers,22,23 refugees, 24 the general population,25,26 and the young population (18-35 years). 27 A study conducted at the onset of the pandemic in April 2020 reported that 17.9%, 13.8%, and 1.7% of students exhibited mild, moderate, and severe depressive symptoms, respectively; also, mild, moderate, severe, and extremely severe anxiety symptoms were found in 3.3%, 21.9%, 6.3%, and 2.3%, respectively; and 11% of students reported mild stress, while 1.7% reported moderate stress.
To document, understand, and plan for appropriate mental health programming for students after 2 years of the pandemic, we conducted a cross-sectional survey in Lebanon between November 2022 and February 2022 using an online survey among university students. We have then developed multiple machine learning models to predict the level of depression, anxiety, and stress among university students; those models would be used by students’ counseling services to plan appropriate interventions. While machine learning models have been used to predict the effectiveness of an intervention on depression28,29 or change of anxiety levels 30 or stress31,32 in different populations, studies are focused on prevention or treatment instead of prediction. To our knowledge, this is the first time a machine learning model is used to predict the existing levels of depression, anxiety, and stress among university students, and based on standard socioeconomic status (SES), lifestyle, and education-related data without access to health-related ones (eg, blood pressure and heart rate). Developing predictive models will enable early detection of symptoms and, hence, early intervention, recognized as essential for mental health and symptom management.33,34 Besides, machine learning models have the ability to recognize the most important factors influencing prediction; such recognition will allow universities to tailor their engagement with students on mental health to these factors through their programming counseling services. Our guiding research questions are (1) what is the state of depression, anxiety, and stress in the Lebanese university student population, and (2) could we build machine learning predictive models for depression, anxiety, and stress based on sociodemographic and lifestyle data?
The remainder of this paper is organized as follows. Section II presents the methodology, from data collection to ML models. Section III presents the results of the ML models and their optimization. Section IV discusses the models’ results and the approach’s limitations, while Section V concludes the paper.
Methods
Data Collection
A cross-sectional survey was conducted in Lebanon using an online survey distributed to undergraduate and graduate students. Our study targeted students from 2 different universities in Lebanon to ensure a comprehensive representation of the student body. The American University of Beirut (AUB), a prestigious private university, and the Lebanese University (LU), Lebanon’s only public university, were chosen. This careful selection was made to cover a wide socioeconomic range. The Lebanese University serves a diverse student body, including students from low-income families and rural areas. The American University of Beirut, on the other hand, primarily serves a more affluent student demographic. The intentional choice of these universities significantly amplifies the study’s capability to capture a wide array of perspectives and socioeconomic factors that could influence mental health levels among university students in Lebanon.
An online survey was disseminated to undergraduate and graduate university students in Lebanon in both Arabic and English languages. The students were provided a detailed study description and a link to the survey through electronic platforms like WhatsApp and email. Two reminders were sent to the participants within a 2-week interval to maximize participation and response rates. The survey started with a consent form that provided the students with relevant information about their rights and responsibilities and guaranteed the confidentiality of their information. On average, the survey took students approximately 15 to 20 min to complete.
The data was collected between November 2021 and February 2022, when the Omicron variant was first identified and started spreading. The study participants were 329 students who were 18 years old or above, enrolled in either Spring 2020 to 2021 or Fall 2021 to 2022 at either the American University of Beirut (a private institution) or the Lebanese University (a public institution).
The participants provided their written informed consent online before completing the survey. Considering the evolving nature of the pandemic and to minimize the spread of the virus, an online convenience sampling strategy was adopted. This sampling approach has been commonly employed in numerous COVID-19-related studies.35 -37
The individuals involved in the study did not receive any financial rewards, and their identities were kept confidential to ensure the reliability and privacy of the collected data. This study was carried out per the Declaration of Helsinki’s guidelines for human subjects research. Ethics approval for the study was obtained from the Institutional Review Board at the American University of Beirut (SBS-2021-0256) and the Research Ethics Board at York University in Canada (Certificate # e2021-327).
Features: Sociodemographic and Lifestyle Practices
The measured sociodemographic factors are age, gender, income, current program, nationality, relationship status, and number of people living in the household. Lifestyle practices include cigarette and shisha smoking, alcohol intake, physical activity, sleeping patterns, internet usage, and overall health. Participants were also asked if they had sought private counseling or therapy from a clinical mental health professional, tried mindfulness meditation, followed COVID-19 preventive measures (wearing masks, handwashing, quarantining, etc.), received the COVID-19 vaccine, and if they had kept up with COVID-19 updates. Finally, participants were asked if they had COVID-19 infection, believed that Coronavirus and vaccination were the subjects of a conspiracy, and if religion is important in their daily lives. The feature list can be found in Table 1.
Machine Learning Model’s Features.
Target Outcome Variables
Depression (PHQ-9)
The depression data were collected using the Patient Health Questionnaire (PHQ-9) questionnaire. 38 PHQ 9 is a brief 9-items. Each item is assessed for the prior 2 weeks: 0 = “not at all,” 1 = “several days,” 2 = “more than half the days,” and 3 = “nearly every day,” with a total score ranging from 0 to 27. A score of 0 to 4 indicates a minimum depression; 5 to 9 mild depression; 10 to 14 moderate depression; 15 to 19 moderately severe depression; 20 to 27 severe depression. 38 Participants with a score of 10 or above were assigned to the Possible Major Depressive Disorder (MDD) group, while those with a score of 9 or less were assigned to the Non-MDD group. 38 With a sensitivity of 80% and specificity of 92%, a total score of 10 or above indicated the possibility of serious depression.39,40 Additionally, PHQ-9 is a self-rating scale with strong reliability and validity for students.41,42 The Cronbach’s alpha coefficient of the PHQ-9 was .901 in our study.
Anxiety (Beck Anxiety Inventory (BAI))
Anxiety data was collected using the Beck Anxiety Inventory (BAI) questionnaire.43,44 BAI is a 21-item questionnaire that measures anxiety symptoms. Participants must rate themselves on a 0 to 3 scale, with zero indicating “Not at all” and 3 indicating “Severely-It bothered me a lot,” with a maximum score of 63 and a minimum score of zero. Minimal anxiety is a score of 0 to 7, mild anxiety 8 to 15, moderate anxiety 16 to 25, and severe anxiety 26 to 63. 45 A score of 16 is considered the clinical cut-off for anxiety. 46 BAI questionnaire demonstrated high internal consistency and acceptable reliability. 47 In our study, the Cronbach’s alpha coefficient of the BAI scale was .944.
Stress (Perceived Stress Scale (PSS))
Stress data was collected using the 10-item Perceived Stress Scale (PSS) questionnaire. 48 Participants must rate themselves on a 5-point Likert scale from 0 = never to 4 = very often. PSS-10 scores were obtained by reversing the scores on the 4 positive items; the items were 4, 5, 7, and 8. Total scores vary from 0 to 40, with 0 to 13 indicating mild stress, 14 to 26 indicating moderate stress, and 27 to 40 indicating high stress.
High perceived stress was defined as a score of 27 or above. This cut-off point has been used in a previous study. 49 PSS has been proven reliable and valid in various settings and languages.50 -53 The Cronbach’s alpha coefficient of the PSS-10 scale was .846 in this study.
Machine Learning Algorithms
In this study, 8 ML predictive models including Multi-Layer Perceptron (MLP), Logistic Regression (LR), K Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest (RF), Ada Boosting (AdaBoost), eXtreme Gradient Boosting
Logistic Regression (LR)
is a linear method that models the probability of an outcome taking place by calculating the log odds of the event given a combination of independent features. It is used in situations where the outcome is binary. 52 Its linearity makes it easier to implement, interpret, and explain than more complex models. Using Maximum Likelihood Estimation (MLE), LR can use the coefficients to predict the probability of an observation belonging to a class; in our case, we have 3 classes (depression, anxiety, and stress).
K Nearest Neighbors (KNN)
It is an algorithm that starts with K instances of the datasets around a data instance and assigns the most frequent label to the instance. 54 KNN can adjust itself to various data shapes and complexities as it heavily relies on distance computations (eg, Euclidean distance) and, thus, the training data without assuming a parametric model.
Support Vector Machine (SVM)
It is an algorithm that creates the optimal hyperplane that divides a dataset into 2 or more classes. The optimal hyperplane is at the maximum distance from the classes’ nearest data points. 52 SVM is deemed fit for prediction given its robustness and use of kernel trick, allowing it to handle non-linear decision boundaries.
Random Forest is an ensemble technique that builds multiple decision trees and merges them to obtain a more accurate and stable prediction. It selects a random sample with replacement from the dataset and creates a corresponding model. At each split, it selects a random subset of features to do the splitting, making it less likely to overfit. 55 Its ability to model non-linear interactions between the features and the target variable makes it a good fit for this case.
AdaBoost
It is an algorithm that belongs to ensemble learning that builds a strong learner out of a combination of weak learners, such as a decision stump (ie, a decision tree with 1 level). It focuses on the training instances that the predecessor algorithm misclassified. 54 Given our small dataset, AdaBoost tends to resist overfitting in such cases while providing insights into feature importance.
Extreme Gradient Boosting (XGBoost)
XGBoost uses decision trees as base learners and gradient boosting as a combination method (ie, Newton boosting). XGBoost is more efficient than decision trees and usually provides better prediction accuracy. 56 XGBoost has built-in L1 (Lasso) and L2 (Ridge) regularization, which prevents overfitting, especially when the dataset is small. XGBoost uses the gradient boosting algorithm to iteratively add weak learners, typically trees, to the model, where each tree corrects the errors of its predecessor.
Naïve Bayes
It is an algorithm that computes the conditional probability that a data instance belongs to a class, knowing the class characteristics. The instance would be assigned to a class with the highest conditional probability.54,57 Although it has a high bias, it is advantageous when the dataset is small. It might not capture all the intricacies of the data, but it makes it less prone to overfitting than complex models and can easily handle scenarios with more than 2 classes.
Multi-layer perceptron (MLP)
A multilayer perceptron is a deep learning approach that learns dependencies between the input layer (the features or variables) and the output layer (the classification decision). Between the input and the output layers, there can be 1 or more hidden layers but as many neurons as needed. The neurons are weighted and connected with nonlinear functions. The MLP uses a backpropagation algorithm to update the weights within the hidden layers to minimize the output layer’s error rate.58,59
Table 2 compares the advantages and disadvantages of the above predictive machine learning models.
The Advantages and Disadvantages of the Above Predictive Machine Learning Models.
Hyperparameter Tuning
Hyperparameter tuning is the process of optimizing the hyperparameters of a machine learning algorithm to improve its performance. The grid search was performed on all the models to get the best hyperparameters using roc_auc value as a scoring metric. Those hyperparameters are set before the training process begins and determine different aspects of it. Overfitting has been tackled using approaches appropriate for each algorithm (eg, regularization and tree depth reduction).
Data Analysis
The dataset that consisted of 329 records was cleaned; missing values were replaced with the mode of the feature. Each ML algorithm was used to predict the students’ mental health, depression, anxiety, and stress symptoms. Google Collaboratory was used for training, optimizing, and testing the ML models.
While AUC was the main performance measurement, other measures were computed, including sensitivity, specificity, precision, F-measure, and accuracy. 54 The equations to compute the performance measure are as follows:
Results
Implementation Procedure
Implementation is done using Python (3.7.13) and the Scikit-learn library (1.0.2). The CSV file is read into a data frame. For categorical values, imputation is performed by replacing the null values with the most frequent values. Each target variable has a corresponding data frame with 18 predictive features.
The obtained dataset has been split into training and test data using Stratified Shuffle Split, 70% for training and 30% for testing.
Sociodemographic Characteristics
Table 3 summarizes the descriptive statistics for the study participants’ characteristics. The participants’ mean (SD) age was 24.99 (7.39) years. The majority of participants were females (63.8%). Students were enrolled in various university programs, with undergraduate students accounting for 43% of the sample. More than two-thirds (77.5%) of participants had a monthly household income of 450 USD or less. Approximately 60% of students considered their overall health good, very good, or excellent. Sixty-four percent of the respondents stated that religion is important in their daily lives. Coronavirus and vaccination were the subjects of a conspiracy, according to 14% of participants. Furthermore, the majority of students (73.6%) followed COVID-19 prevention guidelines, and about a quarter of them were infected with COVID-19. Private counseling was received by more than half of the students (57.4%).
Sociodemographic and Other Characteristics of University Students (N = 329).
Mental Health Outcomes
The mental health outcome analysis showed that the mean (SD) score for depression was 10.18 (6.83), anxiety was 18.81 (14.42), and stress was 21.97 (7.30). Figure 1 depicts the study participants’ levels of depression, anxiety, and stress. Mild to moderate depression, anxiety, and stress were reported by the majority of participants (52.3%, 42.9%, and 61.7%, respectively), while severe depression, severe anxiety, and high stress were reported by 24.6%, 29.3%, and 27.6%, respectively. In total, students reported moderate to severe levels of depression, anxiety, and stress at a rate of 75.9%, 72.2%, and 89.3% respectively.

Severity of mental health outcome.
Performance of the ML Models
The comparison between models’ performance rates used in predicting students’ depression, anxiety, and stress is shown in Table 4.
Models’ Performance Measurements.
Abbreviations: AUC, area under curve; F1-score, harmonic mean between precision and recall.
Depression
The AUC value for Random Forest at 78.27%, AdaBoost at 76.25%, XGBoost at 75.55%, Support Vector Machine at 74.36%, Logistic regression and Naïve Bayes at 74.12 %, MLP at 73.90%, and KNN at 66.63%.
Anxiety
The AUC value for Naïve Bayes at 76.37%, Support Vector Machine at 74.94%, AdaBoost and Logistic regression at 74.89 %, MLP at 72.60%, Random Forest at 69.93%, and XGBoost at 67.67%, and KNN at 61.05%.
Stress
The AUC value for AdaBoost at 72.96%., followed by Support Vector Machine at 72.36%, Random Forest at 72.42%, MLP at 70.30%, XGBoost at 66.87%, Logistic regression at 66.51 %, KNN 63.84% and Naïve Bayes at 63.36%.
In addition, we have performed a feature selection using the Random Forest feature importance ranking method (Table 5). For anxiety, self-rated health was the top-ranked feature (100% importance), followed by age (64%); the remaining features were below 30%.
Feature Importance for Depression, Anxiety, and Stress.
For depression and stress, age was found to be the most important feature to predict depression and stress (100% for both), followed by self-rated health (89% for depression and 70% for anxiety). Sleeping hours during the pandemic and change in physical activity were the third and fourth most important features for depression (36% and 31%, respectively), and physical activity duration ranked fourth for depression at 31% and third most for stress (32%). The remaining features were below 30%.
Discussion
There is currently a scarcity of studies assessing the mental health of university students in Lebanon. This study aimed at understanding university students’ mental health, specifically depression, anxiety, and stress, during Lebanon’s extended COVID-19 pandemic based on the sociodemographic factors and lifestyle practices associated with it.
An AUC value between 70% and 80% is acceptable, while an excellent test would have an AUC value between 80% and 90%. 60 In our study, we aim to have a quasi-diagnostic model; in such cases, AUC is the best measurement for performance. Hence, no single model could be adopted as a single predictor for all 3 outcomes.
Random Forest achieved the best AUC at 78.27% for depression, Naïve Bayes at 76.37% for anxiety, and AdaBoost at 72.96% for stress.
Several studies reported predicting PHQ-9 based on smartphone data, 61 gait abnormality, 62 and surveys, 63 with different levels of success. Compared with the sole study that considered the prediction of PHQ-9 during COVID-19 (AUC = 96%), 63 the AUC of the random forest model in our study is significantly lower (78.27%). The difference could be attributed to the nature of the questionnaire items used in the previous study, which included questions about financial stress, whether the participant lost someone close to them, whether they have a conflict with family and friends, whether they faced any life-threatening events, whether they had any suicidal though, and whether they were physically, emotionally, or sexually abused. Such questions are more directly linked to one’s psychological condition than those in our study.
In the sole published study addressing anxiety within a cohort of 1172 university students in China, the Self-Rating Anxiety Scale 30 was employed for multiclassification using XGBoost. It is worth noting that this study used a distinct measurement scale and a multiclassification approach, in contrast to our binary approach. Furthermore, the study did not report the AUC in its findings.
Previous studies used machine learning to predict PSS 64 among 206 students in India before COVID-19 and did not report the AUC. The pre-COVID study reported the highest accuracy for an SVM model (85.71%), which is higher than our AdaBoost classifier (72.96%); however, the researchers did not report the survey questions, which makes it impossible to compare the results of our study with theirs; this is further complicated by the fact that the pre and post COVID-19 attitudes and experiences differ drastically.
Feature Importance Implications
In relation to anxiety, the most important factor was self-rated health (100%), followed closely by age (64%). Conversely, age and self-rated health were the most important predictors for depression (100% and 89%, respectively) and stress (100% and 64% respectively). Exploring the predictive capacity of these features independently or combined with a change in physical activity and sleeping hours on the models’ performance would be interesting in the future as it could lead to a robust predictive model using very few data items. Such a model could become an important tool to enhance universities’ engagement with students on mental health and programming counselling services. Machine learning holds significant potential in addressing mental health issues on university campuses. 65
Limitations of the Study
This study has several limitations. First, given the cross-sectional nature of the study design, the results are subject to confounding biases, such as the participants’ mental health status prior to the COVID-19 pandemic and other life stressors (eg, experiences of violence). Second, there is the possibility of selection bias as participation was voluntary. Third, the study relied on a convenience sample limited to students from 2 universities. While this sampling technique does not necessarily assure that results are generalizable, it can be a valuable tool for determining the likelihood of a potential relationship between the variables.66,67 Lastly, like any research conducted in an unstable environment with insecurity and instability and constantly changing circumstances, predicting and isolating the impact of these life factors is nearly impossible.
Although we had a relatively limited number of respondents in our study, the MLP neural network has the advantage that it can be trained effectively on small datasets and produce favorable performance.68,69 The model we developed showed promising performance in predicting the risk of depression, anxiety, and stress among university students, which can be helpful for university counselors in planning customized, scalable interventions such as e-mental health.
Recommendations
Model’s Performance Recommendations
In terms of the model’s performance, it is recommended that future studies develop an ensemble model that integrates the top-performing models in this study while exploring the possibility of collecting data for a more diverse sample, possibly from a broader range of universities.
Looking at the feature importance analysis and considering the significant role of self-rated health and age in predicting all 3 conditions, we recommend prioritizing these features in the training and tuning phases when developing the ensemble models.
Practical Applications Recommendations
Given the promising performance of machine learning models, there is an opportunity to integrate these models, especially regarding key predictive features, into virtual mental health care systems. This could benefit university counseling services to provide early identification and intervention for at-risk students. Additionally, the study’s findings can be used to improve policy-making at educational institutions in terms of raising awareness about the significant predictors of mental health issues and considering data-driven approaches in policy formulation to have more effective mental health strategies.
On a larger scale and based on our analysis of the study results, we recommend integrating the proposed predictive modeling solution with online mindfulness programs or similar scalable solutions to address the widespread mental health problem among university students.
Conclusion
We have outlined and discussed the initial stages of constructing a framework for forecasting depression, anxiety, and stress levels among university students. The MLP-based model exhibited superior performance, demonstrating the highest AUC and satisfactory accuracy. Machine learning models, particularly those applied in virtual care, hold great potential for enhancing mental health interventions. Our upcoming research aims to employ data augmentation techniques to improve results and broaden the scope to include multi-class predictions. Scalable solutions, such as online mindfulness,70 -72 are also essential to investigate to alleviate the mental health crisis among university students in Lebanon.
Footnotes
Acknowledgements
We thank the Canadian Lebanese Academic Forum for facilitating the team-building effort.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
