Sage Journals: Discover world-class research

Abstract

Background

Computer vision syndrome (CVS) is a growing occupational health concern linked to extensive digital device use, yet robust predictive models for risk identification remain scarce. This study aimed to evaluate and compare traditional and machine learning-based predictive models for CVS by assessing their accuracy in identifying CVS cases as defined by the CVS-Q^© questionnaire, which served as the gold standard.

Methods

A cross-sectional study was conducted among 90 Portuguese workers regularly exposed to digital display devices. Data were collected via a self-administered questionnaire covering age, sex, refractive errors, daily screen time, and symptoms assessed with the validated CVS-Q^©. Participants were classified as symptomatic or non-symptomatic (CVS-Q^© score ≥6). Predictive models were developed using logistic regression and four machine learning algorithms: random forest (RF), Gradient Boosting Machine (GBM), eXtreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM), with Synthetic Minority Over-sampling Technique (SMOTE) augmentation and repeated 5-fold cross-validation.

Results

Logistic regression and XGBoost demonstrated the highest accuracy (70.6%) and strong discriminative ability (AUC = 0.773 and 0.758, respectively). All models exhibited high sensitivity (≥72.7%), though specificity varied from 16.7% (random forest) to 50% (SVM, XGBoost). Calibration was acceptable across models (Brier scores 0.21–0.25). Female sex was the only statistically significant predictor in logistic regression (OR = 5.14, p = 0.012), while sex, screen time, and age were consistently the most influential variables in machine learning models. Refractive errors contributed minimally.

Conclusions

Our results indicate that predictive modeling shows promise for CVS risk identification in occupational settings, particularly using demographic and behavioral variables. However, the limited and unrepresentative sample warrants caution, and further research with larger cohorts is needed to confirm these preliminary findings and assess their scalability.

Keywords

Computer vision syndrome predictive modeling machine learning logistic regression occupational health digital screen exposure risk assessment

Introduction

The widespread integration of digital technologies into work, education, and leisure has profoundly transformed modern life. Computers, tablets, and smartphones are now indispensable, but their excessive use has been linked to a range of visual and musculoskeletal problems.^1–3 Among these, computer vision syndrome (CVS), also known as digital eye strain, is one of the most prevalent conditions.^1,4

CVS is characterized by blurred vision, dry eyes, headaches, and musculoskeletal discomfort, which collectively reduce quality of life and affect work and learning efficiency.^1,2,4–6

Epidemiological evidence suggests that CVS affects a large share of digital screen users worldwide, with prevalence estimates ranging from 50%^7,8 to over 90%⁹ in certain populations, including students and healthcare professionals.^10,11 The condition is driven by multiple, interacting risk factors such as uncorrected visual errors, prolonged screen exposure, poor ergonomics, and demographic influences.^12,13 Beyond its ocular symptoms, CVS has been associated with sleep disturbances, circadian rhythm disruption, and psychological distress, further underscoring its public health relevance.^14–17

Artificial intelligence (AI) and its subset, machine learning (ML), are increasingly revolutionizing diverse areas of healthcare. For instance, advanced ML techniques are being applied to hematology and hematopathology, shaping the future of blood disorder management through improved pattern recognition and data analysis.¹⁸ In diagnostic imaging, Porkar et al.¹⁹ demonstrated how ML, specifically self-organizing map neural networks, can significantly enhance the segmentation and diagnosis of cancer zones in MRI images, even in the presence of image noise and variability. In drug discovery, AI methodologies, including deep learning and language models, are accelerating the identification of novel antimicrobial agents, optimizing drug design, and predicting mechanisms of antimicrobial resistance (AMR), thereby addressing urgent global health threats such as AMR.²⁰ Together, these examples underscore the transformative impact of AI across multiple domains in medicine, from diagnostic imaging and disease management to drug development and personalized therapy.

Despite the high prevalence and impact of CVS, most existing research has been limited to cross-sectional studies with descriptive or simple statistical analyses. While these studies provide estimates of prevalence and associations with risk factors, they fall short in predicting which individuals are most at risk.^4,21–23 Currently, there is a lack of validated predictive models that integrate demographic, clinical, and behavioral variables, and only a few studies have compared traditional regression methods with advanced ML techniques in this context.^7,8,24,25

Developing and validating predictive models for CVS has implications that extend beyond the scope of clinical optometry. In occupational health, these tools can guide the implementation of targeted preventive measures such as personalized ergonomic adjustments, scheduled screen breaks, and visual hygiene education.^4,26 From a public health perspective, predictive models can contribute to policymaking by identifying vulnerable subgroups, estimating the burden of disease, and guiding resource allocation.^8,27 Additionally, within educational and remote work environments, especially post-pandemic, automated risk assessment tools may support institutional health policies and promote effective well-being strategies.^28,29

The present study aims to address this gap by developing and evaluating predictive models for CVS risk using the CVS-Q^© questionnaire as the gold standard for case identification. By comparing logistic regression with ML algorithms, including random forest, Gradient Boosting, XGBoost, and Support Vector Machines (SVM), we assessed and contrasted their precision in predicting CVS as defined by the CVS-Q^©. The objective was to determine which methods provide the most accurate, interpretable, and generalizable predictions relative to this reference standard. Beyond its methodological contribution, this work has practical implications for occupational health, education, and public health policy, where predictive tools could support early identification, preventive strategies, and resource allocation.

Methods

Study design and population

A cross-sectional exploratory observational study was conducted between January and March 2025 among Portuguese adults who regularly use digital display devices (DDDs), such as computers, tablets, or smartphones in their occupational settings. Eligible participants were aged 18 years or older and reported using DDDs for at least 4 h per workday as part of their occupational activities. This inclusion criterion is consistent with previous CVS studies, which typically define regular use as exposure to screens for at least 3–4 h per day.^2,30 Individuals with less than 4 h of daily screen exposure or incomplete data were excluded from the analysis. The study followed the ethical standards of the Declaration of Helsinki and was approved by the Ethics Committee of the Higher Institute of Education and Sciences of Lisbon (ISEC Lisbon) on 10 October 2024 (approval ID: CE/2024/10/10). All participants received detailed information about the study and provided written informed consent prior to inclusion.

Sample size and participant selection

A total of 100 adults were initially recruited during the period January–March 2025. Of these, 10 participants were excluded (6 due to incomplete questionnaire data and 4 for not meeting the minimum daily screen exposure criterion of 4 h per workday). All remaining eligible adults were included, resulting in a final sample of 90 participants. Recruitment aimed to maximize occupational diversity, including administrative, healthcare, education, and technical staff. A post-hoc statistical power analysis was conducted to assess the adequacy of the sample size for detecting the association between sex and CVS status. A chi-square test of independence applied to the 2 × 2 contingency table (sex × CVS status) yielded a statistically significant result (χ² = 7.83, df = 1, p = 0.005). The effect size was calculated using Cohen's w, defined as:

w = \sqrt{\frac{χ^{2}}{N}}

resulting in w = 0.295, which corresponds to a moderate effect size according to Cohen's classification.³¹ Based on this effect size, a significance level of α = 0.05, and a total sample size of N = 90, the estimated post-hoc statistical power was 79.92%, calculated using the pwr.chisq.test() function from the pwr package in R.³²

Data collection and variables

Data were collected through a self-administered online questionnaire distributed via digital platforms, such as Google Forms, WhatsApp, Facebook, and email. The questionnaire captured a range of variables:

Sociodemographic data: age (continuous, in years) and sex (binary: male or female);

Occupational factors: average daily screen time during the workday (continuous, in hours) and regular contact lens use (binary: yes or no);

Clinical history: self-reported refractive errors, including myopia, hyperopia, and astigmatism.

Symptom assessment was conducted using the Computer Vision Syndrome Questionnaire (CVS-Q^©), originally validated by Seguí et al.³⁰ and recently translated and culturally adapted into Portuguese (CVS-Q PT^©).³³ The Portuguese version was translated, back-translated, pre-tested, and validated in a sample of 280 workers, showing good internal consistency (α = 0.793), temporal stability (ICC = 0.847), sensitivity of 78.5%, specificity of 70.7%, and a discriminative ability AUC of 0.832. In the validation study, more than 96% of participants reported no difficulties completing the questionnaire, and 84% considered it clear and easy to understand. The CVS-Q^© includes 16 symptoms commonly associated with prolonged digital screen use, such as eye burning, dry eyes, blurred vision, headaches, and difficulty focusing, among others. Each symptom is rated along two dimensions: frequency (never, occasionally, or frequently/always) and intensity (moderate or severe). If a symptom is reported as “never,” it automatically receives a score of zero for intensity. The final score is calculated by multiplying frequency by intensity for each symptom and then summing the values across all items. According to the criteria established in the original validation, a participant is considered to have CVS symptoms if they score six or more points. Consistent with the original and Portuguese validation studies, a total score of ≥6 was used to classify participants as symptomatic, indicating clinically relevant CVS.

Statistical analysis

Data analysis was conducted using R software (version 4.3.1). Descriptive statistics were used to characterize the sample. Group comparisons (with and without CVS) were performed using Student's t-tests for continuous variables and chi-square tests for categorical variables, with a significance level of p < 0.05.

To evaluate the predictive capacity of the variable set, a binary logistic regression model was fitted. This model included age, sex, refractive errors, and daily screen exposure time as predictors. Subsequently, ML algorithms (random forest, Gradient Boosting Machine (GBM), XGBoost, and SVM) were trained and evaluated to compare their performance with the traditional model.

All models were trained on 80% of the data and validated on the remaining 20%. To address class imbalance (presence/absence of CVS), the Synthetic Minority Over-sampling Technique (SMOTE) was applied, and 5-fold cross-validation repeated three times was used. For each predictive model, the following performance metrics were calculated and reported: accuracy (overall correct classification rate), sensitivity (true positive rate), specificity (true negative rate), and area under the receiver operating characteristic curve (AUC). Models were evaluated based on accuracy, sensitivity, specificity, AUC, calibration (calibration curves and Brier score), and variable importance analysis.

Logistic regression, random forest, GBM, eXtreme Gradient Boosting (XGBoost), and SVM were selected as widely used, well-established approaches for binary classification in health research. Other, more complex algorithms were not applied due to the relatively small sample size.

Results

Sample characteristics

A total of 90 valid records were analyzed after excluding entries with missing values in the variables selected for the predictive models. The sample included 59 participants who were classified as symptomatic for CVS, while 31 were non-symptomatic.

The mean age in the CVS group was 41.5 ± 10.2 years, compared to 40.4 ± 11.7 years in the non-CVS group, with no statistically significant differences (t = –0.42; p = 0.673). However, a significantly higher proportion of women was observed in the CVS group (84.7%) compared to the non-CVS group (58.1%) (χ² = 6.46, p = 0.011).

Regarding refractive errors, myopia was reported by 51% of participants with CVS and 68% of those without, though the difference was not statistically significant (χ² = 1.72; p = 0.189). Hyperopia was more frequent in the CVS group (31%) than in the group without symptoms (13%), although this difference was not statistically significant (χ² = 2.52; p = 0.112). Astigmatism was observed in 66% of the CVS group and 71% of the non-CVS group, with no significant differences (χ² = 0.05; p = 0.817). Finally, participants with CVS reported longer daily screen exposure time (6.5 ± 2.9 h) compared to the group without symptoms (5.5 ± 2.9 h), although this difference also did not reach statistical significance (t = –1.73; p = 0.089).

Logistic regression model

A binary logistic regression model was applied to predict the presence of CVS based on age, sex, refractive errors (myopia, hyperopia, astigmatism), and daily screen time. The model demonstrated an acceptable overall fit (AIC = 94.4; residual deviance = 80.4, df = 66), with an AUC of 0.77, indicating moderate discriminative performance (Figure 1). The logistic regression model was trained on a subset of 73 participants (80%) and validated on an independent test set of 17 participants (20%).

Figure 1.

ROC curve of the logistic regression model.

Among the predictors included in the logistic regression model, sex emerged as the only statistically significant predictor (OR = 5.14, 95% CI: 1.42–18.56, p = 0.012), indicating that female participants had approximately five times higher odds of reporting CVS symptoms compared to males. The other predictors were not statistically significant but were retained in the model due to their theoretical relevance in the context of CVS risk. The confusion matrix revealed an overall accuracy of 64.7%, sensitivity of 33.3%, and specificity of 81.8%.

Random forest model

A random forest classification model comprising 500 decision trees was implemented to evaluate the performance of a ML approach compared to traditional logistic regression. The model achieved an AUC of 0.71 (Figure 2), indicating moderate discriminatory performance.

Figure 2.

ROC curve of the random forest model.

The confusion matrix revealed an accuracy of 70.6%, with a sensitivity of 16.7%, specificity of 100%, and balanced accuracy of 58.3%. The model correctly classified nearly all negative cases (high specificity) but performed poorly in identifying positive cases (low sensitivity). The positive predictive value was 100%, while the negative predictive value was 68.8%. The Cohen's Kappa coefficient was 0.21, suggesting slight agreement beyond chance. The analysis of variable importance (Figure 3) indicated that sex, screen time, and hyperopia were the most influential predictors, followed by age, myopia, and astigmatism. These findings were consistent across the Mean Decrease in Accuracy and Gini indices.

Figure 3.

Variable importance plot from the random forest model.

Other models

The comparative performance of the predictive models is illustrated in the joint ROC plot (Figure 4). In terms of classification metrics, both GBM and XGBoost achieved an overall accuracy of 58.8%. Sensitivity values were 72.7% for GBM and 81.8% for XGBoost, while specificity values were 33.3% and 16.7%, respectively. The SVM model demonstrated a more balanced performance, with a sensitivity of 63.6%, specificity of 50.0%, and an accuracy of 58.8%. However, the differences in AUC between the logistic regression model and the other approaches were not statistically significant, as assessed by DeLong's test: random forest (Z = 0.33, p = 0.739, 95% CI [–0.30, 0.42]), GBM (Z = 0.74, p = 0.458, 95% CI [–0.17, 0.39]), XGBoost (Z = 0.55, p = 0.580, 95% CI [–0.23, 0.41]), and SVM (Z = 0.75, p = 0.456, 95% CI [–0.12, 0.27]). In terms of predictor importance, both GBM and XGBoost models identified age as the most influential variable, followed by daily screen time and sex, whereas refractive errors played a negligible role in model performance. A similar pattern was observed in the SVM model, which also assigned highest importance to age, screen time, and sex. Overall, across all ML models, demographic factors (particularly age and sex) and screen exposure time consistently emerged as the primary contributors to CVS risk in this cohort, while refractive errors showed limited predictive value.

Figure 4.

Comparative ROC curves.

Calibration assessment was performed for all models using calibration plots and Brier scores (Figure 5). All models showed reasonably good calibration, with Brier scores ranging from 0.21 (logistic regression) to 0.25 (XGBoost), indicating similar average agreement between predicted probabilities and observed outcomes. Calibration curves further demonstrated that none of the models systematically overestimated or underestimated the risk of CVS across the spectrum of predicted probabilities, despite differences in classification metrics such as sensitivity and specificity. This consistency suggests that the discriminative and probabilistic performances of the ML models were generally consistent with those of logistic regression in this dataset.

Figure 5.

Calibration curves for all predictive models (logistic regression, random forest, GBM, SVM, and XGBoost). The dashed diagonal line represents perfect calibration. Brier scores for each model are displayed in the figure.

Cross-validated performance and comparative model metrics

To improve model reliability and address the class imbalance in the dataset, all predictive models (logistic regression, random forest, GBM, XGBoost, and SVM) were retrained using SMOTE-augmented data and evaluated through repeated 5-fold cross-validation (3 repetitions). The final models were then tested on an independent hold-out set (20% of the sample).

Table 1 summarizes the performance metrics of each model, including accuracy, sensitivity, specificity, and AUC. Logistic regression and XGBoost achieved the highest accuracy (0.706) and strong AUC values (0.773 and 0.758, respectively), while GBM showed the weakest overall performance (AUC = 0.682). Most models exhibited high sensitivity, but specificity varied considerably, particularly for the random forest model (16.7%). These results highlight the inherent trade-offs between sensitivity and specificity depending on the predictive approach selected.

Table 1.

Performance metrics for each predictive model, including accuracy, sensitivity, specificity, and AUC.

Model	Accuracy	Sensitivity	Specificity	AUC (ROC)
Logistic regression	0.706	0.818	0.500	0.773
Random forest	0.588	0.818	0.167	0.742
GBM	0.647	0.818	0.333	0.682
XGBoost	0.706	0.818	0.500	0.758
SVM (radial kernel)	0.647	0.727	0.500	0.758

Confusion matrices and classification patterns

To complement the performance metrics previously reported, the confusion matrices for each predictive model are presented below. These matrices detail the number of true positives, false positives, true negatives, and false negatives obtained on the test set (n = 17), offering additional insight into how each model performed in classifying symptomatic and asymptomatic individuals.

Logistic regression correctly identified 12 out of 13 symptomatic individuals but misclassified all asymptomatic participants, resulting in very high sensitivity but no specificity. The random forest and XGBoost models also demonstrated strong sensitivity with 11 true positives each, although their specificity remained limited, correctly classifying only one non-symptomatic case. In contrast, the GBM and SVM models provided a more balanced classification pattern, each achieving 10 true positives and two true negatives.

The confusion matrix results for each model were as follows. The logistic regression model identified 12 true positives, 4 false positives, 0 true negatives, and 1 false negative. The random forest model yielded 11 true positives, 3 false positives, 1 true negative, and 2 false negatives. The GBM model produced 10 true positives, 2 false positives, 2 true negatives, and 3 false negatives. The XGBoost model also resulted in 11 true positives, 3 false positives, 1 true negative, and 2 false negatives. Lastly, the SVM model showed 10 true positives, 2 false positives, 2 true negatives, and 3 false negatives.

Figure 6 displays the normalized confusion matrices for all predictive models.

Figure 6.

Normalized confusion matrices for all predictive models (logistic regression, random forest, GBM, XGBoost, and SVM). Each heatmap represents the proportion of predictions relative to the actual class, with rows corresponding to true labels and columns to predicted labels. Darker cells indicate higher classification proportions within each row.

Discussion

This study highlights the value of predictive modeling for CVS by integrating accessible demographic, clinical, and behavioral variables in a real-world occupational sample. The findings align with, and also extend, existing literature on CVS prevalence and risk assessment. A key strength of this work lies in the application and comparison of multiple ML algorithms and logistic regression with calibration analysis and cross-validation, offering a robust foundation for the development of technology-driven screening tools.

Consistent with prior research,^34,35 our findings revealed a significantly higher prevalence of CVS among women. This trend may stem from shared behavioral patterns, such as longer screen exposure or less frequent breaks, as well as physiological factors like hormonal fluctuations that influence tear film stability.³⁶ However, in contrast to Jalali et al.,³⁷ who reported that sex did not remain a significant factor after controlling for confounders, our model identified female sex as the only statistically significant predictor of CVS. This discrepancy may reflect contextual factors unique to our occupational sample, such as job roles, screen ergonomics, or health-seeking behavior, which could amplify gender-based vulnerability.

Screen exposure time emerged as another relevant, though not statistically significant, predictor in our models. On average, participants with CVS reported approximately one additional hour of daily screen time compared to those without symptoms. This trend aligns with the findings of Kahal et al.³⁸ and Jalali et al.,³⁷ both of whom identified excessive screen use as a key risk factor for CVS. Similarly, Fernández-Villacorta et al.¹³ and Behrens et al.³⁹ reported significant associations when daily screen use exceeded 7 h, particularly in remote work settings. In our sample, although screen time was longer among symptomatic individuals, the average duration remained below the high-risk threshold reported in those studies. This may partly explain the lack of statistical significance in the logistic regression analysis. However, in the random forest model screen exposure time was ranked among the most influential predictors. This suggests that ML approaches may capture complex, non-linear interactions or synergistic effects between screen time and other factors, such as ergonomics, sex, or job demands that are not readily detected by traditional statistical methods. These findings underscore the added value of ML in uncovering subtle but meaningful patterns in behavioral health data.

Unlike some previous studies, refractive errors (myopia, hyperopia, astigmatism) did not significantly contribute to the predictive performance of our models. Although hyperopia was slightly more frequent among participants with CVS, this difference was not statistically significant, nor did refractive errors rank highly in variable importance within the ML models. This contrasts with the findings of AlGhamdi et al.⁴⁰ and Coronel-Ocampos et al.,⁴¹ who reported associations between CVS and the use of glasses, particularly in individuals with uncorrected or inadequately corrected vision. This may be explained by adequate optical correction in our sample or by the greater relevance of behavioral and demographic factors for CVS risk in this population. Notably, our comparative evaluation of model performance also revealed that age, while not significant in logistic regression, was identified as a more important predictor in several ML approaches. This aligns with Moore et al.³⁶ and Fernández-Villacorta et al.,¹³ who found higher CVS prevalence in older age groups, while Sengo et al.³⁴ reported higher risk among younger individuals. These differences suggest that nonlinear relationships or interactions between predictors and outcomes may exist, which are better captured by advanced modeling techniques. Future research should explore additional clinical or biometric variables, such as those suggested by Jha et al.,⁴² to further refine CVS risk prediction.

This study presents several strengths, including the use of accessible sociodemographic, behavioral, and clinical variables, which enhances the scalability and applicability of our models in occupational health. By comparing multiple predictive modeling techniques, logistic regression and four ML algorithms (random forest, GBM, XGBoost, and SVM), we provide a broad evaluation of model performance for CVS risk prediction. The use of repeated cross-validation and SMOTE balancing addressed limitations related to sample size and class imbalance, improving internal validity. Calibration was assessed using Brier scores and calibration curves, offering insight into the reliability of probabilistic predictions.

However, certain limitations should be acknowledged. The small sample size and use of convenience sampling may affect the generalizability of our findings. Relevant predictors such as screen ergonomics or environmental factors were not included, and CVS classification relied on self-reported symptoms, introducing potential response bias. In addition, our models have not yet been externally validated in independent occupational cohorts, which is essential to confirm their utility in other settings.

Future research should focus on validating these predictive models in larger and more diverse populations and consider incorporating additional biometric and environmental variables to improve predictive accuracy. Longitudinal studies are recommended to establish causal relationships and assess model performance over time. Ultimately, integration of such predictive tools into digital health platforms or workplace wellness programs could support real-time risk assessment and early detection of CVS, contributing to more effective prevention in increasingly digital work environments.

Conclusions

This exploratory study provides initial evidence that predictive modeling, using accessible sociodemographic, behavioral, and clinical variables, may help identify individuals at risk of CVS in occupational settings. In our sample, demographic and behavioral factors, particularly sex, age, and daily screen time, consistently emerged as stronger predictors of CVS risk than clinical variables such as refractive errors. These findings align with previous research suggesting the multifactorial nature of CVS, but our results must be interpreted with caution.

Several limitations should be acknowledged, including the modest sample size, the lack of external validation, and the potential for selection bias inherent in convenience sampling. As such, the predictive performance and generalizability of the developed models remain to be confirmed. Our study is intended as a proof of concept and highlights both the opportunities and the current methodological challenges in applying ML approaches to occupational health data.

Future research should focus on validating these preliminary findings in larger, more representative, and independent cohorts. Incorporating additional variables, such as detailed ergonomic, environmental, or biometric data, may further enhance model performance. Ultimately, rigorous external validation is necessary before predictive modeling approaches can be recommended as scalable decision-support tools for occupational health professionals, employers, or policymakers in the context of CVS risk management.

Footnotes

ORCID iD

Clara Martinez-Perez

Ethics approval and consent to participate

The study was conducted in accordance with the ethical standards set forth in the Declaration of Helsinki and was approved by the Ethics Committee of the Higher Institute of Education and Sciences of Lisbon (ISEC Lisbon) on 10 October 2024 (approval ID: CE/2024/10/10). All participants received detailed information about the study and provided written informed consent prior to their inclusion.

Author contributions

Ana Paula Oliveira: conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft, writing—review & editing, visualization, supervision, project administration. Clara Martinez-Perez: conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft, writing—review & editing, visualization, supervision, project administration. Magda Isabel Sebinha: conceptualization, data curation, investigation, validation, writing—review & editing.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplemental information files.

References

Munshi

. Computer vision syndrome – a common cause of visual symptoms in the modern era. Int J Clin Pract 2017; 71: e12962.

Sheppard

Wolffsohn

. Digital eye strain: prevalence, measurement and amelioration. BMJ Open Ophthalmol 2018; 3: e000146.

Tesfaye

Alemayehu

Abere

, et al. Prevalence and associated factors of computer vision syndrome among academic staff in the University of Gondar, Northwest Ethiopia: an institution-based cross-sectional study. Environ Health Insights 2022; 16: 11786302221111865.

Derbew

Nega

Tefera

, et al. Assessment of computer vision syndrome and personal risk factors among employees of Commercial Bank of Ethiopia in Addis Ababa, Ethiopia. J Environ Public Health 2021; 2021: 6636907.

Gowrisankaran

Sheedy

. Computer vision syndrome: a review. Work 2015; 52: 303–314.

Boadi-Kusi

Abu

Acheampong

, et al. Association between poor ergophthalmologic practices and computer vision syndrome among university administrative staff in Ghana. J Environ Public Health 2020; 2020: 7516357.

Anbesu

Lema

. Prevalence of computer vision syndrome: a systematic review and meta-analysis. Sci Rep 2023; 13: 1801.

Ccami-Bernal

Soriano-Moreno

Romero-Robles

, et al. Prevalence of computer vision syndrome: a systematic review and meta-analysis. J Optom 2024; 17: 100482.

Altalhi

Khayyat

Khojah

, et al. Computer vision syndrome among health sciences students in Saudi Arabia: prevalence and risk factors. Cureus 2020; 12: e7060.

10.

Iqbal

Soliman

Ibrahim

, et al. Analysis of the outcomes of screen-time reduction in computer vision syndrome: a cohort comparative study. Clin Ophthalmol 2023; 17: 123–134.

11.

Peter

Giloyan

Harutyunyan

, et al. Computer vision syndrome (CVS): the assessment of prevalence and associated risk factors among the students of the American University of Armenia. J Public Health (Bangkok) 2023; 33: 1–10.

12.

Alenazi

Alshehri

, et al. The prevalence and severity of computer vision syndrome among primary care health workers in the Ministry of National Guard Health Affairs, Central Region, Saudi Arabia. Cureus 2024; 16: e74741.

13.

Fernandez-Villacorta

Soriano-Moreno

Galvez-Olortegui

, et al. Síndrome visual informático en estudiantes universitarios de posgrado de una universidad privada de Lima, Perú. Arch Soc Esp Oftalmol 2021; 96: 515–520.

14.

Mohan

Sen

Shah

, et al. Prevalence and risk factor assessment of digital eye strain among children using online e-learning during the COVID-19 pandemic: digital eye strain among kids (DESK study-1). Indian J Ophthalmol 2021; 69: 140–144.

15.

Mei

Zhou

, et al. Sleep problems in excessive technology use among adolescents: a systematic review and meta-analysis. Sleep Sci Pract 2018; 2: 9.

16.

Issa

Sfeir

Azzi

, et al. Association of computer vision syndrome with depression/anxiety among Lebanese young adults: the mediating effect of stress. Healthcare 2023; 11: 2674.

17.

Rolando

Arnaldi

Minervino

, et al. Dry eye in mind: exploring the relationship between sleep and ocular surface diseases. Eur J Ophthalmol 2023; 34: 1128–1134.

18.

Tajvidi Asr

Rahimi

Pourasad

, et al. Hematology and hematopathology insights powered by machine learning: shaping the future of blood disorder management. Iran J Blood Cancer 2024; 16: 9–19.

19.

Porkar

Mehrabipour

Pourasad

, et al. Enhancing cancer zone diagnosis in MRI images: a novel SOM neural network approach with block processing in the presence of noise. Iran J Blood Cancer 2025; 17: 34–45.

20.

Ghaderzadeh

Shalchian

Irajian

, et al. Artificial intelligence in drug discovery and development against antimicrobial resistance: a narrative review. Iran J Med Microbiol 2024; 18: 135–147.

21.

Jaiswal

Asper

Long

, et al. Ocular and visual discomfort associated with smartphones, tablets and computers: what we do and do not know. Clin Exp Optom 2019; 102: 463–477.

22.

Cacho-Martínez

Cantó-Cerdán

Lara-Lacárcel

, et al. Assessing the role of visual dysfunctions in the association between visual symptomatology and the use of digital devices. J Optom 2024; 17: 100510.

23.

Thiagarajan

Ciuffreda

. Visual fatigue effects on vergence dynamics in asymptomatic individuals. Ophthalmic Physiol Opt 2013; 33: 642–651.

24.

Liu

Zhang

Gao

, et al. Correlation between eye movements and asthenopia: a prospective observational study. Res Square 2022; 11: 7043.

25.

De-Hita-Cantalejo

Sánchez-González

Silva-Viguera

, et al. Tweenager computer visual syndrome due to tablets and laptops during the postlockdown COVID-19 pandemic and the influence on the binocular and accommodative system. J Clin Med 2022; 11: 5317.

26.

Coles-Brennan

Sulley

Young

. Management of digital eye strain. Clin Exp Optom 2019; 102: 18–29.

27.

Upadhyay

Juneja

Muhammad

, et al. Analysis of IoT-related ergonomics-based healthcare issues using analytic hierarchy process methodology. Sensors 2022; 22: 8232.

28.

Seresirikachorn

Thiamthat

Sriyuttagrai

, et al. Effects of digital devices and online learning on computer vision syndrome in students during the COVID-19 era: an online questionnaire study. BMJ Paediatr Open 2022; 6: e001429.

29.

León-Figueroa

Barboza

Siddiq

, et al. Prevalence of computer vision syndrome during the COVID-19 pandemic: a systematic review and meta-analysis. BMC Public Health 2024; 24: 640.

30.

Seguí

MDM

Cabrero-García

Crespo

, et al. A reliable and valid questionnaire was developed to measure computer vision syndrome at the workplace. J Clin Epidemiol 2015; 68: 662–673.

31.

Cohen

. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988.

32.

Champely

. pwr: basic functions for power analysis. R package version 1.3-0, https://CRAN.R-project.org/package=pwr (2020).

33.

Cantó-Sancho

Linhares

Ronda-Pérez

, et al. Cross-cultural validation into Portuguese of a questionnaire to assess computer vision syndrome in workers exposed to digital devices. Arq Bras Oftalmol 2023; 87: e20220256.

34.

Sengo

da Deolinda Bernardo Pica

Dos Santos

IIDB

, et al. Computer vision syndrome and associated factors in university students and teachers in Nampula, Mozambique. BMC Ophthalmol 2023; 23: 508.

35.

Vargas Rodríguez

Espitia Lozano

de la Peña Triana

, et al. Computer visual syndrome in university students in times of pandemic. Arch Soc Esp Oftalmol (Engl Ed) 2023; 98: 72–77.

36.

Moore

Wolffsohn

Sheppard

. Digital eye strain and clinical correlates in older adults. Contact Lens Anterior Eye 2025; 48: 102349.

37.

Jalali

Esmaeili

Hesam

, et al. Investigating computer vision syndrome and associated factors before and during the COVID-19 pandemic: a comparative cross-sectional study in faculty members. Work 2025; 80: 1202–1214.

38.

Kahal

Al Darra

Torbey

. Computer vision syndrome: a comprehensive literature review. Future Sci OA 2025; 11: 2476923.

39.

Behrens

Griemsmann

Hosbach

, et al. Computer vision syndrome before and after the SARS-CoV-2 pandemic: new symptom onset and workplace setup of visual display terminals. J Occup Environ Med 2025. Epub ahead of print. DOI: 10.1097/JOM.0000000000003484.

40.

AlGhamdi

Alshehri

Bashir

. Prevalence and determinants of computer vision syndrome among healthcare providers in Al Baha, Saudi Arabia: a cross-sectional study. Cureus 2025; 17: e85991.

41.

Coronel-Ocampos

Gómez

, et al. Computer visual syndrome in medical students from a private university in Paraguay: a survey study. Front Public Health 2022; 10: 935405.

42.

Jha

Kaushik

Bhavaraj

, et al. Association of digital eye strain with central corneal thickness: a cross-sectional observational pilot study. Int Ophthalmol 2025; 45: 52.

Exploratory machine learning models for computer vision syndrome in occupational health

Abstract

Background

Methods

Results

Conclusions

Keywords

Introduction

Methods

Study design and population

Sample size and participant selection

Data collection and variables

Statistical analysis

Results

Sample characteristics

Logistic regression model

Random forest model

Other models

Cross-validated performance and comparative model metrics

Confusion matrices and classification patterns

Discussion

Conclusions

Footnotes

ORCID iD

Ethics approval and consent to participate

Author contributions

Funding

Declaration of conflicting interests

Availability of data and materials

References