Abstract
Background
This study aimed to use data from the PD-MDCNC database to develop a risk prediction model using machine learning (ML) methods for the early identification of the risk of mild cognitive impairment in Parkinson's disease (PD-MCI) within a Parkinson's disease (PD) patients cohort.
Methods
This study used assessment scales and blood test results from 523 patients with Parkinson's disease (PD) in the PD-MDCNC database, collected from the Hubei Parkinson's Disease Clinical Research Center, to develop a predictive model for assessing the risk of mild cognitive impairment (MCI) in PD patients. Using simple assessment scales and blood test data, we developed ten machine learning algorithms to predict PD-MCI. The optimal model was determined through comparison, and its performance was evaluated using an external validation cohort of 139 PD patients from a Taihe state hospital in Shiyan City, Hubei Province.
Results
The area under the receiver operating characteristic curve (AUC) for the ten models ranged from 0.57 to 0.72. The best predictive performance was achieved with RF (AUC=0.72). The variable importance ranking results indicated that the U3 score was the best important feature in predicting PD-MCI.
Conclusions
This study presents a robust machine learning model for the early detection of MCI in PD patients, which may help provide a simple method for early identification of PD-MCI patient.
Plain language summary
Cognitive impairment is common in people with Parkinson's disease, but it is often difficult to detect at an early stage. Many patients still function well in daily life, and early cognitive changes may be overlooked. In this study, we used computer-based machine learning methods to estimate the risk of mild cognitive impairment in people with Parkinson's disease. We analyzed clinical assessment data and blood test results from a large Parkinson's disease database in China. We compared ten different prediction models and found that the random forest model performed best in identifying patients at higher risk of cognitive impairment. Measures related to motor symptoms played an important role in the prediction. Our findings suggest that machine learning models may help clinicians identify cognitive risk earlier and support timely monitoring and intervention. This approach is intended as a screening tool rather than a diagnostic test and may help improve clinical decision-making in routine practice.
Keywords
Introduction
Parkinson's disease (PD) is the second most common neurodegenerative disorder globally. 1 The non-motor symptoms of PD can seriously affect the prognosis and quality of life of patients 2 ; Parkinson's disease related cognitive impairment (PD-CI) is the most common and disabling non motor symptom in PD patients. 3 Parkinson's disease with mild cognitive impairment (PD-MCI) is a cognitive syndrome in PD patients that lies between normal cognitive function (Parkinson's Disease with Normal Cognitive Function, PD-NC) and Parkinson's disease with dementia (PDD). Clinically, PD-MCI is primarily characterized by impairments in one or more cognitive domains, including memory, attention, working memory, executive function, language abilities, and visuospatial abilities. 4 PDD refers to PD patients with severe cognitive impairment or dementia, primarily characterized by severe deficits in executive function, visuospatial abilities, and attention. 5 With PD progression, is a resulting progression in PD-CI. 6 Prevalence of PD-MCI is approximately 18.9%-38.2%, while its prevalence among non-dementia patients exceeds 50%.7,8 Within one year of diagnosis, 24.2%-27.8% of PD-MCI patients may revert back to PD-NC; however, after 20 years, 83% of PD-MCI patients will inevitably progress to PDD. 9 PDD is associated with higher hospitalization rates and costs. As of 2022, nearly $1 trillion in annual global expenses can be attributed to dementia, affecting an estimated 57.4 million people worldwide. 10 It has been reported that the mortality and hospitalization rates of PDD patients are significantly higher than those of PD-NC patients. 11 Currently, there are no disease-modifying treatments or preventive measures available to improve outcomes for PDD patients. As a result, once PDD patients begin to exhibit clinical symptoms, they not only endure higher morbidity and severe decline in quality of life, but their primary caregivers also face significant caregiving burdens and financial strain. 12 PD-MCI represents the earliest detectable stage of cognitive impairment in PD patients that may progress to PDD. 13 Therefore, early identification of the risk of PD-MCI in PD patients is beneficial for early cognitive intervention and slowing the progression to PDD. Although numerous studies have reported risk factors for PD-MCI, including gender, age, education level, disease duration, history of hypertension, diabetes, and hyperlipidemia.14–17 Due to inconsistencies among reported factors in the literature, a systematic study of PD-MCI risk factors is lacking.
Machine learning (ML), which involves algorithms that iteratively learn from data to identify complex patterns and improve predictive performance, has been increasingly applied to the early prediction of PD-MCI. 18 Therefore, in this study, we used clinical assessment scales from the PD-MDCNC database of the Hubei Parkinson's Disease Research Center, together with blood test data from an affiliated hospital, to develop and compare ten ML-based models for predicting the risk of PD-MCI.
Materials and methods
Dataset
The data used in this study were obtained from the Hubei Parkinson's Disease Clinical Research Center database, which is part of the Parkinson's Disease & Movement Disorders Multicenter Database and Collaborative Network in China (PD-MDCNC). This database contains standardized clinical assessment data from Parkinson's disease patients collected across multiple tertiary hospitals in China. It includes more than 50 assessment scales covering motor and non-motor symptoms, quality of life, and other clinical characteristics of PD patients. The overall data completeness exceeds 90%, and all assessments were conducted by trained clinicians, ensuring high data quality and reliability. Website of the database is http://www.pd-mdcnc.com/index.html. Since this database is not open to the public, authorization for data use was obtained from the database administrator prior to data extraction. The study protocol was reviewed and approved by the Institutional Review Board of Xiangyang No.1 People's Hospital and conducted in accordance with the Helsinki Declaration. The patient data extracted from the database includes basic information, medical history, lifestyle history, current medical history, and multiple PD-related assessment scales, as detailed in Appendix Figure 1.
Study population and inclusion/exclusion criteria
Scale assessment data were extracted from 881 patients diagnosed with PD who were recorded in the database from its establishment to December 31, 2023. Corresponding blood test indicators were retrieved from the affiliated hospital's laboratory information system for analysis. The diagnosis of PD was based on the revised Movement Disorder Society (MDS) Parkinson's Disease Diagnostic Criteria, which require the absence of absolute exclusion criteria, the presence of at least two supportive criteria, and no red flags. 19 Detailed diagnostic criteria are provided in Appendices Tables 1–3. All PD diagnoses recorded in the database were made by experienced senior neurologists specializing in PD.
PD scale assessment data was extracted based on the inclusion and exclusion criteria outlined in Table 1. The scale assessment data of patients from the database, which includes four parts:Basic Information: Gender, age, education level, disease duration, levodopa equivalent daily dose (LEDD) (general information section of the database). Medical History:History of hypertension, diabetes, hyperlipidemia, cardiovascular diseases, etc. (medical history section of the database). Lifestyle History: Smoking history, alcohol consumption history (lifestyle assessment within the medical history section of the database), history of falls (first part of the Non-Motor Symptoms Questionnaire, NMSQ), history of constipation (Wexner Constipation Scoring System), etc. Current Medical Condition: Parkinson's disease severity classification (Hoehn-Yahr scale, H-Y scale), depression severity (Hamilton Depression Scale), anxiety severity (Hamilton Anxiety Scale), presence of hallucinations (first part of the MDS Unified Parkinson's Disease Rating Scale, MDS-UPDRS), freezing of gait (third part of the MDS-UPDRS), olfactory dysfunction (Olfactory Function Assessment Scale), sleep status (REM Sleep Behavior Disorder Screening Questionnaire), and motor function status (third part of the MDS-UPDRS, referred to as the U3 score), etc. As for the blood data of PD patients, it will be extracted from the laboratory system of the affiliated hospital using the patient's name and visit date as indexes, matching the blood test data with the database patients. This includes data values for tests such as TC, TG, HDL-C, LDL-C, Apo A1, Apo B, UC, and HCY.
PD scale assessment data inclusion and exclusion criteria.
Significant risk factors for PD-MCI identified by univariate analysis
Note: 1) Mann-Whitney U test; 2) Chi-square test.
MDS Working Group Level I criteria are used to classify PD patients into the PD-MCI group and the PD-NC group. 20 Level I requirements are impairment of the Global Cognitive Ability Scale validated for PD or impairment of limited neuropsychological tests, such as one test is included per cognitive area, or tests that assess fewer than five cognitive domains are included Mini Mental State Examination (MMSE) 21 and Montreal Cognitive Assessment Scale (MoCA) 22 were used in this study for cognitive test. The best cut-off points for MoCA and MMSE scales in MCI screening were determined by China Guidelines for Cognitive Disorders in Elderly People. In MCI screening, the best cut-off points determined by the MoCA scale for people aged ≤75 years and ≤6 years of education, aged 75 years and ≤5 years of education, aged ≤75 years and 6 years of education, aged 75 years and 6 years of education were 19.5, 15.5, 24.5 and 24.5 respectively, and the cut-off points determined by the MMSE scale were 26.5, 22.5, 28.5 and 26.5 respectively. 23 The total data extraction process is shown in Figure 1.

Data extraction process diagram. logistic regression, LR; random forest, RF; classification and regression Tree, CART; eXtreme Gradient Boosting, XGBoost; k-nearest neighbor algorithm, KNN; support vector machine, SVM; extremely randomized trees, Extra Trees; gradient boosting, GBDT; categorical boosting, CatBoost; light gradient boosting, LightGBM.
Data preprocessing
Before model construction, several preprocessing procedures were carried out to ensure data quality. The proportion of missing data varied among variables. Variables with a missing - value rate of ≥30% and low relevance to the target outcome were removed. For variables with a missingness rate of <30%, missing entries were imputed using the Multiple Imputation (MI) method. Continuous variables were imputed using predictive mean matching, while categorical variables were imputed using a Random Forest–based approach. 24 No additional data transformation or feature scaling procedures were applied.
Model establishment
In this study, ten machine learning (ML)–based models were used to predict early PD-MCI. The models included logistic regression model (LR), support vector machine (SVM), classification and regression trees model (CART), XGBoost (XGB) model, random forest (RF) model, and k-nearest neighbor algorithm (KNN), extremely randomized trees, (Extra Trees), gradient boosting, (GBDT), light gradient boosting, (Light GBM), categorical boosting, (CatBoost). The dataset was divided into a training set and a testing set at an 80:20 ratio, with 80% of the data used for model training and 20% for independent testing. During model training, five-fold cross-validation (k = 5) was performed, and hyperparameters were optimized using grid search.
Model prediction efficiency evaluation
For evaluating the predictive performance of different models, TP (true positive) represents cases correctly predicted as PD-MCI, FP (false positive) represents cases incorrectly predicted as PD-MCI, TN (true negative) represents cases correctly predicted as PD-NC, and FN (false negative) represents cases incorrectly predicted as PD-NC. The definitions of these indicators are shown in Appendix Figure 2. Based on the confusion matrix constructed from these values, multiple performance metrics were used to assess model performance, including the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, F1 score, sensitivity, specificity, average precision (AP), precision–recall (PR) curves, and detection error tradeoff (DET) curves.
In addition, to further validate the robustness of the models, a univariate analysis was first conducted to examine the associations between the candidate variables and the outcome variable (PD-MCI). Variables with a p-value < 0.05 in the univariate analysis were then selected for external validation using data from hospitalized and outpatient PD patients at Taihe State Hospital in Shiyan City.
Feature importance ranking
SHAP (Shapley Additive Explanations), along with visualizations such as summary plots and dependence plots were used to rank the variables identified by the model. The principle of Shape values is to use the best-performing model to quantify the impact of each variable on the model's prediction accuracy. 25 In terms of feature importance ranking, the larger the Shape value and the higher its position in the ranking, the more important the feature is to the model.
External validation
To further evaluate the generalizability of the optimal model, an external validation was conducted using an independent cohort of 139 PD patients from Taihe State Hospital in Shiyan City, Hubei Province. This cohort was completely separate from the dataset used for model development. All predictors were processed using the same procedures applied to the training data, including missing-value imputation and feature preparation. The trained model was directly applied to the external cohort without any retraining or parameter adjustment. Model performance was assessed using the same metrics as in internal validation, including accuracy, precision, recall, F1-score, and AUC. The external validation results demonstrated that the model maintained satisfactory predictive performance, confirming its potential applicability in independent clinical settings.
Statistical analysis
All statistical analyses were performed using Python (version 3.6.8) and R language statistical software (version 3.5.0). First, a univariate analysis was performed on all variables to identify independent risk factors for PD-MCI, and variables with statistical significance were included in the predictive model. Next, Patients were then divided into two groups according to the presence or absence of PD-MCI.
For continuous variables, data following a normal distribution were expressed as mean ± standard deviation (
Results
Basic characteristics
This study extracted data from 523 PD patients in the database after applying the inclusion and exclusion criteria. According to the PD-MCI classification results, 301 patients (57.6%) were categorized into the PD-MCI group, while 222 patients (42.4%) were classified into the PD-NC group. Univariate analysis was conducted to identify potential risk factors for PD-MCI. The results showed that age at database entry, gender, educational level, years of education, smoking history, U3 score, and LEDD were significantly associated with PD-MCI (P < 0.05). The significant risk factors are summarized in Table 2, and the complete results, including non-significant variables, are provided in Appendix Table 4.
Tuning characteristics
During model training, the KNN model showed the best performance with 10 nearest neighbors. For the SVM model, a penalty parameter C value of 1 produced optimal results. In the CART model, the model performance was optimized when the maximum tree depth and minimum sample split were set to 5 and 10, respectively. The RF model achieved its best performance with a maximum tree depth of 5 and 100 trees. Similarly, the XGBoost model performed best when the maximum tree depth was 5 and the leaf node splitting threshold was set to 0.1. For the Extra Trees model, optimal results were obtained using 100 estimators. The CatBoost model showed the best performance with a border count of 300 and a learning rate of 0.1. In the GBDT model, setting the maximum tree depth and minimum sample split to 5 resulted in optimal performance. For LightGBM, the best performance was achieved with 5 leaves and a learning rate of 0.1. The detailed hyperparameter settings are summarized in Appendix Table 5.
Performance of developed models
Figures 3 and 4 summarize the performance of the machine learning models on the external validation set from complementary perspectives. Figure 2 presents the ROC curves, reflecting the overall discriminative ability of each model. Among all models, the RF model achieved the highest AUC (0.72), whereas the KNN model showed comparatively weaker discrimination. Figure 3 displays the precision recall (PR) curves, which are particularly informative in the context of imbalanced data and screening tasks. All models demonstrated AP values above 0.5, with RF and XGBoost showing relatively stronger performance, indicating a favorable balance between sensitivity and precision for identifying individuals at risk of PD-MCI. Figure 4 illustrates the DET curves. The RF model curve was closest to the lower left corner, suggesting a lower overall prediction error and a more stable balance between false positive and false negative rates compared with the other models. A comprehensive comparison of recall, precision, accuracy, F1 score, and related metrics is provided in Table 3. Although CatBoost achieved the highest sensitivity, this was accompanied by reduced specificity, while XGBoost demonstrated higher specificity at the expense of sensitivity. Overall, the RF model showed the most balanced and stable performance across multiple evaluation metrics, supporting its potential utility as a screening-oriented tool for identifying PD patients at increased risk of mild cognitive impairment in new datasets.

Evaluation of the prediction model for PD-MCI, the average receiver operating characteristic curves from the ten machine learning-based models. logistic regression model (LR), support vector machine (SVM), classification and regression trees model (CART), XGBoost (XGB) model, random forest (RF) model, and k-nearest neighbor algorithm (KNN), extremely randomized trees, (Extra Trees), gradient boosting, (GBDT), light gradient boosting, (Light GBM), categorical boosting, (CatBoost).

Evaluation of the prediction model for PD-MCI, the average precision recall curves, indicating the tradeoff between precision and recall. logistic regression model (LR), support vector machine (SVM), classification and regression trees model (CART), XGBoost (XGB) model, random forest (RF) model, and k-nearest neighbor algorithm (KNN), extremely randomized trees, (Extra Trees), gradient boosting, (GBDT), light gradient boosting, (Light GBM), categorical boosting, (CatBoost).

Evaluation of the prediction model for PD-MCI, the detection error tradeoff curves, indicating the false positive rate and false negative rate. logistic regression model (LR), support vector machine (SVM), classification and regression trees model (CART), XGBoost (XGB) model, random forest (RF) model, and k-nearest neighbor algorithm (KNN), extremely randomized trees, (Extra Trees), gradient boosting, (GBDT), light gradient boosting, (Light GBM), categorical boosting, (CatBoost).
Performance of ten models.
Abbreviations: logistic regression model (LR), support vector machine (SVM), classification and regression trees model (CART), XGBoost (XGB) model, random forest (RF) model, and k-nearest neighbor algorithm (KNN), extremely randomized trees, (Extra Trees), gradient boosting, (GBDT), light gradient boosting, (Light GBM), categorical boosting, (CatBoost).
Performance of developed models
For the RF model, the variable importance ranking is illustrated in Figures 5 and 6. The predictors are ranked, in descending order of importance, as follows: U3 score, LEDD, age, education level, years of education, smoking history, and gender. Among these variables, the U3 score contributed the most to the model's prediction of PD-MCI risk.

Variable importance ranking of RF model. Abbreviations: Third part of the MDS-UPDRS scale score (U3), levodopa equivalent daily dose (LEDD).

SHAP summary plot of feature importance.
Discussion
In this study, we developed ten models for predicting MCI status in PD patients using ML algorithms and scales and blood data from the Hubei Parkinson's Disease Clinical Research Center. Among the ten models, the RF model not only demonstrated a higher AUC, but also achieved better prediction accuracy and sensitivity for PD-MCI. Studies have shown that PD-MCI is also associated with higher hospitalization rates and healthcare utilization. The mortality and hospitalization rates of PD-MCI patients are significantly higher than those of PD-NC patients. 11 PD-MCI represents the earliest detectable stage of cognitive impairment in PD patients. 13 Therefore, early identification of the risk of PD-MCI in PD patients is particularly important. Most studies have shown that cognitive impairment in PD patients is positively correlated with U3 scores, the higher the U3 score, the worse the patient's cognition function. 26 Numerous studies have also indicated that the worsening of PD motor symptoms leads to dysfunction of the cortico-striatal network, resulting in impaired frontal/executive and visuospatial functions in PD patients, thereby accelerating the onset of PD-MCI.27–29 Previous meta-analyses have reported that motor symptoms in PD patients are associated with the occurrence of cognitive impairment, and that PD-MCI patients have significantly higher U3 scores than non-PD-MCI patients. PD-MCI is related to more severe motor symptoms in PD patients. 30 The nomogram results for predicting the risk of PD-MCI occurrence indicate that age, sleep disorders, More severe motor symptoms and olfactory impairment are independent risk factors for PD-MCI. 31
Currently, most PD-MCI prediction models are based on traditional linear regression approaches. However, linear models have limited ability to capture complex nonlinear relationships, are sensitive to outliers, and perform poorly when handling categorical variables, which restricts their predictive performance in PD-MCI identification. In contrast, ML methods can effectively handle high-dimensional and nonlinear data, are less affected by outliers, and have shown clear advantages in disease risk prediction. In recent years, several studies have applied ML techniques to predict PD-MCI with encouraging results. Altham et al. constructed a PD-MCI prediction model using clinical variables such as hypertension history, motor scores, self-care ability, and coffee intake. Their results showed that the RF model achieved better performance than LR and CART models, with an AUC of 0.656. 32 Chen et al. developed an ML model based on voxel-level features, and reported that XGBoost achieved superior discrimination between PD-NC and PD-MCI patients (AUC = 0.940) compared with RF and CART models. 33 Hou et al. built a PD-MCI risk prediction model using demographic and clinical factors, including age, gender, disease duration, education level, LEDD, Barthel index, and H–Y stage, and found that the SVM model showed the best performance (AUC = 1.00) among the evaluated models. 34 Similarly, Ma et al. applied ML approaches to explore biological differences between PD-MCI and PD-NC patients, reporting that the SVM model achieved an accuracy of 85%, outperforming the RF model. 35 Overall, previous studies indicate that ML models can effectively improve the identification of PD-MCI. Importantly, the use of multiple ML models within the same framework helps reduce the bias associated with single-model validation, allows complementary perspectives on the data, and provides more robust support for cognitive risk assessment.
Cognitive impairment in patients with PD is difficult to recognize clinically and is often underdiagnosed. On the one hand, during the PD-MCI stage, patients usually maintain their daily living abilities or show only mild impairment. Therefore, both patients and their families tend to pay more attention to motor symptoms and may overlook early cognitive decline, which leads to delayed cognitive evaluation and intervention. On the other hand, cognitive impairment in PD is highly heterogeneous. The timing of onset, the affected cognitive domains, and the severity of impairment vary across individuals and disease stages, which increases the risk of misdiagnosis or missed diagnosis in the early stage of the disease. 36 Due to the difficulty of clinically identifying cognitive impairment in PD patients, many previous studies have relied on relatively small sample sizes and mainly used neuroimaging data or single serum biomarkers to predict cognitive decline. These data are often collected through on-site assessments conducted by researchers, which may be influenced by environmental conditions and assessment settings, potentially introducing measurement bias and variability.
In contrast, the present study is based on data from a large-scale, standardized clinical database, which allows for more efficient data acquisition and reduces errors caused by environmental or human factors. This approach enables large-scale cognitive assessment of PD patients and improves the stability and reliability of the predictive models. Therefore, training machine learning algorithms using large-scale database records is essential for improving the prediction of PD-MCI.
In our study, we applied multiple machine learning (ML) models to predict early PD-MCI based on demographic, clinical, and motor-related variables, including age, gender, education level, years of education, smoking history, and U3 score. All variables were obtained from standardized assessments conducted by professional clinicians and recorded in a large-scale database. To our knowledge, this is the first study to predict PD-MCI risk using patient data derived from the Chinese Parkinson's Disease Database. Our findings demonstrate the feasibility and potential value of ML-based approaches for the early identification of PD-MCI.
Among the ten ML models constructed, RF model showed the most stable overall predictive performance. RF is an ensemble learning algorithm based on decision trees, in which multiple sub-datasets are generated through repeated random sampling. Each sub-dataset is used to construct an individual decision tree, and the final prediction is obtained by aggregating the outputs of all trees through voting or averaging. 37 A key advantage of RF lies in its feature selection strategy: instead of using all available features at each split, a random subset of features is selected, and the optimal feature is chosen from this subset. This mechanism helps reduce overfitting and improves the robustness and generalization ability of the model. 38 In addition, the XGBoost, CatBoost, and LightGBM models also achieved relatively high AUC values, with only minor differences among them. This similarity may be explained by the fact that these models are all based on the GBDT framework, which leads to comparable learning strategies and classification performance. In contrast, although the KNN model performs well in classification and regression tasks when the feature space is clear and data distribution is uniform, its predictive performance was relatively limited in this study, possibly due to the complexity and heterogeneity of clinical data. 39
To ensure reliable model evaluation, stratified five-fold cross-validation was used for internal validation. The dataset was divided into five folds with similar proportions of PD-MCI and PD-NC cases in each fold. In each iteration, four folds were used for training and one fold for testing. This approach allowed each sample to be used for both training and validation, improving data utilization while reducing the risk of overfitting and providing a more stable estimation of model performance.
Regarding feature importance, the U3 score was identified as the most influential variable for predicting PD-MCI. The severity and progression of motor symptoms in PD patients appear to be closely associated with cognitive decline. Previous studies have shown that the cortico-striatal network plays a central role in regulating both motor and cognitive functions in PD. This network includes sensorimotor cortico-striatal connections as well as the prefrontal cortex. Alterations in sensorimotor cortico-striatal connectivity have been linked to motor symptom severity and are reflected in U3 assessment scores. Meanwhile, the prefrontal cortex is critical for working memory maintenance and executive control during complex tasks, and changes in its functional connectivity have been associated with cognitive performance measures such as MOCA scores.26,27 Therefore, as motor symptoms worsen and progress more rapidly, cognitive function may gradually decline, ultimately increasing the risk of PD-MCI. Although the U3 score is a well-known indicator of motor severity, its identification as the most important feature in our model provides quantitative evidence of its contribution to PD-MCI risk, as reflected in the SHAP-based feature importance ranking. We do not claim U3 as a novel predictive factor; rather, the novelty of this study lies in its data-driven integration of U3 with other clinical and laboratory variables. Importantly, the RF model integrates U3 with other clinical and laboratory variables, capturing complex interactions that allow for individualized prediction of PD-MCI risk beyond what could be achieved using U3 alone. Furthermore, the prominence of U3 is biologically plausible, aligning with cortico-striatal network mechanisms linking motor severity and cognitive function in PD patients. 28
Previous studies have suggested that abnormal brain iron deposition may be involved in the development of cognitive impairment in PD. QSM(Quantitative susceptibility mapping)-based analyses have shown increased magnetic susceptibility in the nigrostriatal system of PD patients, with higher susceptibility values in the caudate nucleus observed in PD-MCI patients compared with PD-NC patients and healthy controls. 40 From a pathological perspective, excessive iron accumulation may promote oxidative stress and facilitate α-synuclein aggregation, leading to neuronal dysfunction in cognitive-related circuits. 41 Furthermore, iron accumulation has also been associated with metabolic dysfunction and other pathological substrates involved in cognitive decline, although spatial dissociations between iron deposition and amyloid or tau pathology have been reported. 42
This study has several limitations. First, this was a retrospective analysis, and although all clinical assessments were performed by experienced PD specialists, a certain degree of subjectivity in scale-based evaluations cannot be completely excluded. Second, while the model achieved the best performance among the evaluated algorithms, the highest AUC reached 0.72, indicating moderate discriminative ability. This level of performance may limit its use as a standalone diagnostic tool; however, it may still be clinically valuable as an auxiliary screening and risk stratification tool for early identification of PD-MCI, particularly in routine clinical settings using easily accessible variables.
In addition, although the model was developed using data from 523 PD patients, which is a relatively large sample for this population, the external validation cohort included only 139 PD patients. Methodological guidelines suggest that external validation of binary prediction models ideally includes at least 200 cases to ensure adequate statistical power, with a positive outcome rate exceeding 30%.39,43 Nevertheless, the external validation cohort in this study was completely independent from the training dataset, minimizing the risk of information leakage. Despite the relatively limited sample size, the model demonstrated stable predictive performance in the external cohort, supporting its potential generalizability. Future multicenter studies with larger external validation cohorts and incorporation of additional biomarkers may further enhance the robustness and clinical applicability of the model.
Conclusion
This study utilized ML methods, PD patient assessment scales from the Hubei Parkinson's Disease Database, and blood data from an affiliated hospital to develop and compare ten ML models for the early prediction of PD-MCI risk in PD-NC patients. Among these models, the RF model showed the best overall predictive performance. RF model can effectively predict the risk of PD-MCI in PD patients at an early stage, suggesting potential clinical value for screening purposes. Additionally, Study not only provides practical evidence supporting the application of ML-based models in disease risk prediction but also offers a useful reference for clinicians to implement early cognitive interventions in PD patients with an increased risk of PD-MCI.
Supplemental Material
sj-docx-1-pkn-10.1177_1877718X261424696 - Supplemental material for Creation of a machine learning-based model for early identification of mild cognitive impairment risk in Parkinson's disease: A PD research center-based population study
Supplemental material, sj-docx-1-pkn-10.1177_1877718X261424696 for Creation of a machine learning-based model for early identification of mild cognitive impairment risk in Parkinson's disease: A PD research center-based population study by Zhengting Yang, Sufang Liu, Li Wang, Yaoling Duan, Qiu Deng, Qiang Sun, Jing Tian, Puqing Wang and Min Zhou in Journal of Parkinson's Disease
Footnotes
Acknowledgment
This work was supported by the Science and Technology Program of Hubei Province (Grant No.2023BCB140) and the Natural Science Foundation of Hubei Province (Grant Nos. 2022CFB341, 2023AFD045).
Author contributions
ZhengTing Yang and YaoLing Duan conceived and designed the study; SuFang Liu and Li Wang assisted in extracting data from the database; Qiang Sun contributed to the collection of external validation data; Jing Tian provided technical guidance for the statistical analysis; ZhengTing Yang wrote and edited the manuscript; PuQing Wang and Min Zhou participated in the overall study design and conception and assisted in revising the initial draft of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Science and Technology Program of Hubei Province, Natural Science Foundation of Hubei Province, (grant number 2023BCB140, 2022CFB341, 2023AFD045).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
