A Study on Machine Learning Models in Detecting Cognitive Impairments in Alzheimer’s Patients Using Cerebrospinal Fluid Biomarkers

Abstract

Several research studies have demonstrated the potential use of cerebrospinal fluid biomarkers such as amyloid beta 1-42, T-tau, and P-tau, in early diagnosis of Alzheimer’s disease stages. The levels of these biomarkers in conjunction with the dementia rating scores are used to empirically differentiate the dementia patients from normal controls. In this work, we evaluated the performance of standard machine learning classifiers using cerebrospinal fluid biomarker levels as the features to differentiate dementia patients from normal controls. We employed various types of machine learning models, that includes Discriminant, Logistic Regression, Tree, K-Nearest Neighbor, Support Vector Machine, and Naïve Bayes classifiers. The results demonstrate that these models can distinguish cognitively impaired subjects from normal controls with an accuracy ranging from 64% to 69% and an area under the curve of the receiver operating characteristics between 0.64 and 0.73. In addition, we found that the levels of 2 biomarkers, amyloid beta 1-42 and T-tau, provide a modest improvement in accuracy when distinguishing dementia patients from healthy controls.

Keywords

machine learning models Alzheimer’s disease cerebrospinal fluid amyloid beta 1-42 P-tau T-tau cognitive dementia ratings Mini-mental state score

Introduction

Alzheimer’s disease (AD) is 1 of the most prevalent neurodegenerative diseases characterized by progressive cognitive decline. This cognitive deterioration also poses a significant burden on the caregivers. Despite Alzheimer’s disease affecting over 55 million people globally,¹ there remains no reliable and straightforward method for diagnosing the condition. The mild impairment of episodic memory is typically 1 of the first indications of patients with early-stage AD. Although these people may meet the requirements for mild cognitive impairment (MCI) diagnosis, because of their daily life activities are unaffected and their global cognitive functioning is intact, they are not yet considered demented^2,3 and are not detected in early stages.

Recent research on cerebrospinal fluid (CSF) biomarkers has revealed the potential to detect AD at its early stages.^4,5 Initial research revealed that T-tau and amyloid beta 1-42 (Aβ_1-42) together had a good predictive potential to detect early-stage AD in MCI cases.⁴ According to recent studies with prolonged periods of clinical follow-up, the combination of all 3 CSF biomarkers (T-tau, P-tau, and Aβ_1-42) may have a predictive value as high as 95% to distinguish MCI cases with progression toward AD from stable MCI cases and MCI cases with other types of underlying pathology.⁶ Several large multi-center studies have confirmed that these CSF biomarkers have a strong predictive value for detecting early-stage AD.^2,7,8

Two types of rating instruments are used to identify dementia subjects. The clinical dementia rating (CDR), first published in 1982,⁹ is considered as the gold standard global rating scale for staging dementia patients. The CDR classifications are zero (no dementia), 0.5 (questionable dementia), 1 (mild dementia), 2 (moderate dementia), and 3 (severe dementia). While the CDR is highly valid and reliable, it depends on a comprehensive structured interview with both the patient and the physician.^10,11 In addition, the process becomes complicated if a trustworthy and knowledgeable caregiver is unavailable, limiting its utility in everyday practice.¹² In contrast, simple and shorter assessment techniques such as the Mini-Mental State Examination (MMSE) scores are more appropriate for primary and secondary care levels.¹³ The MMSE score, ranging from zero to 30, is calculated based on a series of questions designed to assess the patient’s cognitive skills. A score of 26 or higher represents normal cognition, scores between 20 and 26 indicate mild dementia, scores between 10 and 20 indicate moderate dementia, and scores less than 9 indicate severe dementia. Numerous studies have demonstrated the reliability and validity of the MMSE score.¹⁴

Previous research has primarily concentrated on machine learning (ML)-based classification of dementia stages by integrating CSF biomarkers with various other parameters, rather than relying on CSF biomarkers alone.¹⁵ These studies have explored the synergistic potential of combining CSF biomarkers such as amyloid-beta and tau proteins with clinical assessments, cognitive scores, genetic factors, and imaging data. The primary imaging modalities include Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). For instance, a research project, Alzheimer’s Biomarkers in Daily Practice (ABIDE), utilized the Amsterdam Dementia Cohorts—a longitudinal cohort from a tertiary referral center—comprising 525 individuals.¹⁶ Each patient’s initial appointment occurred at a clinic between September 1, 1997, and August 31, 2014. The study employed Cox regression (or proportional hazards regression) analysis to develop prognostic models for analyzing the progression of mild cognitive Alzheimer’s phases. The models incorporated MRI biomarkers (hippocampal volume and normalized whole-brain volume), CSF biomarkers (Aβ_1-42, tau), patient characteristics such as age and gender, and MMSE scores as inputs.

In another work, a support vector machine (SVM) with multi-task learning was applied to predict the 2-year conversion from MCI to AD using baseline MRI and CSF measurements.¹⁷ The model achieved 73.9% accuracy, 68.6% sensitivity, and 73.6% specificity. Another group of researchers trained a multi-modal Gated Recurrent Unit model to predict the conversion from MCI to AD using longitudinal cognitive performance and CSF biomarkers data, along with cross-sectional neuroimaging and demographic data.¹⁸ Although these studies demonstrate good accuracy of ML models, their performance relies on imaging data, which are expensive to acquire and require specialized expertise to interpret. Moreover, these imaging modalities often fail to distinguish Alzheimer’s from other neurodegenerative disorders.¹⁹ In contrast, CSF biomarkers have become increasingly attractive for clinical use. Several research studies demonstrate the potential of Aβ_1-42 and tau biomarkers in detecting AD pathology in its early stages.²⁰

In this work, we studied the performance of ML models for the classification of dementia patients from normal controls based on the CSF biomarkers alone. Various machine learning models such as Discriminant, Logistic Regression, Tree, Support Vector Machine (SVM) and Naïve Bayes classifiers were studied. With 3 CSF biomarkers (Aβ_1-42, T-tau and P-tau), medium gaussian SVM provided the highest accuracy of 67% with area under the receiver operating characteristic (AUROC) of 0.72, true positive rate (TPR) of 0.78 and false positive rate (FPR) of 0.43. With 2 CSF biomarker levels (Aβ_1-42 and T-tau), the K-Nearest Neighbor (KNN) classifier model provided the highest accuracy of 69% with AUROC of 0.73, TPR of 0.76 and FPR of 0.37.

Methods

Description of Data

We analyzed data from electronic health records of AD patients collected from the National Alzheimer’s Coordinating Center (NACC) database.²¹ CSF biomarker levels from subjects with both CDR and MMSE scores were considered for further analysis, excluding any subjects with missing or erroneous values. CSF biomarker levels were obtained from multiple subjects over several years, and repeated measurements were excluded, resulting in a total of 711 subjects with unique identification numbers. Among the 711 subjects, 356 were categorized as normal controls (NC), 219 as mild, 109 as moderate, and 27 as severe according to MMSE scores. For the same set of subjects, CDR scores categorized 216 as NC, 178 as mild, 29 as moderate, 20 as severe, and 268 as questionable. To make a conservative classification of groups, we included a subject in the dementia group only if it met the criteria for both CDR and MMSE scores, resulting in 210 subjects in the dementia group. To balance the dataset, which is necessary for efficient training and validation of machine learning models, we randomly selected 210 subjects from the NC group who met the criteria for NC using both CDR and MMSE scores.

Machine Learning Classification Models

We used classification models in MATLAB (MathWorks, R2023a) to train and validate the balanced data set in differentiating dementia subjects from normal control (NC) as a binary classification problem. Since the data set is small, we used 5-fold cross validation to train different types of machine learning models that belong to discriminant, logistic regression, tree, support vector machines and naïve bayes. Models were compared using the performance metrics such as accuracy, area under the receiver operating characteristics (AUROC), true positive rate (TPR) and false positive rate (FPR).

Results and Discussion

We first studied whether the biomarker levels are statistically different between the dementia and NC groups by considering the balanced data set. Then we studied the correlation of these biomarker levels with the CDR and MMSE scores. Finally, we trained and validated different machine learning models on the data set. By categorizing subjects into dementia and NC groups only when they meet the criteria for both CDR and MMSE scores, we can eliminate any variability in classification based on these scores. Such approach ensures that the performance of the machine learning models relies only on the biomarker levels. A preliminary analysis of machine learning models, including all subjects and the repeated measurements with CDR and MMSE scores considered separately, was reported in our archived work.²²

Table 1 shows the mean and standard error (SE) of age, CDR, MMSE and 3 biomarker values of dementia and NC groups. These features are significantly different using a st test comparison. Table 2 provides the correlation (denoted by r) of biomarker levels with the CDR as well as MMSE scores and shows a significant weak correlation. As expected, the correlation coefficient values are opposite in sign between CDR and MMSE scores.

Table 1.

Mean (SE) of Dementia and NC Groups.

Variables	Dementia (n = 210)	NC (n = 210)
Age (years)	70.20 (0.67)	65.95 (0.51)
MMSE	17.91 (0.42)	29.01 (0.05)
CDR	1.32 (0.04)	0.17 (0.01)
Aβ_1-42 (pg/ml)	227.8 (−10.29)	417.93 (−12.36)
T-tau (pg/ml)	312.26 (36.06)	157.02 (8.48)
P-tau (pg/ml)	44.68 (2.03)	37.39 (1.17)

Table 2.

Correlation of Biomarker Levels With MMSE and CDR Scores. All Features are Significantly Different Using a Studentst test Statistics With a p-value Less Than 0.001.

Biomarkers	MMSE	CDR
Aβ_1-42 (pg/ml)	r = 0.34	r = −0.38
Aβ_1-42 (pg/ml)	P < 0.001	P < 0.0001
T-tau (pg/ml)	r = −0.15	r = 0.21
T-tau (pg/ml)	P < 0.001	P < 0.0001
P-tau (pg/ml)	r = −0.12	r = 0.11
P-tau (pg/ml)	P < 0.01	P < 0.01

Table 3 outlines the performance of machine learning models in classifying NC from dementia patients. Various models using 3 biomarker levels show comparable accuracy, ranging from 64% to 67%. AUROC, TPR and FPR metrics are also comparable. Hence, any of these models can be used to differentiate dementia from NC. When the models were trained exclusively with MCI subjects and NC, while further balancing the dataset, the results remained consistent. This suggests that the model’s performance is influenced by the biomarker levels of MCI subjects that are significantly more prevalent in the dementia group. Adding age, gender, sex, and race as additional features did not significantly alter the model performance metrics.

Table 3.

Performance of Machine Learning Models for Two Classes (NC Versus Dementia).

	Three Biomarkers NC Vs Dementia				Two Biomarkers NC Vs Dementia
Model	Accuracy (%)	AUROC	TPR	FPR	Accuracy (%)	AUROC	TPR	FPR
Linear discriminant	66	.71	.70	.39	67	.72	.71	.38
Efficient logistic regression	65	.72	.69	.38	67	.72	.71	.37
Coarse tree	65	.66	.80	.50	63	.64	.78	.49
Medium Gaussian svm	67	.72	.78	.43	67	.73	.70	.36
Medium knn	64	.68	.73	.45	69	.73	.76	.37

Feature importance scores sorted using Analysis of Variance (ANOVA) algorithm showed that Aβ_1-42 had a score of 12.23 and T-tau had a score of 12.89 whereas P-tau had the lowest value of 3.73 suggesting that Aβ_1-42 and T-tau biomarker levels are contributing significantly to the performance of the machine learning models in classification. Further confirmation is obtained by checking the feature importance scores sorted using Kruskal Wallis algorithm in which Aβ_1-42 had a score of 14.88, T-tau had a score of 14.02 and P-tau had a score of 6.15.

Based on the findings from the feature important scores, we trained and validated the models with 2 biomarker levels, Aβ_1-42 and T-tau, as shown in Table 3. The performance metrics of the models are comparable with the models with 3 biomarker levels confirming the earlier observations.⁴ Figure 1(A) shows the ROC curve for differentiating NC from dementia based on 2 biomarkers, with the operating point (shown as the dot) set at a TPR of 0.76 and FPR of 0.37. The model achieves an AUROC of 0.73. Figure 1(B) shows the confusion matrix, highlighting the TPR for detecting dementia and NC groups, along with their FNR. The TPR suggests that the dementia patients can be detected with a sensitivity of 76% and FPR suggests a specificity of 63%.

Figure 1.

Performance metrices of the best performance model using two biomarker levels. (A) Reciever operating charateristics and (B) Confusion matrix along with TPR and FNR values.

In summary, CSF biomarkers alone offer potential in the early detection of AD. The results are encouraging from a future standpoint because they show good accuracy with fewer input features, in this case 2 biomarkers. The model’s accuracy could be improved by including more patient data and real-time input features. There is evidence in the literature that a strong correlation exists between blood biomarkers and CSF biomarkers.²³ Recent research indicates that a blood test measuring tau and the Aβ₄₂/Aβ₄₀ ratio demonstrates high diagnostic accuracy (range, 88%-92%) for identifying Alzheimer’s disease in both primary and secondary care settings.²⁴ Hence, future research could also involve developing a point-of-care blood sampling device interfaced with machine-learning classification models for early diagnosis and prognosis of Alzheimer’s disease patients.

Footnotes

Author Contributions

S. T. conceptualized and supervised the project. V. K. T. designed the experiments and implemented the machine learning models. P. I. performed the analysis of the results, including model evaluation, statistical analysis, and interpretation of outcomes. S.T. and V. K. T. wrote the initial version of the manuscript. All authors contributed to reviewing and revising the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Shawana Tabassum

Data Availability Statement

The datasets generated and/or analyzed during this study may be available from the corresponding author upon reasonable request.*

References

Gustavsson

Norton

Fast

, et al. Global estimates on the number of persons across the Alzheimer’s disease continuum. Alzheimer's Dementia. 2023;19(2):658-670. doi:10.1002/alz.12694.

Gauthier

Reisberg

Zaudig

, et al. Mild cognitive impairment. Lancet. 2006;367(9518):1262-1270. doi:10.1016/S0140-6736(06)68542-5.

Petersen

. Mild cognitive impairment as a diagnostic entity. J Intern Med. 2004;256(3):183-194. doi:10.1111/j.1365-2796.2004.01388.x.

Blennow

Hampel

. CSF markers for incipient Alzheimer’s disease. Lancet Neurol. 2003;2(10):605-613. doi:10.1016/S1474-4422(03)00530-1.

Bouwman

Frisoni

Johnson

, et al. Clinical application of CSF biomarkers for Alzheimer’s disease: from rationale to ratios. Alzheimers Dement (Amst). 2022;14(1):e12314. doi:10.1002/dad2.12314.

Hansson

Zetterberg

Buchhave

Londos

Blennow

Minthon

. Association between CSF biomarkers and incipient Alzheimer’s disease in patients with mild cognitive impairment: A follow-up study. Lancet Neurol. 2006;5(3):228-234. doi:10.1016/S1474-4422(06)70355-6.

Visser

Verhey

Knol

, et al. Prevalence and prognostic value of CSF markers of Alzheimer’s disease pathology in patients with subjective cognitive impairment or mild cognitive impairment in the DESCRIPA study: A prospective cohort study. Lancet Neurol. 2009;8(7):619-627. doi:10.1016/S1474-4422(09)70139-5.

Mattsson

Zetterberg

Hansson

, et al. CSF biomarkers and incipient alzheimer disease in patients with mild cognitive impairment. JAMA. 2009;302(4):385-393. doi:10.1001/jama.2009.1064.

Hughes

Berg

Danziger

Coben

Martin

. A new clinical scale for the staging of dementia. Br J Psychiatr. 1982;140(6):566-572. doi:10.1192/bjp.140.6.566.

10.

Morris

. The clinical dementia rating (CDR). Neurology. 1993;43(11):2412-2412. doi:10.1212/WNL.43.11.2412-a.

11.

Khan

. Biomarkers in Alzheimer’s Disease. Academic Press; 2016.

12.

Perneczky

Wagenpfeil

Komossa

Grimmer

Diehl

Kurz

. Mapping scores onto stages: Mini-mental state examination and clinical dementia rating. Am J Geriatr Psychiatr. 2006;14(2):139-144. doi:10.1097/01.JGP.0000192478.82189.a8.

13.

Chokesuwattanaskul

Jiang

Bond

, et al. The architecture of abnormal reward behaviour in dementia: Multimodal hedonic phenotypes and brain substrate. Brain Communications. 2023;5(2):fcad027. doi:10.1093/braincomms/fcad027.

14.

Tombaugh

McIntyre

. The mini-mental state examination: A comprehensive review. J Am Geriatr Soc. 1992;40(9):922-935. doi:10.1111/j.1532-5415.1992.tb01992.x.

15.

Saleem

Zahra

, et al. Deep learning-based diagnosis of Alzheimer’s disease. J Personalized Med. 2022;12(5):815. doi:10.3390/jpm12050815.

16.

van Maurik

Zwan

Tijms

, et al. Interpreting biomarker results in individual patients with mild cognitive impairment in the Alzheimer’s biomarkers in daily Practice (ABIDE) project. JAMA Neurol. 2017;74(12):1481-1491. doi:10.1001/jamaneurol.2017.2712.

17.

Zhang

Shen

. Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. Neuroimage. 2012;59(2):895-907. doi:10.1016/j.neuroimage.2011.09.069.

18.

Lee

Nho

Kang

Sohn

Kim

. Predicting Alzheimer’s disease progression using multi-modal deep learning approach. Sci Rep. 2019;9:1952. doi:10.1038/s41598-018-37769-z.

19.

Johnson

Fox

Sperling

Klunk

. Brain imaging in alzheimer disease. Cold Spring Harb Perspect Med. 2012;2(4):a006213. doi:10.1101/cshperspect.a006213.

20.

Lee

Kim

Hong

Kim

. Diagnosis of Alzheimer’s disease utilizing amyloid and tau as fluid biomarkers. Exp Mol Med. 2019;51(5):1-10. doi:10.1038/s12276-019-0250-2.

21.

National Alzheimer’s Coordinating Center (NACC) . National institute on aging. Available at: https://www.nia.nih.gov/research/dn/national-alzheimers-coordinating-center-nacc.

22.

Tiwari

Indic

Tabassum

. Machine learning classification of Alzheimer’s disease stages using cerebrospinal fluid biomarkers alone. 2024: arXiv:2401.00981.

23.

Olsson

Lautner

Andreasson

, et al. CSF and blood biomarkers for the diagnosis of Alzheimer’s disease: a systematic review and meta-analysis. Lancet Neurol. 2016;15(7):673-684. doi:10.1016/S1474-4422(16)00070-3.

24.

Palmqvist

Tideman

Mattsson-Carlgren

, et al. Blood biomarkers to detect alzheimer disease in primary care and secondary care. JAMA. 2024;332(15):1245-1257. doi:10.1001/jama.2024.13855.