Abstract
Background:
In the era of disease-modifying therapies, empowering the clinical neuropsychologist’s toolkit for timely identification of mild cognitive impairment (MCI) is crucial.
Objective:
Here we examine the clinimetric properties of the Montreal Cognitive Assessment (MoCA) for the early diagnosis of MCI due to Alzheimer’s disease (MCI-AD).
Methods:
Data from 48 patients with MCI-AD and 47 healthy controls were retrospectively analyzed. Raw MoCA scores were corrected according to the conventional Nasreddine’s 1-point correction and demographic adjustments derived from three normative studies. Optimal cutoffs were determined while previously established cutoffs were diagnostically reevaluated.
Results:
The original Nasreddine’s cutoff of 26 and normative cutoffs (non-parametric outer tolerance limit on the 5th percentile of demographically-adjusted score distributions) were overly imbalanced in terms of Sensitivity (Se) and Specificity (Sp). The optimal cutoff for Nasreddine’s adjustment showed adequate clinimetric properties (≤23.50, Se = 0.75, Sp = 0.70). However, the optimal cutoff for Santangelo’s adjustment (≤22.85, Se = 0.65, Sp = 0.87) proved to be the most effective for both screening and diagnostic purposes according to Larner’s metrics. The results of post-probability analyses revealed that an individual testing positive using Santangelo’s adjustment combined with a cutoff of 22.85 would have 84% post-test probability of receiving a diagnosis of MCI-AD (LR+ = 5.06).
Conclusions:
We found a common (mal)practice of bypassing the applicability of normative cutoffs in diagnosis-oriented clinical practice. In this study, we identified optimal cutoffs for MoCA to be allocated in secondary care settings for supporting MCI-AD diagnosis. Methodological and psychometric issues are discussed.
Keywords
INTRODUCTION
Every 3 seconds, someone in the world develops dementia. Every year, almost 10 million new cases are recorded. Nowadays, over 55 million people are living with dementia. Among older people, dementia is one of the primary causes of disability and dependency, and the seventh leading cause of death.1–3 This scenario is further compounded by the dramatic congestion of healthcare and socio-assistance services, as well as by the significant financial impact of the disease. The annual cost of dementia, including both direct and indirect expenses, exceeds cumulatively 1 trillion dollars.2,3, 2,3
Alzheimer’s disease (AD) is the most prevalent cause of dementia, contributing to 60–80% of cases. 1 Amyloidogenesis, namely, the process leading to abnormal aggregation of amyloid proteins resulting in the formation of insoluble fibrils, represents a pivotal pathophysiological mechanism in AD. Consequently, in the fervent pursuit of disease-modifying therapies, most of AD-related drugs currently undergoing clinical trials aim to weaken amyloid protein aggregates. 4 With this in mind, the imperative for early identification of individuals at risk of conversion towards AD has never been more critical. Particularly, those diagnosed with Mild Cognitive Impairment (MCI) warrant special mention.
MCI is traditionally considered a boundary stage between healthy aging and overt dementia. It is characterized by a slight cognitive decline with minimal or no functional impairment in daily activities. 5 The amnestic phenotype, which predominantly affects memory domains, is a major risk factor for the subsequent onset of AD dementia (ADD), with a conversion rate ranging from 8% to 15% within 1 year, reaching 80% at 6 years. 6 However, MCI can stabilize over time and demonstrates reversibility in 26% of cases.5,7, 5,7 Although the clinical phenotype is relevant, conversion towards ADD is primarily related to the etiopathogenic profile. Therefore, it is crucial to detect MCI patients with a biological diagnosis of AD (i.e., MCI due to AD). 8
According to the National Institute on Aging-Alzheimer’s Association (NIA-AA), an accurate evaluation of patients with suspected MCI due to AD (MCI-AD) should involve a comprehensive neurocognitive assessment combined with gathering evidence of AD-like pathophysiology. 8 In particular, based on the AT(N) paradigm (amyloid-β deposition, pathologic tau, and neurodegeneration), a patient with MCI is classified as being on the AD continuum if they exhibit biomarker evidence of Aβ deposition (abnormal amyloid PET scan, low cerebrospinal fluid Aβ42, or low Aβ42/Aβ40 ratio), with pathologic phosphorylated tau strengthening the diagnostic likelihood. 9 Interestingly, relying solely on biomarkers seems to reduce the predictive value of the diagnosis. 10
While acknowledging the importance of integrated approaches that combine etiological and clinical diagnosis, the management of patients with dementia faces limitations in terms of time, costs, and availability of experienced staff. To give a few examples, waiting times for undergoing instrumental examinations can be quite long. Additionally, significant logistical resources are required for administering extensive neuropsychological batteries, which is particularly problematic in outpatient settings where time constraints are rigid. Moreover, the costs associated with procedures like the amyloid PET scan can be notably high, as can those of neuropsychological assessments performed by private providers rather than through the public healthcare system. Finally, those actively practicing clinical neuropsychology in Europe have a very heterogeneous educational background and skill level, compounded by the scarcity of academic training programs and/or clinical training opportunities.11,12, 11,12 In light of this, there is a pressing need for brief, flexible tools with high diagnostic power, particularly in secondary care settings, where the objective is to skim patients and, when necessary, direct them towards furtherinvestigations. 13
Among the tests used in memory clinics, the Mini-Mental State Examination (MMSE) is widely acknowledged as the gold standard neuropsychological battery for assessing global cognitive functioning in moderate/advanced stages of dementia.
14
Instead, the Montreal Cognitive Assessment (MoCA) has been specifically designed to evaluate general cognition in patients with MCI and mild ADD.
15
As compared to MMSE, MoCA covers a wider range of cognitive domains, including sustained attention, visuospatial, and visuoconstructive abilities. Furthermore, MoCA is less affected by patient’s linguistic capabilities and has demonstrated utility in predicting conversion from MCI towards dementia. In particular, some studies have shown that patients with MCI exhibiting low MoCA scores at baseline were more likely to convert to ADD within a timeframe of 1.5 to 3.5 years.16,17, 16,17 In addition to providing an overview of general cognitive functioning, it is of particular interest to inquire whether MoCA holds sufficient
The architecture of diagnostic research: key questions
Let us imagine that a young researcher has devised a long-term visuospatial memory task requiring the examinee to memorize, and then recall, the spatial arrangement of tokens placed on a chessboard, namely, the ‘Chessboard Test’ (CBT). In particular, the researcher is interested in determining whether CBT could be considered a reliable marker for AD. To establish this, the researcher refers to one interesting chapter within the seminal manual by Knottnerus and Buntinx titled ‘The Evidence Base of Clinical Diagnosis’,
18
so as to identify the appropriate research questions to pose. Phase 1 question: Do patients with AD achieve significantly lower scores on CBT than healthy individuals? Phase 2 question: Are individuals getting lower CBT scores more likely to be diagnosed with AD than individuals getting a higher CBT score? Phase 3 question: Among individuals for whom there is a clinical suspicion of AD, can CBT score effectively discriminate between those with and without AD? Phase 4 question: Do patients tested with CBT have better health outcomes, such as functional autonomies, quality of life, or mortality rate, compared to those who do not undergo the test?
Here we focus on Phase 1 and 2 questions. This choice is prompted by the presence of several threats to validity in Phase 3 studies and limitations in their applicability to clinical research. As for the Phase 4 question, interventions for AD remain currently confined to cognitive stimulation and palliative pharmacological therapies. 19
Consider once more our enterprising researcher. Picture them now, eager to delve into a Phase 1 question. To determine whether CBT may be clinically meaningful, the test should be administered to demographically-matched samples of patients with AD and healthy controls. If a significant difference is detected in CBT score’s distribution between the two groups, the researcher may conclude that CBT is a useful diagnostic tool. Regrettably, this finding does not really ensure that CBT can be confidently translated into clinical practice for diagnostic purposes. Indeed, if the Phase 1 question receives an affirmative response, the next step is conducting a clinimetric study to address a Phase 2 question.
19
Here,
To answer a Phase 2 question, the researcher set up a supplementary study. This time, CBT is administered under standardized (ideal) conditions. Furthermore, to discern the presence of AD, the researcher relies on established gold standard references, e.g., cerebrospinal fluid (CSF) Aβ levels and performance on the Rey Auditory Verbal Learning Test (RAVLT).20,21, 20,21 Upon collecting CBT scores, its discriminative capability is estimated, typically using Receiver Operating Characteristic (ROC) curve analysis. Still, indexes of diagnostic power, such as sensitivity and specificity, are computed with respect to a specified cutoff point.
19
The study results suggest that CBT exhibits adequate discriminatory power. Moreover, the identified
Stopping at the first rung: the normative studies
Phase 1 studies are relatively simple, quick, and cost-effective. These advantages have captivated researchers in neuropsychology, leading to an oversimplification of the aforementioned diagnostic architecture:
The diagnostic significance of a test relies on its ability to discriminate between ‘normal’ and ‘abnormal’ conditions. Accordingly, the definition of normality is pivotal. In normative studies, the interpretation of an individual’s test score involves comparing it with scores from a normative/healthy sample, assumed to be representative of the population from which the individual comes. There are different methods to quantify the relative standing of an individual’s score within a normative distribution. For instance, one may use the percentile rank, but it only indicates the score’s ordinal position within the distribution, without assuming univariate normality. Instead, if one posits that scores follow the normal/gaussian distribution, it is possible to proceed in terms of equal intervals using measures of central tendency and dispersion. Specifically, the individual’s raw score may be converted into
As sociodemographic variables, such as age and education, can affect cognitive performance, the ES method entails statistically weighing their contributions to score variability using linear regression. Subsequently, correction coefficients are derived. Upon application of these correction factors, it becomes possible to easily compare individuals with different age and education levels. For instance, the performance of a young university student can be compared with that of a septuagenarian with only primary education. Following this, the demographically-adjusted normative distribution is standardized using a 5-point ordinal scale, from ES0 to ES4. Conventionally, ES0 corresponds to an adjusted score that is equal to or lower than the outer non-parametric tolerance limit on the 5th percentile with 95% confidence. ES4, conversely, corresponds to performance equal to or better than the median value. ES1, ES2, and ES3 are obtained by dividing the distribution between ES0 and ES4 into three parts.24,26, 24,26 The outer non-parametric tolerance limit on the 5th percentile aligns with
On the one hand, it may represent a step backward compared to standardized scores, resulting in a loss of information. On the other hand, only the tolerance region around the 5th percentile holds potentially inferential value. Indeed, while it is reasonable to use the median as a designated measure of central tendency due to the self-styled non-parametric nature of the method, the portion of the distribution between ES0 and ES4 remains a black hole. Assuming that the left tail of the adjusted score distribution is comparable to that of the normal distribution, the ES0-ES4 interval is divided into three sections using the space between
From a diagnostic perspective, an additional flaw of the ES method concerns the establishment of the nominal cutoff. Now it is clear that using ESs to compare an individual’s score to normative data is traditionally grounded in the idea that normative distributions approximate a Gaussian distribution. In healthy individuals, some psychological test scores fit the normal distribution. Conversely, many neuropsychological scores typically show negatively skewed and leptokurtic distributions in normative datasets, as a result of a significant ceiling effect. Consequently, scores are condensed into a limited set of discrete values at the upper extreme of the score range, with only a few observations at the left tail of the distribution. 18 In such instances, setting nominal cutoffs at the 5th percentile may be a procedure devoid of meaning.
Also, it is crucial to emphasize that even in the presence of a normal distribution, using such a ‘low’ cutoff may be disadvantageous for several reasons. It is customary to classify as not normal those scores that fall within the lower 5% of the population, accepting an error risk < 5%. This approach stems from inferential statistics, where it is common practice to assume a nominal alpha level equal to 0.05 to mitigate type 1 error inflation, namely, the rejection of the null hypothesis when it is true. In diagnostic terms, this implies maximizing the test specificity, decreasing the risk of false positives, i.e., the risk of mistakenly rejecting the null hypothesis that an individual is free from cognitive impairment. Concurrently, this entails a decrease in the test sensitivity, which is a crucial diagnostic parameter. 33 This is especially true in the context of serious health conditions that can be delayed (or treated) if correctly managed in the early stages. 34 Ultimately, the test may become primarily beneficial for general screening purposes.
To conclude, the crux of the matter lies in the stark contrast between the statistical and diagnostic definitions of normality. According to Bayes’ theorem, the probability (P) that an individual suffering from a disease (D) tests positive (T), i.e., the positive predictive value P(D|T), is equal to
In the clinical neuropsychology literature, there is the habit of omitting the investigation of normative thresholds’ diagnostic applicability to target conditions. As previously outlined, the extent of neuropsychological deficits may be differently captured by normative data, depending on how they are handled psychometrically. Furthermore, their diagnostic significance may be negligible. However, it is important to stress that a great deal of work has been done recently within the Italian scenario to identify disease-specific cut-offs on demographically-adjusted scores. This is particularly the case for tests conceived originally for cognitive screening purposes.33,35–39, 33,35–39 As a general rule, following Phase 1 and/or normative studies, it should be imperative to conduct robust clinimetric studies answering the Phase 2 question. The best algorithm is still to administer the test to be validated to a large normative sample, adjust the score distribution, and then calculate optimal cutoffs for sensitivity and specificity based on a specific target clinical population. 40
Aims
The MoCA is the designated character of this paper. Several normative studies on the MoCA exist, involving cognitively intact individuals with different geographic backgrounds, e.g., Czechoslovakia, 41 Italy, 42 Japan, 43 Norway, 44 Portugal, 45 and Sweden. 46 As many clinimetric studies on the MCI population have been conducted.13,47–57, 13,47–57 However, in these studies, patients were selected based on different algorithms for clinical diagnosis only, i.e., using Petersen’s criteria, the Diagnostic and Statistical Manual of Mental Disorders (DSM), or NIA-AA 2011 guidelines.8,58–61, 8,58–61 This approach, on the one hand, guarantees less conservative inclusion criteria, hence allowing the enrollment of large patient cohorts in line with the prevailing big data ‘culture’. On the other hand, however, it exposes to risks of misdiagnosis, false positives, and limited generalizability to individuals with a biologically confirmed diagnosis of neurogenerative disease. This may represent a significant methodological flaw. Surprisingly, only one Czech study investigated the clinimetric properties of MoCA in patients with MCI-AD, thus addressing such a crucial issue. 62
Recently, some of us devised a study that highlighted how the historic overconfidence in the effectiveness of normative data among Italian neuropsychologists may constitute a significant challenge in diagnostic settings. 33 Specifically, the study included patients with MCI and early dementia of mixed etiology (i.e., AD, mixed AD, cerebrovascular disease, frontotemporal degeneration, dementia with Lewy bodies). This cohort was compared with a control group consisting of healthy participants matched for sociodemographic characteristics. Regardless of correction factors and geographic extraction of normative datasets, we demonstrated that the available Italian normative cutoffs exhibited excellent specificity.42,63,64, 42,63,64 However, these cutoffs showed very poor sensitivity, ranging from 0.09 to 0.24, in distinguishing between individuals with mild neurodegeneration and normal cognition. 33 Moreover, we determined the optimal cutoffs for each of the Italian normative adjustments, as well as for the conventional Nasreddine’s 1-point adjustment method. 15 In this replication study, we aimed to assess the clinimetric properties of MoCA in a sample of patients with MCI-AD. In particular, we (i) tested the diagnostic properties of previously identified cutoffs, and (ii) computed new optimal cutoffs, weighted for sensitivity and specificity.
METHODS
Retrospective data collection was performed for a consecutive series of patients of either sex with suspected MCI who were referred to the Memory Centre of Trieste University Hospital (Neurological Unit, Azienda Sanitaria Universitaria Integrata Giuliano Isontina, ASUGI, Trieste, Italy) and the Dementia Clinic of C.T.O. Hospital (Neurological Unit, AORN Ospedali ‘Dei Colli’, Naples, Italy). All eligible patients underwent a comprehensive neurological and neuropsychological examination by experienced clinicians. Patients were included in the study if received a clinical diagnosis of MCI according to Petersen’s algorithm,
60
and a concurrent biomarker-driven diagnosis of MCI-AD. The latter was performed by harmonizing the NIA-AA 2011 criteria with the AT(N) framework.8,9, 8,9 Specifically, the diagnosis of MCI-AD was supported by neurobiological evidence indicating ongoing AD-like pathophysiological mechanisms. This evidence encompasses markers of Aβ deposition (e.g., lower CSF Aβ42 levels, lower CSF Aβ42/Aβ40 ratio, positive results at amyloid PET imaging). In addition, pathologic tau biomarkers (i.e., elevated CSF phosphorylated tau levels) and markers of neuronal injury (e.g., hippocampal and medial temporal lobe atrophy detected in MRI, hypometabolic clusters affecting the temporoparietal and/or the posteromedial parietal cortex highlighted in FDG-PET, elevated CSF total tau levels) were taken into account. Patients were thus classified according to both diagnostic categories outlined in the NIA-AA 2011 guidelines (i.e., low, intermediate, and high diagnostic likelihood) and AT(N) profiles. The minimum inclusion criterion was set at intermediate likelihood in combination with an
A group of participants with normal cognition (normal controls, NCs) was assembled by recruiting, on a voluntary basis, individuals from various districts in Friuli-Venezia Giulia, Veneto, Trentino-Alto Adige, and Campania regions, ensuring demographic comparability with the patient group. None of the control participants reported cognitive complaints. Exclusion criteria for both patient and control groups were age > 75 years, <5 years of formal education according to the Italian schooling system, history of learning disabilities, acquired brain injuries, psychiatric disorders (e.g., major depression), other major health conditions (e.g., cancer, severe obesity), alcohol/drug abuse, and ongoing treatments with psychoactive medications (e.g., antidepressants, neuroleptics, anxiolytics). Furthermore, while patients showing chronic cerebrovascular lesions (Fazekas grade≤2) were retained, those with severe vascular encephalopathy (Fazekas grade = 3) or multi-infarct dementia, which may justify the clinical picture, were excluded. 65 Note that participants over 75 years of age were excluded in accordance with Italian consensus recommendations for biomarker-based etiological diagnosis in patients with MCI, 66 owning to the high variability of the potential clinical impact of amyloid biomarkers assessment in this population (e.g., to minimize unnecessary investigations and age-related false positives in amyloid biomarkers). Participants with well-pharmacologically compensated chronic medical illnesses (e.g., hypertension, type II diabetes, gastrointestinal diseases) were included to minimize the risk of a ‘hyper-normality’. All participants had normal or corrected-to-normal vision. All participants were Caucasian and native Italian speakers.
Both patients and controls completed the Italian version of the MoCA. This was not used within the diagnostic process. In particular, the clinicians involved in the diagnostic process were uninformed about the individual’s MoCA score. Additionally, the neuropsychologists administering the MoCA were unaware of the presence of a clear diagnostic suspicion. Finally, data analysis was performed in a blinded fashion concerning group membership (dummy: 0 = Group A, 1 = Group B). Raw MoCA scores were adjusted according to (i) Nasreddine’s 1-point correction, entailing the addition of 1 point for individuals with≤12 years of education 15 and (ii) age-and-education correction factors derived from the three available Italian normative studies.42,63,64, 42,63,64 Therefore, four distinct MoCA scores, each subjected to independent adjustments, were achieved. Subsequently, we examined whether these adjusted scores fell below or exceeded the reference threshold values. Specifically, as concerns Nasreddine’s method, the conventional cutoff of 26 was employed as the gold standard, 15 in combination with the cut-point of 23.50 proposed by Ilardi et al. 33 Regarding Italian normative data, adjusted MoCA scores were compared with the respective nominal cutoffs (Conti = 17.36, Santangelo = 15.50, Aiello = 18.58), i.e., the upper limits of ES0.23–25 In addition, for each of the three Italian adjustment methods, the cutoffs proposed by Ilardi et al. were reassessed (Conti = 20.97, Santangelo = 22.85, Aiello = 22.29). 33 Optimal cutoffs for MCI-AD were finally calculated.
The current study was approved by the Comitato Etico Unico Regionale of Friuli-Venezia Giulia (CEUR-FVG; decree n. 438 of 8 June 2018; study protocol n.95/2018) and performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Informed consent was obtained from all participants included in this study. Based on the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) network library, the STARD 2015 guidelines for reporting diagnostic accuracy studies were followed.
Statistical analyses
For descriptive purposes, nominal variables were presented as frequency while quantitative ones as mean (
Summary measures of diagnostic accuracy employed in this study
TP, true positive; TN, true negative; FP, false positive; FN, false negative.
RESULTS
Power analysis
The results of a priori power analysis indicated that, at a nominal alpha level of 0.05, statistical power set to 0.80, minimum expected AUC of 0.70, and an allocation ratio equal to 1, the required total sample size was 48, i.e., 24 patients with MCI-AD and 24 NCs. 72
Sample characteristics
Forty-eight patients with MCI-AD (23 females, 36 from northern Italy,
Descriptive statistics in patient and control groups
MCI-AD, Mild cognitive impairment due to Alzheimer’s disease; MoCA, Montreal Cognitive Assessment; NIA-AA, National Institute on Aging and Alzheimer’s Association; AT(N), ATN classification system (Amyloid, Tau, Neurodegeneration). Among the AT(N) profiles, the asterisk (*) indicates that the biomarker group was untested. aChi-squared test. bStudent’s
ROC curve analysis
Regardless of the adjustment method, the MoCA demonstrated an adequate discriminative capability (MoCA-Nasreddine: AUC = 0.802,

Cutoff analysis
The results of cutoff analysis are summarized in Table 3. According to

Results of cutoff analyses for diagnosis of MCI due to AD
T+, positive test results; T–, negative test result; MCI-AD, mild cognitive impairment due to Alzheimer’s disease; PPV, positive predictive value; NPV, negative predictive value; FPR, false positive rate; FNR, false negative rate; ACC, overall accuracy; LR+, positive likelihood ratio; LR–, negative likelihood ratio; J, Youden index; CZ, concordance probability method; ER, Closest to (0, 1) criteria. aConventional MoCA’s cutoff. bNominal normative cutoffs. cCutoffs from Ilardi et al. (2023). dOptimal cutoffs for MCI due to AD. ■ Not computable.
Although originally set on a more heterogeneous clinical population, Ilardi’s cutoff for Aiello’s adjustment maintained good diagnostic performance in MCI-AD. In comparison to the optimal cutoff for MCI-AD, Ilardi’s cutoff for Aiello’s adjustment demonstrated increased
Table 4 shows the NNSU and LDM values for each examined cutoff. These newly-developed metrics express the utility of MoCA for screening and the rate of diagnosis versus misdiagnosis of MCI-AD. In line with the canonical cutoff analysis results, the presented optimal cutoffs demonstrated acceptable performance in both screening and diagnosis (NNSU values < 1.02, LDM values > 1), with Santangelo’s and Aiello’s adjustments showing a certain superiority over Conti’s. However, a dissociation persisted between Santangelo’s and Aiello’s adjustments. While they behaved similarly when measuring MoCA’s screening utility, the former surpassed the latter in terms of diagnoses over misdiagnoses (MoCA-Santangelooptimal, NNSU = 0.86, LDM = 2.17; MoCA-Aiellooptimal, NNSU = 0.89, LDM = 1.96). Ilardi’s cutoffs for Conti’s and Aiello’s adjustments performed comparably to Nasreddine’s cutoffs. The Italian nominal normative cutoffs were found to be unsatisfactory for both screening and diagnostic aims.
‘Number needed for screening utility’ and ‘likelihood to be diagnosed or misdiagnosed’ for each cutoff
NNSU, Number Needed for Screening Utility; LDM, Likelihood to be Diagnosed or Misdiagnosed. *Screening utility for ruling in and ruling out diagnosis. **Diagnosis prevails over misdiagnosis. aConventional MoCA’s cutoff. bNominal normative cutoffs. cCutoffs from Ilardi et al. (2023). dOptimal cutoffs for MCI due to AD.
DISCUSSION
Clinical neuropsychology is a discipline marked by considerable ‘volatility’. This likely stems from the lack of universally agreed-upon standards in diagnostic clinical practices. Neuropsychological examinations may be strongly affected by the clinician’s subjectivity, especially when their ‘style’ or professional experience collides with a standardized approach. Neuropsychologists may introduce biases in the selection of appropriate psychometric tools because of time constraints and overworking. In addition, differences in demographic profile, sociocultural and financial backgrounds, educational quality, language/communication style, cognitive reserve, emotional and personality factors may significantly moderate patients’ performance at cognitive testing.73,74, 73,74 However, even considering these limitations, the utility of clinical neuropsychology should not be questioned, and this is certainly true in the framework of AD.
In 2018, the AT(N) paradigm attempted to exclude clinical expertise from AD diagnosis. 9 Nevertheless, it has been demonstrated that a mere biological definition of AD has poor predictive accuracy. Likely, AD does not align with an at-risk model, such as that of prostate cancer, where screening and treating an asymptomatic patient can ensure a better prognosis. 10 Here, neuropsychology comes into play. It can outperform neuroradiology in predictive power for MCI and AD diagnoses. 75 Also, it covers methods and techniques to quantify the therapeutic outcomes in terms of cognitive and functional performance. The contribution of neuropsychology is highly relevant in the context of MCI due to AD (MCI-AD), especially in view of future disease-modifying treatments.
MoCA is a brief pencil-and-paper neuropsychological battery originally devised to identify patients with MCI and early-stage dementia.
15
Over time, its application has extended to exploring cognitive deficits and monitoring rehabilitation/treatment outcomes across different clinical populations, ranging from Parkinson’s disease to chronic obstructive pulmonary disease.
16
MoCA has been translated into over 50 languages, and normative data are available for many countries.44,46, 44,46 Regrettably, a common (mal)practice in the interpretation of neuropsychological test scores involves primarily relying on normative data, without delving into whether the designated normative ‘pathological’ ranges truly hold diagnostic significance. In fact, the conventional psychometric approaches to extract normative cutoffs render neuropsychological tools highly specific but inadequately sensitive for MCI. This applies to both short cognitive batteries like MoCA and more elaborate, domain-specific tests. Paradoxically, it may be hypothesized that irrespective of the length or comprehensiveness of cognitive assessment, neuropsychological clinical practice leans heavily towards
In this study, we examined the diagnostic properties of MoCA in patients with MCI-AD. Particularly, we aimed at identifying optimal cutoffs, balanced for sensitivity and specificity, when four demographic adjustments were applied: the conventional 1-point correction by Nasreddine et al. 15 and correction factors derived from three Italian normative studies.42,63,64, 42,63,64 To our best knowledge, only one previous study shared a similar goal, 62 wherein a cutoff of 24 was found to be the optimal threshold for differentiating patients with MCI-AD from healthy controls sampled from the Czech population. 62 Consistently, our optimal cutoffs ranged from 22.53 to 23.50. The slight discrepancy between the Czech study’s results and ours might be attributed to differences in sampling procedures or geographic extraction of participants.
In accordance with previous research,33,77, 33,77 we highlighted that the original Nasreddine cutoff of 26 led to an increased false positive rate due to high sensitivity but poor specificity. Instead, the optimal Nasreddine’s cutoff we proposed, at 23.50, is close to that recommended in a recent meta-analysis on the matter, 77 and demonstrated adequate clinimetric outcomes.
As concerns normative adjustments, among the optimal cutoffs we identified, those related to Santangelo’s 42 and Aiello’s 63 demographic adjustments demonstrated the highest diagnostic performance. Aiello’s cutoff of 23.35 was more sensitive while Santangelo’s of 22.85 was more specific. Furthermore, according to the ‘Number Needed for Screening Utility’ (NNSU) and ‘Likelihood to be Diagnosed or Misdiagnosed’ (LDM) metrics, Santangelo’s adjustment, combined with our optimal cutoff, restores robust performance to MoCA when used as a screener and even more so as a diagnostic-oriented tool. Still, it is crucial to emphasize that, based on our estimates of the likelihood ratio, an individual testing positive on MoCA with Santangelo’s adjustment and a cutoff of 22.85 will have, in the presence of a diagnostic suspicion, 84% post-test probability of being diagnosed with MCI-AD.
In a recent clinimetric study by Ilardi et al., MoCA’s discriminatory power was assessed in patients with MCI and early dementia of mixed etiology compared to individuals with normal cognitive functioning. Sensitivity- and specificity-weighted cutoffs were also computed. 33 Here, we tested the generalizability of these cutoffs in the MCI-AD population. Interestingly, the optimal cutoffs for Nasreddine’s and Santangelo’s adjustments coincided between the two studies. This evidence suggests a certain degree of MoCA’s clinimetric flexibility along the continuum of dementia, independently of the underlying pathology. In comparison to the earlier study, here all cutoffs suffered from a loss of diagnostic sensitivity (including those related to Nasreddine’s and Santangelo’s methods), likely stemming from the inclusion of patients with mild dementia, who scored ∼1.5 points lower on MoCA than patients with MCI. However, it is worth noting that the diagnostic performance of Santangelo’s cutoff remained largely unchanged. 33
In light of the above considerations, we advocate for the preferential adoption of Santangelo’s demographic adjustment alongside an optimal cutoff of 22.85 for a comprehensive evaluation of cognitive functioning in patients with suspected amnestic MCI-AD. However, despite its limitation in controlling the covariance of sociodemographic variables, 77 one should ponder the idea of using the rapid Nasreddine’s procedure to correct MoCA scores, and interpreting them with a cutoff of 23.50. This may represent a valuable resource in memory clinics with high attendance in the earlier steps. 33
As previously shown, 33 we found that cutoffs extracted from normative datasets exhibited very low sensitivity, despite excellent specificity, revealing their limited utility in diagnosis-oriented clinical settings for dementia. This finding is further corroborated by poor LDM values. Given the equally poor NNSU values, normative cutoffs demonstrate limited relevance, even for screening purposes. If clinical decisions must rely on normative data, it would be akin to flipping a coin. To sum up, we confirm the constraints of solely depending on normative data when correcting and interpreting neuropsychological test scores for clinical purposes. Their usefulness in gauging the degree of possible cognitive impairments along the dementia continuum is a different matter, though a lot relies on how reliable the psychometric extraction technique is.
The present study has some limits. Even considering the favorable a priori power analysis results, the sample size was relatively small, potentially undermining the external validity of the study. Additionally, the demographic attributes of the clinical and control samples (e.g., income, employment status, occupation, parental status) were not sufficiently specified. Lastly, although this paper exclusively centers on the MoCA, the latter serves merely as a pretext. The theoretical and methodological aspects expounded upon herein can be extended to all neuropsychological tests grounded in a quantitative framework. Clearly, tests relying on qualitative assessment, such as cancellation tasks, where a single error is deemed symptomatic, are exempt from our considerations.
Conclusions
Italy is leading the debate on the need for extra efforts to define disease-specific cutoffs for routinely used tools in clinical neuropsychology practice. These endeavors should prioritize the accurate classification of patients with a variety of clinical conditions. One major challenge is the significant fluctuation of cutoffs across different diseases (e.g., MoCA cutoff of 22.82 for patients with stroke or 19.94 for patients with extra-pyramidal disorders). 35 ,36 This highlights the importance of conducting methodologically sound clinimetric research that carefully considers intergroup homogeneity from as many perspectives as possible, including age, education, and number of comorbidities, and that ideally involves different clinical cohorts in addition to healthy controls.
In this study, we started by identifying optimal MoCA cutoffs for the early detection of patients with MCI-AD. The imperative of enhancing the diagnostic properties of neuropsychological tests should remain a focus in future research, as neuroimaging techniques are unlikely to ever completely replace a comprehensive and
AUTHOR CONTRIBUTIONS
Ciro Rosario Ilardi (Conceptualization; Formal analysis; Methodology; Validation; Visualization; Writing – original draft); Alina Menichelli (Conceptualization; Data curation; Investigation; Resources; Writing – review & editing); Marco Michelutti (Data curation; Investigation; Resources); Tatiana Cattaruzza (Data curation; Investigation; Resources; Writing – review & editing); Giovanni Federico (Conceptualization; Methodology; Writing – review & editing); Marco Salvatore (Funding acquisition; Project administration); Alessandro Iavarone (Resources; Writing – review & editing); Paolo Manganotti (Resources; Supervision).
Footnotes
ACKNOWLEDGMENTS
The authors have no acknowledgments to report.
FUNDING
This work was funded by the Italian Ministry of Health by “Progetti di Ricerca Corrente”.
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
DATA AVAILABILITY
The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
