Abstract
Background:
Cognitive and biological markers have shown varying degrees of success in identifying persons who will develop dementia.
Methods:
Neuropsychological assessment, genetic testing (apolipoprotein E –APOE), and structural magnetic resonance imaging (MRI) were performed for 418 older individuals without dementia (60–97 years) from a population-based study (SNAC-K). Participants were followed for six years.
Results:
Cognitive, genetic, and MRI markers were systematically combined to create prediction models for dementia at six years. The most predictive individual markers were perceptual speed or carrying at least one APOE ɛ4 allele (AUC = 0.875). The most predictive model (AUC = 0.924) included variables from all three modalities (category fluency, general knowledge, any ɛ4 allele, hippocampal volume, white matter-hyperintensity volume).
Conclusion:
This study shows that combining markers within and between modalities leads to increased predictivity for future dementia. However, minor increases in predictive value should be weighed against the cost of additional tests in larger-scale screening.
INTRODUCTION
Dementia and Alzheimer’s disease (AD) have a long preclinical phase during which a range of cognitive and biological markers may be used to identify people in the early stages of disease development [1]. Many clinical trials and interventions for dementia are focused on this period, where they are most likely to result in the prevention or delaying of dementia onset. In the present study, we evaluate the ability of a set of commonly used markers from three modalities: cognitive, genetic, and structural-magnetic resonance imaging (MRI), to identify individuals with a high probability to develop dementia within the next 6 years.
Cognitive deficits can present years or decades before a clinical diagnosis of dementia [2, 3], with perceptual speed, executive function, and episodic memory being most consistently impaired in the preclinical phase [4, 5]. Of the dementia risk genes, carrying the ɛ4 allele of apolipoprotein E (APOE) is the strongest risk factor for AD [6]. In addition, MRI markers, such as grey matter volumes, can predict future AD up to 10 years before clinical diagnosis [7] and white matter hyperintensities (WMHs) have been associated with increased dementia risk [8, 9].
Whereas all of these markers hold some predictive value, the power of individual markers is more limited. Recent research has therefore focused on the possible benefits of combining various prodromal markers. Increased predictivity has been reported from combining neuropsychological tests with MRI markers [10, 11] or by combining MRI markers and APOE [12]. Some [13], but not all [14], studies suggest that combining across modalities yields higher predictive value than combining within modalities. Moreover, several studies report a plateau in accuracy, after which the benefit of adding further markers is limited [10, 15]. From a practical perspective, due to financial, availability, or time constraints, it might not be possible to include a large number of predictors, if screening is to be implemented on a larger scale.
In this study, we perform a systematic investigation of the added value of combining markers from cognitive, genetic and MRI domains, both within and across modalities, to identify those at increased risk of developing dementia.
MATERIALS AND METHODS
Participants
Data were collected from participants involved in a longitudinal population-based study, the Swedish National Study on Aging and Care in Kungsholmen (SNAC-K). Baseline assessment was conducted on 3,363 individuals, belonging to specific age cohorts. Older age groups (≥78 years) were re-examined after 3 and 6 years and younger age groups (60–72 years) after 6 years. The assessment at each wave consisted of a nurse interview, a medical examination, and a neuropsychological testing session.
The present study focuses on a subgroup (n = 555) from the baseline study sample that underwent MRI scanning. Due to exclusion (poor quality images/technical issues; n = 52, missing cognitive data; n = 12, infarct/tumor/neural abnormality; n = 31, neurological disorder; n = 7, autoimmune disorder; n = 1) and drop-out, follow-up data were available for 418 participants. Of those, 354 remained dementia free, 28 developed dementia, and 36 died during the 6-year follow-up (see Fig. 1).

Flowchart of study participants.
Compared to the full sample, the MRI sample was significantly younger, more educated, had a higher Mini-Mental State Examination (MMSE) score, and included more women (p < 0.01).
All stages of SNAC-K have been approved by the Karolinska Institutet ethical committee or the regional ethical review board and written informed consent was collected from all participants. In cases where participants had severe cognitive impairment, a proxy was asked for consent.
Dementia diagnosis
Dementia diagnoses were made according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition [16]. A preliminary diagnosis was made by the examining physician, followed by a secondary diagnosis based on computerized data from the medical examination. In cases of disagreement, a final decision was made by a senior physician. The cognitive assessment used for diagnosis included the MMSE [17], the Clock test [18] and items regarding memory, executive functioning, problem solving, orientation, and interpretation of proverbs. Neuropsychological, genetic, and neuroimaging information was not used for diagnostic purposes. For those who died before receiving a dementia diagnosis in SNAC-K, death certificates and medical records were reviewed to identify additional dementia cases.
Cognitive assessment
A detailed description of the included tasks has been previously published [19], except the trail making test (TMT) which is described below.
Assessment of episodic memory involved free recall of a 16-item wordlist and a 32-item yes/no recognition test [20]. Two tasks of semantic memory were administered, a general knowledge task and a vocabulary task [21, 22]. Verbal fluency was assessed with letter (‘F’ and ‘A’) and category (‘animals’ and ‘professions’) fluency. Tasks of perceptual speed included digit cancellation [23], pattern comparison [24] and TMT-A [25]. This task involved connecting 13 encircled digits in numeric order as fast and accurately as possible. The TMT-A score was calculated by dividing number of correct connections by completion time. Executive function was measured using TMT-B [25], where circles with numbers and letters were connected based on numeric and alphabetical order, alternating between the two categories (1-A, 2-B, etc.). The TMT-B score was calculated by dividing number of correct connections by completion time, after which TMT-A performance was regressed out to minimize the influence of motor speed.
Genotyping
DNA was obtained from peripheral blood samples and genotyping was performed using MALDI-TOF analysis on the Sequenom MassARRAY platform [26]. Because very few individuals carried two ɛ4 alleles, the APOE (rs429358) polymorphism was analyzed as a binary variable, i.e. ‘ɛ4 versus no ɛ4’. However, there was no significant effect of having two ɛ4 alleles (n = 15) over having only one ɛ4 allele (n = 89; p = 0.355) with regard to future dementia risk.
MRI assessment
Acquisition
MRI data were acquired using a 1.5T scanner (Philips Intera, Netherlands). The protocol included an axial 3D T1-weighted fast field echo (FFE) sequence with repetition time (TR) 15 ms, echo time (TE) 7 ms, flip angle (FA) 15°, field of view (FOV) 240, 128 slices with slice thickness 1.5 mm and in-plane resolution 0.94×0.94 mm, no gap, matrix 256×256, and an axial turbo FLAIR sequence (TR 6000 ms, TE 100 ms, inversion time 1900 ms, FA 90°, ETL 21, FOV 230, 22 slices with slice thickness 5 mm and in-plane resolution 0.90×0.90 mm, gap 1 mm, matrix 256×256).
Post-processing
The T1-weighted images were first segmented into grey matter, white matter and cerebro-spinal fluid (CSF) using the unified segmentation method approach [27] and SPM12b (Statistical Parametric Mapping, Wellcome Trust Centre for Neuroimaging, http://www.fil.ion.ucl.ac.uk/spm/). Further removal of odd voxels from the segments was achieved through the ‘light clean-up’ option. Total intracranial volume (ICV) was obtained by adding grey matter, white matter and CSF volumes.
Automatic segmentation of hippocampal volumes was performed using the Freesurfer image analysis suite (v. 5.0.1, Martinos Center for Biomedical Imaging, Harvard-MIT, Boston, USA; http://surfer.nmr.mgh.harvard.edu/). This procedure has previously been described by Gerritsen et al. [28].
WMHs were manually delineated on the FLAIR images by a single rater. For details about the procedure, see Köhncke et al. [29].
All volumes were corrected for ICV, using the analysis of covariance approach [30].
Statistical analysis
All statistical analyses were conducted in IBM SPSS 23. Baseline differences between incident dementia and no dementia groups were determined using χ2 tests for dichotomous variables and ANOVAs for continuous variables.
Multinomial logistic regressions were employed to investigate how well various markers, or combination of markers, predicted future dementia, with three outcomes possible: no dementia (reference group), incident dementia, and death. The third outcome was included to take into account mortality as a competing risk. However, as the outcome of interest was dementia, only estimates from the reference and incident dementia groups are reported in this paper. Age, sex, and education were included as covariates in all models and all variables were entered simultaneously. To determine which variable or combination of variables best predicted future dementia, the estimated probabilities from the multinomial regressions were saved and receiver operating characteristics (ROC) were calculated for the dementia outcome using the no dementia group as reference.
The predictive value of individual variables was determined first. Significant individual measures, based on the regression analyses, with the highest area under the curve (AUC) value within their domain were entered into subsequent models. This was done to reduce the number of variables and to address issues of collinearity. The threshold for statistical significance was set to p < 0.05.
Predictive models were built within and between modalities (cognitive, genetic, MRI). Models were created by starting with the best predictor (based on AUC value) and adding a second variable from the same or a different modality, systematically testing all available combinations. The 2-variable model with the highest AUC was then used as the base for testing a possible 3-variable model using the same method. When no predictor could add further unique variance, in the regression analyses, this was considered the final model. The statistical significance of differences in AUC between models was assessed using DeLong’s test. The Bayesian information criterion (BIC) was used as a measure of model fit.
All non-dichotomous variables were standardized and all scores where a higher value was related to a decreased risk were reversed so that odds ratios (ORs) represent increased risk per SD-unit change in the predictor.
RESULTS
Background
Descriptive characteristics across follow-up status are shown in Table 1. Persons who developed dementia were significantly older, had fewer years of education, and lower MMSE scores at baseline compared to the no dementia group (p < 0.001). Both groups included more women but there was no difference in sex distributions between groups (p = 0.20).
Descriptive characteristics across dementia status at follow-up
aFollow-up refers to the 6-year follow-up for the no dementia and time of diagnosis (3-year, n = 6, or 6-year, n = 22, follow-up) for the incident dementia group.
Raw scores and baseline differences in predictor variables between those who developed dementia and those who did not are presented in Supplementary Table 1. Correlations among all variables included in the prediction models are available in Supplementary Table 2.
Individual predictors
Results from multinomial logistic regressions and ROC analyses showed that the pattern comparison task (perceptual speed) and the presence of at least one ɛ4 allele were the strongest individual predictors of future dementia up to six years later (Table 2). Word recall (episodic memory), TMT-B (executive function), category fluency (verbal fluency), and general knowledge (semantic memory) were the best predictors in their respective cognitive domains and were therefore entered in the combined models. Hippocampal and WMH volume were the only significant MRI variables and both were kept for model development. However, results from DeLong’s test show none of the individual models were significantly more predictive of future dementia when compared to the covariate model (model 0) including age, sex and education.
Multinomial logistic regressions for individual variables
aIncident dementia versus no dementia. AUC, area under the curve; CI, confidence intervals; OR, odds ratio; ROC, receiver operating characteristic curve; WMH, white matter hyperintensity.
As pattern comparison, carrying any ɛ4 allele, and hippocampal volume were the strongest individual predictors of their respective modalities, they were chosen as bases for the combined models.
Intra-modality models
Adding further variables from within the base variable’s modality numerically increased predictivity (Table 3). Of the intra-modality models, the highest predictive value was obtained by adding a test of episodic memory (word recall) to the perceptual speed predictor (pattern comparison), which yielded the highest combined predictive value (AUC = 0.901). Among the MRI variables, hippocampal volume and WMHs both contributed unique variance and remained significant within the same model (AUC = 0.878). The final model for the cognitive modality was significantly more predictive (DeLong’s, p = 0.01) than the model including only the covariates (model 0); however, there was no significant increase in predictivity from model 0 to the final MRI model (p = 0.080).
Multinomial logistic regressions for intramodality models
aIncident dementia versus no dementia. AUC, area under the curve; CI, confidence intervals; OR, odds ratio; BIC, Bayesian information criterion; ROC, receiver operating characteristic curve; WMH, white matter hyperintensity. Model 0 includes sex, age, and education.
Inter-modality models
The models with the numerically highest predictive values were obtained by combining variables across modalities (Table 4).
Multinomial logistic regressions for intermodality models
aIncident dementia versus no dementia. AUC, area under the curve; CI, confidence intervals; OR, odds ratio; BIC, Bayesian information criterion; ROC, receiver operating characteristic curve; WMH, white matter hyperintensity. Model 0 includes sex, age and education.
The model starting with the strongest individual cognitive predictor (pattern comparison, AUC = 0.875) was most improved by adding the word recall test (AUC = 0.901). The final model also included hippocampal volume (AUC = 0.913). Although the addition of pattern comparison did not increase predictivity compared to model 0 (p = 0.222), the 2-variable model (p = 0.012), followed by the 3-variable model (p = 0.007) led to a significant increase in predictivity, relative to model 0.
Even though the presence of at least one ɛ4 allele was a strong predictor in itself (AUC = 0.875), adding a cognitive test, word recall, further improved dementia prediction (AUC = 0.908). The addition of the general knowledge task resulted in a final model where all three variables gave independent contributions to dementia prediction (AUC = 0.922). Compared to the covariate model, the addition of a single variable did not significantly increase predictive value (p = 0.171), whereas predictivity for models with two or more variables increased from model 0 (p = 0.001).
A model starting with hippocampal volume, the strongest MRI predictor, was most improved by adding the category fluency test (AUC = 0.895). Adding the presence of at least one ɛ4 allele resulted in a 3-variable model including tests from each modality (AUC = 0.911). Prediction was further increased by the inclusion of WMHs (AUC = 0.921). The addition of the general knowledge task resulted in a final model of 5 variables spanning all modalities, which had the highest predictivity of all models tested (AUC = 0.924). As with the other modality bases, there was no significant increase in predictivity from the inclusion of only one variable (p = 0.476), although, all models from three variable onwards showed a significant increase in predictive value (p < 0.05). Taking into account the BIC values, note that, although the highest predictive value was obtained using all 5 variables, BIC increased after three variables suggesting a lowering of model fit.
All model bases showed a significant increase in AUC from model 0 to the final models (cognitive base, p = 0.007; genetic base, p = 0.001; MRI base, p = 0.005). However, there was no significant difference in predictivity between any of the final models. The final inter-modality models were also not significantly different from the final intra-modality models.
For ROC curves for all model bases, see Supplementary Figure 1.
DISCUSSION
The present study demonstrates that markers commonly used in clinical praxis from cognitive, genetic, and MRI modalities can be used for predicting future dementia. Including additional markers in the models increased predictivity and to obtain a significant increase in predictivity over age sex and education, at least two predictors were required. Adding markers from a different modality led to a higher numerical increase in predictivity and the model with the highest predictive value included markers from all three modalities. However, it should be noted that none of the final models were significantly different from each other. When choosing which and how many predictors to include, economical and practical aspects should also be considered.
Individual predictors
The observably strongest predictor of future dementia within the cognitive modality was perceptual speed (pattern comparison). This was also, numerically, the strongest predictor overall (alongside APOE). Previous research has often indicated episodic memory [10, 32] or executive function [33, 34] as the most predictive cognitive domains. Nevertheless, our current findings are in line with results suggesting that perceptual speed does equally well in differentiating persons with preclinical AD from controls [4]. As AD and dementia is preceded by multiple changes in neural structure and function, both in the hippocampus and beyond [35], it may be that global cognitive deficits reflect the wide-ranging brain changes in the preclinical phase.
The current results are in line with previous research showing deficits in multiple cognitive domains in preclinical AD and MCI [4, 5]. Possible underlying mechanisms for these cognitive deficits are hippocampal atrophy, which may primarily affect episodic memory [36], and alterations in the white matter, which have been linked to speed [37]. The findings that hippocampal volume and WMHs were strong predictors of dementia are consistent with these observations.
Alongside perceptual speed, the presence of any APOE ɛ4 allele was the strongest predictor of future dementia. APOE is known to be the strongest genetic risk factor for AD [6], thus, our findings support previous research regarding the predictivity of the ɛ4 allele [14, 38]. Although there are studies showing no significant predictive value when competing with other variables, such as hippocampal volume, cognitive tests, and CSF markers [33].
Among the MRI variables included, hippocampal volume and WMHs were significantly predictive of dementia. Hippocampal integrity is a well-established neuroimaging marker for dementia, particularly AD [39]. While more wide-spread or whole-brain atrophy has been shown to predict future dementia [7], this is not a consistent finding [40] and was not the case in the current study. This may be due to the relatively long distance from diagnosis, as atrophy in preclinical AD begins in the entorhinal and hippocampal regions before spreading to other parts of the brain [41]. Vascular burden may also lead to brain atrophy; WMHs have been found to increase the rate of hippocampal atrophy in persons with MCI [42].
Studies that have used WMHs as a sole predictor of future dementia have produced mixed findings [8, 9]. While traditionally associated with vascular dementia [43], WMHs hold some predictive value for AD [8] as the effects of vascular lesions can exacerbate AD pathology [42, 43]. In this context it is important to note that in the general older population, persons with dementia will commonly present with a mix of vascular and AD pathology [44, 45]. Furthermore, many markers have been shown to predict vascular dementia and AD in a similar way [43, 46].
Although the numerically best individual predictors were the presence of any ɛ4 allele and a test of perceptual speed, there was homogeneity in predictivity across the variables tested. Previous studies have sometimes claimed that cognitive markers are the best predictors of future dementia [14, 48], although there are conflicting results [12, 34], potentially due to the inclusion of highly predictive and specific CSF markers. However, the current results show only minor variation in predictive ability among the included cognitive, genetic, and MRI variables, suggesting that no individual modality is clearly superior at predicting future dementia. The fact that no single test provided a significant increase in predictivity above that of the covariates highlights the need for predictor models that include multiple variables.
Combined models
Adding variables from the same modality (cognition or MRI) led to increased dementia prediction, consistent with previous findings [10, 38], although, the increase in predictivity relative to the covariate model was only significant for the cognitive modality.
An important observation was that the highest predictive values were obtained by combining across modalities. All of the final models included variables from at least two modalities, with the numerically best model combining variables from all three modalities. Intercollinearity among variables within domains might partly explain why predictors from a different modality were more likely to contribute unique variance (Supplementary Table 2). Previous work supports the added benefit of combining between modalities such as MRI and cognition [14, 34] or cognitive and genetic [38]. Dukart et al. [12] found that combining between cognitive, genetic and MRI conveyed greater predictive value than individual predictors or any combination of two modalities. The pattern of results from our study is consistent with these previous findings. However, when formally testing for differences in predictivity between models the addition of structural MRI variables increased predictive value only numerically from a model of cognition and APOE.
Moreover, a model of only cognitive markers did not differ significantly from models including multiple modalities. Worth noting is also that, although predictive value increased in the final models of the MRI base, the BIC value also increased after the 3-variable model indicating a worsening of model fit. This suggests that the models with more than three predictors may be over-fitted to this specific dataset, leading to an artificial increase in predictivity. Thus, adding more predictors may not be optimal, especially considering the relatively small increase in AUC.
Strengths and limitations
A major strength of the current study is the population-based sample, making the results generalizable outside a clinical setting. The fact that all individuals were assessed, not only those with subjective cognitive impairment, and that the dementia diagnosis was based on a clinical examination, without making use of the included predictors, minimizes the risk of circularity often present in clinical environments. Potential limitations are that precise information on time of dementia onset was lacking and that the MRI sample was slightly positively biased relative to the full SNAC-K sample. However, the bias in the MRI sample, in terms of younger age and higher education compared to the full sample, may have led to lower predictivity than would be expected in the general population, as smaller variance in the predictors leads to weaker associations with the outcome (i.e., dementia).
Implications
Our results show that a range of widely available markers may be used to identify persons in the general population who have an increased dementia risk. The predictors included in these models may all be suited as screening tools for selecting individuals who should take part in preventive interventions. Indeed, several ongoing trials are multimodal, targeting multiple dementia types [49]. Although the results show that dementia prediction could be improved by including additional predictors, there was no strong evidence to suggest one observably better combination over another. The marginal increases in predictive value seen beyond the 2- or 3-variable models should also be weighed against potential inconveniences of adding further assessments. However, cognitive tests may be particularly beneficial for pre-dementia screening as they were found to be equally predictive as a combination of markers from multiple modalities, while being easy to implement at a low cost.
Footnotes
ACKNOWLEDGMENTS
We thank the participants as well as all staff involved in the data collection and management of the SNAC-K study. SNAC-K is financially supported by the Swedish Ministry of Health and Social Affairs, the participating County Councils and Municipalities, and the Swedish Research Council. This work was further funded by grants from the Swedish Council for Working Life and Social Research (EL, LF, LB), the Swedish Research Council (EL), Swedish Alzheimer Foundation (EL), Osterman Foundation (EL), and Gamla Tjänarinnor Foundation (EL). This study was accomplished while NP was affiliated with the Swedish National Graduate School for Competitive Science on Aging and Health (SWEAH), which is funded by the Swedish Research Council.
