Abstract
Background
Accurate and accessible risk assessment tools are essential for effective dementia management. The Cardiovascular Risk Factors, Aging, and Incidence of Dementia (CAIDE) model is the widely used tool to assess mid-life dementia risk.
Objective
To determine whether adding resting heart rate (RHR), a simple, readily measurable, non-invasive vital sign, improves dementia risk prediction within the CAIDE model using machine learning methods.
Methods
Data from 27,768 participants of comprehensive cohort in the Canadian Longitudinal Study on Aging were analyzed to predict 3-year dementia risk. Predictive models were developed using random forest and support vector machine algorithms. Performance was assessed using key metrics, including area under the receiver operating characteristic curve (AUC), sensitivity, specificity, Matthew's correlation coefficient (MCC), and Brier score. Internal cross-validation was used to ensure model robustness.
Results
Among the 18,013 participants with complete data for analysis, 516 (2.86%) exhibited dementia. Incorporating RHR into the CAIDE model led to a significant improvement in predictive accuracy. Random forest models with RHR achieved an AUC of 0.67 and an MCC of 0.32 in training data, compared to 0.65 and 0.29 in the test data. Similarly, support vector machines demonstrated a 2–3% increase in both AUC and MCC with the inclusion of RHR.
Conclusions
Incorporating RHR modestly but significantly improves the predictive performance of the CAIDE model using machine learning methods. This approach may support earlier identification of at-risk individuals using non-invasive, routinely available data, representing a step toward scalable and practical dementia risk screening in clinical and community settings.
Keywords
Introduction
Dementia is a rapidly growing public health concern; particularly as global populations age. By 2050, it is projected that over 71% of older adults will be affected by some form of dementia, including Alzheimer's disease (AD), which currently lacks effective treatments to halt or reverse its progression.1,2 Modifiable risk factors, particularly those related to cardiovascular health, have been shown to significantly influence the development and progression of dementia. 3 As a result, early risk identification remains essential for guiding preventive interventions, even in the absence of curative treatments, as addressing these risk factors can reduce the overall disease burden.
The Cardiovascular Risk Factors, Aging, and Dementia (CAIDE) model is one of the most widely used prognostic tools, specifically designed to estimate a 20-year dementia risk among middle-aged individuals. 4 It integrates a comprehensive array of predictors including age, education level, hypertension, physical activity, sex, cholesterol, obesity, and the APOE ε4 allele biomarker. 4 This tool has been proven instrumental in clinical decision-making and patient counseling, supports management and prognostic expectations for dementia. Moreover, this tool has been validated both internally and externally, 5 the model consistently demonstrates strong predictive capabilities, affirming its utility across various clinical settings and policy-making contexts. 5
In efforts to enhance this foundational tool, several studies have incorporated additional biomarkers and modifiable risk factors into the CAIDE model. Despite these efforts resulting in variable performance,6–9 a notable challenge is that these biomarkers require laboratory measurements that may not be readily available during initial clinical evaluations.6,7,9,10 Consequently, there is an urgent need for more effective dementia risk tools that can capture the impact of physiological and lifestyle factors on brain structure and function.
Resting heart rate (RHR) is an accessible and non-invasive marker of cardiovascular health, with studies increasingly highlighting its potential role in dementia risk.11–14 Abnormal heart rates may signal underlying cardiovascular dysfunction, potentially contributing to suboptimal cerebral perfusion and neurodegenerative processes.15,16 Despite its promise, RHR has not been widely included in existing dementia risk models, making it a potentially valuable addition to enhance predictive accuracy.
Recent advancements in machine learning (ML) methods have revolutionized risk prediction in various medical fields. 17 Unlike traditional statistical models, ML techniques such as random forest and support vector machines can capture complex, nonlinear relationships among variables, offering superior accuracy in handling large and diverse datasets. This study seeks to leverage these strengths by integrating RHR into the established CAIDE model and applying machine learning methods to assess dementia risk in a large cohort of older adults from the Canadian Longitudinal Study on Aging (CLSA).18–22
By combining the simplicity of RHR measurement with the advanced capabilities of machine learning, this study aims to improve the accuracy and clinical utility of dementia risk prediction tools, potentially contributing to more effective early identification and prevention strategies.
Methods
This study followed the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines, which outline best practices for prediction modeling and ML research in clinical domain (see Supplemental Table 1). 23
Data source
The data were obtained from the CLSA. 18
Ethical consideration
Ethical approval for this study was granted by the Brock University Health Sciences Research Ethics Board (REB #: 23-195).
Canadian Longitudinal Study on Aging (CLSA)
The CLSA is a large, national, randomly sampled study focused on understanding the determinants of aging and the factors contributing to health changes and quality of life over time. 18 The study initially recruited 50,388 men and women aged 45–85 at baseline (2010–2015). The CLSA consists of two cohorts: the Tracking and Comprehensive cohorts, which together form the full study population of 50,388 participants. These individuals are followed every three years for at least 20 years or until death or loss to follow-up. 18 The Comprehensive cohort, comprising 30,097 participants, was randomly selected from those living within 25–50 km of one of the eleven Data Collection Sites (DCS) located across Canada: St John's, Halifax, Vancouver, Montreal, Calgary, Sherbrooke, Winnipeg, Ottawa, Hamilton and Victoria. Demographic, clinical, psychological, social, biological, and economic data were collected through 60-min home interviews. Additionally, clinical tests, and blood and urine samples were obtained at the DCS following standard protocols agreed upon by the participants. This study used data from the baseline and the first follow up from the Comprehensive cohort, as it provided relevant health-related information. Details of this data have been extensively published elsewhere.24,25 The dataset is available for researchers who meet access criteria from the Canadian Longitudinal Study on Aging (www.clsa-elcv.ca). The flow diagram of the eligible participants cohort is presented in Supplemental Figure 1.
Outcome definition
The primary outcome in this study was dementia, defined by the COG_OVERALL6_IMP_BELOW5PT8_COM parameter from the CLSA dataset, indicating global cognitive impairment if scores fell below the 5.8% threshold. 26 Cognitive function was assessed using six dementia diagnosis protocols recommended by CLSA investigators, with a comprehensive neuropsychological battery conducted at baseline and follow-up. Memory was evaluated using the Rey Auditory Verbal Learning Test (REYI for immediate recall, REYII for delayed recall), while executive function was measured with the Animal Fluency (AF), Mental Alternation Test (MAT), FAS, and Stroop tests.27–29 Test scores were standardized into Z-scores, adjusting for age, sex, and education, and composite cognition sub scores were calculated at both time points. Dementia status was defined as scoring below 5.8% on the overall cognitive test battery. 26
Selection of predictors
We selected our predictors based on the variables included in the original CAIDE model. Specifically, socio-demographic factors were age (years), sex (male versus female), education (less than secondary school graduation, secondary school graduation, some post-secondary education, and post-secondary degree/diploma). Furthermore, a well-known cut-off range were used, for age (<45–54, 55–64 and 65 above) according CLSA cohort. For education “some post-secondary education” and “post-secondary degree/diploma” were categorized >10 years of education while “less than secondary graduation” were categorized as 0–6 years and “secondary school graduation” were categorized between 7–9 years. 4
Chronic medical conditions
Comorbid factors were obtained from physical examination that was previously conducted by a physician based on the structure of the interview. Furthermore, this was confirmed by repeated measurement conducted six times through a physician and the average measurement was used for this study (≥140 mm/hg for hypertension versus <140 mm/hg for no hypertension). 4 Total cholesterol (mol/L) was also measured and measurement <6.5 mmol/L was considered as low and >6.5 mmol/L was considered as high. 4
Resting heart rate
Resting heart rate beats per minute (bpm) was conducted six times, and the average pulse was used for this study; resting heart rate was categorized as <60 bpm (low), 60–90 bpm (normal), and >90 bpm (high) according to the previous published studies classification.30,31
Physical activity
Physical activity was assessed using a modified Physical Scale for the Elderly (PASE), 32 a validated tool specifically designed for older adults to assess their physical activity, lifestyle behavior and quality of life. Participants were asked about their recent engagement on leisure time, household, and/or work-related activity during the past seven days. Response was categorized into never, seldom (1–2 days), sometimes (3 days) and often (5–7 days).33,34 The reliability agreement and consistency of this tool demonstrated 0.73 measured by Cronbach metric.33,34 For this analysis, sometimes (3–4 days) and often (5–7 days) were categorized as active while “never” and seldom (1–2 days) were categorized as inactive.
Body mass index
Participants information on height and weight was collected to derive the body mass index (BMI). For context, BMI categories are defined as underweight (<18.5 kg/m², normal 18.5–24.9 kg/m², overweight 25.0–29.9 kg/m², and obesity ≥ 30 kg/m²). 35 In this analysis, we applied the CAIDE cutoff, where BMI ≥ 30 kg/m² was considered a risk factor for dementia. 4
Statistical analysis
Descriptive statistics including percentages, median, interquartile range (IQR), and frequencies were used to summarize participants’ demographic and clinical characteristics according to their 3-year dementia status for both the training and test datasets. Univariate associations between dementia status of the categorical and continuous variables were assessed using the chi-square test and Wilcoxon rank-sum test, respectively. In this cohort, given that the total missing data for each of the categorical and continuous variables were minimal (<5%). Thus, multiple imputation was not adopted. In addition, given the low 3-year incidence of dementia, we addressed class imbalance during model training using Synthetic Minority Over-sampling Technique (SMOTE) using DMwR package in R. SMOTE synthesizes minority examples by interpolating between a sample and its k nearest minority neighbors. We used k = 5 and oversampled to approximately 1:1 class balance. To avoid information leakage, SMOTE was applied only within the training portion of each resampling split (never to validation or test data). All other preprocessing (encoding of categorical variables and scaling of numeric predictors) was fit on the training data and applied to the corresponding validation/test data.
Two strategies were used to measure and quantify the impact of adding RHR as a clinical modifying characteristic into the CAIDE tool. First, the association was examined based on multivariate logistic regression (i.e., variables in the CAIDE as model 1 and CAIDE model with RHR included as model 2) to measure the association and the risk of having dementia. We conducted likelihood ratio test to determine the categories of RHR were truly significantly associated with the risk of dementia. The purpose of this analysis was to determine the variables categories that was associated with dementia rather than the whole variable. This was done by comparing the CAIDE model with the model that includes RHR. Adjusted odds ratios (ORs) with 95% confidence intervals (CIs) were reported and the association were considered significant if the 95% excluded the null value of 1. We interpreted the ORs as relative risk (RRs) (as appropriate given the incidence is observed). 36
Second, dementia risk prediction models were specifically developed and trained using random forest 37 and support vector machine 38 in the CLSA study. Each model was developed using patients’ demographic and clinical characteristics, including age, sex, physical activity, body mass index, cholesterol, level of education, RHR and hypertension, as input variables to predict the dementia patients. Feature scaling was performed to ensure all predictors were on same scale and the predictive accuracy of each model was assessed using internal cross-validation and test (validation). Specifically, the whole CLSA dataset was partitioned into two datasets (80% and 20%). The whole 80% (i.e., the training) was used for model development and the remaining 20% of the data was used for test (validation). The prediction models were developed using CLSA study via a repeated 10-fold cross-validation method. This process involves randomly splitting the data into 9:1 ratio, retraining the model in the nine-tenth of the data, and estimating the accuracy in the one-tenth. This process was repeated 500 times. The predictive accuracy of each model developed was measured using the following metrics, sensitivity, specificity, negative predictive value, positive predictive value, the area under the curve (AUC), Mathew's correlation coefficient (MCC) respectively. MCC is a robust metric that measures the balance between the observed and the predicted value. 39 MCC ranges between −1 and +1; where +1 signifies a perfect prediction, zero correspond to a random distribution and −1 indicate a completely opposite prediction.
Additionally, Brier score 40 and calibration plots were used to assess how much the predicted probability matched the actual outcome (i.e., model agreement) of all the models developed. Higher values of Brier score indicate poor discrimination/calibration of the model. While a perfect calibrated plots should have all the points on a 45° diagonal line in the x- and y axes; large deviation curve from the diagonal line indicates a poor calibrated model. Statistical significance was set at α = 0.05. All the analyses were conducted using R software version 4.4.1. 41
Results
Of the 18,013 participants included in the analysis, 516 (2.86%) were identified as having dementia at 3 years (i.e., first follow up). Table 1 presents the categorical analysis of the demographic and clinical characteristics of participants with and without dementia. Individuals who developed dementia over the 3-year follow-up had a higher proportion of participants aged ≥ 65 years (44%) compared to those who did not develop dementia (38%). Additionally, a greater proportion of individuals in the dementia group had a body mass index (BMI) ≥ 30 kg/m² (30%) and a resting heart rate (RHR) ≥ 90 bpm (9.1%) than those in the no dementia group (28% and 5.7%, respectively). However, when the variables were analyzed as continuous variables (Supplemental Table 2), the most notable differences between the two groups were observed in levels of physical inactivity, 477 individuals in the dementia group (92.5%) and in systolic blood pressure, with an average of 121.83 mmHg in the dementia group.
Descriptive characteristics of individual with dementia diagnosis and non-dementia diagnosis in CLSA study
CLSA: Canadian Longitudinal Study on Aging; n: sample size; Grad: graduation; bpm: beats per minutes.
Table 2 presents the results of the multivariable analysis of clinical variables and 3-year dementia risk with and without the inclusion of RHR. There was no notable difference between the results obtained with and without the resting RHR inclusion. However, Age, RHR, and physical inactivity were identified as significant predictors of dementia. Adding RHR into Model (i.e., model 2) resulted in a modest increase in the risk of dementia. Specifically, participants with a heart rate greater than 90 bpm had a risk ratio of 1.62 (95% CI: 1.19 −2.21), indicating that individuals with higher RHR were 1.62 times more likely to develop dementia compared to those with lower RHR. Hypertension was also found to be significantly associated with an increased dementia risk (see Supplemental Figures 2 and 3). Although other variables were not statistically significant in both models, they were retained in the model to maintain consistency with the original CAIDE model predictors. Furthermore, the likelihood ratio test confirmed that RHR was significantly associated with dementia risk (see Supplemental Table 3).
Multivariable analysis of clinical variables and 3-years dementia risk (95%CI) with and without resting heart rate.
Grad: graduation; CI: confident interval; N/A: not applicable; bpm: beats per minutes.
Figures 1 and 2 describe the feature importance of the predictor's variables with respect to 3-year individual with dementia diagnosis for random forest and support vector machine models in imputed in the CLSA data. Education, RHR, and physical activity were ranked as the most three important predictors of 3-year dementia diagnosis for the two class models.

Feature importance of random forest model predictors. edu: education; rhr: resting heart rate; phy: physical activity; bmi: body mass index; chol: cholesterol; hyper: hypertension.

Feature importance of support vector machine predictors. edu: education; rhr: resting heart rate; phy: physical activity; bmi: body mass index; chol: cholesterol; hyper: hypertension.
Table 3 presents the predictive accuracy of the machine learning algorithms for predicting the 3-year dementia risk in the CLSA training data. The inclusion of RHR in Model 2 resulted in a significant improvement in predictive accuracy compared to Model 1. For instance, the random forest model in Model 2, which included RHR, achieved a higher AUC of 0.67 (95% CI: 0.65–0.68), compared to an AUC of 0.65 (95% CI: 0.63–0.66) in Model 1. Similar patterns were observed with the support vector machine, with Model 2 showing an average 2% increase in AUC compared to Model 1. All models exhibited higher sensitivity, meaning they correctly identified individuals with dementia, but lower specificity, indicating less accuracy in identifying individuals without dementia.
Predictive accuracy (95%CI) of machine learning models in CLSA-data (training).
95%CI: 95% confidence interval; AUC: area under the receiver operating characteristic curve; RF: random forest; SVM: support vector machine; MCC: Mathew's correlation coefficient; NPV: negative predictive value; PPV: positive predictive value; CLSA: Canadian Longitudinal Study on Aging.
Table 4 shows the predictive accuracy of the machine learning models in the CLSA test data. The differences in predictive accuracy between the two machine learning methods were negligible. For example, the AUCs for random forest and support vector machine in Model 2 were 0.65 (95% CI: 0.63–0.66) and 0.64 (95% CI: 0.62–0.65), respectively, representing an average 3% improvement over Model 1 in both models. However, random forest showed higher MCC and lower Brier scores in the result, indicating better discrimination and calibration compared to support vector machine (see Table 4). Similar to the training data, all models had higher sensitivity but lower specificity. Supplemental Figure 4 illustrates the discriminative performance of both models, where Model 2 demonstrates improved performance over Model 1 in the two methods investigated. Figures 3 and 4 display the calibration plots for the machine learning models, revealing that Model 2 was better calibrated than Model 1.

Calibration plots for CAIDE and CAIDE + resting heart rate for random forest model (test data).

Calibration plots for CAIDE and CAIDE + resting heart rate for support vector machine model (test data).
Predictive accuracy (95%CI) of machine learning models in CLSA-data (test).
95%CI: 95% confidence interval; AUC: area under the receiver operating characteristic curve; RF: random forest; SVM: support vector machine; MCC: Mathew's correlation coefficient; NPV: negative predictive value; PPV: positive predictive value; CLSA: Canadian Longitudinal Study on Aging.
Discussion
Accurately predicting dementia risk is vital for effective prevention. This study introduces new evidence supporting the integration of RHR into the CAIDE dementia risk score. RHR, a simple cardiovascular marker, routinely measured physiological parameter that reflects autonomic nervous system function and cardiovascular health, has been increasingly found to associate with dementia risk but remains absent from existing models,7,8,42 making it a valuable factor for improving predictive accuracy. In this study, we evaluated the impact of adding RHR as a modifying variable to the CAIDE model to predict 3-year dementia risk. To our knowledge, this is the first study to integrate RHR into the CAIDE model using machine learning methods.
CAIDE-RHR was designed for use in clinical or care settings using routinely collected data, without the need for specialized information like results from comprehensive neuropsychological testing or invasive cerebrospinal fluid analysis. Despite the shorter 3-year follow-up period compared to the original CAIDE model's 20-year horizon, our models showed moderate discrimination and strong calibration amongst the machine learning approaches tested. For example, the random forest algorithm outperformed the support vector machine (SVM), yielding approximately 3% improvement in predictive performance and enhanced calibration. To put this into context, the random forest model can discriminate average risk those with dementia of ≈ 65%, which suggest could be clinically and personally (to the patients) relevant. This suggests that RHR provides additive value to the existing CAIDE model by capturing underlying physiological stress and sub clinical cardiovascular burden that may not be fully accounted for by traditional metrics.
Previous attempts to improve the CAIDE model by adding modifying factors and biomarkers have produced mixed results. Some studies found no change in predictive accuracy, while others reported a decrease.6,7,9,43 For example, Trares et al. found no performance improvement when adding biomarkers to the CAIDE model in a German prospective cohort. 6 Stephan et al. reported poor predictive accuracy (AUC: 0.53–0.63) in low- and middle-income countries. 42 Exalto et al. added factors such as depression, diabetes, head trauma, poor lung's function and smoking to the CAIDE model, but saw no improvement. 7 Similarly, Geethavi et al. compared the CAIDE model with two other risk scores in an Australian cohort, and while the CAIDE model had the lowest accuracy (AUC: 0.54), a hybrid model yielded moderate discriminative accuracy. 10 These inconsistencies may result from varying clinical characteristics, lack of detailed data reporting, and methodological heterogeneity. Unlike previous studies, our research demonstrates robustness through internal cross validation, as well as model calibration.
While our predictive models demonstrated statistically significant results, the AUC values in the mid-0.60 s indicate only moderate discriminative performance. This level of accuracy suggests limited utility for precise individual-level prediction, especially in the context of a relatively rare outcome. Moreover, the model's high sensitivity, though advantageous for identifying at-risk individuals, was accompanied by low specificity, leading to a high rate of false positives. This trade-off may be partly attributable to the relatively short follow-up period (3 years) in our cohort, which contrasts with the 20-year prediction horizon commonly used in models like CAIDE. The abbreviated timeframe may limit the model's ability to distinguish between true and false positives, thereby affecting its specificity and sensitivity. This limitation may reduce the feasibility of applying the model in routine clinical practice without supplementary screening or confirmation steps.
Nonetheless, models with modest AUC values can still hold meaningful utility in public health and preventive contexts. For example, the CAIDE risk score, has shown AUCs around 0.60 s in some external validation studies,44–48 yet continues to be applied effectively for population-level dementia risk stratification. The CAIDE model's clinical utility stems from its simplicity, emphasis on modifiable midlife risk factors, and its value in guiding early intervention strategies, particularly in low-resource or primary care settings. Similarly, the present model may serve as a pragmatic tool for identifying individuals who could benefit from lifestyle modification or further monitoring in community or screening programs, even if not suited for standalone diagnostic use. Future work would focus on improving model accuracy through integration of more specific clinical, genetic, or imaging biomarkers and evaluating model performance prospectively in diverse populations.
Moreover, another important implication of our finding is the importance of model selection in dementia risk prediction. The random forest algorithm outperformed the support vector machine across key metrics, including AUC, Brier score, and MCC, with MCC offering a balanced evaluation of model performance. Random forest also provides interpretable insights into predictor importance, making it a practical choice for clinical applications. On the other hand, while random forest outperformed SVM, model choice should also consider factors like the model's intended use, complexity, number of predictors, and whether external validation is available. Additionally, relying solely on AUC is insufficient; metrics like the Brier score, which reflects both calibration and discrimination, provide a more complete evaluation of model performance.
Our study shows a unique promise in estimating the risk of 3-year dementia risk using routinely collected health data. While this represents a step toward clinical innovation, it is not yet a recommendation to change current assessment practices. Additionally, this model was intentionally designed to be simple, practical, and non-invasive, relying only on variables commonly available in primary care. As such, it may serve as a first-step screening tool to flag individuals at higher risk of dementia, who can then be referred for more detailed assessment and monitoring. This tiered approach may support earlier detection and more efficient use of specialized healthcare resources. However, before such models can be fully integrated into clinical practice, further research is needed. This includes external validation in other populations, careful consideration of ethical and legal issues, and studies assessing how best to incorporate predicted risk into clinical decision-making using machine learning methods.
Strengths and limitations
This study has several unique strengths. First, it includes robust internal(train), validation(test) and model calibration, addressing the limitations of previous studies highlighted in systematic reviews.5,49 Second, while most clinical prediction studies have relied on regression-based models, we used novel machine learning methods, which offer advantages in capturing non-linear relationships. 5 Additionally, this study adheres to the TRIPOD checklist 23 ensuring transparent and reliable reporting of prediction models.
However, there are limitations that may affect the generalizability of our findings. First, machine learning models such as random forest and support vector machines are often considered “black box” methods 50 due to their complex mathematical formulations, which may not be easily interpretable by clinicians. While they provide greater accuracy, this lack of transparency could hinder clinical adoption. CAIDE-RHR shows lower discrimination than some advanced models,51,52 this is expected as it was intentionally designed to rely only on routinely collected, non-invasive data. This makes it practical for use in primary care and community settings, where access to specialized tests like neuropsychological assessments or cerebrospinal fluid biomarkers is limited. Its simplicity enables broader early risk screening. Future studies will explore whether adding advanced biomarkers can enhance predictive accuracy, as prior work suggests such additions may achieve C-statistics of 0.85 or higher.53,54 Second, our analysis is based on a single cohort, and although the dataset is large and reliable, external validation in other cohorts, including more diverse populations is needed. Additionally, the prevalence of dementia in our cohort was relatively low, which may not accurately reflect the true prevalence in Canada and globally. Third, our analyses used SMOTE to synthetically balance the dataset likely influenced the model's performance characteristics, contributing to the observed high sensitivity and low specificity. By augmenting the minority class, SMOTE improves the model's ability to identify individuals with dementia but may also increase false positives due to the synthetic samples potentially overlapping with the majority class distribution. This trade-off is important to consider when interpreting the clinical utility of the model, especially in screening contexts where balancing false positives and false negatives has practical consequences. Fourth, the selection of predictors was based on the original CAIDE model, and using different sets of predictors could potentially alter the predictive accuracy. Fifth, the CAIDE model is specifically designed to estimate a 20-year dementia risk among middle-aged individuals. However, in our current CLSA dataset, we only have outcomes at 3 years (i.e., first follow up data) in our disposal. This limits the model's long-term predictive power and results in only moderate prediction sensitivity. Furthermore, there is opportunity to derive more accurate results by considering dementia long term risk factors and predictive power due to the nature of the ongoing datasets of the Canadian longitudinal Studying on Aging (CLSA). Finally, our study focused on machine learning approaches and did not explore survival risk models such as Cox proportional hazards or accelerated failure time models, which may offer further insights into dementia risk over time.
Conclusions
The addition of RHR modestly but significantly improves the predictive performance of the CAIDE model in the CLSA cohort, particularly when applied through machine learning approaches such as random forest. Future research, including external validation and real-world clinical studies, is needed to confirm the model's utility and guide its implementation.
Supplemental Material
sj-docx-1-alz-10.1177_13872877251390562 - Supplemental material for Enhancing dementia risk prediction with heart rate and machine learning in the Canadian Longitudinal Study on Aging
Supplemental material, sj-docx-1-alz-10.1177_13872877251390562 for Enhancing dementia risk prediction with heart rate and machine learning in the Canadian Longitudinal Study on Aging by Shakiru A Alaka, So-Fong Cam Ngan, Rebecca EK MacPherson, William Pickett, Christopher P Chen and Siu Kwan Sze in Journal of Alzheimer's Disease
Footnotes
Acknowledgements
This research was made possible using the data/biospecimens collected by the Canadian Longitudinal Study on Aging (CLSA). Funding for the Canadian Longitudinal Study on Aging (CLSA) is provided by the Government of Canada through the Canadian Institutes of Health Research (CIHR) under grant reference: LSA 94473 and the Canada Foundation for Innovation, as well as the following provinces, Newfoundland, Nova Scotia, Quebec, Ontario, Manitoba, Alberta, and British Columbia. This research has been conducted using the CLSA dataset: Baseline Comprehensive Dataset – Version 7.0 and Follow-up 1 Comprehensive Dataset – Version 5.0 under Application Number 2401008. The CLSA is led by Drs. Parminder Raina, Christina Wolfson and Susan Kirkland.
Ethical considerations
The use of CLSA data for the present secondary analysis was approved by the Brock University Health Sciences Research Ethics Board (REB #: 23-195).
Consent to participate
All participants in the CLSA study provided informed consent at the time of data collection.
Consent for publication
This manuscript was reviewed and approved for publication by the CLSA. The opinions expressed in this manuscript are the authors’ own and do not reflect the views of the Canadian Longitudinal Study on Aging.
Author contribution(s)
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the Canadian Institutes of Health Research Tier1 Canada Research Chair (CRC-2020-00263), Canadian Institutes of Health Research Project Grant (PJT-186091), The Natural Sciences and Engineering Research Council of Canada Discovery Grant (RGPIN-2023-04304), Canada Foundation for Innovation Grant (41454 and 44115), Ontario Research Fund, and start-up research grant from Brock University.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data that support the findings of this study are available from the Canadian Longitudinal Study on Aging (CLSA), subject to approval. Researchers may apply for access through the CLSA's online application system, Magnolia (
). All applications are subject to review by CLSA's Data and Sample Access Committee, and data are released only after ethics approval and signing of data use agreements.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
