Abstract
Background:
We explored the feasibility of linking research datasets to electronic health records to identify acute stroke computed tomography (CT) brain features associated with post-stroke dementia.
Methods:
We linked data from two existing research datasets of people who had a stroke. These datasets contained expert-coded features from CT brain scans. Participants were followed up by linking to their electronic health records. Survival analyses were performed to identify prognostic factors associated with increased risk of post-stroke dementia.
Results:
Twenty-one participants (11%, n = 21/185) were identified as having dementia after stroke (median follow-up: 9 years and 8 months). Presence of cerebral atrophy and moderate-to-severe white matter hyperintensities on acute stroke CT scans were associated with an increased risk of post-stroke dementia.
Conclusion:
Linkage to electronic health records is a feasible method for studying dementia outcomes after stroke. This method can be applied to larger stroke populations to explore acute stroke imaging predictors in more detail.
Introduction
Cognitive impairment, including dementia, is common after a stroke. The term ‘cognitive impairment’ can be used to describe any severity of cognitive impairment, including dementia. 1 Dementia is diagnosed when cognitive decline is severe enough to impact a person’s ability to undertake daily activities. 1 Determining the prevalence of post-stroke dementia can be challenging. For example, a previous systematic review calculated that prevalence of post-stroke dementia varied between 7.4% and 41.3% depending on the research methods (e.g. a population-based or hospital-based study) and the inclusion criteria used (e.g. first ever stroke, excluded pre-stroke dementia). 2 Electronic health records contain data that have already been collected in clinical practice. As data have already been collected, linking to health records offers a less-resource intensive method of studying many individuals over a substantial follow-up period and does not introduce bias due to an inability to recruit people who are not able to provide consent for research studies due to stroke severity, cognitive impairment, illness or death.
In our study, we identified two existing Scottish research studies that recruited people with acute stroke and coded features of acute and pre-stroke changes on computed tomography (CT) brain scans. Data from these two research studies were linked to the participants electronic health records to identify whether participants received a diagnosis of dementia after their stroke. Our study focuses on developing the methodology for using electronic health records to identify dementia after stroke. Importantly, we explore the feasibility of using data-linkage methods to identify acute stroke CT brain features associated with post-stroke dementia.
Methods
This study is reported according to STROBE cohort reporting guidelines (Supplementary Material). 3
Study population
This data linkage study is formed of a subset of participants originally recruited to two previous research studies.4,5 We will refer to these cohorts as the Stroke-Fatigue cohort (Kutlubaev et al. 4 ) and the Stroke-Delirium cohort (Barugh 5 ). Inclusion and exclusion criteria are presented in Table 1.
Population inclusion and exclusion criteria.
CT: computed tomography; MMSE: mini-mental state examination; TIA: transient ischaemic attack.
Data linkage to electronic health records
In Scotland, each person has a unique 10-digit Community Health Index (CHI) number. Electronic Data Research and Innovation Service (eDRIS), Public Health Scotland linked the research datasets (data from N = 201 participants) to the participants’ electronic health records using the participants’ CHI numbers (Figure 1). Cases of dementia were subsequently identified in hospital admissions data, death records and prescribing data, thus avoiding attrition bias (Figure 1).

Timeline showing linkage of existing research study data to electronic health records.
Outcomes
We provided eDRIS with relevant diagnostic codes and drug names and they identified the primary and secondary clinical outcomes (Supplementary Table 1). eDRIS also provided the Scottish Index of Multiple Deprivation 2009/2012.
Primary clinical outcome
The primary clinical outcome of interest in this study is a diagnosis of dementia International Classification of Diseases (ICD) codes presented in Supplementary Table 1). Cases of dementia were also identified by looking at prescribed medications (Supplementary Table 1). We identified dementia diagnoses at least 3 months after the index stroke to allow sufficient time for symptoms of dementia to develop.
Secondary clinical outcome
Secondary clinical outcomes in this article are stroke, myocardial infarction, diabetes and death (Supplementary Table 2).
Risk factors
The CT scans had been coded by experts using the structured form presented in Appendix 6 of Barugh, 2018 as part of the original research studies.4,5 We focussed on exploring the association between the following CT features and risk of post-stroke dementia: (1) atrophy, (2) white matter hyperintensities (WMH), (3) acute stroke features, (4) old vascular lesions. Due to the small sample size in this study, we condensed the four severities of atrophy and WMH into none/mild and moderate/severe. The complete list of neuroimaging and other risk factors included in this study is provided in Supplementary Table 3.
Statistical analysis
Prior to pooling the two cohorts in our analyses, we presented descriptive statistics for each cohort. Kaplan–Meier survival analysis was also performed. Log-rank tests were performed to assess the statistical significance between categorical prognostic factors and the development of post-stroke dementia. Cox proportional hazards (CPH) regression was subsequently performed to estimate the effect size between prognostic factors and post-stroke dementia. CPH regressions were also adjusted for age and age-squared (due to violation of proportional hazards assumption) and stratified by sex. Since older adults have a higher risk of dementia and death; we used univariable Fine-Gray competing risk regression models (CRR), with death as the competing risks, and reported the subdistribution hazard ratio (SHR) for these models. Analyses were performed using R (Version 4.0.5) (R Core Team, 2021). 6
Study approvals
The project received approval through Lothian Research Safe Haven, including Caldicott Guardian approval, ACCORD sponsorship (AC20012) and favourable ethical opinion under Lothian Research Safe Haven’s delegated authority (reference: 17/NS/0072).
Results
Participant characteristics
Sixteen/201 (8%) participants were excluded because they either had received a dementia diagnosis prior to their index stroke or died within 3 months of the index stroke, giving a final cohort of 185 stroke survivors (mean age: 71.9 years, 40% female). Most brain scans were performed within 1 day of hospital admission for the index stroke (mode: 0 days).
Comparing cohorts
In comparison to the Stroke-Delirium cohort, participants in the Stroke-Fatigue cohort had less severe strokes, were younger and a lower proportion of individuals had a myocardial infarction prior to their index stroke (Supplementary Table 4).
Cases of dementia
Data from both cohorts were linked to electronic health records on 17 December 2020. The median length of follow-up from stroke to date of data linkage was 9 years and 8 months (IQR = 2 years and 9 months). Eleven percent (N = 21/185) of participants were identified as having a dementia. Median time to dementia diagnosis was 4 years and 2 months (IQR: 3 years and 3 months). The 10-year probability of survival without dementia is 87% (Supplementary Table 5).
Risk factors for post-stroke dementia
CT neuroimaging features
All participants who were identified as having dementia had presence of atrophy on their CT brain scan at the time of the index stroke (X 2 = 4.4, p = 0.04; Supplementary Table 6). Following unadjusted analysis, moderate/severe WMH (CPHun: HR = 3.41, 95% CI = 1.24–9.39, p = 0.018; CRRun: SHR = 3.26, 95% CI = 1.19–8.98, p = 0.022) were associated with an increased risk of developing post-stroke dementia (Supplementary Table 6). This association remained after performing adjusted Cox Proportional Hazards analyses (Supplementary Table 6).
Demographic, vascular and acute stroke risk factors
Post-stroke dementia was more likely following an acute stroke lesion on left side of the brain (CPHun: HR = 3.53, 95% CI = 1.29–9.64, p = 0.014; CRRun: SHR = 3.55, 95% CI = 1.31–9.66, p = 0.013; Supplementary Tables 7 and 8).
Discussion
This study highlights the feasibility of linking research data to electronic health records to follow-up participants and shows the potential for the same methodology to be repeated on a larger scale.
When used in conjunction with other demographic and vascular factors, acute stroke brain imaging could help healthcare professionals to better target support and provide advice about prevention to those at risk. Understanding who is at risk of developing dementia after a stroke could enable healthcare professionals to tailor care to those most at risk, for example informing patients about signs and symptoms to be aware of, how to seek support and to consider the potential impact of lifestyle choices on their cognition as well as their physical recovery from stroke.
This study highlights the feasibility of linking research data to electronic health records to follow up participants and shows the potential for the same methodology to be repeated on a larger scale. Due to the limited sample size of this feasibility study, repeating these data-linkage methods on a larger scale will improve generalisability of findings and will reduce the risk of sampling bias. Automated processing of brain images offers the potential for future studies to identify brain features associated with post-stroke dementia at scale. For example, deep learning models have been developed to classify cerebral small vessel disease-related brain atrophy on brain images. 7 Future work should also consider how cognitive impairment (without a dementia diagnosis) is considered and coded.
It is also important to highlight that a significant proportion of people who have dementia are undiagnosed. 8 Cognitive screening in people with language impairment following a stroke also raises challenges when diagnosing dementia. Although we used multiple health data sources to identify dementia cases, individuals with undiagnosed dementia will have been missed.
Conclusions
This study has developed the methodology for using electronic health records to study dementia outcomes after stroke. Future work could use these data-linkage methods, alongside automated processing of acute stroke brain images, to identify risk factors for post-stroke dementia on a much larger scale. For example, data-linkage to stroke audits which collect data from all patients admitted to hospital with stroke would be an ideal resource to study dementia outcomes and avoid the biases associated with recruitment to clinical research studies. This will help to ensure findings are representative of the general stroke population.
Supplemental Material
sj-pdf-1-rcp-10.1177_14782715251358924 – Supplemental material for Using electronic health records to identify computed tomography brain features associated with post-stroke dementia: A feasibility study
Supplemental material, sj-pdf-1-rcp-10.1177_14782715251358924 for Using electronic health records to identify computed tomography brain features associated with post-stroke dementia: A feasibility study by Emily L Ball, Gillian E Mead, Terence J Quinn, Dorota Religa, Joanna M Wardlaw and Susan D Shenkin in Journal of the Royal College of Physicians of Edinburgh
Footnotes
Acknowledgements
This work uses data provided by patients and collected by the NHS as part of their care and support. This project has been facilitated by the Lothian Research Safe Haven service (reference: SH2019_023). Lothian Research Safe Haven (now DataLoch) enables access to de-identified extracts of health care data from the South-East Scotland region to approved applicants: dataloch.org. We would like to thank the following people for their advice on this project. Professor Will Whiteley for peer reviewing the protocol. Dr. Tim Wilkinson and Dr. Guy Holloway for providing advice on identifying dementia diagnoses using routinely collected healthcare data and the process of prescribing dementia drugs. Dr. Hong Xu for providing advice on statistical methods. Professor Gillian Mead and Dr. Amanda Barugh for providing data from the original Stroke-Fatigue and Stroke-Delirium cohorts which are used in this data linkage study. We would like to acknowledge the participants and researchers involved in the original Stroke-Fatigue and Stroke-Delirium projects. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission.
Author contributions
All authors provided input into the design of the study. ELB conducted the statistical analysis and drafted the manuscript. All authors reviewed and provided feedback on the manuscript.
Consent to participate
Caldicott Guardian approval was obtained for this data-linkage study.
Consent for publication
Not applicable.
Data availability statement
All analyses were performed using de-identified data within a secure data environment approved by NHS Lothian. Data may be accessed through DataLoch (dataloch.org) following successful application and approvals.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Ethical considerations
The project received approval through Lothian Research Safe Haven, including Caldicott Guardian approval, ACCORD sponsorship (AC20012) and favourable ethical opinion under Lothian Research Safe Haven’s delegated authority (reference: 17/NS/0072).
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: ELB was supported by the MRC, University of Edinburgh and University of Glasgow, as part of the Precision Medicine Doctoral Training Programme (MR/N013166/1). GEM receives occasional honoraria for lectures on stroke, royalties from Elsevier for a book on Exercise after stroke, and payment for consultancy to Imperative Care; these are all paid to University of Edinburgh and are used to support further research. DR is supported by Swedish Research Council (2020-06101). JMW: UK Dementia Research Institute funded by UK MRC, Alzheimer’s Society, Alzheimer’s Research UK. TJQ: none specific to this work. SDS: none specific to this work.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
