Abstract
Background
Gold standard diagnosis of Alzheimer's disease (AD) relies on invasive, expensive, and non-scalable methods (cerebrospinal fluid lumbar puncture and amyloid-positron emission tomography). Blood biomarkers present a scalable, accessible, and resource-efficient diagnostic alternative.
Objective
To investigate the diagnostic and differential diagnostic performance of three clinically relevant plasma biomarkers: phosphorylated tau-217 (pTau217), glial fibrillary acidic protein (GFAP), and neurofilament light chain (NfL) for biologically confirmed AD patients in real-world, clinical settings.
Methods
A systematic search was conducted across 5 databases for peer-reviewed studies between January 2019-January 2025. A narrative synthesis was conducted for eligible studies.
Results
13 studies (n = 4686 participants) were included. All studies were cross-sectional, and investigated populations recruited from memory clinics, neurology departments, or clinical cohorts. Diagnostic performance of pTau217 was consistently high (AUC > 0.90 across all comparisons). GFAP showed stronger and more consistent diagnostic performance as compared to NfL, though both demonstrated moderate and variable accuracy (AUCs ranging from <0.75 to >0.90). No studies assessed combinations of all 3 biomarkers. Methodological and assay heterogeneity was common.
Conclusions
Plasma pTau217 demonstrated strong diagnostic accuracy and promise for diagnosis of AD. GFAP and NfL displayed inconsistent results, but could provide complementary information, particularly for differential diagnosis. Further standardized studies in underrepresented populations are required to validate and enable blood biomarker implementation in clinical settings.
Introduction
Gold standard diagnostic techniques for Alzheimer's disease (AD), like positron emission tomography (PET) neuroimaging and cerebrospinal fluid (CSF) lumbar puncture, are invasive, expensive and time intensive. Consequently, their use is limited to high-income countries and specialist tertiary-care clinics, but rare in low- and middle-income countries (LMICs) and community-based clinics, which often lack advanced healthcare infrastructure. 1 This highlights the need for more cost-effective and scalable alternatives to facilitate timely and accurate diagnoses across diverse clinical settings. 2
Recent advancements in highly sensitive detection methods have enabled accurate detection of relevant biomarkers in peripheral biofluids such as blood. This is a timely development, as the approval of anti-amyloid therapies for AD, aducanumab in 2021 and lecanemab in 2023, has underscored the need for early and accurate diagnoses, and subsequent streamlining of patients into the appropriate clinical trial. 3
The clinical promise of AD blood biomarkers is highlighted in the National Institute on Aging-Alzheimer's Association's (NIA-AA) most recent guidelines, “Revised criteria for diagnosis and staging of Alzheimer's disease”.4,5 Updating the 2018 ATN Research Framework (key differences summarized in Table 1), 5 this document points to the imminent clinical applicability of blood biomarkers, particularly those that are likely to receive regulatory approval. These are phosphorylated tau-217 (pTau217), glial fibrillary acidic protein (GFAP), and neurofilament light chain (NfL). 4
Differences in biomarker categorization between NIA-AA Research Framework (2018) and Clinical Criteria for Staging and Diagnosis (2024).
Note: the following biomarkers are included for conceptual purposes but have not undergone the same degree of testing as other Core biomarkers: P-tau231, p-tau205, MTBR-tau243, and non-phosphorylated tau.
Study characteristics.
Diagnostic criteria and blood biomarkers.
Outcomes.
Current evidence suggests that these biomarkers are clinically relevant: plasma pTau217 has shown diagnostic performance superior to other phosphorylated tau isoforms in both preclinical and clinical stages of AD dementia.6–8 Additionally, both GFAP and NfL are known to be elevated in AD brains, 9 and CSF,10–13 and have been shown to differentiate AD from other dementias.14–16 Further, blood biomarkers have arguably the most utility as screening and diagnostic tools in clinical settings, where patients present with a variety of co-pathologies. In such contexts, a combination of AD-specific biomarkers like pTau217, and non-specific biomarkers of associated neurodegenerative and inflammatory pathological processes (NfL and GFAP, respectively), could prove to have immense clinical significance.
Considering the above, this systematic review evaluates the diagnostic and differential diagnostic utility of pTau217, GFAP, and NfL in AD. Specifically, this review focuses on studies conducted in real-world clinical populations (e.g., memory clinics), where blood biomarker roll-out is most likely, and will offer an accessible alternative to present diagnostic techniques, that can be scaled to diverse, under-resourced settings including LMICs.
Methods
Search and screening
The systematic review was undertaken in accordance with the PRISMA guidelines, designed according to the PICOS framework (Figure 1), and was registered on the International Prospective Register of Systematic Reviews (PROSPERO: https://www.crd.york.ac.uk/PROSPERO/view/CRD42024524962/), registration number CRD42024524962.

Review design according to PICOS framework.
The systematic search was conducted in two phases: the first phase included studies published between 1 January 2019 and 31 December 2023, and the second was an update search carried out from 31 December 2023 to 27 January 2025. Both searches included publications across multiple databases (PubMed, EMBASE, Global Health, Journals@OVID and PsycINFO) and were limited to articles published in English in peer-reviewed journals.
Articles were selected as per the specific inclusion criteria: (1) the outcome of the article was to assess diagnostic or differential diagnostic ability of a biomarker of interest, or any combinations of the biomarkers of interest, (2) the population was partially or completely recruited from a clinical setting (for instance, a memory clinic or neurology department in a hospital), (3) the study had a clearly defined AD patient group, in which participants had both a clinical diagnosis consistent with AD (i.e., etiological confirmation) and biological confirmation of AD pathology via CSF or PET biomarkers.
Articles were excluded if (1) the study design and outcomes were in line with risk prediction or prognosis only, without any diagnostic element, (2) the population was exclusively cognitively normal, consisted of individuals with memory complaints without an AD-confirmed group, or was subject to strict recruitment criteria (for instance, a cohort study which categorically excluded patients with AD or other dementia diagnoses), (3) the population represented genetic variants of AD alone, (4) the biomarker of interest was measured in combination with a biomarker other than GFAP, NfL or pTau217, or was measured using a non-scalable method (5) the biomarker of interest was used as a reference, to assess efficacy of another biomarker or an intervention, (6) the article was associative, or aimed at elucidating underlying mechanisms of disease, neuropathology, (7) the study was carried out in a biofluid other than blood, or not in a human population (e.g., cell lines, mice), (8) any grey literature (conference abstract, poster presentation, book chapter, theses, etc.), meta-analyses or reviews.
Search results were imported into Rayyan, and duplicates were removed. Title- and abstract- screening, and subsequent full text screening were done independently by SS and LM. Any disagreements during the screening process were resolved by a consensus between SS, LM, SB, and VR.
Quality assessment and data extraction
Quality assessment of all included articles was conducted using the QADAS-2 tool by SS and LM. Signaling questions in all 4 domains of the QADAS questionnaire (patient selection, index test, reference standard and flow and timing) were tailored to ensure review-specific assessment of bias. Relevant data on study characteristics, biomarker- and diagnostic information, and outcomes were extracted by SS and LM. Narrative synthesis was conducted by SS.
Results
Study selection
The search identified 4272 articles across five databases (PubMed, Journals @ OVID, Embase, Global Health, and PsycINFO) (Figure 2). After eliminating 2296 duplicates, 1976 articles remained for title- and abstract screening. At this stage, 1908 articles were excluded as they were irrelevant to the review question, leaving 68 articles for full-text review. Of these, the full-text version of 2 reports was not available, and 49 reports were excluded for having ineligible diagnostic measures, biomarkers, outcomes, patient populations, or study design, and 4 were excluded as they were poster presentations or conference abstracts. Consequently, 13 articles were included in the review.17–29

PRISMA flowchart for Systematic Review search and screening. *OVID advanced search included the following databases: Journals @ Ovid, Embase, Global health, PsycINFO.
Study characteristics
Of the thirteen included studies, eight studies were conducted directly in specialist memory clinics or hospital neurology departments, 4 analyzed data from clinical cohorts, and 1 included patient data collected partially from cohorts, and partly from clinic (Table 2). Four studies explicitly mentioned consecutive patient recruitment, enhancing representation of real-world clinical settings by minimizing selection bias.
A total of 4686 participants were studied across all included articles. In some studies, only specific sub-cohorts relevant to the research question were assessed. Sample sizes were varied across studies, ranging from 98 in Cecchetti et al. (2024) 18 to 700 in Shen et al. (2023). 23 The average/ median age of the AD group was higher than or equal to that of the cognitively unimpaired (CU) group in nine cases and lower in three cases. One study (Cecchetti et al. 18 ) lacked a CU group. All studies had a fair representation of both males and females, assessed differences in age between diagnostic groups, and controlled for these factors in analyses if significant effects were found. There was some variability in demographic reporting: 8 studies recorded age as mean (SD) while 4 studies reported median (IQR).
The thirteen studies that met the inclusion criteria of the review represented nine countries. Three studies reported ethnicity: Kivisakk et al. (2023) 20 reported 93.7% non-Hispanic whites, Thijssen et al. (2021) 28 reported an 85% white population, and Shen et al. (2023) 23 reported exclusively Chinese participants.
Diagnostic criteria and blood biomarkers
There was substantial variability in the methods used to confirm AD diagnosis was observed, both clinically and biologically (Table 3). Six studies established clinical etiology according to the 2011 NIA-AA criteria, two according to the 2018 NIA-AA criteria, two studies used the DSM criteria, and three studies did not mention any specific guidelines. Despite 11 of 13 selected studies using CSF biomarkers to confirm AD diagnosis, the specific biomarkers varied ranged from only amyloid markers (CSF Aβ1−42, Aβ1−42/Aβ1−40 ratio) to a combination of amyloid and tau markers (such as CSF total tau, p-tau, p-tau181), and hybrid ratios (Aβ42/p-tau ratio). Amyloid PET was utilized in four studies, and genetic confirmation, and pathological diagnosis on autopsy were employed in addition to CSF/PET biomarkers in some cases.
Six studies evaluated more than one blood biomarker of interest, and no study evaluated all three biomarkers. All selected articles assessed other biomarkers, beyond the scope of this review. These were most commonly plasma pTau181 and amyloid markers in both plasma and CSF.
The diagnostic performance of pTau217 was assessed in three studies, all using different assays and platforms for analysis (Automated CLEIA on LUMIPULSE G600II system, Lilly plasma pTau217 on the MSD platform, commercial/homebrew kit on the Quanterix HD-1 analyzer). Along a similar vein, the eight studies that measured GFAP and NfL also used various assays for biomarker measurement. Individual measurements of NfL were either reported using the NF-light advantage kit, or a “commercial kit by Quanterix”, 27 or a “homebrew or commercially available kit”. 28 Additionally, four studies used multiplex assays for the simultaneous measurement of plasma GFAP and NfL. Kivisakk et al. (2023) 20 used an ultrasensitive prototype S-PLEX assay on the MSD platform, Thijssen et al. (2022) 29 and Shen et al. (2023) 23 used the Neurology 4-plex E kit to for simultaneous measurement of GFAP, NfL, Aβ1−42 and Aβ1−40, and Sarto et al. (2023) 22 used the Neurology 4-plex A kit to analyze GFAP, NfL, UCH-L1 and t-tau.
Overall, several assays were used to measure all biomarkers on both MSD and SiMOA platforms. Different versions of the SiMOA platform were used in different studies (e.g., SR-X, HD-1), leading to less standardization while measuring the same biomarker. Taken together, this reflects heterogeneity in biomarker measurement techniques.
No study measured the biomarkers relevant to the review in a panel with one another. Combinations of biomarkers of interest with those that are not within the scope of this review were excluded to maintain consistency in the relevant biomarkers.
Outcomes
Eleven out of the selected thirteen studies reported both diagnostic (AD versus CU) and differential diagnostic (AD versus other neurodegenerative disease(s)) comparisons, two reported differential diagnosis alone (Table 4). Studies whose outcome was solely the detection of amyloid positivity were systematically excluded from this review to maintain focus on the dementia phase of the AD continuum.
Of the three biomarkers of interest, only pTau217 consistently had an accuracy of >0.90 across all studies, and all twelve diagnostic comparisons. Although GFAP generally showed stronger diagnostic performance than NfL, especially in distinguishing AD from, the performance of both these biomarkers was moderate and varied. Of the thirty diagnostic comparisons in which GFAP was evaluated, six had an AUC > 0.90, sixteen had an AUC >0.75–0.90, and eight reported AUC < 0.75. Twenty-four comparisons were investigated for NfL, of which only one had an AUC > 0.90.
Discussion
This systematic review evaluates the utility of three blood-based biomarkers, p-tau217, GFAP, and NfL, in the diagnosis and differential diagnosis of AD dementia in patients recruited from clinical settings, such as memory clinics and hospital neurology departments. The development of highly sensitive blood-based tests to detect and diagnose AD presents a highly resource-efficient, non-invasive and accessible alternative to traditional CSF and PET neuroimaging techniques, that are expensive and non-scalable. 30 These biomarkers were recognized in the Alzheimer's Association Workgroup's recent “Revised Criteria for Diagnosis and Staging of Alzheimer's disease”, which emphasized their contribution in recent AD dementia research, and highlighted the likelihood of their regulatory approval imminently. 4 The revised criteria will play an important role in laying the groundwork for rollout of commercial plasma biomarker tests in coming years, particularly in clinical services, given their likely upscale given the advent of disease modifying therapies.31,32 In light of this, the current review is timely in synthesizing the current literature, pointing out gaps in the literature, and laying the groundwork for future studies.
One of the key recommendations of the Alzheimer's Association's 2024 Revised Criteria demonstrate an accuracy of >90%, or equivalent to its CSF counterpart. All studies assessing diagnostic utility of plasma p-tau217 are consistent with this recommendation. One of these studies, conducted by Thijssen et al. (2021), 28 also demonstrated the superior performance of p-tau217 as compared to NfL across all diagnostic comparisons. This is expected as NfL is a non-specific biomarker of neurodegeneration, especially in disorders of cognitive dysfunction.12,33,34 While the combined performance of p-tau217 and NfL was not explored, individual evaluation of NfL was investigated. Although two independent evaluations of NfL reported moderate diagnostic accuracy, both demonstrated that the diagnostic performance of plasma NfL was equivalent to its CSF counterpart, suggesting increased clinical utility and scalability.26,27 Similar findings have been reported in longitudinal studies of a cohort of non-demented participants (cognitively normal, or with mild cognitive impairment). 35 However, a meta-analysis evaluating the overall pooled correlation coefficient estimate between NfL in both these biofluids reported only moderate correlations. 36
In contrast to NfL, GFAP showed >90% accuracy in AD diagnosis23,24 and differential diagnosis, specifically against non-AD pathology, 22 as well as amyotrophic lateral sclerosis and spinocerebellar ataxia. 23 Notably, the study by Shen et al. 23 is the only selected study to report findings from an LMIC and could provide valuable insights into the global applicability of GFAP in clinical settings. In the four studies that assessed accuracy of both NfL and GFAP, GFAP outperformed NfL in most diagnostic comparisons. Analogously, GFAP outperformed NfL in predicting AD incidence in population-based community cohort representing AD and other neurodegenerative disorders followed for 17 years, suggesting similar trends in studies aimed at risk prediction and prognosis. 37
The major strengths of this review are the specific focus on clinical populations, evaluation of biomarkers of high clinical relevance, and the validation of blood biomarkers against gold-standard diagnostic methods (PET or CSF). However, some of the initial hypotheses of the review were not met. For instance, combinations or panels of the selected biomarkers only were not performed in any of the included studies. This is surprising for a few reasons. First, it was anticipated that the inclusion of differential diagnosis as an outcome would prompt the selection of articles that looked at AD-specific (p-tau217), along with non-specific (GFAP and NfL) biomarkers as these are expressed differentially in neurodegenerative conditions and might, intuitively, provide a more holistic insight into heterogenous pathologies represented in real-world, memory clinic settings. 38 For instance, NfL has been shown to identify dementia in Down syndrome and among psychiatric conditions, and even between demented and non-demented Parkinson's disease patients.39,40 This gap might, in part, be due to the exclusion of comparisons of a selected biomarker with another biomarker like p-tau181 or plasma Aβ42/40. This biomarker selection trade-off was considered during the conceptualization phase of the review, and was justified since, (a) p-tau217 has shown superior performance to other p-tau isoforms, 8 41–44 (b) plasma Aβ42/40 assays have shown limited robustness and diagnostic range,45,46 and (c) an intention to keep biomarker selection consistent with the Revised Criteria was a major driving force for the current review.
A similar trade-off was also considered while excluding data from exclusively cognitively healthy, pre-clinical populations, or those from population-based- or community- cohorts. Importantly, as community- or population- cohort enrolment hinges on voluntary participation, often, at-risk groups are inherently excluded, and this limits generalizability in clinical populations. Yet, as much of clinical research has utilized data from cohorts and health registries, articles were included if the cohort mentioned at least partial recruitment from clinical services. Notably, through the screening process, it was observed that cohorts including minority- or non-White ethnicities recruited exclusively through community outreach initiatives and not through clinical services, suggesting a lack of healthcare utilization and access for ethnic minority groups.
Eleven studies represented patient populations from World-Bank designated high income countries: France, Germany, Italy, Netherlands, Portugal, Sweden, Spain, and USA, and two studies presented data from Chinese populations. Adjacently, only three included articles reported ethnicity, of which two reported >85% White participants. Geographical representation was also limited, with only one study being conducted in a non-High-income country. This limitation reflects disparities in resources and infrastructure, which perpetuate the use of clinical judgement alone, without biomarker-backed diagnoses, in resource-deficient settings.47,48 This creates a paradox: while the use of blood biomarker tests might bridge this disparity, their validation requires data from the very populations that currently lack access to currently-approved, gold standard methods. Efforts to improve inclusion and ethnicity reporting in clinical cohorts must be emphasized in future research. In addition to ethnicity, many analyses in the included article failed to consider social determinants of health, such as social contact, socioeconomic status, and discrimination, that have increasingly shown an association with dementia outcomes.49,50 This omission also reduces the applicability and relevance of the above findings to historically marginalized groups. Additionally, it is important to note that while the review's exclusive focus on cross-sectional studies in memory clinic settings provides increased relevance in routine clinical practice, it restricts the scope of blood biomarker utility in earlier phases of the AD continuum, which has been widely demonstrated.43,51
In this review, plasma pTau217 demonstrated the highest diagnostic accuracy for AD. Both GFAP and NfL displayed moderate results, that varied depending on the cohort, analytical platform, and diagnostic comparison. Despite the growing body of evidence showcasing the effectiveness and advantages of plasma biomarkers, significant gaps in the literature persist, especially in terms of representation of real-world, diverse, clinical settings, and the absence of contextual variables. These issues need to especially considered in the context of blood biomarkers, given that they provide a highly accessible, globally scalable, and resource-effective alternative for current diagnostic methods.
Supplemental Material
sj-docx-1-alz-10.1177_13872877251408510 - Supplemental material for Blood biomarkers for diagnosis and differential diagnosis of Alzheimer's disease in real-world clinical populations: A systematic review
Supplemental material, sj-docx-1-alz-10.1177_13872877251408510 for Blood biomarkers for diagnosis and differential diagnosis of Alzheimer's disease in real-world clinical populations: A systematic review by Shivani Suresh, Luciana Maffei, Sarah Bauermeister and Vanessa Raymont in Journal of Alzheimer's Disease
Footnotes
Acknowledgements
The authors have no acknowledgments to report.
Ethical considerations
This systematic review synthesizes the findings of previously published research studies, without including any new data from human participants or animals. Therefore, informed consent was not required.
Consent to participate
Not applicable
Consent for publication
Not applicable
Author contribution(s)
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data sharing is not applicable to this article as no datasets were generated or analyzed during this study.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
