Abstract
Background:
Brain atrophy (BA) is a useful predictor of clinical outcomes in people with multiple sclerosis (PwMS). For this reason, MAGNIMS (Magnetic Resonance Imaging in Multiple Sclerosis), an expert consensus group, recommended that global brain volume loss (BVL) is included as a secondary outcome in therapeutic clinical trials. However, there has not been a recent review of the evidence of the association, or strength of association, between global BA and disability in PwMS.
Objectives:
Our aim is to review articles from 2013 onward measuring the associations between percentage of brain volume loss (PBVL), normalized brain volumes (NBV) or normalized brain parenchymal volume (NBPV), and the Expanded Disability Status Scale (EDSS), or disability progression (DP) measured by EDSS in PwMS.
Design:
Systematic review.
Methods:
We searched Medline, Embase, Cochrane Library, Cochrane Clinical Register of Controlled Trials, Cochrane Database of Systematic Reviews and Cumulative Index to Nursing and Allied Health Literature for observational studies, clinical trials and modelling studies measuring the association between global BVL, PBVL, NBV or NBPV, and EDSS score or DP in PwMS. We included people with clinically isolated syndrome and excluded studies with a population greater than 20% primary progressive multiple sclerosis patients.
Results:
We found 58 studies were eligible for the review. Most longitudinal studies (19/23) observed a significant association between global BVL and change in EDSS score or DP. Similarly the majority of cross-sectional studies (26/29) observed an association between baseline BV measures and EDSS. Most studies investigating the association between baseline brain volume (BV) measures and follow-up EDSS, that is, asking if baseline BV is a predictor of DP, or future EDSS score, did not find an association (4/15 observed an association).
Conclusion:
Around a 1% (range 0.4%–1.3%) decrease in global BV per year was associated with DP, but caution in comparing studies is recommended due to variations in the definition of DP.
Introduction
Multiple sclerosis (MS) is a widespread neuroinflammatory disease associated with progressive neurodegeneration. Typically onset is at a young age. 1 Symptoms include altered sensation such as numbness, tingling, itching or pins and needles, impaired coordination and balance, weakness, muscle spasms or cramps, fatigue, pain, and visual disturbances.2,3
Brain atrophy (BA) in people with multiple sclerosis (PwMS) was first observed in the early 1960s. 4 By the mid-1990s magnetic resonance imaging (MRI) was being used to quantify BA, and its association with physical disability. 5 These early studies used two-dimensional scans to measure BA.5,6 More recent studies using three-dimensional scans with analysis by Structural Image Evaluation Using Normalization of Atrophy (SIENA) software suggest that PwMS will lose around 0.5%–1.35% of brain volume (BV)/year, whereas for an average person without MS around 0.1%–0.3% of BV/year is lost due to normal ageing.7 –9 Measurements of BA are thought to represent the net effect of all the degenerative processes occurring in MS10 –12 and seem to be better predictors of clinical outcomes then other MRI measures.6,13 –22 For example, BA seems to be a better marker of clinical disability than conventional lesion measures,23,24 and people with higher levels of BA are more likely to progress from clinically isolated syndrome (CIS) to MS.15,25,26 BA has been associated with increased disability and worsening cognition,11,17,20,21,27 –41 poorer quality of life,42,43 increased fatigue 44 and poorer economic outcomes.45,46 Several reviews (2009–2016) have described associations between BA and cognition but just three considered physical disability.23,47 –50 The MAGNIMS (Magnetic Resonance Imaging in Multiple Sclerosis) group recently recommended that global brain volume loss (BVL) is used to define and predict MS severity and is included as a secondary outcome in therapeutic clinical trials. 51
Over the last four decades, the most widely used instrument to assess disability progression (DP) in people with MS (PwMS) has been the Expanded Disability Status Scale (EDSS). 52 It is an ordinal rating system ranging from 0 to 10. A frequent criticism of the EDSS is the unequal interval distances between points, for example, the difference between the values 1 and 2 has a different relevance to 6 and 7. This is because the lower end of the scale (0–4) measures neurological impairments, 4–6 indicates impacts on walking ability, 6–7 indicates loss of walking other than a few metres and by 7–7.5 wheelchair dependency. 53 Furthermore, EDSS is only weakly correlated with neuropsychological impairment or patient-reported outcomes. 53
There are several approaches to, and technical challenges in, quantifying global BA using MRI. 24 A longitudinal approach analyses two or more MRI scans on the same individual at different time-points that are spatially matched and subtracted. The brain surface is automatically modelled and registered to the outer skull to compare and normalize serial images. Percentage of brain volume change (PBVC) is based on changes in the brain surface relative to the skull and normalized to the skull size. SIENA is the most frequently used image-processing software. 54 BA can also be estimated or inferred from a single MRI scan. The image is registered to a standard space and normalized to the skull surface and tissue is segmented into regions of brain and non-brain tissue. 55 The most frequently used software is Structural Image Evaluation, using Normalization of Atrophy-Cross-Sectional (SIENAX). 54 The normalized brain volume (NBV) measurements can then be categorized as increased or decreased in volume relative to controls or a standard brain atlas and compared with EDSS scores, or compared between groups such as PwMS with or without DP defined by EDSS. Alternatively, the image may be processed using a proportion-based method to estimate the proportion of intracranial volume occupied by brain tissue, that is, the brain parenchymal fraction or normalized brain parenchymal volume (NBPV). 56
An earlier review looked at the associations between BA and physical disability using studies published prior to 2013. 48 Our aim is to build on this review by systematically reviewing the associations between global BA and physical disability measured by EDSS in people with MS, particularly CIS/RRMS, from 2013 onwards. Our primary objective is to compare longitudinal studies measuring the association between change in whole BV and change in EDSS score or DP progression measured by EDSS. Our secondary aim is to compare cross-sectional studies measuring the association between normalized whole BV, or fraction, and EDSS score. A further aim is to compare longitudinal studies measuring the association between baseline normalized whole BV, or fraction, and future EDSS score at follow-up. As BA is increasingly used as an outcome in clinical trials, 51 we need to improve our understanding of the relationship between BA and physical disability.
Methods
Protocol
The protocol was designed according to the preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement 57 and finalized in January 2022 before the final literature search was undertaken.
Inclusion criteria
People aged 18 or over, diagnosed with CIS or MS.
Longitudinal observational studies, clinical trials and modelling studies measuring the association between change in whole BV and EDSS score or DP defined by EDSS score. Confirmed or non-confirmed DP was included.
Cross-sectional studies measuring the association between NBV, or fraction, and EDSS score or DP defined by EDSS score.
Original article written in English and published between 1 January 2013 and 3 February 2022 in a peer reviewed journal.
Exclusion criteria
Study population includes 20% or more people with primary progressive multiple sclerosis (PPMS). A pragmatic decision to keep the review focus on CIS/RRMS.
Studies measuring the association with cord or regional atrophy only.
Studies where EDSS was only reported as a composite of ‘No Evidence of Disease Progression’. (NEDA-3 is defined by no new MRI lesions, no new relapses and no change in EDSS over 1 year. NEDA-4 is defined by no clinical relapse, no confirmed disability progression, no new or enlarging T2 lesion on brain MRI and no brain volume loss ⩾0.4% per year.)
Studies where the association was not reported separately for PwMS and healthy controls.
The review is focused on CIS/RRMS because BVL is thought to proceed at a faster rate in the earlier stages and disease modifying therapy (DMT) are more effective at this time. However, many studies were not restricted cleanly to CIS/RRMS and sometimes patients progressed to secondary progressive multiple sclerosis or PPMS during follow-up. We wanted to include as many papers as possible that focused on people with CIS/RRMS so made a pragmatic, a priori, decision to exclude studies with more than 20% of people with PPMS diagnoses.
Electronic searches
We searched Medline, Embase, Cochrane Library, Cochrane Clinical Register of Controlled Trials (CENTRAL), Cochrane Database of Systematic Reviews (CDSR) and Cumulative Index to Nursing and Allied Health Literature (CINAHL). The full search strategy is listed in Table 1.
Literature search terms for review of the association between global brain atrophy and the EDSS score in people with multiple sclerosis.
Duplicates between Embase and MEDLINE removed.
EDSS, Expanded Disability Status Scale; MS, multiple sclerosis; PPMS, primary progressive multiple sclerosis; RRMS, relapsing-remitting multiple sclerosis; SPMS, secondary progressive multiple sclerosis.
Screening process
PROQUEST was used to remove duplicate articles from Medline and Embase which was then combined with articles from Cochrane and CINAHL. Additional duplicates were removed by searching within excel. The initial title and abstract screening consisted of two researchers independently completing a standard data collection form. Any discrepancies between the researchers were resolved by discussion. Full-text articles were obtained for all the abstracts selected for the second screen. The second screening of the full-text articles was undertaken independently by two researchers followed by discussion to reconcile any differences. A third reviewer was available in case the two reviewers did not reach agreement but was not required.
Data extraction
A standardized Excel spreadsheet was used to capture the data list in Box 1. The data were extracted by one researcher and validated by a second researcher.
Data extracted from each study.
BA, brain atrophy; EDSS, Expanded Disability Status Scale; MRI, magnetic resonance imaging; MS, multiple sclerosis; PwMS, people with MS.
Data analysis
Studies were grouped first according to study design, that is, longitudinal or cross-sectional. Longitudinal studies were then grouped according to the way EDSS was used to describe increase in disability – either as change or loss in BV (percentage of brain volume loss, PBVL) or as DP. If EDSS scores were transformed in to DP then further groupings were used to count how many studies had equivalent definitions of DP given the range of baseline EDSS reported for the study. Cross-sectional studies were also further grouped into those comparing baseline BA with baseline EDSS and those comparing baseline BA with follow-up EDSS. Within these four groups, the studies were described and compared qualitatively. Results that were methodologically aligned and used similar outcome measures were reported individually to allow for comparisons.
Results
Screening process
The screening process is shown in Figure 1. Briefly, 3114 abstracts were screened resulting in 715 papers and abstracts for full screening. At this point, the review was limited to full-text articles published from 2013 onwards resulting in 320 articles for full-text screening. In total 164 articles published before 2013 and 231 conference abstracts published from 2016 onwards were not considered further. The full-text screening resulted in 197 articles measuring BA; 131 of which also focused on disability outcomes. Studies only addressing regional BA, or not reporting on the associations with EDSS or DP, or the study population included more than 20% PPMS diagnoses were excluded. The remaining 58 studies were included in this review.

Flowchart of the selection process.
Study designs and patient characteristics
The study designs and patient characteristics are summarized in Table 2. Twenty-three were longitudinal analyses and 37 were cross-sectional. Two studies included both cross-sectional and longitudinal analyses.58,59 The longitudinal studies had slightly larger study populations and included slightly younger PwMS with slightly shorter disease duration compared to cross-sectional and follow-up studies (Table 2).
Summary of study population characteristics by study design.
EDSS, Expanded Disability Status Scale; PPMS, primary progressive multiple sclerosis; PwMS, people with multiple sclerosis.
Methods and results of longitudinal studies
The methods and results for the longitudinal studies are described in Table 3. The patient characteristics are summarized in Table 4. All studies measured BA as PBVL but one study categorized PBVL for analysis. 60 Eight studies investigated the association between PBVL and change in EDSS; 6 of which observed a significant association (p < 0.05). Two small studies were inconclusive, one observed a non-significant association (n = 38) 61 and another observed no change in EDSS (n = 16). 62 Fifteen studies investigated the association between PBVL and DP defined by changes in EDSS score; 13 of which reported a significant association between PBVL and DP. One study observed a non-significant association (n = 62), 63 and one study observed no difference in NBV (n = 82). 64 There were six different definitions of DP (see 1–6, Table 5). Twenty longitudinal studies were retrospective analyses of existing cohorts and clinical studies. Three longitudinal studies were prospective, two of which observed a significant association between PBVL and DP46,65 and one observed a significant association between PBVL and EDSS with Icobrain software but not SIENA. 59 All except one of the longitudinal studies used SIENA software (one used FreeSurfer 66 ) but two used other software in addition to SIENA (Icobrain 59 and SPM12 62 ). Eighteen studies used regression or survival models adjusted by various combinations of sex, age, education, disease phenotype, disease duration, study cohort, lesion volume, field strength and other baseline variables. The majority of the study populations were treated with DMT and just one study included mostly untreated patients. 46 Many studies also reported measures of regional BA (14/23, 61%) and regional atrophy was the main focus for many studies. A few studies reported additional measures of physical disability, specifically 9HPT and T25FW (4/23, 17%).61,62,67,68 Most studies didn’t report an annualized PBVL tending to report the PBVL over the observation period or by categories but for the 5 that did report annualized PBVL, the range was −0.48% ± 0.93 to −0.9% ± 1.0 with observation periods ranging from 2 to 7.5 years.
Description of analysis and outcomes of longitudinal studies reporting associations between PBVC and change in EDSS scores (raw EDSS score or categorized as disability progression).
CI, confidence interval; DI, disability improvement; DMT, disease modifying therapy; DP, disability progression; EDSS, Expanded Disability Status Scale; HR, hazard ratio; IQR, interquartile range; MRI, magnetic resonance imaging; NBV, normalized brain volume; OR, odds ratio; PBVC, per cent brain volume change; PBVL, percentage of brain volume loss; RCT, randomized controlled trial; SIENA, Structural Image Evaluation, Using Normalization, of Atrophy.
Patient characteristics at baseline for longitudinal studies reporting associations between PBVC and change in EDSS scores (raw EDSS score or categorized as disability progression) and other study data collected including treatment, measures of physical disability and regional brain atrophy.
Australia, Belgium, Canada, CZ, Finland, France, Germany, Greece, Israel, Lithuania, NL, Poland, Russia, Slovakia, SA, Switzerland, Sweden, Turkey, UK.
Germany, Italy, NL, Spain, Switzerland, UK (MAGNIMS).
Austria, Canada, Chile, Czech Republic, Denmark, Estonia, Finland, France, Germany, Italy, NL, Norway, Poland, Portugal, Russian Federation, Sweden, Switzerland, Turkey, Ukraine, UK, USA.
Australia, Austria, Belgium, Brazil, Bulgaria, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Italy, Lithuania, Latvia, Morocco, NL, Poland, Portugal, Russia, Saudi Arabia, Serbia and Montenegro, Turkey, Ukraine, USA; n/s not stated.
AZA, azathioprine; CIS, clinically isolated syndrome; CTh, cortical thickness; DGM, deep grey matter; DMF, dimethyl fumarate; DMT, disease modifying therapy; EDSS, Expanded Disability Status Scale; GA, Glatiramer acetate; GMF, grey matter fraction; GMV, grey matter volume; 9HPT, 9 hole peg test; IFNB-1a, interferon beta-1a; IQR, interquartile range; LVV, lateral ventricle volume; MAGNIMS, Magnetic Resonance Imaging in Multiple Sclerosis; MS, multiple sclerosis; NAWMV, normal appearing white matter volume; nCGMV, normalized cortical grey matter volume; nCV, normalized cortical volume; nGMV, normalized GMV; nLVV, normalized LVV; nSDGMV, normalized sub-cortical deep grey matter volume (includes Caudate nucleus, Putamen, Globus pallidus, Thalamus, Hippocampus, Nucleus accumbens, Amygdala); nWGMV, normalized whole grey matter volume; nWMV, normalized white matter volume; PBVC, percentage of brain volume change; PCVC, percentage cortical volume change; PFHWC, percentage frontal horn width change; PGMVC, percentage grey matter volume change; PHVC, percentage hippocampus volume change; PICDC, percentage intercaudate distance change; PPMS, primary progressive multiple sclerosis; PSCDGMVC, percentage sub-cortical deep grey matter volume change; PTVC, percentage thalamic volume change; PTVWC, percentage third ventricle width change; P4VWC, percentage fourth ventricle width change; PVVC, percentage ventricular volume change; RRMS, relapsing-remitting multiple sclerosis; T25FW, timed 25 feet walk; WMF, white matter fraction; WMV, white matter volume.
Definitions of DP.
DP, disability progression; EDSS, Expanded Disability Status Scale.
Methods and results of cross-sectional and follow-up studies
The methods and results for the 37 cross-sectional studies are described in Table 6. The patient characteristics are summarized in Table 7. Twenty-nine studies investigated cross-sectional associations between baseline BA measured by NBV, NBPV or WBV and baseline EDSS. Of these 29 studies, 26 observed a significant association (p < 0.05) but 2 of them had mixed results with different methods giving a different result.56,80 The three studies failing to observe an association were small (n = 29, 38 and 95),61,81,82 as were the studies with mixed results (n = 20 and 61).56,80 The correlations measured by Spearman’s rho, Pearson’s r, Kendall’s tau were weak for all studies (0.25–0.5) and the coefficients from regression models tended to be small. 83
Description of analysis and outcomes of cross-sectional and follow up studies reporting associations between global brain volume and EDSS scores (raw EDSS score or categorized as DP).
BPV, brain parenchymal volume; BV, brain volume; CS, clinically stable; CW, clinically worsened; DP, disability progression; EDSS, Expanded Disability Status Scale; HR, hazard ratio; MRI, magnetic resonance imaging; NBPV, Normalized brain parenchymal volume; NBV, normalized brain volume; NCSFV, normalized cerebrospinal fluid volume; OASIS, Open Access Series of Imaging Studies; PBVL, percentage brain volume loss; SIENAX, Structural Image Evaluation, using Normalization of Atrophy-Cross-Sectional; WBV, whole brain volume.
Patient characteristics for cross-sectional and follow-up studies reporting associations between global brain volume and EDSS scores (raw EDSS score or categorized as DP) and other study data collected including treatment, measures of physical disability and regional brain atrophy.
Germany, Italy, NL, Spain, Switzerland, UK (MAGNIMS).
Denmark, Finland, France, Germany, Italy, Poland, Portugal, Spain, Switzerland, UK.
Argentina, Australia, Austria, Belgium, Brazil, Canada, Egypt, France, Germany, Greece, Hungary, Italy, Republic of Korea, Portugal, Spain, Switzerland, UK, USA.
AI, ambulation index; ALEM, alemtuzumab; AZA, azathioprine; CCA, corpus callosum area; CCf, corpus callosum fraction; CCI, corpus callosum index; CCV, corpus callosum volume; CGM, cortical grey matter; CIS, clinically isolated syndrome; CTh, cortical thickness; CYC, cyclophosphamide; DGM, deep grey matter (thalamus, putamen, globus pallidus, caudate and amygdala); DMF, dimethyl fumarate; DMT, disease modifying therapy; EDSS, Expanded Disability Status Scale; FTY, fingolimod; GA, Glatiramer acetate; GMf, grey matter fraction; GMV, grey matter volume; 9HPT, 9 hole peg test; IFNB-1a, interferon beta-1a; IQR, interquartile range; LVV, lateral ventricle volume; MAGNIMS, Magnetic Resonance Imaging in Multiple Sclerosis; MEDW, medullary width; MS, multiple sclerosis; MSWS, MS waking scale; MTX, methotrexate; NAT, natalizumab; NAWMV, normal appearing white matter volume; nBstV, normalized brain stem volume (includes Midbrain, Pons, Medulla oblongata); nCGMV, normalized cortical grey matter volume; nCV, normalized cortical volume; nDGMV, normalized deep grey matter volume; nGMV, normalized GMV; nLVV, normalized LVV; nSDGMV, normalized sub-cortical deep grey matter volume (includes Caudate nucleus, Putamen, Globus pallidus, Thalamus, Hippocampus, Nucleus accumbens, Amygdala); nWMV, normalized white matter volume; PPMS, primary progressive multiple sclerosis; RRMS, relapsing-remitting multiple sclerosis; RTX, rituximab; Tf, thalamic fraction; TFM, teriflunomide; TV, Thalamic volume; SPMS, secondary progressive multiple sclerosis; TVW, third ventricle width; T25FW, timed 25 feet walk; WMf, white matter fraction; WMV, white matter volume.
Fifteen studies investigated the association between baseline BA and follow-up EDSS or DP. Of these 15 studies, 5 observed a significant association (p < 0.05),59,79,89,91,94 8 did not and 2 had mixed results according to the DP outcome definition 105 or the follow-up time. 110
Sixteen of the 37 cross-sectional studies reported NBV (a registration-based method), 17 studies reported NBPV (a proportion-based method) and 2 studies reported BV or whole BV but were unclear if this was normalized.103,108 Two studies reported normalized cerebrospinal fluid volume (NCSFV) which is directly related to NBPV. NBPV is the sum(WMV&GMV)/sum(WMV, GMV&CSFV); therefore, as CSFV increases NBPV decreases. Two studies defined BA as part of the analysis, one referenced to healthy controls 90 and the other to the Open Access Series of Imaging Studies cohort.91,113
The majority of cross-sectional studies used SIENAX (20), followed by SPM5 or SPM8 (6), Freesurfer (5), Icobrain (2), ScanView (1), SyMap (1), MSmetrix (1), BRAIM (1), cNeuro (1) and 2 did not state the image analysis software used. Some studies used more than one type of image analysis software. The majority of the study populations reported treatment with DMT (23/37, 62%) but many studies did not report treatments (14/37, 38%). Most studies also reported measures of regional atrophy (30/37, 81%). Some studies reported additional measures of physical disability, specifically 9HPT, T25FW, SMWS, postural sway and ambulation index (10/37, 27%). Of the 29 cross-sectional studies investigating baseline associations, 2 dichotomized EDSS scores and 27 used EDSS score directly. Nine studies used regression analysis, and the most frequently used measure of correlation was Spearman’s rho (12/29) or Pearson’s r (8/29). Many studies adjusted for combinations of age, gender, disease duration, phenotype and scanner (11/29). Of the 15 follow-up studies 9 categorized EDSS scores for analysis, 6 according to DP (definition 2, 4, 7–9 in Table 5) and 3 in other ways, such as reaching EDSS = 4, or time to reach EDSS = 4. Ten follow-up studies used regression or survival analysis adjusting for combinations of age, gender, T2 lesion volume and disease duration.
Summary of longitudinal findings
The analytical approaches varied between studies making between study comparisons difficult other than the observation of an association, or not. However, for those studies amenable to comparisons, that is, reported associations between annual or annualized PBVC and EDSS, the outcomes were consistent as listed below (Table 3).
0.13% BVL per year associated with a 1 point increase in EDSS. 11
0.4% BVL per year distinguished patients with zero mean annual change in EDSS from those with a mean annual EDSS increase of 0.14. 49
1.0% BVL per year in patients with DP versus 0.28% without DP. 71
1.1% BVL per year in patients with DP versus 0.8% without DP or 0.7% with disability improvement. 67
1.26% BVL per year with DP and 0.19% without DP. 66
Discussion
Main findings
Our main finding is that the majority (19/23) of longitudinal studies observed a significant association (p < 0.05) between PBVL and change in EDSS or DP defined by EDSS score. Looking more closely at the four studies failing to observe an association between PBVL and change in EDSS or DP, all were inconclusive, possibly as a consequence of being underpowered. Our second main finding is that the majority of cross-sectional studies observed a significant association between NBV or NBPV and EDSS scores (26/29), although two studies reported mixed results.56,80 The three studies failing to observe any association between baseline NBV81,82 or NBPV 62 and EDSS were all small studies and potentially underpowered.
As far as we are aware, the previous reviews on this topic were published in 2016 but only cover the literature up to 2013.23,48 These reviews both observe a consistent association, but there is a range of approaches to measuring regional and global atrophy as well as a range of disability outcome measures. A substantial body of literature has grown over the last decade, and this review provides increasing evidence of a consistent association between PBVL and change in EDSS or DP, and BA measured by NBV or NBPV and EDSS scores, that is, between objective and clinically meaningful measurements. Furthermore, this review has made some progress in quantifying the magnitude of the association between PBVL and EDSS in PwMS. Two prospective studies observed a consistent and significant association between PBVL and DP (−3% at 4 years and −3.8% 5 years)65,78 and three retrospective analyses also had similar findings, for example, 1.0%, 71 1.1% 67 and 1.26% 66 annual PBVC with DP.
Implication of multiple definitions of DP
Most studies were predominantly RRMS diagnoses (due to studies with >20% PPMS patients being excluded) and typically had mean EDSS scores around 2. The range of baseline EDSS scores was 0–9. The variation in definition of DP (Table 5) may be a substantial source of bias for this review. (However, the 8 longitudinal studies reporting change in EDSS and the 29 cross-sectional studies reporting baseline EDSS would not be affected by this bias.) For example, 8/15 longitudinal studies using DP included people with EDSS of zero at baseline yet 5/15 studies defined DP as a 1.5 point change from baseline, while 10/15 defined DP as a 1 point change. At other end of the EDSS range 10/15 studies varied the baseline EDSS score between ⩾5, ⩾5.5 or ⩾6, where a 0.5 point increase rather than a 1 point increase defined DP. Patients with EDSS score below 5 are fully ambulatory and the functional systems scores are the main determinants of EDSS. Given that ambulation is the main determinant of change in EDSS when scores are above 5, and scores are less sensitive to change in patients with higher EDSS scores 53 , categorizing EDSS into DP might be a more useful outcome measure than change in EDSS. However, if the definitions are not aligned it will lead to misclassification and difficulty in comparing studies. Altogether, there were nine definitions of DP in the papers included in this review and future studies would benefit from a universally agreed definition of DP. The European Medicines Agency recommends 1 point on the EDSS scale with a baseline EDSS score less than or equal to 5.5 and 0.5 points in an EDSS score over 5.5. 53
Statistical methods and covariates
Most studies reported PBVL over a time period rather than annualized. This is probably because BVL doesn’t progress linearly, 114 nonetheless over a short time period it is a helpful way to compare study outcomes. Some statistical analyses (Tables 3 and 6) assumed a linear relationship but not all. Most studies used models that adjusted for various combinations of sex, age, education, disease phenotype, disease duration, study cohort, lesion volume, DMT or other treatment and MRI field strength. We hypothesize that DMTs mostly have an impact on PBVL or NBV or NBPV which in turn affects EDSS or DP rather than directly affecting EDSS via another mechanism. This is speculative but as none of the studies have consistent usage of DMT or other treatments, we have to be pragmatic and suggest that the best approach is to report associations both with, and without adjustment, for treatments. If our hypothesis is correct and treatments are affecting EDSS mainly, or solely, through modifying BVL then models adjusted for treatment or unadjusted will provide similar results. Most studies reported both univariate and multivariate associations, and these were generally in agreement (Tables 3 and 6). Ideally a study would adjust for variables associated with EDSS, change in EDSS or DP but not variables solely affecting PBVL, NBV or NBPV.
Regional atrophy considered in the studies
We only considered global BA in this review but 44/58 included studies also considered regional atrophy, often as the main focus of the study, plus a further 39 studies were identified in screening that only addressed regional atrophy. The quantity of studies identified by the literature search pointed to the need for a separate review to consider the association between EDSS and regional BA. Different regions of the brain show different degrees of atrophy relative to global BA. 115
The most frequently reported regional measures were the corpus callosum area or volume, cortical thickness, grey matter (GM) and deep GM volumes, white matter volume, lateral ventricle volumes, thalamic volumes and third ventricle width (Tables 3 and 6). BA in the deep GM can be detected earlier than whole BA 62 and the mid-sagittal cross sectional area of the corpus callosum seems to be particularly affected by MS.116 –118 Atrophy in the thalamus appears to proceed faster than in other brain regions.27,98,119 Volume loss in the GM is more related to neuro-axonal loss and neuronal shrinkage than demyelination.120,121 This may make GM matter changes less susceptible to confounding that might occur when using PBVL, NBV or NBPV due to active inflammation, vasogenic oedema, dehydration and gliosis, or decreased by anti-inflammatory drugs.12,24,122,123 All the studies in this review addressed potential confounding by pseudo-atrophy by excluding PwMS with recent relapse or use of anti-inflammatory drugs and observing changes over at least 1 year.
Baseline BA as a predictor of future DP
Most studies found no association between baseline NBV or NBPV and DP at follow-up. NBV or NBPV at a single point in time may reflect the prior BA rate but the future atrophy rate may differ. Many studies adjusted for age, gender and disease duration, but disability and BA at follow-up may be also influenced by treatment and baseline EDSS.
Limitations of the review
Although we find evidence of a consistent association between BVL and EDSS, we are unable to compile an estimated effect size because of the wide range of methodological approaches. None of the longitudinal studies had the primary aim of measuring the association between BVL and EDSS. Furthermore, most studies were retrospective analyses of data collected for other purposes. Quality is generally evaluated with respect to the primary outcome of the study. The best way to estimate the effect size using these data would be a meta-analysis of the original data. We could potentially divide the studies into groups, where DP happened to be equivalent by coincidence of the definition and the baseline EDSS, but this would make several small groups and is unlikely to be meaningful.
Conclusion
Our review of 58 studies consistently finds an association between PVBL, NBV or NBPV and EDSS scores or DP defined by EDSS score. It supports the MAGNIMS recommendation that global BA should be included as a secondary outcome in therapeutic clinical trials. 51 We further recommend that a consensus on the definition of DP is agreed to allow for more accurate quantification of the magnitude of the relationship between DP defined by EDSS scores and MS going forward. Baseline BVL or NBPV does not appear to reliably predict future DP or EDSS score.
