Abstract
Background:
Existing severity measurements in multiple sclerosis (MS) are often cross-sectional, making longitudinal comparisons of disease course between individuals difficult.
Objective:
The objective of this study is to create a severity metric that can reliably summarize a patient’s disease course.
Methods:
We developed the nARMSS – normalized ARMSS (age-related MS severity score) over follow-up, using the deviation of individual ARMSS scores from the expected value and integrated over the corresponding time period. The nARMSS scales from −5 to +5; a positive value indicates a more severe disease course for a patient when compared to other patients with similar disease timings.
Results:
Using Swedish MS registry data, the nARMSS was tested using data at 2 and 4 years of follow-up to predict the most severe quartile during the subsequent period up to 10 years total follow-up. The metric used was area under the curve of the receiver operating characteristic (AUC-ROC). This resulted in measurements of 0.929 and 0.941. In an external Canadian validation cohort, the equivalent AUC-ROCs were 0.901 and 0.908.
Conclusion:
The nARMSS provides a reliable, generalizable and easily measurable metric which makes longitudinal comparison of disease course between individuals feasible.
Introduction
Multiple sclerosis (MS) is a chronic, neurodegenerative disorder of the central nervous system which requires lifelong care. As MS has a heterogeneous disease course, it is important to determine a patient’s current and potential future severity in order to understand if interventions affect the disease course and to mitigate progression of disease. The most commonly used disability measure in both MS clinical practice and clinical trials is the expanded disability status scale (EDSS). The EDSS is an ordinal scale ranging from 0 (no neurological deficits) to 10 (death due to MS) and includes an assessment of eight functional systems by a neurologist during a clinical examination. 1 Due to limitations of the cross-sectional nature of the EDSS score, the MS severity score (MSSS) 2 and age-related MS severity score (ARMSS) 3 were created which enable a patient’s EDSS to be ranked by years since disease onset or age, respectively. Compared to the MSSS score, the ARMSS may be more versatile as, instead of the date of MS symptom (which is often missing or imprecise), an individual’s age is used. This may allow for inclusion of more patients within a given cohort. Also, use of the ARMSS score could eliminate any bias or imprecision introduced by the need for data on disease duration. However, there is still a need for more accurate disease severity and progression measurements that capture a complete overview of a patient’s disease, regardless of when they are measured.
There has been a notable lack of a severity score to directly compare patient’s overall disease course without undue sensitivity to follow-up time. The combinatorial weight-adjusted disability (COMBIwise) score was one notable effort to create a measure to predict future disability based on several criteria measured early in a patient’s follow-up. 4 Although it made some progress towards a predictive score, some limitations were present such as (1) the requirement of precise information for the calculation, thereby reducing patient inclusion and (2) its comparatively low accuracy at predicting future disability. 5 In addition, an early effort to use serial EDSS measurements to construct a single metric based on area under the curve as an outcome for clinical trials did not attain widespread use. 6
Hence, we sought to use ARMSS to build a more comprehensive metric that would give a cross-sectional view of disease. The aim of this study was to investigate the performance of this new measure, normalized ARMSS (nARMSS), which we developed in order to create an instant overview of a patient’s course as well as to provide a potentially enhanced ability to predict future disability. This score can be calculated solely from EDSS/ARMSS early in as well as throughout follow-up and shows strong correlation between early and future disability.
Materials and methods
This study included individuals with MS from two large clinical cohorts. The first, the Swedish MS registry (Swedish cohort) is a National registry of patients diagnosed with MS based on the McDonald criteria. 7 The registry is voluntary, and the vast majority of neurologists in Sweden electively participate, resulting in the inclusion of nearly 85% of MS patients in Sweden. 8
To be included in the study, individuals were required to have complete data on sex, date of birth, date of disease onset (first manifestation of MS) as well as >1 EDSS score (and the date of EDSS capture). These data were used to calculate the ARMSS against a published reference matrix of global values, 3 using the R package ms.sev version 1.0.4.
In this study, 20,025 individuals in the Swedish cohort with 121,616 clinical visits which included an EDSS measurement were eligible, of which 14,160 individuals were included based on the above criteria.
The second cohort for validation was from British Columbia, Canada (Canadian cohort) and has been previously described.3,9,10 This cohort comprised 5989 eligible individuals.
Calculation of the nARMSS
The nARMSS was constructed based on serial EDSS scores for a patient and required at least two scores for calculation. Follow-up was defined as the time from the first recorded EDSS to the most recently recorded EDSS for all patients. First, an ‘ARMSS integral’ was calculated as the difference in the integrated area under the ARMSS scores from the expected median ARMSS of 5. The ARMSS integral can therefore be thought of as the area under the curve for all ARMSS measurements during follow-up in relation to the median value of 5. Positive values indicate an accumulation of disability greater than average for the patient’s age(s) of disease. The equation for ARMSS integral is
where
The ARMSS integral can be considered a relative measure for the total disability experienced by a patient over follow-up years. In order to compare patients when ages and follow-up time are different, the nARMSS is the ARMSS integral which has been normalized using follow-up time. Figure 1 illustrates a sample calculation based on a single patient’s EDSS and ARMSS scores.

Example of the construction of nARMSS using serial ARMSS scores for a given patient. The ARMSS integral (left) is determined by the areas in green less the areas in red. The nARMSS can be calculated (right) and is shown longitudinally.
The formula for the nARMSS is
As such, the score varies between +5 and −5, since the ARMSS is scaled between 0 and 10 and the ARMSS integral is determined based on the deviation from 5. The nARMSS is therefore the normalized ARMSS over follow-up relative to median, and each increase in one unit gives the average increase in the ARMSS (and thus decile) from the average patient with identical disease timings.
An R shiny app for calculating nARMSS directly from serial EDSS scores for an individual can be found online at https://aliman.shinyapps.io/nARMSS/.
We also determined the ranges of ARMSS and nARMSS for all individuals in the Swedish cohort, defined as the difference in maximum and minimum values over follow-up. This was calculated to determine if any reduction in the range of nARMSS was present over ARMSS. These were then compared to determine the compression of the range in the nARMSS relative to ARMSS.
nARMSS association to future disability
We sought to determine if nARMSS early in follow-up could predict nARMSS for a patient’s next years until 10 years of follow-up, without overlapping data. The following procedures were therefore conducted in both the Swedish and Canadian cohorts.
nARMSS at 2 and 4 years were then used to predict nARMSS during the next 8 and 6 years of follow-up, respectively. The outcomes were recoded as binary where the most severe quartile of patients was coded as 1, and patients with scores in the remaining quartiles were coded as 0. To maximize the number of patients included, at 2 and 4 years of follow-up, the closest chronological EDSS was used (ranging from 0.5 to 3 and 2 to 5 years of follow-up, respectively). We used area under the curve (AUC) of the receiver operating characteristic (ROC), a metric constructing using the curve of sensitivity and specificity, to determine how well the nARMSS was associated with future disability. This procedure was repeated for nARMSS at 8 years to predict the following 7 years of follow-up, that is, 15 total years, as a final check of the reliability of this method.
Similar methods were conducted using both the MSSS and EDSS values at 2 and 4 years to predict the most severe nARMSS quartile over the next years to 10 years total follow-up, resulting in four separate AUC-ROCs per cohort. This was repeated using the average EDSS for all visits up to the timepoint used in the 2 and 4 year cutoffs to predict identical Q4 nARMSS binary outcomes. The average MSSS was also tested in an identical manner to average EDSS, in order to determine the Q4 nARMSS predictive ability.
Finally, the same approach was applied using secondary progressive MS (SPMS) as the outcome, coded as 1 for SPMS and 0 for those who had not converted to SPMS, as determined retrospectively by a neurologist. This included all previous tests using both the 2 and 4 year timepoints, with the outcome being SPMS status after 10 years of follow-up. Individuals who were SP at the 2 or 4 year timepoint were removed from the analysis.
nARMSS comparison to other MS outcomes
The various quartiles of nARMSS were compared to other measurements that are often used as severity outcomes in MS studies. The mean values of nARMSS, EDSS, first SDMT (Symbol Digit Modalities Test) ever completed at any point during follow-up time and first ever MSIS-29 (multiple sclerosis impact scale-29) physical and psychological scores were compared for patients in nARMSS quartiles, defined as the last nARMSS value. These were repeated for quartiles of EDSS and MSSS, defined as the average values over entire the follow-up. The SDMT provides a standardized measurement of cognitive ability. 11 Since a learning effect from repeated testing has been noted, only the first measurement was used. The MSIS-29 is a self-assessment consisting of 29 questions covering physical, psychological and well-being. 12 Similarly, the first scores for physical and psychological symptoms were measured separately. The SDMT and MSIS-29 were only available for the Swedish cohort, as these have become part of the routine clinical assessment in Sweden since the introduction of the second generation of DMTs in 2006. 13
Data availability
Data from the Swedish MS registry used in this article can requested from the Karolinska Institutet. This requires both a data transfer agreement and required ethical permission facilitated between Karolinska Institutet and the institution requesting access to the data in accordance with the data protection legislation governing Europe, GDPR (General Data Protection Regulation). Researchers who are interested in obtaining data access should contact the corresponding author.
Results
Overall, we included 14,160 patients from the Swedish cohort and 5989 patients from the Canadian cohort. Sub-analyses used reduced sets of patients according to available data as indicated. Characteristics of the study population are presented in Table 1.
Characteristics of the Swedish and Canadian cohorts.
EDSS: expanded disability status scale; IQR: interquartile range; SD: standard deviation; MS: multiple sclerosis.
Swedish cohort data
The relationship between nARMSS from 2 to 10 years can be visualized in Figure 2. For each stratum of individuals after 2 years of follow-up, defined by nARMSS in 1-year intervals and shown in black, the corresponding values for the subsequent period are presented. The light bar gives the median values of the next 8 years of nARMSS for individuals in a given strata; that is, not using the initial 2 years of data. As can be seen in the figure, median nARMSS for the first 2 years of follow-up are very similar to the median values for the next 8 years of follow-up.

nARMSS comparison between 2 and 10 years of follow-up (n = 3556), without overlap.
The range of nARMSS for all individuals had a median of 1.17 points over all patient visits, compared with a median range of 3.24 for the underlying ARMSS recorded. This corresponds to a ratio of the nARMSS range of 0.36 to that of ARMSS, meaning that nARMSS ranges varied less than ARMSS over the same number of measurements.
Moving to the individual level, the ability of the nARMSS at 2 and 4 years of follow-up to predict the most severe patients in the fourth quartile of the nARMSS at 10 years are shown in Figure 3. The AUC-ROC values were 0.929 (95% CI = 0.920–0.939, n = 3419) for the nARMSS at 2 years predicting the next 8 years and 0.941 (95% CI = 0.932–0.949, n = 3489) for the nARMSS at 4 years predicting the next 6 years.

AUC for ROC of the nARMSS at (top) 2 years predicting Q4 of the nARMSS for next 8 years (n = 3419), (middle) 4 years predicting Q4 of the nARMSS for next 6 years and (bottom) 8 years predicting Q4 of nARMSS for next 7 years (n = 3489).
A determination of AUC-ROC stability over longer time periods in Figure 3 using the ROC of nARMSS at 8 years predicting Q4 of the nARMSS over the next 7 years resulted in an AUC of 0.951 (95% CI = 0.938–0.965, n = 1006).
The ability of nARMSS at 2 and 4 years to determine whether SPMS had been reached after 10 years of follow-up is shown in Supplemental Figure 1. The nARMSS at 2 years had a 0.662 (95% CI = 0.639–0.686, n = 877) AUC-ROC at 10 years, while the nARMSS at 4 years had a 0.663 (95% CI = 0.638–0.687, n = 772) AUC-ROC. These accuracies were surpassed by other predictors, with the EDSS at 2 years having an AUC-ROC of 0.754 (95% CI = 0.734–0.775, n = 2646) and the EDSS at 4 years having an AUC-ROC of 0.763 (95% CI = 0.741–0.784, n = 2540).
Complete data on the AUC-ROC for each test made in both cohorts are given in Supplemental Table 1. Supplemental Figure 1 illustrates all additional AUC-ROC plots using both Q4 of nARMSS and SP at 10 years of follow-up.
Validation in the Canadian cohort
Similar AUC-ROC curves were calculated in the Canadian cohort for nARMSS at 2 and 4 years to predict the most severe quartile after the subsequent years to 10 years of follow-up. These values were very similar to those of the Swedish cohort, with an AUC-ROC of 0.901 (95% CI = 0.877–0.924 for 2 years to predict next 8 years, n = 948) and 0.908 (95% CI = 0.886–0.929 for 4 years to classify the next 6 years, n = 904). All other variables used to predict the most severe nARMSS quartile showed slight reductions from the AUC-ROC values obtained in the Swedish cohort (Supplemental Table 1).
When predicting SPMS in the Canadian cohort, all variables showed slight increases compared with the Swedish cohort with the AUC-ROC obtained by nARMSS at 2 years being 0.732 (95% CI = 0.701–0.764) and at 4 years being 0.744 (95% CI = 0.712–0.775). The largest value of AUC-ROC in the Canadian cohort using SP status at 10 years was obtained with EDSS at 4 years (0.818, 95% CI = 0.790–0.845).
Comparison to other measures of disease severity
MS cases in the most severe nARMSS quartile had substantially lower first SDMT scores (mean: 42.7, n = 1341) when compared to the lowest nARMSS quartile (mean: 53.3, n = 2130). MSIS-29 physical and psychological scores were consistently lower in nARMSS Q4 (mean physical score of 2.84, mean psychological score of 2.64, n = 1436) than those in nARMSS Q1 (mean physical score of 1.57, mean psychological score of 1.95, n = 2265). Similar trends in lower SDMT scores in the most severe quartile compared to the least severe quartile were observed with EDSS (Q4 = 38.7, n = 813 compared to Q1 = 54.7, n = 2577) and MSSS (Q4 = 41.6, n = 1196 compared to Q1 = 53.6, n = 2261).
A sub-analysis of nARMSS for those with and without missing SDMT indicated that mean nARMSS was significantly different for all quartiles. Q4 showed the largest increase in nARMSS between non-missing and missing data (quartile: available, missing, p value – Q1: −1.43, −1.68, p < 0.001, Q2: −0.33, −0.34, p = 0.001, Q3: −0.004, 0.004, p = 0.01, Q4: 0.65, 1.17, p < 0.001). This indicates that more severe disability in Q4 is likely under-reported and that interquartile differences might increase with more complete data. This tendency towards less favourable outcomes with increasing quartiles of nARMSS is presented in Table 2.
Comparison of nARMSS, EDSS and MSSS by quartiles with the first recorded SDMT and MSIS-29 scores.
ARMSS: age-related MS severity score; EDSS: expanded disability status scale; SDMT: Symbol Digit Modalities Test; MSIS-29: MS impact scale-29; SD: standard deviation.
Discussion
The AUC-ROCs show that the nARMSS has a strong capacity to predict future disability, even when only 2 years of follow-up is available. This allows the nARMSS to be used when categorizing individuals in severity studies with an improvement in accuracy when compared to the use of cross-sectional metrics such as the first or the last available EDSS/ARMSS or MSSS. A potential use of the nARMSS is as an outcome measure in studies such as genome wide association studies (GWAS) for disease severity and progression. In this setting, where severity is not stable and therefore noisy, isolating the signal using the nARMSS may allow for more accurate and reproducible results. Patients can also be included in research studies regardless of their age at measurement and follow-up time, without affecting the results due to large fluctuations in these factors.
When comparing the ranges for the ARMSS and nARMSS for patients over the entire course of their follow-up, the nARMSS in effect compresses the variability of values. The median nARMSS variation is approximately a third of that of the variation in ARMSS measurements, denoting close to a two-third reduction in the instability of serial EDSS/ARMSS measurements. It is precisely this reduction in variation, despite the similar scales, which gives the nARMSS increased utility as an overall marker of disease progression even when determined early in follow-up.
The reasons for using the ARMSS to construct such a metric, instead of the MSSS are (1) the potential increased size of the available patient pool, since recorded patient information may lack onset date but not age and (2) the elimination of the risk of systematic bias due to retrospectively assigning the date of onset. While the nARMSS has some power to predict SPMS after 10 years, EDSS scores alone have increased AUC-ROCs, likely due to the fact that individuals with high EDSS after 2 or 4 years of follow-up are more likely to convert to SP within 10 years. Since we have removed individuals with SP at the measurement point, these patients who remained and had high EDSS are closer to phenotype conversion. Other more accurate methods of predicting SPMS exist, which exceed the accuracy demonstrated here. 14
Similarly, in a clinical setting, the nARMSS may be useful as an additional data point for neurologists to use in combination with other factors and might be used to determine if a patient is likely to have a milder or more severe disease course early in treatment. Given data showing the importance of early treatment, this could be useful in clinical practice when making treatment decisions for early diagnosed patients. 15 However, since there are always exceptions to strong trends, clinical application should be undertaken with caution.
The main limitation of the nARMSS is that it is constructed from EDSS measurements and thus biased towards mobility, especially at higher scores (>5) which consist solely of physical factors. Cognitive disability, for example, is less well represented. Additional outcomes might better represent all aspects of the disease. For example, income and sickness-absence data are available in Sweden, and both are correlated with cognitive decline. 16 However, comparisons between the nARMSS quartiles and first SDMT scores show negative correlation (Table 2), which as expected implies that cognitive decline is associated with the nARMSS as both indicate disability and SDMT may have a predictive role on motor disability. Similarly, both MSIS-29 physical and psychological scores showed correlation with the nARMSS, providing further confirmation of utility beyond only physical disability. Nevertheless, composite scores with SDMT, MSIS-29 and other metrics might lead to greater accuracies. However, it should be noted that relying on EDSS has the benefit of including nearly all patients due to the large data availability of such a metric.
While the correlation between the nARMSS early in follow-up and after 10+ years of follow-up is high, it should be noted from Figure 1 that the nARMSS does vary over a patient’s disease course, ultimately reaching a nearly steady-state level. Therefore, it can be inferred that the metric becomes more accurate with more follow-up time, likely due to more complete information. Individuals enrolled into severity studies should use the most recent clinical measurement available to calculate nARMSS for greatest accuracy.
This metric could aid in the search for factors which are correlated with MS disease severity, such as genetic markers which are associated with increased disease severity. Furthermore, early identification of the potential future severity of an individual’s disease could inform the most appropriate treatment option(s) for that patient. Finally, any alterations in disease progression could be more accurately captured, so that interventions and factors which improve disease course could be identified.
Supplemental Material
sj-pdf-1-msj-10.1177_1352458520988632 – Supplemental material for A multiple sclerosis disease progression measure based on cumulative disability
Supplemental material, sj-pdf-1-msj-10.1177_1352458520988632 for A multiple sclerosis disease progression measure based on cumulative disability by Ali Manouchehrinia, Elaine Kingwell, Feng Zhu, Helen Tremlett, Jan Hillert and Ryan Ramanujam in Multiple Sclerosis Journal
Supplemental Material
sj-pdf-2-msj-10.1177_1352458520988632 – Supplemental material for A multiple sclerosis disease progression measure based on cumulative disability
Supplemental material, sj-pdf-2-msj-10.1177_1352458520988632 for A multiple sclerosis disease progression measure based on cumulative disability by Ali Manouchehrinia, Elaine Kingwell, Feng Zhu, Helen Tremlett, Jan Hillert and Ryan Ramanujam in Multiple Sclerosis Journal
Footnotes
Acknowledgements
The authors wish to thank neurologists, nurses and multiple sclerosis patients in Sweden and Canada, as well as the Swedish Multiple Sclerosis Register for providing data for this study. They gratefully acknowledge the BC MS Clinic neurologists who contributed to the study through patient examination and data collection (current members at the time of data extraction listed here by primary clinic): UBC MS Clinic: A. Traboulsee, MD, FRCPC (UBC Hospital MS Clinic Director and Head of the UBC MS Programs); A.-L. Sayao, MD, FRCPC; V. Devonshire, MD, FRCPC; S. Hashimoto, MD, FRCPC (UBC and Victoria MS Clinics); J. Hooge, MD, FRCPC (UBC and Prince George MS Clinic); L. Kastrukoff, MD, FRCPC (UBC and Prince George MS Clinic); J. Oger, MD, FRCPC; Kelowna MS Clinic: D. Adams, MD, FRCPC; D. Craig, MD, FRCPC; S. Meckling, MD, FRCPC; Prince George MS Clinic: L. Daly, MD, FRCPC; and Victoria MS Clinic: O. Hrebicek, MD, FRCPC; D. Parton, MD, FRCPC; K Atwell-Pope, MD, FRCPC. The views expressed in this paper do not necessarily reflect the views of each individual acknowledged.
Author group information
The BeAMS Study Group: Long-Term Benefits and Adverse Effects of Beta-Interferon for Multiple Sclerosis: A Shirani, Y Zhao, C Evans, ML van der Kop, G Gustafson, J Petkau and J Oger. Role: facilitated gaining funding applications and creation of the study cohort used in the validation-related analyses in this manuscript.
Declaration of Conflicting Interest
The author(s) declared the following potential conflicts of interest with respect to the research, authorship and/or publication of this article: A.M. was supported by the Margaretha af Ugglas Foundation. R.R. and F.Z. have nothing to disclose. J.H. has received honoraria for serving on advisory boards for Biogen, Sanofi-Genzyme and Novartis and speaker’s fees from Biogen, Novartis, Merck-Serono, Bayer-Schering, Teva and Sanofi-Genzyme. He has served as P.I. for projects or received unrestricted research support from Biogen, Merck-Serono, TEVA, Sanofi-Genzyme and Bayer-Schering. This MS research was funded by the Swedish Research Council and the Swedish Brain foundation. E.K. was supported through research grants from the Canadian Institutes of Health Research and the Multiple Sclerosis Society of Canada. In addition, during the last 5 years, she has received travel expenses to give presentations or attend CME conferences, from ACTRIMS, ECTRIMS and the MS Society of Canada. H.T. is the Canada Research Chair for Neuroepidemiology and Multiple Sclerosis. Current research support was received from the National Multiple Sclerosis Society, the Canadian Institutes of Health Research, the Multiple Sclerosis Society of Canada and the Multiple Sclerosis Scientific Research Foundation. In addition, in the last 5 years, H.T. has received research support from the UK MS Trust and travel expenses to present at CME conferences from the Consortium of MS Centres (2018), the National MS Society (2016 and 2018), ECTRIMS/ACTRIMS (2015, 2016, 2017, 2018, 2019 and 2020) and American Academy of Neurology (2015, 2016 and 2019). Speaker honoraria are either declined or donated to an MS charity or to an unrestricted grant for use by H.T.’s research group.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by Canadian Institutes of Health Research (CIHR) (MOP-93646) and the US National MS Society (#RG 4202-A-2; 2009–2012; P.I.: HT).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
