Abstract
Background
There is heterogeneity of aerobic fitness (VO2peak) changes with a standardized exercise training stimulus in the general population (i.e. some participants demonstrate improvements, others no change, and some a reduction in VO2peak).
Objectives
This secondary, exploratory analysis of data examined the heterogeneity of VO2peak responses and possible correlates among persons with progressive multiple sclerosis (PMS) from the CogEx trial.
Methods
CogEx was a multi-site, multi-arm, randomized, double-blinded, and sham-controlled trial undertaken by 11 sites in six different countries. Participants were randomized into one of four conditions with different combinations of exercise training and cognitive rehabilitation including respective sham conditions. The analysis focuses primarily on VO2peak change for the pooled exercise training intervention conditions compared with the pooled sham exercise control conditions.
Results
Waterfall plots for change in VO2peak suggested greater heterogeneity with exercise training than sham, and the proportions of difference in VO2peak change (i.e. improvement/worsening) were significantly different between exercise training and sham conditions(p < 0.05). The multivariable analysis indicated that lower baseline VO2peak (p < 0.001) was the only statistically significant correlate of increases in VO2peak with exercise training.
Conclusion
Our results highlight the heterogeneity of change in VO2peak with exercise training that is correlated with initial aerobic capacity in PMS, and such results may inform hypothesis testing in future clinical trials of exercise training.
Introduction
Randomized controlled trials (RCTs) indicate that exercise training (ET) yields benefits for a variety of outcomes, notably aerobic fitness, in people with multiple sclerosis (MS).1–3 For example, meta-analyses of RCTs examining ET effects on aerobic fitness in people with MS have reported a ½ standard deviation improvement in peak oxygen consumption (VO2peak).4,5 Such evidence supported the development of prescriptive guidelines for yielding ET effects on aerobic fitness and other outcomes in MS.1,6 The underlying assumption of the prescriptive guidelines is that people with MS will on average accrue similar benefits with a standardized ET stimulus(i.e. homogeneity of responses with exercise training).
Importantly, there is increasing recognition of within-study heterogeneity of changes in outcomes, including aerobic fitness, with ET in the general population7–9 and people with MS. 10 This indicates some participants will demonstrate improvements in aerobic fitness, others no change in aerobic fitness, and some a reduction in aerobic fitness with a standardized aerobic ET stimulus. The pattern of variation in responses with ET, including an apparent detraining with aerobic ET, has been characterized as not solely representing measurement error based on an NIA NIH Workshop summary. 9 Of note, the re-analysis of published data from an RCT of 8–10 weeks of ET in 42 people with progressive MS (PMS) 10 indicated interindividual variability of changes in VO2peak based on waterfall plots and chi-square tests across three, standardized aerobic ET programs. 11
The heterogeneity of aerobic fitness change with ET in MS may be explainable by a set of core factors. 9 The core factors may include central nervous system damage, disease burden, and sample demographic/clinical characteristics 10 as well as physiological function, adherence/compliance, and physical activity. 12 For example, one secondary analysis of data from an RCT of a 24-week period of multimodal ET in 54 people with moderate MS-disability reported that there was response heterogeneity for change in peak work rate (Wpeak; performance-based metric of aerobic fitness) that was associated with baseline fitness. 12 We note that the NIA NIH Workshop summary did not identify a single, analytic approach as optimal for examining heterogeneity of change and its correlates within RCTs of ET. 8
The premise and evidence for heterogeneity of aerobic fitness changes that is explainable by other factors opens the door for the application of precision medicine in MS. The premise of precision medicine focuses on identifying factors that explain heterogeneity of change in an outcome for a given treatment, and then using that information for delivering an individually-centered treatment stimulus. 13 Such an approach should optimize the benefits of ET and minimize the variability in treatment response. Within the context of ET, this often starts with aerobic fitness, as outcomes of ET often depend on adaptations in physiological systems that can translate into secondary benefits.9,14
This paper involved a secondary, exploratory analysis of data from the CogEx trial.15,16 We have published results regarding group-level changes in outcomes, notably aerobic fitness based on VO2peak and Wpeak from a maximal, incremental exercise test. 16 The current paper focuses on (a) examining the heterogeneity of individual-level changes in VO2peak (primary outcome) and Wpeak (secondary outcome) within the ET intervention condition compared with the sham, exercise control condition, and (b) exploring bivariate and multivariable correlates of aerobic fitness changes within the ET condition based on a published model 10 and other research7–9,12; we then examine the bivariate results from the ET condition in the sham condition.
Methods
Trial description
The CogEx trial was a multi-site, multi-arm, randomized, double-blinded, and sham-controlled clinical trial undertaken by 11 sites across six countries: Canada (one site), the USA (two sites), the United Kingdom (two sites), Denmark (one site), Belgium (one site), and Italy (four sites).15,16 Participants with PMS were randomized into one of four conditions with different combinations of ET and cognitive rehabilitation including respective sham conditions. This paper focuses on aerobic fitness changes for the pooled ET intervention conditions compared with the pooled sham exercise control conditions (i.e. ET versus sham exercise conditions collapsed across cognitive rehabilitation and sham cognitive rehabilitation conditions); this was reasonable as there were no differences for changes in outcomes between the cognitive rehabilitation conditions. 16
Participants
The inclusion/exclusion criteria for participants are provided in Table 1. 15 Of note, the diagnosis of primary or secondary PMS was confirmed by a neurologist, and impaired cognition was defined as a Symbol Digit Modalities Test (SDMT) score of at least 1.282 standard deviations below published normative data (10th percentile), and physical inactivity was based on a Godin Leisure-Time Exercise Questionnaire Health Contribution Scale score of < 24.
Inclusion and exclusion criteria for the CogEx trial as reported in our protocol and primary outcomes papers.
Exercise training program
The ET intervention condition was fully described in the protocol 15 and outcomes 16 papers, and involved a standardized program of supervised aerobic ET on a recumbent arm-leg stepper (Nustep T5XR, Nustep Inc, Ann Arbor, MI). The ET protocol was designed based on several important features. The first is that the program is largely consistent with evidence-based guidelines for physical activity in MS. The second is that the combination of continuous, moderate-intensity ET and high-intensity interval ET on separate days of the week includes two modalities of aerobic ET that illicit acute and chronic adaptations in MS. The third is that an international team of experts designed the ET program based on both research and personal experiences with ET in MS. Briefly, the aerobic ET consisted of bouts of continuous, moderate-intensity exercise along with high-intensity interval training (HIIT) performed on alternating days, 2 times per week for 12 weeks. The continuous bouts progressed from 10 minutes of exercise at a work rate associated with 50%–60% VO2peak in Week 1 towards 30 minutes of exercise at a work rate associated with 70%–80% VO2peak in Week 12. The HIIT bouts progressed from 5, 1-minute intervals at a work rate associated with 80%–90% VO2peak interspersed with 1-minute rest periods(i.e. lightly exercising at 15 W) in Week 1 towards 10, 2-minute intervals at a work rate associated with 90% VO2peak interspersed with 2-minute rest periods in Week 12. The sham, exercise condition consisted of supervised stretching and balance that were performed 2 times per week, and this too has been described in the protocol 15 and primary outcomes 16 papers.
Procedure
Study procedures were approved by site-specific, institutional review boards, and participants provided written informed consent. Participants initially completed baseline assessments in the laboratory, and then were randomized into one of the four study conditions; 50% of participants were randomly assigned to the ET intervention conditions and 50% of participants were randomly assigned to the sham exercise control conditions. Participants completed the follow-up assessments in the laboratory following the 12-week study period.
Outcomes
The details of the outcomes included in this secondary analysis are reported in the protocol paper. 15 We note that all outcomes were collected using the same procedures with similar equipment across sites.
Data analysis
The analyses included participants who were randomized, began the intervention, and had adherence/compliance data for the exercise interventions. Waterfall plots were created to visualize the shape of the data distributions. Levene's statistic tested the homogeneity of variances between the treatment and sham conditions. Additionally, we categorized VO2peak change into improved (10% or more increase over baseline value), worsened (10% or more decrease below baseline value), or no change (i.e. < 10% increase or decrease) using the guideline for interpreting VO2peak change in MS. 27 The difference in VO2peak change (improvement/worsening) proportions were compared between the ET and sham conditions based on chi-square test. We evaluated associations between correlates and change in VO2peak and Wpeak for the ET and sham control conditions using bivariate and multivariable methods. The bivariate correlations involved Pearson product-moment correlations (r) for continuous variables, point-biserial correlations (rpb) for dichotomous variables (e.g. sex), and Spearman rank-order correlations (ρ) for multi-level categorical variables between baseline variables and change in VO2peak and Wpeak. Cohen's guidelines of 0.1, 0.3, and 0.5 indicated small, moderate, and strong correlations, respectively. 28 The multivariable analysis involved linear regression with direct entry of variables that demonstrated univariate associations (p < 0.2) with the outcome variables, and this was followed by a sensitivity analysis removing the respective baseline factor (e.g. baseline VO2peak for VO2peak change). All statistical analyses were conducted in SAS (Version 9.4, Cary, NC, USA).
Results
Sample characteristics and group-level change in fitness outcomes
The characteristics for the overall sample (n = 304) and the subsamples who completed the ET (n = 152) and sham exercise control conditions (n = 152), irrespective of cognitive training, are provided in Table 2a and b. There were no baseline differences between the ET and sham conditions for all included variables except adherence and compliance (p < 0.001) that were both lower for ET than sham. The adherence data in this study for ET were strong when compared with a recent review, 29 but compliance was weaker and might suggest some issues with delivering the ET program, notably the HIIT program. Both VO2peak and Wpeak change scores were significantly different between conditions (p = 0.003, p < 0.001, respectively), as shown in Table 2c, with higher mean change for ET.
Demographic characteristics of the samples in exercise training and sham conditions regardless of cognitive training group assignment.
EDSS: Expanded Disability Status Scale; School: total years of schooling; * Self-identified sex. ** n = 303. *** n = 302.
Clinical characteristics of the samples in exercise training and sham conditions regardless of cognitive training group assignment.
SDMT: symbol digit modalities test; CVLT: California Verbal Learning Test-II; BVMT-R: Brief Visuospatial Memory Test-Revised; FAMS: functional assessment of multiple sclerosis; 6MWT: 6-minute walk test; MSWS: 12-item MS walking scale; PDQ: Parkinson's disease questionnaire; MFIS: Modified Fatigue Impact Scale.
* p < 0.001, two-sample t-test.
Outcomes of the samples in exercise training and sham conditions regardless of cognitive training group assignment.
*Two-sample t-test. **n = 272. ***n = 271.
Response heterogeneity for individual-level change in exercise training and sham conditions
The waterfall plots for change in VO2peak and Wpeak with the 12-week ET and sham conditions are provided in Figures 1 and 2, respectively. Levene's test did not identify statistically significant differences in the variances of changes in VO2peak and Wpeak between conditions, but inspection of the waterfall plots suggested greater heterogeneity (i.e. variability of the changes) with ET than sham. Among those in the ET condition, 43.7% (n = 59) demonstrated improvement in VO2peak (i.e. ≥ 10% increase in mL/kg/min), 34.1% (n = 46) demonstrated no change, and 22.2% (n = 30) demonstrated worsening in VO2peak (i.e. ≥ 10% decrease in mL/kg/min). Among those in the sham condition, 29.2% (n = 40) demonstrated improvement in VO2peak (i.e. ≥ 10% increase in mL/kg/min), 38.0% (n = 52) demonstrated no change, and 32.8% (n = 45) demonstrated worsening in VO2peak (i.e. ≥ 10% decrease in mL/kg/min). The difference in VO2peak change (improvement/worsening) proportions was statistically significantly different between the ET and sham conditions (p = 0.03).

Waterfall plots for change in peak aerobic power based on peak oxygen consumption (VO2peak; mL/kg/min) in the exercise training (Panel A) and sham exercise training (Panel B) conditions.

Waterfall plots for change in peak aerobic power based on peak work rate (Wpeak; watts) in the exercise training (Panel A) and sham exercise training (Panel B) conditions.
Bivariate correlates of response heterogeneity for exercise training
The bivariate analysis is provided in Table 3a, and only lists variables associated with either change in VO2peak or Wpeak in the ET condition. Baseline VO2peak (r=–0.21, p = 0.02), average minutes/day of MVPA (r = 0.16, p = 0.07), average compliance(r = 0.15, p = 0.07), 6MWT distance(r = 0.15, p = 0.09), SDMT number correct (r = 0.14, p = 0.10), and baseline Wpeak (r = 0.14, p = 0.11) were correlates of VO2peak change for the ET condition. EDSS (r=–0.16, p = 0.06), average adherence (r = 0.16, p = 0.07), years of school (r = 0.14, p = 0.10), FAMS total (r = 0.15, p = 0.10), and 6MWT distance (r = 0.11, p = 0.19) were correlates of Wpeak change for the ET condition. We further report the correlation between the same variables identified for the ET condition with change in VO2peak or Wpeak for the sham control condition in Table 3b, and notably baseline VO2peak was significantly and inversely correlated with change in VO2peak.
Correlation analysis of significant factors (p < 0.2) associated with VO2peak and Wpeak change in the exercise training condition
EDSS: Expanded Disability Status Scale; FAMS: functional assessment of multiple sclerosis; MVPA: moderate-to-vigorous physical activity; 6MWT: 6-minute walk test; SDMT: symbol digit modalities test.
Correlation analysis of significant factors (p < 0.2) associated with VO2peak and Wpeak change identified in the exercise condition that were examined in the sham condition
EDSS: Expanded Disability Status Scale; FAMS: functional assessment of multiple sclerosis; MVPA: moderate-to-vigorous physical activity; 6MWT: 6-minute walk test; SDMT: symbol digit modalities test.
Multivariable correlates of response heterogeneity for exercise training
The multivariable analysis is provided in Table 4. Regarding VO2peak, we included baseline VO2peak, baseline Wpeak, average compliance, SDMT number correct, 6MWT distance, and average minutes/day in MVPA as factors, and baseline VO2peak (–0.37, p < 0.001) was the only correlate of VO2peak change that was statistically significant. Regarding Wpeak, we included average adherence, FAMS total, EDSS score, years of school, and 6MWT distance as factors, and the analysis identified none of the factors as statistically significant correlates of Wpeak change. Sensitivity analysis for the removal of baseline VO2peak from the VO2peak change regression is displayed in Table 5, yielding no statistically significant predictors of VO2peak change.
Regression results for VO2peak change as outcome in the exercise training condition.
SDMT: symbol digit modalities test; MVPA: moderate-to-vigorous physical activity; 6MWT: 6-minute walk test.
Regression results for Wpeak change as outcome in the exercise training condition.
FAMS: functional assessment of multiple sclerosis; EDSS: Expanded Disability Status Scale; 6MWT: 6-minute walk test.
Regression results for VO2peak change as outcome, baseline VO2peak excluded as sensitivity analysis.
SDMT: symbol digit modalities test; MVPA: moderate-to-vigorous physical activity; 6MWT: 6-minute walk test.
Discussion
This article involved a secondary, exploratory analysis of data from the CogEx trial15,16 and examined response heterogeneity in aerobic fitness with ET and its possible correlates in persons with PMS. We observed group-level mean changes in VO2peak and Wpeak as markers of aerobic fitness favoring the ET condition, and there was individual-level variability of the changes within the ET condition. The change in VO2peak was primarily correlated with baseline aerobic fitness levels in the multivariable analyses—those with the lowest levels of aerobic fitness demonstrated the largest improvement in aerobic fitness with ET. Our results collectively highlight the presence of response heterogeneity in aerobic fitness with ET in PMS, and that this is correlated with initial levels of aerobic fitness. These analyses and results might directly inform hypothesis generation and testing in future clinical trials of response heterogeneity with ET in MS.
The ET condition resulted in statistically significant improvements in both VO2peak and Wpeak compared with no change in the sham control exercise condition, and this result argues for fitness adaptations with ET rather than practice effects with the assessment of aerobic fitness. The results further indicated heterogeneity of changes in VO2peak and Wpeak for the ET condition. The presence of this heterogeneity indicated that with the standardized aerobic ET stimulus included in the CogEx trial, some participants demonstrated improvements in aerobic fitness, others no change, and some exhibited reductions in aerobic fitness. This is consistent with the results of a secondary analysis of data from an RCT of a 24-week period of multimodal ET in 54 people with moderate MS disability that documented response heterogeneity for change in Wpeak. 12 The re-analysis of published data from an RCT of 8–10 weeks of ET in 42 people with PMS 11 further indicated interindividual variability of changes in VO2peak based on waterfall plots and chi-square tests across three, standardized aerobic ET programs. 10 This convergence of results supports the presence of heterogeneity of changes in VO2peak and Wpeak with standardized ET conditions in persons with MS, and extends such data for the first time into PMS.
This study further examined bivariate and multivariable correlates of heterogeneity of change in VO2peak and Wpeak with the standardized ET condition. The variables were selected based on a published model for MS 10 and other research.7–9,12 The correlation analysis identified baseline VO2peak, average minutes/day of MVPA, average compliance, 6MWT distance, SDMT number correct, and baseline Wpeak as small/weak, bivariate correlates of VO2peak change for the ET condition; baseline VO2peak had a significant and marginally larger correlation with change in VO2peak for the sham control than ET condition. EDSS score, average adherence, years of school, FAMS total, and 6MWT were small/weak, bivariate correlates of Wpeak change for the ET condition. The multivariable regression analysis indicated that only baseline VO2peak correlated with change in VO2peak, and no variables correlated with change in Wpeak. Those with lower baseline VO2peak had larger changes in the respective outcome than those who had higher baseline levels, and this result was comparable with those seen for non-MS samples 8 ; this may indicate that factors influencing oxidative metabolism with ET might not be influenced by MS-disease processes per se. We do not believe this represents a regression to the mean, as those in the ET intervention condition had larger changes in VO2peak and Wpeak than those in the sham control condition. We further note that this result might reflect an artifact of the analyses that involved examining baseline aerobic fitness as a correlate of change in aerobic fitness, notably as we observed a significant correlation between baseline VO2peak and change in VO2peak for both the sham control and ET conditions, although there is no agreed-upon approach for studying response heterogeneity. 9 Our results suggest, in principle, that baseline aerobic fitness levels might be an important consideration as a biomarker for precision trials of aerobic ET in PMS, particularly given that lower levels of aerobic fitness have been associated with worse outcomes in MS. 27 This implies that future research might target those with lower aerobic fitness levels, rather than physical inactivity, for inclusion in aerobic ET interventions that target beneficial outcomes in MS.
There are important limitations of this article. The CogEx trial itself was not designed for examining response heterogeneity with ET in PMS, and hence our results are largely data-driven and hypothesis-generating rather than confirmatory. The sample inclusion criteria resulted in a homogeneous sample, particularly regarding processing speed impairment and physical inactivity, and the results might not be generalizable beyond the study sample. The sample size further might be too small, as we did not power the original study for this purpose. 9 The examination of correlates of heterogeneity was largely data-driven and restricted to the variables included in CogEx; other unmeasured factors might explain the variability of changes. The data analysis was exploratory and data-driven involving correlation analysis and multiple regression, and might have capitalized on chance features of the data set. We lastly analyzed the VO2peak and Wpeak changes between conditions differently than the main article, 16 although we do not believe this misrepresented the results. The sample size was based on a power analysis for the primary outcome, but not examining response heterogeneity, and future researchers might apply the results from the current article when guiding sample size calculations in confirmatory research involving predictors of heterogeneity. We did not collect data on smoking behavior, and this might have been a key influence on aerobic fitness adaptations and heterogeneity.
Overall, we observed group-level mean change in aerobic fitness favoring the ET condition, and there was heterogeneity of the change within the ET condition. The change in aerobic fitness was primarily correlated with baseline fitness levels in the multivariable analyses, such that those with the lowest levels of VO2peak had the largest change in VO2peak. The expanding evidence of heterogeneity of outcomes with ET interventions supports future hypothesis-driven research in this area, particularly if the field is interested in the design and testing of precision medicine approaches for optimizing outcomes with ET in MS.
Footnotes
Data availability
To promote data transparency, anonymized data will be available 1 year after the publication of this article, upon reasonable request. Please make the request to the corresponding author, RWM. The request will be reviewed for approval by a CogEx committee, and a data-sharing agreement will be put in place before any data are shared.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Multiple Sclerosis Society of Canada (grant number #EGID3185).
