Abstract
Background:
Accurate clinical assessment in multiple sclerosis (MS) is challenging. The Assess MS system is being developed to automatically quantify motor dysfunction in MS, including upper extremity function (UEF) and mobility.
Objective:
To determine to what extent combinations of standardized movements included in the Assess MS system explain accepted measures of UEF and mobility.
Methods:
MS patients were recruited at four European MS centres. Eight movements were selected, including tasks of activities of daily living (ADL) and classical neurological tests. Movements were recorded on video and rated by experienced neurologists (n = 5). Subsequently, multivariate linear regression models were performed to explain the variance of the Nine-Hole Peg Test (9HPT), Arm Function in Multiple Sclerosis Questionnaire (AMSQ) and Timed-25 Foot Walk test (T25WT).
Results:
In total, 257 patients were included. The movements explained 62.9% to 80.1% of the variance of the 9HPT models, 43.3% and 44.3% of the AMSQ models and 70.8% of the T25WT. In all models, tasks of ADL contributed most to the variance.
Conclusion:
Combinations of movements are valuable to assess UEF and mobility. Incorporating ADL tasks into daily clinical practice and clinical trials may be more valuable than the classical neurological examination of UEF and mobility.
Keywords
Introduction
Assessment of disability in multiple sclerosis (MS) is traditionally performed with the Expanded Disability Status Scale (EDSS), which is a physician-based method. However, there are several limitations to the EDSS. Some of these are related to the heterogeneous nature of MS, others are inherently a consequence of methodological aspects of the scale itself, for example, a high inter- and intra-rater variability and a disproportional impact of ambulatory function on the total score. 1 To improve the clinical assessment of MS disability, various performance-based tests were introduced. Widely accepted performance-based tests are the Nine-Hole Peg Test (9HPT), 2 the Timed 25-foot Walk Test (T25WT) 3 and the Symbol Digit Modalities Test. 4 Also, patient-reported outcome measures contribute to clinical assessment by giving insight into the patient-perspective of a certain aspect, such as upper extremity function (UEF) with the Arm Function in Multiple Sclerosis Questionnaire (AMSQ). 5
A potential valuable improvement in clinical assessment would be the automatic quantification of disability with Machine Learning Algorithms (MLA). With this in mind, the Assess MS system is being developed to automatically quantify motor functioning by capturing standardized movements of patients recorded by the Microsoft Kinect® camera (Microsoft, Redmond, SA, USA). 6 Several of these movements are used for the assessment of UEF and mobility, which are important functional domains in MS since the majority of patients experience UEF and mobility impairment at some point in the course of their disease.7,8 Furthermore, impaired UEF and mobility can impact the ability to perform activities of daily living (ADL), on general health perception,9,10 and on quality of life and social participation.11,12
The standardized movements used to develop the Assess MS include several classical neurological tests and tasks of ADL, which can easily be administrated in daily practice. However, it is unclear to what extent these tests contribute to determining UEF and mobility. Presumably, not all tests are required to assess these functions, and it is unclear how much each test contributes to UEF and mobility. In the current study, we investigate to what extent combinations of standardized movements explain accepted measures of UEF and mobility.
Methods
Patients
Patients were recruited at four large European MS centres in Amsterdam, Basel, Bern and Lucerne. Inclusion criteria were aged older than 18 years, diagnosis of MS or a clinically isolated syndrome suspicious for MS according to the 2010 revised McDonald criteria, EDSS score between 0 and 7. Exclusion criteria were inability to follow procedures or read the informed consent due to psychological disorders, dementia or insufficient ability to speak the local language or English. Each patient provided written informed consent prior to study entry and the study was approved by the respective ethics committees.
Procedure
An example of the experimental setup is illustrated in Figure 1. All patients were recorded with the Kinect® camera that simultaneously captures depth and colour videos. Eight standardized movements covering trunk, upper and lower extremities, which are partly based on the classical neurological examination, and movements typical of ADL were chosen. Three movements covering UEF were performed: finger-to-nose test (FNT), pronator drift test (PDT), as classical neurological tests, and drinking from a cup (CUP), as an ADL movement. For CUP, patients had to take a sip from a standardized plastic cup that was at least half-full of water. To assess mobility the following five movements were performed: Romberg test (ROM), tight-rope-walking (TRW), as classical neurological tests, and sit-to-stand (STS), turning-on-the-spot (TOS) and walking a distance of 25 foot (GAT), as ADL movements. For STS, patients were instructed to get up from a standardized chair without touching it. Schematic representations of the movements can be found in Figure 2.

Experimental setup. In this example, a patient sits on a chair and performs the finger-to-nose test. The Assess MS machine is placed perpendicular to the patient and displays an audio-guided instruction video of the movement on the large screen. A physician on the other side operates the machine with a tablet that in this example has been turned towards the patient for demonstration purposes. After showing the instruction video, the patient performs the movement after a beep. This is recorded by the Kinect® camera and stored locally on the machine. The people seen in this picture are members of the study group that gave their consent.

Schematic representations of the standardized movements.
All colour videos were rated by two independent neurologists with experience in MS. Each patient video was given a score based on a predetermined rating scale (see Table 1). Some of these scales (FNT, PDT and ROM) were derived from the functional system subscores from the Neurostatus-EDSS. 13 For the ADL movements, a 0 to 4 scale was created, in which 0 is a normal performance, 1 mildly impaired (minor interference with function), 2 moderately impaired (clear interference with function), 3 severely impaired (severe interference with function) and 4 is impossible to perform. In addition, the videos were also presented as sets which the neurologists ordered from least affected to most affected. Using an algorithm similar to the one described by Sarkar et al. 14 that takes into account individual rater bias, the videos were then assigned a consensus score. This consensus score was subsequently used in the statistical analysis. Videos of insufficient quality or videos that were not performed according to the protocol were excluded from further analysis.
Rating scales of movements.
In the current study, only the video ratings of movements were analysed. The development of MLA is part of another study that is currently being performed.
All patients received a standardized Neurostatus-EDSS assessment 13 on the day of recording, performed by another examiner than the before mentioned neurologists that rated the videos. Furthermore, the 9HPT and T25WT were administrated, as performance-based measures of UEF and mobility. All patients were asked to complete the Arm Function in Multiple Sclerosis Questionnaire (AMSQ), 5 as a patient-reported outcome measure for UEF.
Data analysis
Statistical analyses were performed in IBM SPSS Statistics for Macintosh, Version 24. A p-value of <0.05 was considered statistically significant. The normality of each variable was assessed using histograms and normality plots. For variables with a normal distribution, mean values with standard deviation (SD) were calculated, and median values with interquartile range (IQR) for non-parametric distributions. For the movements that were performed multiple consecutive times (FNT three times for both sides, and 9HPT and T25WT two times), the best performance was used for statistical analyses. Spearman’s rho correlation was used for assessing the relation between the 9HPT and AMSQ.
After confirming the absence of strong collinearity with partial regression (collinearity present if r ⩾ 0.9), combinations of the eight movements were used in stepwise multivariate linear regression models to determine how much the clinical ratings of the movements contribute to the variance of the 9HPT and the AMSQ in UEF, and the T25WT for mobility. For UEF, different models were used for the left and right sides, and for the dominant and non-dominant hand. The rating scales were categorized into groups (i.e. dummy variables were created), because the relation between the outcome variables and the rating scales of the movements was not linear.
Results
In total, 257 patients were included in this study of which 171 (66.5%) were women and the mean age was 46.6 years (SD 12.8). The mean disease duration was 14.9 years (SD 11.7). Clinical phenotypes were distributed as follows: clinically isolated syndrome 11 (4.3%), relapsing–remitting MS 186 (72.4%), secondary progressive MS 45 (17.5%) and primary progressive MS 15 (5.8%) patients. Twenty-four (9.3%) patients experienced a relapse within 3 months prior to inclusion. The median EDSS score was 3.0 (IQR 2.0). Baseline characteristics and results of the 9HPT, T25WT and questionnaires are shown in Table 2. Correlation coefficients of the AMSQ, and 9HPT were 0.60 for the right side, 0.46 for the left side, 0.61 for the dominant hand and 0.44 for the non-dominant hand. The video ratings of the eight movements are summarized in Table 3.
Baseline characteristics.
SD: standard deviation; EDSS: Expanded Disability Status Scale; 9HPT: Nine-Hole Peg Test; T25WT: Timed 25-foot Walk Test; AMSQ: Arm Function in Multiple Sclerosis Questionnaire; IQR: interquartile range; SD: standard deviation; RRMS: relapsing-remitting MS: SPMS: secondary progressive MS; PPMS: primary progressive MS; CIS: clinically isolated syndrome.
Assessments of movements.
FNT: finger-to-nose test; PDT: pronator drift test; CUP: drinking from a cup; ROM: Romberg test; TRW: tight-rope-walking; STS: sit-to-stand; TOS: turning-on-the-spot; GAT: walking a distance of 25 foot; n.a.: not applicable.
Two ambidextrous patients were defined as unrateable of which one-CUP movements were unrateable.
Regression models for UEF and mobility
No co-linearity was found between the video ratings of the movements. Results of the regression models are displayed in Table 4. CUP, PDT and FNT explained 73.2% of the variance of the right-sided 9HPT, and 78.2% of the left-sided 9HPT. CUP, PDT and FNT explained 80.1% and 62.9% of the variance in the dominant and non-dominant hand models of 9HPT, respectively. In all models, CUP contributed most to the variance of the 9HPT, with only a minor contribution of PDT and FNT.
Regression models.
9HPT: Nine-Hole Peg Test; T25WT: Timed 25-foot Walk Test; FNT: finger-to-nose test; PDT: pronator drift test; CUP: drinking from a cup; ROM: Romberg test; TRW: tight-rope-walking; STS: sit-to-stand; TOS: turning-on-the-spot; GAT: walking a distance of 25 foot; D: dominant hand; ND: non-dominant hand; AMSQ: Arm Function in Multiple Sclerosis Questionnaire.
p-value of ANOVA test.
p-value of F-change.
In the AMSQ model in which CUP and FNT were stratified according to side (left and right side), 44.3% of the variance was explained by CUP and FNT of the right side. In the other AMSQ model in which CUP and FNT were stratified according to dexterity (dominant and non-dominant hand), 43.3% of the variance was explained with CUP from the dominant hand, and FNT of the non-dominant hand. In these models again, CUP contributed most to the variance of the AMSQ, and FNT contributed only to a minor proportion.
The six movements in the model for mobility explained 70.8% of the variance of the T25WT. The STS contributed most to the variance in this model, and the other movements to a minor extent.
Discussion
Combinations of standardized movements that are used in the Assess MS system explained UEF to a large extent as defined by the 9HPT as a measure of performance, and to a lesser extent as defined by the AMSQ as a measure of patient-reported outcome. Mobility, as defined by the performance-based T25WT, was also explained to a large extent by a combination of movements. The ADL tasks CUP and STS contributed more to the variance of UEF and mobility than classical neurological tests such as FNT and ROM.
The 9HPT was used as measure of UEF since it is the most widely used tool to assess UEF in MS studies so far. 15 It has good psychometric properties and clinical relevance concerning the ability to perform ADL tasks and quality of life. 2 Although the AMSQ has not yet been used as frequently as the 9HPT, it has good psychometric properties to assess UEF as well,5,16 and additionally gives insight into the patients’ perspective of UEF.
A large percentage of the variance of the 9HPT was explained by a combination of movements, of which CUP contributed most in all models. Various explanations may be given for this. First, CUP is a typical ADL movement, and the 9HPT is known to correlate with the ability to perform ADL tasks. 2 Second, the 9HPT primarily quantifies hand function (i.e. distal arm function), which is relevant for the ability to hold a cup and drink from it. 2 Finally, one study found that approximately 53% of the variance of the 9HPT was explained by muscle strength, tactile sensitivity of the thumb and intention tremor, 17 which are all relevant in performing CUP.
The variance of the AMSQ could only be explained for 43.3% and 44.3%. The AMSQ covers a variety of patient-perceived ADL tasks, ranging from gross (such as holding a plate) to fine movements (such as using a keyboard), and covering both proximal and distal arm function. 5 Therefore, the AMSQ score probably represents more than what is covered with CUP, FNT and PDT. The strong contribution of CUP, being a typical ADL movement, in these models is in line with the focus of AMSQ on patient-perceived ADL tasks. Although the FNT and PDT are valuable in the neurological examination for localisation purposes, our results indicate that these tests are less sensitive to assess UEF, as defined with the 9HPT or AMSQ.
In our study, we found a lower correlation between the 9HPT and AMSQ (r = 0.44–0.61) than in another study (r = 0.77). 16 This supports the idea that different constructs were tested with the 9HPT and AMSQ. This is in line with our finding that combinations of movements explained different proportions of the variances of the 9HPT and AMSQ. The difference of correlation coefficients between left versus right and non-dominant versus dominant hand may be explained with the AMSQ being a measure of perceived upper extremity ADL tasks. Objective impairment of the dominant hand, which is most frequently the right hand, probably influences perceived UEF more strongly.
With regard to the assessment of mobility, the T25WT was chosen, because it has good psychometric properties to assess ambulatory function. 3 It is primarily a measure of walking speed, which seems clinically relevant, because walking speed relates to the capacity to perform outdoor activities important in daily life 18 and employment status. 19 However, since walking speed is often preserved in less disabled patients, measures of walking distance or endurance can better used for these patients.
The movements used to assess mobility in Assess MS explained 70.8% of the variance of the T25WT. In previous studies, the T25WT correlated with the ability to perform ADL tasks, 18 which is in line with our finding that STS contributed most to the variance. Furthermore, there are similarities between STS and the Timed Up & Go test (in which a patient gets up from a chair), which correlated strongly with the T25WT. 3
The GAT also contributed significantly to the variance of the T25WT. Although these tests are very similar, GAT is principally a qualitative measure of ambulation (i.e. ‘how well does a patient walk?’) and the T25WT only measures walking speed. However, the relation of walking speed with spatial and temporal gait parameters has been previously described. 12
A strong point of our study is the use of a combination of simple movements to assess UEF and mobility that can be performed in a short time that can easily be done in clinical setting. Our study has some limitations. First, patients included in our cohort were relatively mildly disabled with a median EDSS of 3.0, and this hampers generalization to a more disabled population. This is also reflected in the distribution of assessments of the movements (Table 3). Results of our models might have been different if more severely disabled patients were included. This would particularly account for the T25WT, because of its limited sensitivity to detect abnormalities in patients with mild ambulatory impairment. 19 For these patients, it may be more appropriate to assess walking endurance with longer walking distances (e.g. with the 6-minute walking test). 20 Second, our construct of mobility is probably not entirely covered with the T25WT. Our construct includes standing up from a chair, turning on a spot, walking a straight line and the ROM. With the T25WT, only the time that a patient walks straight for a distance of 25 foot is measured. This explains why TRW and TOS did not contribute significantly to the model. Using another measure than only the T25WT as surrogate for the construct of mobility would have likely given different results. Finally, the rating scales of the ADL tasks have not been validated yet. Future research should consider the assessment of psychometric properties of these tests, such as validity and reliability. Nevertheless, the neurologists who performed the video rating, experienced that the ADL scales were much easier to apply than the scales derived from the Neurostatus-EDSS.
We conclude that UEF and mobility can be assessed with a combination of standardized movements. ADL tasks contributed most to these assessments, which indicates that including ADL tasks (such as CUP and standing up from a chair) in daily clinical practice, may be more valuable than the classical neurological examination (such as placing a finger on one’s nose) to assess UEF and mobility. Also, incorporating ADL tasks in clinical trials may be valuable to assess motor functioning. Future research will have to determine whether these ADL movements have all the other psychometric properties that would make them valuable clinical assessments.
Footnotes
Acknowledgements
C.E.P.V.M. and M.D. contributed equally.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship and/or publication of this article: C.E.P.V.M. has received travel support from Novartis Pharma AG, Sanofi Genzyme and Teva Pharmaceuticals, and honoraria for lecturing and consulting from Biogen-Idec and Merck Serono. M.D. has received travel support from Bayer AG, Teva and Genzyme and research support from the University Hospital Basel. S.S. has received travel support from Bayer and Merck and honoraria for consulting from Bayer, Merck, Roche and Teva. C.P.K. has received honoraria for lectures as well as research support from Biogen, Novartis, Almirall, Bayer Schweiz AG, Teva, Merck, Genzyme, Roche and the Swiss MS Society (SMSG). J.B. has received travel support from Novartis Pharma AG. M.D. has no conflicts of interest. K.K. has no conflicts of interest. J.D. is an employee of Novartis Pharma AG. F.D. is an employee of Novartis Pharma AG. L.W. is an employee of Novartis Business Services. L.K.’s institution (University Hospital Basel) has received in the last 3 years and used exclusively for research support at the Department of Neurology steering committee, advisory board and consultancy fees from Actelion, Alkermes, Almirall, Bayer, Biogen, df-mp, Excemed, GeNeuro SA, Genzyme, Merck, Minoryx, Mitsubishi Pharma, Novartis, Receptos, Roche, sanofi-aventis, Santhera, Teva, Vianex and royalties from Neurostatus products. For educational activities of the Department, the institution received honoraria from Allergan, Almirall, Bayer, Biogen, Excemed, Genzyme, Merck, Novartis, Pfizer, Sanofi-Aventis, Teva and UCB. B.M.J.U. has received consultation fees from Biogen-Idec, Novartis Pharma AG, EMD Serono, Teva Pharmaceuticals, Sanofi Genzyme and Roche. The Multiple Sclerosis Centre Amsterdam has received financial support for research, from Biogen-Idec, Merck Serono, Novartis Pharma AG and Teva Pharmaceuticals.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This research has been funded by Novartis Pharma AG.
