Abstract
Background
Multiple sclerosis (MS) causes pervasive motor, sensory and cognitive dysfunction. The Expanded Disability Status Scale (EDSS) is the gold standard for assessing MS disability. The EDSS is biased towards mobility and may not accurately measure MS-related disabilities in the upper limb or in cognitive functions (e.g. executive function).
Objective
Our objectives were to determine the feasibility of using the Kinarm robotic system to quantify neurological deficits related to arm function and cognition in MS patients, and examine relationships between traditional clinical assessments and Kinarm variables.
Methods
Individuals with MS performed 8 robotic tasks assessing motor, cognitive, and sensory ability. We additionally collected traditional clinical assessments and compared these to the results of the robotic assessment.
Results
Forty-three people with MS were assessed. Most participants could complete the robotic assessment. Twenty-six (60%) were impaired on at least one cognitive task and twenty-six (60%) were impaired on at least one upper-limb motor task. Cognitive domain task performance correlated most strongly with the EDSS.
Conclusions
Kinarm robotic assessment of people with MS is feasible, can identify a broad range of upper-limb motor and sensory, as well as cognitive, impairments, and complements current clinical rating scales in the assessment of MS-related disability.
Keywords
Introduction
Multiple sclerosis (MS) is a chronic demyelinating central nervous system disease that affects an estimated 2.2 million people worldwide, 1 and prevalence in some countries such as Canada has increased substantially in the past decade. 1 The current standard measurement of disability in MS is the Expanded Disability Status Scale (EDSS), 2 which quantifies the overall effect of MS on an individual. The EDSS measures pyramidal, cerebellar, visual, sensory, bowel and bladder, and cognitive function. However, one limitation is that it is heavily biased towards ambulation, particularly at higher EDSS scores. 3 Furthermore, changes on the EDSS are nonlinear and may not necessarily relate to true changes in patient-reported outcomes.4–6 Newer scales such as the MS Functional Composite (MSFC) reflect functional system deficits beside mobility in a more balanced way. 7 However, as with all clinical tools presently used to quantify impairments in MS, the MSFC is imprecise and it remains difficult to optimize novel therapeutic strategies using either the EDSS or MSFC. Furthermore, it remains difficult to objectively described impairments in upper-limb motor function and cognition in people with MS.
Robotic assessment tools are becoming a widely-accepted approach for capturing detailed kinematic data in health and disease.8–10 In this study, we investigate the feasibility of the Kinarm robotic system (Kinarm, Kingston, ON, Canada) to quantify upper limb motor and sensory skill, as well as cognitive function (e.g. executive functions), in people with MS and compare these measurements to standard clinical assessment tools. Kinarm has been used to assess individuals with diverse neurological conditions, including stroke,11–13 amyotrophic lateral sclerosis, 14 and concussion. 15 Previous work assessing upper-limb function using robotics in people with MS has provided detailed accounts of individual kinematic parameters in some instances, 16 and purely motor skill in others. 17 Here, we investigate a broad range of motor, sensory and cognitive tasks with emphasis on performance across multiple behavioural domains. The objectives of the present work are 1) to investigate the feasibility of using Kinarm robotic assessment in people with MS by determining how well the assessment is tolerated by people with diverse MS endotypes, and 2) to examine the complementary roles of robotic assessment and current standard clinical tests (e.g. EDSS) in characterizing disability in MS.
Methods
Participants and clinical assessment
Adult MS patients (age 18+) were recruited from Kingston Health Sciences Centre MS clinic, Hotel Dieu Hospital (Kingston, ON, Canada). Inclusion criteria for the study were: 1) diagnosis of MS (2010 McDonald criteria 18 ), 2) lack of severe cognitive deficits (<20 points on the MoCA) that would limit the understanding of the robotic task instructions, 3) no previous neurological injury that would confound the study results, and 4) no upper limb injury that would limit the ability to perform the robotic tasks. We allowed participants with visual impairments unless they substantially impacted the ability to complete the tasks (i.e. vision <20/200; most individuals had 20/50 vision or better in their good eye; however, 1 individual had 20/200 vision in their good eye). These cases were considered as they arose. Situations in which individuals were unable to complete a task or noted excessive difficulty because of visual symptoms simply led to an individual not completing that task. Generally, task completion did not require the ability to read specific symbols (see below and Table 1). This study was reviewed and approved by the Queen’s University Research Ethics Board. All participants provided written informed consent prior to taking part in the study.
Detailed task descriptions.
Participants in the study underwent a series of standard clinical assessments examining sensory and motor skill, as well as cognitive function. The EDSS 2 was collected by one of three experienced and Neurostatus-certified neurologists (AYJ, SWT, or MB). Lower and upper limb function were assessed using the timed 25-foot walk test (T25W) 19 and 9-hole peg test (9HPT), 20 respectively. The nine-hole peg test was performed independently for each arm (dominant arm: 9HPT-D; non-dominant arm: 9HPT-ND).
We tested cognitive functions using two standard tests; 1) the Montreal cognitive assessment (MoCA), 21 and 2) the symbol-digit modalities test (SDMT). 22 The MoCA was originally developed to screen for mild cognitive impairment in a geriatric outpatient population with an impairment threshold of <26. 21 An alternative threshold of <23 was determined in a larger, more heterogeneous population 23 ; we reported results based on both thresholds. Additionally, we quantified overall fatigue using the 21-question variant of the Modified Fatigue Impact Scale (MFIS-21). 24
Robotic assessment
Kinarm is a robotic device that is designed to measure upper limb motor and sensory (proprioception), as well as cognitive (executive function, processing speed and spatial working memory) domains using a suite of tasks called Kinarm Standard Tests™, (KSTs) (Kinarm, Kingston, ON, Canada). See Figure 1 for depiction of the robotic devices and Table 1 for a detailed description of KSTs. Primarily, we used the Kinarm Endpoint lab that required individuals to grasp a handle attached to the end of a robotic linkage. For those with severe weakness, the Kinarm Exoskeleton lab was used, which supported the arms in the horizontal plane and did not require a handle to be gripped. In both cases, the participants completed the tasks by moving their arms in the horizontal plane underneath a semitransparent mirror. Tasks were projected downwards onto this screen from above. When provided, feedback of hand position was also displayed on the screen (usually a white cursor dot). Participants’ hands were obscured by a vision blocker and they relied on any visual feedback provided by the virtual display aligned with the workspace. Visual feedback on the screen corresponded to the centre of the handle (Endpoint robot) or the index fingertip position (Exoskeleton). A total of 8 tasks were collected (Table 1). Tasks in the motor domain were visually guided reaching (VGR), ball on bar (BOB), object hit (OH), and object hit and avoid (OHA). Cognitive tasks were reverse visually guided reaching (RVGR), trail making (TMT), and spatial span (SPS). Note that although RVGR and TMT focus on motor response inhibition and information processing, respectively, they do include a time component that is sensitive to motor ability. Finally, arm position matching (APM) tested the sensory domain (proprioception). All participants were assessed in the same order to standardize potential fatigue effects across all participants.

The Kinarm robot, tasks, and statistical properties of Task Scores. (a) The Kinarm Endpoint lab includes two robotic linkages with handles that are grasped by the participant and permits movement in the horizontal workspace. A virtual reality system projects objects into the horizontal workspace. (b) The Kinarm Exoskeleton lab includes two robotic linkages with troughs to support the arms in the horizontal plane. The robotic system is attached to a wheelchair frame and the subject and robotic linkage is wheeled up to a virtual reality system. (c) Eight behavioural tasks were performed, spanning motor (VGR, BOB, OH, OHA), cognitive (RVGR, SPS, TMT) or sensory (APM) behaviours. (d) The Task Score is a one-sided measure, and the quantiles of its cumulative density function (CDF) are similar to those of the standard Normal distribution. E.g. a Task Score of 1 represents 68.3% of the area under the curve, the same as the area under the standard Normal CDF for a Z-value of ±1.
Statistical analysis
Each of the robotic tasks generates approximately 15 variables that describe various spatial and temporal aspects of performance; we standardized these measures and then condensed the information to make the results of entire exams more easily understood. These measures, referred to as ‘parameters’, quantified performance in a variety of units (e.g. m/s, sec., cm). Thus, to facilitate comparison across parameters, raw values were converted to Z-scores that accounted for age, sex, handedness, and type of robotic platform (i.e. Endpoint versus Exoskeleton) based on the performance of large healthy control cohorts (collected previously to this study, see www.kinarm.com). In addition to the Z-scores, an aggregate metric of performance was also derived, called the Task Score (see31,32 and www.kinarm.com), which provides a summary of overall performance on a given task that accounts for performance on all parameters in each task. The cumulative density of this measure approximates that of the Normal distribution. Thus, a Task Score of 1.96 is the 5th percentile of expected performance for healthy individuals; this is the threshold we used for impairment. Further detail on calculation of the Task Score is available in the Supplementary Material. Out of the 8 robotic tasks, a total of 13 global measures of performance were derived. Two tasks (VGR and RVGR) generated three values each as these tasks were tested in the dominant (VGR-D, RVGR-D) and non-dominant (VGR-ND, RVGR-ND) arms, and the inter-limb score was calculated (VGR-IL, RVGR-IL). 14 One task (APM) generated two values representing dominant (APM-D) and non-dominant (APM-ND) arm. These operations were performed using Matlab R2018a (The Mathworks, Natick, MA).
We performed additional analysis of Kinarm and clinical variables using R software (R 3.5.1; R Core team 2018, www.R-project.com). Spearman correlations were used with ordinal variables (EDSS, SDMT, MFIS-21, and MoCA), and Pearson correlation was used with continuous variables such as robotic Task Scores (these have approximately a Normal cumulative density function). We corrected for multiple comparisons using either the Bonferroni correction for the family-wise error rate (FWER) when <100 significance tests were compared, or the Benjamini-Hochberg correction for the false discovery rate (FDR) when ≥100 tests were being compared. 33 We used the Benjamini-Hochberg approach for a larger number of comparisons because the Bonferroni adjustment is unnecessarily conservative in this situation. To provide an overview of performance on robotic Task Scores from each domain (i.e. motor, cognitive, and sensory), we calculated the root mean-square (RMS) of the Task Scores in each domain for each individual. For comparisons to the MS cohort, we simulated distributions of expected Task Scores for healthy control individuals. This was achieved by generating 1000 random Normally-distributed values (mean of 0 and a standard deviation of 1) in which 5% of sampled points were outside the expected range of performance values (±1.96), as expected for the Task Scores. We previously demonstrated that some tasks have a higher than 5% impairment rate for a subset of the control group, 34 however we chose to use a 5% theoretical impairment rate in this study for more convenient comparisons across tasks and because we have reported on a subset of the control groups previously. 34
Results
Participants, feasibility, and clinical scores
We assessed 43 individuals diagnosed with relapsing, secondary progressive or primary progressive MS. MS cohort demographic information is summarized in Table 2. Altogether, 4.7% of all upper-limb robotic tasks could not be completed (26 instances out of 559 tasks completed by the cohort). In total, 10 participants were unable to complete one or more tasks, most commonly because of fatigue. We proactively tried to mitigate fatigue by offering breaks when necessary, and testing participants at any point in the day at which they were most comfortable/least likely to experience fatigue. Two individuals had to be assessed using the Exoskeleton robot instead of the Endpoint robot because they could not maintain grip on the handles for an extended period of time (EDSS scores were 6.0 and 6.5). One individual mentioned that the room was quite hot. Participants in the MS cohort completed a series of clinical assessments (see Table 2 for a summary). There were 23 individuals with an EDSS ≤ 2.5, and 11 individuals with an EDSS ≥ 4.0. Two individuals did not have an EDSS recorded at the time of assessment. One individual with an EDSS of 2.5 was unable to complete most Kinarm tasks but reported an MFIS-21 score of only 8 (note that the median was 36.5, with higher scores indicating greater fatigue). One individual reported difficulty seeing the letters/numbers in TMT.
MS cohort demographics and clinical score summary.
IQR = Interquartile range; SD = Standard deviation; -D/-ND = Dominant or non-dominant arm; EDSS = Expanded disability status scale; T25W = Timed 25-foot walk; MoCA = Montreal cognitive assessment; SDMT = symbol-digit modalities test; HPT9 = 9-hole peg test; MFIS-21 = Modified fatigue impact scale, 21-question variant.
Robotic tasks
In general, individuals in the MS cohort were impaired on a wide range of tasks, although the number of impairments did not necessarily relate to the EDSS score. For example, 10 individuals out of 28 who had an EDSS <3.5 were impaired on 4 or more individual Kinarm tasks from a variety of domains. This can be observed in Figure 2, and it highlights that there are individual differences in performance across domains that may not necessarily be captured by the EDSS. Impairment rates across all tasks are also summarized in Table 3. Figure 3 presents data both above- and below the impairment threshold in cumulative distributions of Task Scores. There are clear divergences between MS and expected distributions in some tasks, such as VGR-D and RVGR-D. This suggests that there are not only impairments above the identified threshold of a Task Score of 1.96, but also systematic shifts in the distributions of task performances below the impairment threshold. Six out of twenty-eight (21%) individuals with EDSS < 3.5 were impaired on VGR-D. We additionally noted some individuals that were impaired on APM, which tests proprioception. Interestingly, we did not observe substantial overlap between impairments in APM and those in other tasks testing motor ability, such as VGR. Thus, these deficits are potentially separable.

Task Scores for individuals with MS across all robotic tasks. Individuals are sorted in order of increasing EDSS score (left to right), with missing EDSS values on the rightmost part of the axis (marked with ‘NA’). Squares marked with ‘X’ indicate missing values. Task Scores <1.96 (not impaired) are in lightest grey, whereas Task Scores >1.96 (impaired) are in darkening shades of grey (darker=poorer performance). Note that rms_mot=RMS of motor Task Scores, rms_cog=RMS of cognitive Task Scores, and rms_sen=RMS of sensory Task Scores (i.e. APM-D and APM-ND).
Proportions of impaired individuals in the MS cohort for each robotic task performed.
Number of impaired (%) are presented. Impairment is defined as a Task Score > 1.96; -D/-ND/-IL indicate Dominant, non-dominant arm, or interlimb score; RMS=Root mean square.

Cumulative sums of Task Scores in MS and simulated control cohorts. Expected values for control participants are plotted as thin grey lines based on means and standard deviations for healthy individuals. People with MS are plotted with black circles. The vertical dashed lines indicate the impairment threshold of 1.96. Percentages of participants are indicated on the y-axis (along with the task label).
Relationships between robot and clinical measures
We explored correlations between clinical variables and robotic Task Scores, summarized in Figure 4. The EDSS correlated strongly with 4 robotic variables after adjusting for the FDR. The strongest correlation (R = 0.51) was with OHA. EDSS step, in contrast, had virtually no correlation with VGR-D and TMT. The cognitive clinical tests (MoCA and SDMT) showed multiple significant correlations with cognitive-domain Kinarm tasks. The strongest of these was the correlation between SDMT and TMT (R = -0.56). The 9-hole peg test in the non-dominant arm (9HPT-ND) had significant correlations with 4 motor-domain Kinarm variables; specifically, the tasks testing bimanual ability (BOB, OH, and OHA). Importantly, the MFIS-21 did not have any correlations approaching significance with the robotic assessment tasks.

Correlations between all clinical and robotic variables (MS group only). Blue cells indicate negative correlations and red cells indicate positive correlations. Values that were significant after FDR correction are bolded and in larger font. Note that rms_mot = root mean-square (RMS) of motor Task Scores, rms_cog = RMS of cognitive Task Scores, and rms_sen = RMS of sensory Task Scores (i.e. APM-D and APM-ND). ‘-D’ indicates dominant arm, ‘-ND’ indicates non-dominant arm.
We additionally considered the relationship between 9HPT-D and 9HPT-ND, a validated upper limb assessment for MS, and our robotic Task Scores. Out of all individuals with 9HPT-D scores in the normal range, between 4% (APM-D) and 41% (VGR-IL) had impairments. For 9HPT-ND, the range was 7% (APM-D) to 41% (VGR-IL) as well. Notably, 30–31% of individuals had impaired RVGR Task Scores (dominant, non-dominant, or inter-limb conditions) but normal 9HPT scores (dominant- or non-dominant condition). Finally, out of all 43 individuals in the study, 15 (i.e. 35%) had impairments on 9HPT-D and 13 (i.e. 30%) had impairments in 9HPT-ND. Furthermore, 9/43 were unimpaired on all Kinarm tasks. Of this subset, 2/9 individuals were impaired on the 9HPT-D and 2/9 were impaired on the 9HPT-ND (same 2 individuals in both cases).
Discussion
This is the first report that investigates the feasibility of upper-limb robotic assessment across a broad range of motor, sensory and cognitive functions in patients with MS. We demonstrated that robotic assessment with Kinarm was generally feasible for individuals with a wide range of MS-related disability. We found that people with MS had diverse impairments on robotic tasks testing upper-limb sensorimotor and cognitive function. Finally, we found that robotic tasks had correlations with relevant clinical variables (e.g. cognitive tests), demonstrating face validity of using robotic assessment to test these functional domains.
Even though the EDSS-rated disability in our cohort ranged widely from EDSS = 0.0 to EDSS = 7.5, almost all participants were able to complete the robotic assessment; this demonstrates that Kinarm assessment can be used in a broad range of individuals, not just those with mild or moderate impairments. Importantly, we did not observe significant correlation between the MFIS-21 and any of the robotic tasks, indicating that fatigue had at most a modest impact across the entire cohort. Even so, three individuals reported an effect of fatigue during assessment that prevented them from completing some tasks, two of whom had higher EDSS scores (7.0 and 7.5, respectively). One individual (EDSS 2.5) had an MFIS-21 score of 8, indicating little fatigue was present before the assessment, but did not complete several tasks because of fatigue. This highlights the dynamic nature of fatigue in MS. Potentially, future studies could also gather the MFIS-21 after the exam or at different stages throughout the assessment in order to explore ongoing effects of fatigue on task performance in greater depth. Future studies should also identify a reduced set of tasks that best capture impairments associated with MS to ensure a larger proportion of participants can complete the assessment.
We identified that people across the MS cohort had prevalent impairments in a variety of motor tasks despite a relatively low median EDSS (2.5), suggesting that robotic assessment could identify motor deficits that the EDSS could not. The EDSS is very sensitive to ambulation ability, 35 but does not substantively weight upper-limb motor impairments nor cognition. Importantly, we did identify that several individuals had impairments on tasks testing motor skill. For example, 21% of the cohort had impairments in VGR in the dominant arm. This is a straightforward test of motor skill, but nevertheless 6/28 (21%) individuals with EDSS < 3.5 had an impairment on VGR-D, suggesting that performance on upper-limb motor tasks is only partially related to EDSS. A larger sample size will be required to better-understand he relationships between multiple Kinarm variables and the EDSS in a predictive manner (e.g. using regression-based approaches). Our correlation analyses identified that the EDSS had significant correlations with multiple Kinarm tasks – mostly those testing bimanual skill. Further work will be required to investigate the robustness of these findings in a cohort with a wider range of MS-related disability. Interestingly, we identified relatively high proportions of individuals who had impaired robotic Task Scores but normal performance on the 9HPT, with bounds as defined by Erasmus et al. 36 This suggests that our assessment of whole-arm movement can capture sensorimotor impairments not captured by the 9HPT.
It is interesting that cognitive task impairments were at least as prevalent as motor task impairments across the cohort. Notably, RVGR identified the greatest number of performance deficits of any task in the study. This could reflect deficits in the ability to perform motor skills with additional cognitive constraints. Importantly, RVGR performance was not directly linked to EDSS step. Thirty-nine percent of individuals with EDSS < 3.5 had impairments in RVGR in the dominant arm. This could be related to an MS-induced reduction in white matter integrity combined with grey matter lesions,37,38 as well as potentially impaired interhemispheric connectivity. 39 Prior work has demonstrated common bimanual upper limb impairment in individuals with high EDSS (>5.5). 40
Our study has some limitations. The most substantive is the small sample size and clinical heterogeneity of the cohort; this prevented us from considering interactions between variables on a higher order than bivariate comparisons. We nevertheless found promising relationships between variables (e.g. cognitive Task Scores and the MoCA). Further work should aim to include participants with EDSS scores more evenly distributed across all possible levels. This is particularly so for individuals with EDSS scores in the range of 4.0–5.5 of which we had almost no representation in this study; patients tend to spend less time in this range, 5 thus making it potentially harder to recruit these individuals. Larger samples will also allow the use of more powerful analytical tools to identify clusters of impairments in MS patients and possible contributions of other factors, such as medications, to Kinarm performance. We allowed individuals with visual impairments in the present study as long as these were not sufficient to prevent task completion, because it gave us an idea of any potential problems that could arise from our robotic assessment approach. Finally, the inclusion of imaging markers will allow us to comment in greater detail on any relationships between Kinarm and lesion burden/locations.
Conclusions
This study is the first to examine the feasibility of Kinarm robotic assessment in patients with MS. We showed that this approach is generally feasible in the MS population, and in our small cohort of individuals with MS demonstrated impairments in various robotic tasks measuring motor and cognitive performance. Additionally, robotic tasks testing bimanual skill, or derived values that considered interlimb differences, typically correlated the best with the EDSS, while cognitive tests did not. While the EDSS has proven to be a valuable measurement tool of MS clinical severity, our study shows that robotic assessment of a broad range of cognitive, motor and sensory capabilities may complement current standard clinical rating scales to characterize MS-related disability.
Supplemental Material
sj-pdf-1-mso-10.1177_2055217320964940 - Supplemental material for The feasibility of assessing cognitive and motor function in multiple sclerosis patients using robotics
Supplemental material, sj-pdf-1-mso-10.1177_2055217320964940 for The feasibility of assessing cognitive and motor function in multiple sclerosis patients using robotics by Leif ER Simmatis, Albert Y Jin, Sean W Taylor, Etienne J Bisson, Stephen H Scott and Moogeh Baharnoori in Multiple Sclerosis Journal – Experimental, Translational and Clinical
Footnotes
Data availability
Data presented in this manuscript can be obtained from the authors on request.
Conflict of Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: SHS is the cofounder and Chief Scientific Officer of Kinarm (formerly BKIN Technologies Ltd.) that commercializes the robotic technology used in the present study. AYJ, EJB, LERS, MB, and SWT have no conflicts to disclose.
Acknowledgements
We would like to sincerely thank Simone Appaqaq, Helen Bretzke, Ethan Heming, Kimberly Moore, and Justin Peterson for their valuable assistance with participant recruitment/assessment, database management, and technical support. We would also like to thank Dr. Don Brunet for his early role in recruiting participants and conceptualizing this project.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by an Ontario Research Fund grant (Grant No. ORF RE 09-112), a GlaxoSmithKline and Canadian Institutes of Health Chair in neuroscience, and a Canadian Institutes of Health Operating Grant [Grant No. MOP-106662).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
