Abstract
Keywords
Introduction
Upper-limb motor dysfunction is a significant contributor to the loss of independence in activities of daily living after stroke. Movement may be impaired by a single aspect of motor control such as reduced speed, coordination, range of motion or strength, or by a combination of these components and this is reflected in different strategies to measure motor function. 1 Many tools address a single aspect of function, such as hand-held dynamometry, which measures only handgrip strength, 2 or the modified Ashworth Scale, which categorizes muscle resistance to passive movement. 3 Although such tests provide an indicator of the level of motor dysfunction, the results cannot be generalized to functional movement ability or the capacity for rehabilitation. More comprehensive assessments of upper-limb functional ability can be made using tests that examine multiple aspects of upper-limb motor function, several of which have been specifically developed for use after stroke. This study investigated 3 such multidomain assessments: the upper-limb motor subscale of the Fugl-Meyer Assessment (F-M), the Wolf Motor Function Test timed-tasks (WMFT), and the upper-limb sections of the Motor Assessment Scale (MAS).
The F-M, 4 and particularly the upper-limb motor subscale, is among the most frequently used tools internationally to assess stroke impairment.5,6 International use of the WMFT 7 is growing, and in Australia, the upper-limb sections of the MAS 8 is the most commonly used multidomain tool.9,10 Ceiling and floor effects have been reported for the F-M and WMFT, respectively,11,12 and both floor and ceiling effects have been reported for the MAS.9,13-16 There is some debate about the MAS scoring system and the accuracy of the rank order of items within each section that were designed to reflect the hierarchical structure of the assessment.9,17-19 More recent Rasch analyses confirmed the rank order of section 6, upper arm function,15,20,21 while that of section 7, hand movements, was confirmed in 1 study 20 but rejected in 2 others.15,21 The rank order of section 8, advanced hand activities, was rejected by all 3 studies.15,20,21 When section 8 of the MAS was assessed without the hierarchical scoring system (ie, patients attempted all items), patient scores were higher than when the hierarchy was employed. 19
There is a growing international consensus regarding the need for a harmonized core set of clinical outcome measures for trials involving stroke survivors.22,23 This will not only enable more direct comparison between trials but will also enhance the utility and generalizability of meta-analyses.24,25 Up to 129 different tools were identified in recent surveys of upper-limb stroke rehabilitation trials emphasizing the absence of a harmonized set of assessments.22,26 While there is little consensus regarding the optimal assessment set, there is agreement that no single assessment tool adequately measures upper-limb function across the spectrum of poststroke dysfunction.1,27 Rather than select assessment tools a priori, it may be more efficient to compare multiple tests in the same patients to identify the most robust tools.28,29
The choice of assessment tool is influenced by many factors that may include the level of impairment; measurement sensitivity, validity, and reliability of a given tool; the time and cost required for implementation; or assessor preferences and familiarity.1,30-33 The nature and extent of motor impairment must be accurately established for each patient not only to enable intertrial comparisons and to prevent treatment arm bias in clinical trials 34 but more importantly to provide a basis on which to design rehabilitation programs and set achievable patient-centered goals.
This study investigated 3 stroke specific multidomain assessment tools used to quantify different aspects of upper-limb motor ability and improvement with therapy in a heterogeneous cohort of community-dwelling stroke survivors. This study is the first to implement an upper-limb stratification scheme to provide a novel approach to maximize the utility of these tests. We hypothesized that the WMFT would be the most sensitive across a broad spectrum of stroke patients. Each patient was tested with the F-M, WMFT, and MAS before and after a novel upper-limb therapy program. All items of each of the upper-limb MAS sections were assessed and then analyzed both with and without the hierarchical scoring system. The results of this study confirm that no single tool provides adequate sensitivity to cover a broad spectrum of poststroke dysfunction and that the assessment tool of choice should be based on the level of each patient’s residual upper-limb motor function.
Methods
Participants
Fifty-nine patients were consecutively recruited and screened from St Vincent’s and Prince of Wales’ Hospitals, Sydney, Australia. Of these 10 were excluded from the study (8 did not meet criteria, 2 declined). The remaining 31 male and 18 female patients were aged 22 to 83 years (62.6 ± 12.8 years, mean and standard deviation) and were between 1 month and 21 years poststroke (24.7 ± 39.2 months). All had suffered a unilateral stroke in the territory of the middle cerebral artery and were hemiparetic with an upper-limb involvement. Inclusion criteria were the following: (
Stratification
Classification into upper-limb motor-function groups was achieved using our novel stratification system 37 based on the ability to perform the Box and Block Test (BBT) 38 of gross manual dexterity and the grooved pegboard test 39 of fine manual dexterity. Briefly, patients who could move ≤1 block on the BBT and could not complete the grooved pegboard test were classified with low motor-function, those who could move >1 block on the BBT but were unable to complete the grooved pegboard test were classified with moderate motor-function, and those able to complete both tests were classified with high motor-function. Patients with poor eyesight were encouraged to wear prescription glasses during all assessments, and those with hemianopia and neglect were reminded to scan the test area.
Functional Assessments
Motor ability was assessed using 3 multidomain, hierarchically structured upper-limb tools: the upper-limb motor subscale of the F-M, 40 the WMFT timed-tasks, 12 and the upper-limb MAS.41,42 The 33 items of the F-M are scored on a 3-point scale where 0 represents an inability to complete the test item, 1 represents a partial ability, and 2 represents full completion. Test items assess reflexes; the capacity to move in and out of synergy; to isolate movement to the shoulder, elbow, and wrist; and to grasp different objects. The WMFT consists of 15 timed-tasks and 2 strength-based tasks. The timed-tasks simulate functional activities that range from basic movements such as placing the forearm onto a table, to complex fine motor skills such as stacking checkers. A faster mean time represents better motor performance. 43 Strength-based tasks are tested by a progressive submaximal weight-lifting task and maximal grip-strength measured with hand-held dynamometry. The upper-limb MAS sections contain 6 items scored on a 2-point scale where 1 indicates an ability to complete the task and 0 an inability to complete the task. We examined section 6, upper arm function (for logistical reasons the first 3 items were not included); 7, hand movements; and 8, advanced hand activities. When using the hierarchical scoring system patients must successfully complete each item before proceeding to the next item.41,42 In this study, all items were tested regardless of performance and data were analyzed both with and without the hierarchical structure. Assessments were conducted as part of a larger suite of tests immediately before and after therapy by the second and third authors following a standardized protocol. The order of tests was determined on an individual basis to account for fatigue and available resources.
Intervention
All patients received a standardized 14-day protocol of Wii-based Movement Therapy35,36,44 predominantly implemented by the first author who was unaware of assessment results. In brief, this protocol consisted of 60-minute formal therapy sessions with a trained therapist on 10 consecutive weekdays, augmented by progressively increasing home practice. Therapy and home practice sessions use the Wii and Wii Sports (Nintendo, Japan) games of golf, baseball, bowling, tennis, and boxing. The Wii controller was held in the more-affected hand only, and therapy was structured, targeted, and individualized to the needs of each patient. For those with poor grip strength a self-adhesive wrap was used to secure the Wii controller in the more-affected hand and the extent of wrapping was reduced over the 14-day therapy period as strength and ability progressed.
Data Analysis
The maximum score for the upper-limb motor F-M is 66, the maximum mean time for WMFT timed-tasks is 120 seconds with 121 seconds denoting an inability to complete any item, and in this study the maximum score for the MAS was 15. Wilcoxon signed rank tests were used to compare pre- and posttherapy assessments. To facilitate comparisons with previous studies data are illustrated as mean, range, and 95% confidence intervals. Differences were considered significant when
Results
Patients were stratified as low (n = 16), moderate (n = 14), or high (n = 19) motor-function (see Methods). Assessment scores are presented first as pooled data (n = 49) and then for each motor-function subgroup. The MAS analysis is presented first with and then without hierarchical scoring.
Fugl-Meyer Assessment
The rank ordered pretherapy F-M scores described an almost continuous distribution with a ceiling effect for 2 patients with high motor-function (Figure 1A). A further 7 patients scored ≥60. A ceiling effect was evident for 4 patients after therapy with another 13 patients scoring ≥60. There was no floor effect although the 3 patients with the lowest scores received points only due to the presence of reflexes, and this did not change after therapy. Although 9 patients (including those scoring maximal points) showed no improvements there was a significant improvement in F-M scores after therapy both for the pooled data and for each of the functional subgroups (Figure 2A).

Rank ordered distribution of assessment scores.

Change in assessment scores with therapy.
Wolf Motor Function Test—Timed Tasks
The distribution of WMFT mean times highlighted the floor effect for 5 patients with low motor-function who were unable to complete any task in the allotted 120 seconds (see Figure 1B). The long “toe” at shorter times includes data for patients with high motor-function, there was no ceiling effect. Posttherapy scores showed a reduction in the floor effect with 2 of the above 5 patients scoring <121 seconds. Thus, 3 patients showed no change after therapy with the WMFT, and one got worse being unable to successfully complete the single task after therapy that was performed pretherapy. Pre- and posttherapy comparisons revealed significant improvements for the pooled data, and for the moderate and high, but not for low motor-function subgroups (Figure 2B).
Motor Assessment Scale
Rank-ordered MAS pretherapy scores illustrated an almost continuous distribution with both floor and ceiling effects regardless of the scoring method (Figure 1C and D). When scored with the hierarchical system 12/16 patients with low motor-function were unable to complete any item pretherapy, and 6/19 patients with high motor-function were able to complete all items pretherapy. After therapy floor and ceiling effects were evident for 10 patients with low motor-function and 14 patients with high motor-function, respectively. When scored without the hierarchical system the distribution appears more even with mean scores higher for all groups and a smaller floor effect, that is, 11 patients pretherapy and 9 patients posttherapy (Figures 1C & D and 2C & D).
There was a significant improvement in MAS scores after therapy for the pooled data, and the high motor-function subgroup, regardless of scoring method. There was no significant improvement after therapy for the low motor-function subgroup with either scoring method, and the moderate motor-function subgroup only improved if the MAS was scored without the hierarchical structure (Figure 2C and D). Half the patients in this study showed no improvements after therapy when assessed with the MAS regardless of the scoring system, 26 did not improve when scored with the hierarchy and 25 without the hierarchy.
Other Improvements After Therapy
All patients successfully completed all components of therapy. A graded pattern was evident in both the submaximal and maximal WMFT strength-based tasks. Submaximal strength improved from 7.9 ± 2.0 lb to 9.4 ± 2.1 lb for the pooled data (
Discussion
This study investigated the use of 3 multidomain hierarchically structured tests of upper-limb motor-function after stroke. The tests were implemented both before and immediately after a standardized 14-day protocol of upper-limb therapy for a heterogeneous cohort of community-dwelling stroke survivors with functional ability ranging from some residual voluntary movement to minor impairment only. Both before and after therapy ceiling effects were evident with the F-M and floor effects with the WMFT. Both floor and ceiling effects were evident with the MAS regardless of the scoring method. Although all 3 tests showed significant improvements in functional movement after therapy for the pooled data, contrary to our hypothesis only the F-M could do so for each subgroup after stratification by motor function. These results suggest (
Although significant improvements for each motor-function subgroup were demonstrated with the F-M (see Figure 2A), limitations were evident not only as ceiling effects for the 2 patients with high motor-function pretherapy but also when pretherapy scores were ≥60 (2 moderate and 5 high function patients). This effect increased after therapy for 4 high function patients and scores ≥60 for 5 moderate and 8 high function patients. The structure of the F-M may not be sufficient to detect the range of subtle but functionally relevant improvements after therapy for these patients.11,45 Gladstone and colleagues 45 suggested 2 possible strategies to reduce the ceiling effect of the F-M. These were an expansion of the grading scale from 3 points to 6 points and/or the inclusion of more complex fine manual dexterity components. 45 Despite this limitation, the F-M was the most salient measure of improvement for patients with low motor-function and little residual movement although 3 patients with very low motor-function were unable to complete any movement or task and scored points only due to the presence of reflexes. Each of these patients had some movement ability and capacity for improvement but this was not captured by the assessment tools used in this study. Although reflexes provide information about the status of spinal circuitry they may not necessarily be correlated with motor function after stroke.46-48 Therefore, an assessment tool is still required for those with very low motor-function.
In comparison to the F-M the WMFT was better for detecting improvement in patients with moderate and high motor-function but suffers from floor effects for those with low motor-function.11,12 In this study, 5 patients with low motor-function were unable to complete any WMFT timed-tasks within the allocated 120 seconds, or failed because despite having some movement it was not sufficient to meet the specific criteria for each task. 49 Functional ability was underestimated for these patients. The WMFT has been modified for patients with moderate to severe hemiparesis, 50 but this would still be insufficient to quantify movement and improvement for the 5 patients referred to here.
The use of a time scale is more objective and precludes a ceiling effect for the WMFT. A database of normative times for healthy subjects aged ≥40 has been inaugurated to establish the extent of impairment after stroke. 51 Healthy mean task times range from 1.0 ± 0.1 seconds (right-handed males aged 40-49 years) to 1.5 ± 0.2 seconds (right-handed females aged >70 years). In the current study, the fastest patient times ranged between 1.5 and 5.3 seconds (n = 20; Figure 1B). The best performing patient was a 57-year-old male with left-sided hemiparesis of his nondominant hand (Figure 1B). The age- and sex-matched normative equivalent is 1.2 ± 0.2, 51 so that despite good residual movement speed his function still demonstrated minor impairment. Although the WMFT submaximal and maximal strength tasks provide additional information about motor function and improvement with therapy they were not included in this comparison. The 2 tasks are scored separately and no integrated scale has been developed to incorporate strength and movement speed. An integrated scale might improve the utility of the WMFT for patients with low and very low motor-function and highlight greater improvements for those with moderate and high motor-function.
Of the assessment tools investigated in this study the MAS was the least suitable for a heterogeneous stroke cohort due to both floor and ceiling effects. A recent Australian audit identified that 54% of hospitals use the MAS in the acute period. 10 More importantly, this was the only assessment routinely used in the acute setting that specifically addresses upper-limb motor ability. Additionally, the use of individual MAS sections and single items to assess motor function and improvement is growing due to the time constraints of inpatient care. Fifty percent of patients showed no improvement regardless of the scoring system when assessed with the MAS, whereas no improvement was evident for 18% of patients with the F-M and only 6% with the WMFT. In previous studies 31% 13 and 44.3% to 63.9% 17 of subjects did not improve on MAS testing and 80% were clustered at the extremes of all 3 upper-limb MAS subscales 17 after inpatient rehabilitation. The significant floor effect seen previously in subacute stroke patients with severe disability 13 confirms our conclusions that the MAS is not robust across a heterogeneous cohort of poststroke functional ability.
To investigate the accuracy of the MAS hierarchical order we measured patients on all items and analyzed data with and without the hierarchical scoring system. Our results suggest that hierarchical scoring
Methodological Considerations
The data for this novel analysis were derived from several smaller studies of Wii-based Movement Therapy and combined to increase the statistical power of the analysis. Each of the studies employed the same standardized therapy intervention and assessment tools immediately before and after therapy in community-dwelling stroke survivors. The post hoc analyses by functional status followed the development of the stratification scheme that classifies patients based on their level of upper-limb motor function (see Methods). 37 The results from these analyses warrant further investigation with a larger range of assessment tools including the Action Research Arm Test, 52 WMFT functional ability scale, 49 and graded WMFT. 53 The WMFT timed-tasks, upper-limb motor F-M, and the 3 upper-limb subscales of the MAS were chosen for their widespread implementation and because they were specifically developed for use after stroke. The WMFT as used in this study is an objective tool based on a continuous measure, that is, time. The graded WMFT 53 may provide an alternative, objective measure that is more appropriate for those with low residual motor function.
Minimal detectable changes (MDC) and minimum clinically important differences (MCID) for each assessment are important concepts. However, in many reports these values are calculated for patients with a specific level of motor function and/or a specific time poststroke. Values may be calculated during the acute period when the greatest improvement takes place28,54 or in the chronic phase when improvement is slower,55,56 and as far as we can ascertain, only patients with moderate to high residual motor ability have been included in such studies.28,54-57 In the case of the WMFT and F-M there is no one accepted and standardized MDC or MCID value and we are unaware of any values for the MAS. Future studies are needed to establish MCD and MCID for a range of elapsed time and residual functional ability poststroke.
Conclusions
We included 3 multidomain tools in this study to develop a streamlined quantitative assessment protocol for future clinical trials to accommodate the heterogeneity and multifaceted nature of upper-limb functional ability after stroke. When therapy successfully shifts baseline function from low to moderate or from moderate to high motor-function the choice of tests becomes more complex. Based on available validated assessments and the results of this study, we suggest that those with low motor-function are best assessed with the F-M, those with high motor-function with the WMFT, and those with moderate motor-function with both tests. There are no good assessments for patients with very low motor-function and no one test that can be used across the spectrum of poststroke dysfunction. Our results suggest that the choice of assessment must be determined by the level of residual functional ability after stroke.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the National Health and Medical Research Council of Australia and the New South Wales Office of Science and Medical Research, Australia.
