Abstract
Background. Efficacy of task-oriented training can be reliably trusted only when the inherent measurement variability is determined. The Actual Amount of Use Test (AAUT) and the Motor Activity Log (MAL) have been used together as measures of spontaneous arm use after an intervention; however, the minimal detectable change (MDC) of AAUT and MAL has not been addressed. Objective. To compare the MDC90 of the AAUT and the MAL in the context of a randomized controlled trial of a neurorehabilitation intervention, the Extremity Constraint-Induced Therapy Evaluation trial. Methods. A preplanned secondary analysis was conducted using pre–post test data from the control group. Estimated MDC90 were normalized to the maximum value of the scale of the AAUT and the MAL for each subscale: amount of use (AAUTa, MALa) and quality of movement (AAUTq, MALq). Results. The MDC90 of the AAUTq and the MALq were 14.4% and 15.4%, respectively. However, the MDC90 required greater change for the AAUTa (24.2%) than the MALa (16.8%). The training-induced spontaneous arm use exceeded the MDC90 for the MAL but fell below that for the AAUT immediately after the intervention and at 1-year follow-up visit. Conclusions. The greater variability and insensitivity to treatment effect for the AAUTa is likely because of the low resolution of its scoring system. As such, there is a considerable need to develop valid and reliable tools that capture purposeful arm use outside the laboratory, perhaps through leveraging new sensing technologies with objective activity monitoring.
Introduction
During the past decade, clinical meaningfulness has increasingly gained in importance for clinicians and consumers, especially in the field of stroke rehabilitation.1-5 However, the determination and application of the concept of clinical meaningfulness remains vague. To determine and apply measures of clinical meaningfulness to outcomes for which people have interest, both sensitivity of measurements and efficacy of interventions to promote meaningful outcomes are very important.2,6 For upper limb recovery after stroke, the most significant outcome is the ability to voluntarily use the paretic arm and hand.7,8 To our knowledge, the accelerometer,9,10 the Motor Activity Log (MAL),11,12 and the Actual Amount of Use Test (AAUT) 13 are the only instruments in current use to capture changes in spontaneous arm use in this population. To regain spontaneous arm use after stroke, Taub et al 11 developed a signature intervention, constraint-induced movement therapy (CIMT), based on evidence from animal models to overcome paretic arm nonuse for individuals poststroke. Although evidence shows the efficacy of CIMT,14,15 an objective scientific method of determining a meaningful change in spontaneous arm use has not been established. 1 Before that, there is a need to understand the inherent noise level of the measure used to capture spontaneous arm use. Determining the “noise” level through estimation of the minimal detectable change can be considered as a calibration process. Once the systematic error associated with the measure is known, then the detected training-induced change can be trusted.
The minimal detectable change (MDC) is used to determine the threshold for whether a pre–post change is true from a measurement perspective.2,6,16 This is fundamentally different from that used to determine if the change is “meaningful” (ie, minimal clinically important difference) to the patient, family, clinicians, and others.3,4 MDC is defined as the smallest amount of change that is detectable and not due to inherent variation or noise in the measure itself.2,5,17-19 In general, to compute MDC, data from 2 repeated tests are used where there is no intervening manipulation. The 90% and 95% confidence level of a reliable difference (MDC90 and MDC95) are 2 common MDC metrics; 20 however, MDC90 is regarded as sufficient for decisions pertaining to the efficacy of clinical interventions. 16 Here, a change in the outcome of interest exceeding the MDC indicates the intervention program is effective.2,6 In circumstances where the consequences of a potentially wrong decision are more severe (eg, the possibility of mortality from surgery), MDC95 would be a better choice given the higher confidence level with respect to test–retest reliability. 16
The first large-scale randomized controlled trial to evaluate the efficacy of CIMT, the Extremity Constraint-Induced Therapy Evaluation (EXCITE) trial, used the AAUT (secondary outcome measure) and the MAL (outcome measure) as convergent measures to detect improvement of spontaneous arm use after CIMT. 21 The primary outcomes and their results were presented elsewhere14. The AAUT is a covert performance-based assessment with 17 daily activities used to observe actual arm behavior in real-life situations. 13 Spontaneous arm use was filmed covertly in the laboratory using real-world scenarios (eg, grabbing and pulling a chair out from under the table, opening a file folder) staged by an investigator and scored by trained and standardized raters masked to group assignment. Contrary to the AAUT, the MAL is a semistructured interview questionnaire used to retrieve participants’ recall of arm behavior during the past few days. The questionnaire consists of questions regarding 30 daily activities (eg, turning on a light switch, opening a drawer). 11 Spontaneous arm use was evaluated based on participants’ recall and compared with the remembered status before their stroke (eg, “50% of use for that task before my stroke”). Although there was a significant increase in the amount of paretic arm use measured with the AAUT 22 and with the MAL 14 after 2 weeks of CIMT, there has been little study about whether these increases reflected a true change in spontaneous arm use.1,23,24
The purpose of this study was to determine the MDC of the AAUT and the MAL in the context of the EXCITE trial. We had 3 aims: (a) to determine the MDC for the AAUT and the MAL, (b) to compare the MDC for the AAUT and MAL, and (c) to determine if the CIMT-induced changes in spontaneous arm use exceed the estimated MDC for the AAUT and the MAL.
Methods
This secondary data analysis used a subset of data from the EXCITE trial where 222 participants 3 to 9 months after stroke with mild to moderate impairment were randomized to 1 of 2 treatment arms. Inclusion criteria for the participants were as follows: (a) >10° active wrist extension, (b) >10° active thumb abduction and extension, and (c) at least 2 additional active digit extension movements. On enrollment, participants were assigned randomly to either the CIMT group or the control group (not receiving CIMT) for the first year. The AAUT and the MAL were used as proxies of the outcome of interest, real-life spontaneous arm use. There are 2 subscales for both the AAUT and the MAL: amount of use and quality of movement. For clarity, we abbreviated the subscales as AAUTa, AAUTq, MALa, and MALq for the AAUT amount of use, AAUT quality of movement, MAL amount of use, and MAL quality of movement, respectively (Table 1).
Abbreviations of 2 Subscales for the Actual Amount of Use Test and the Motor Activity Log
For the AAUTa subscale, each testing item was scored either “0” (did not attempt to use the paretic arm) or “1” (attempted to use the paretic arm). The AAUTa was calculated by dividing the number of items scored as “1” by the total number of items scored and expressed as a percentage. The AAUTq subscale was scored from “0” to “5” with 0 being that the participant did not attempt to use the paretic arm for that activity and 5 for normal performance of the activity. A “1” on the AAUTq rated movement performance as very poor, a “2” poor, “3” fair, and “4” nearly normal. The AAUTq was calculated by dividing the total score by the number of items scored and expressed as an average. For the MALa subscale, each item (eg, used the weaker arm for that activity) was scored from “0” to “5.” Zero indicated that the participant did not use the paretic arm for that activity. “1” occasionally tried to use their weaker arm, “2” rarely used the weaker, “3” used the weaker arm for that activity half as much as before the stroke, “4” 75% as much as before the stroke, and “5” indicated the same amount of use for that activity as before the stroke. The rating score of the MALq was the same as that of the AAUTq. For the purposes of comparison, the average ratings of the AAUTq, MALa, and MALq were normalized to the maximal value of the scale and expressed as a percentage, rounded to the first decimal place.
Minimal detectable change for each of the measures was calculated from the pre-post test data of the control group. As mentioned earlier, the MDC90 was determined to be the appropriate measure of reliability for the intervention studies. 16 It was chosen to determine the measurement variability and then used to verify whether the magnitude of the CIMT-induced improvement was caused by the intervention or by random noise. To calculate MDC90, we first computed the intraclass correlation coefficient—ICC(3,1)—and the standard error of measurement (SEM). Then, the MDC90 was obtained using the following formula, where SDbaseline is the standard deviation at baseline.18,20
To investigate the effect of CIMT, the first step was to determine the significance of the CIMT-induced changes, and the second step was to compare these changes to the calculated MDC values. For the first step, an independent-samples t test was used to test the CIMT-induced changes between groups. For the second step, the amount of CIMT-induced change was compared with the MDC values. Statistical analyses were performed using SPSS software package (version 15).
Results
Pre–post test data from 116 participants in the control group were used for this secondary analysis and the data from 106 participants in the CIMT group were used to test the effects of CIMT (Table 2). The ICC, SEM, and MDC90 of the AAUT and the MAL are shown in Table 3. In general, the MDC90 values of both measures were greater in the amount of use than the quality of movement subscale. The MDC90 in the AAUTa and the MALa were 24.2% and 16.8%, respectively. As for the quality of movement subscale, the MDC90 in the AAUTq and the MALq were similar (14.4% and 15.3%, respectively). Whereas the MDC90 of the AAUTq was very close to the MALq, the MDC90 of the AAUTa was 30.6% greater than MALa. In addition, the 95% confidence interval for the MDC AAUTa (24.2%) did not overlap with that for the MDC AAUTq (14.4%), the MDC MALa (16.8%), or the MDC MALq (15.3%). This finding reveals that the difference in measurement variability between the AAUT and the MAL was much greater for the amount of use than for the quality of movement subscale.
Characteristics of Research Participants
Abbreviations: CIMT, constraint-induced movement therapy; FM, Fugl-Meyer Assessment (score range 0-66).
Age to the nearest year.
ICC, SEM, and MDC90 for the AAUT and the MAL (n = 116)
Abbreviations: AAUTa: amount of use of the Actual Amount of Use Test; AAUTq: quality of movement of the Actual Amount of Use Test; MALa, amount of use of the Motor Activity Log; MALq, quality of movement of the Motor Activity Log; SDbaseline, standard deviation at baseline; ICC, intraclass correlation coefficient; SEM, standard error of measurement; MDC90, minimal detectable change at 90% of confidence; CI_ub, upper bound of confidence interval; CI_lb, lower bound of confidence interval.
Values in this column are the MDC90 values before normalization except for AAUTa.
The independent t test showed that the CIMT-induced changes between groups reached a significant difference (P < .001) for AAUTa, AAUTq, MALa, and MALq (Figure 1). Immediately after the intervention for the CIMT group, we found a 24.9% (9.8/39.4) increase in AAUTa, a 34% (0.34/1.00) increase in AAUTq, an 84.6% (1.15/1.36) increase in MALa, and a 72.2% (1.04/1.44) increase in MALq, suggesting noticeable CIMT-induced improvement in spontaneous paretic arm use (Table 4). This effect was maintained at the 1-year follow-up evaluation (Figure 1).

Spontaneous arm use as measured by the AAUT and the MAL for both groups. The measures were taken at 3 different time points: pre-CIMT, post-CIMT, and 1-year follow-up. (A) Amount of Use and (B) Quality of Movement. Abbreviations: CIMT, constraint-induced movement therapy; AAUTa, amount of use of the Actual Amount of Use Test; AAUTq, quality of movement of the Actual Amount of Use Test; MALa, amount of use of the Motor Activity Log; MALq, quality of movement of the Motor Activity Log.
Comparison Between CIMT-Induced Improvement for Spontaneous Arm Use and the MDC90 Values Measured by the AAUT and the MAL (Mean ± Standard Error) Immediately After CIMT
Abbreviations: CIMT, Constraint-Induced Movement Therapy; AAUTa, amount of use of the Actual Amount of Use Test; AAUTq, quality of movement of the Actual Amount of Use Test; MALa, amount of use of the Motor Activity Log; MALq, quality of movement of the Motor Activity Log; Δ, difference between post-CIMT and pre-CIMT; MDC90, minimal detectable change at 90% of confidence.
For the comparison of the CIMT-induced spontaneous arm use and the MDC90, we found that only those measured by the MAL exceeded the MDC90 (Tables 4 and 5). Sur-prisingly, these significant training effects, when measured using the AAUT, did not exceed the MDC90 for either subscale immediately after the CIMT intervention and 1-year follow-up visits (Tables 4 and 5). Figure 2 shows the CIMT-induced change for each of the 4 subscales as a function of the MDC90 thresholds. We found that the numbers of participants whose MALa and MALq exceeded the MDC90 were 67.3% (66/98) and 63.4% (62/98) immediately after the CIMT intervention (Figure 2, Table 4). These percentages were maintained at approximately the same level at the 1-year follow-up (63.8% [51/80] and 70% [56/80], respectively; Figure 2, Table 5). In contrast, for the AAUT, we found that the number of participants whose AAUTa and AAUTq exceeded the MDC were much lower with both 18.9% (18/95) immediately after the CIMT intervention (Figure 2, Table 4) and 16.3% (13/80) and 31.3% (25/80) at the 1-year follow-up, respectively (Figure 2, Table 5).

Magnitude of change in spontaneous arm use immediately post-CIMT intervention and at 1-year follow-up for individuals poststroke in the CIMT group. Spontaneous arm use change for each participant is plotted for (A) AAUTa, (B) AAUTq, (C) MALa, and (D) MALq as a function of the MDC90 threshold (horizontal line). The total number of participants and the number of participants whose scores were equal/greater or less than the MDC90 threshold are indicated for each evaluation time point. Abbreviations: CIMT, constraint-induced movement therapy; AAUTa, amount of use of the Actual Amount of Use Test; AAUTq, quality of movement of the Actual Amount of Use Test; MALa, amount of use of the Motor Activity Log; MALq, quality of movement of the Motor Activity Log; MDC90, minimal detectable change at 90% of confidence.
Comparison Between CIMT-Induced Improvement for Spontaneous Arm Use and the MDC90 Values Measured by the AAUT and the MAL (Mean ± Standard Error) at 1-Year Follow-Up
Abbreviations: CIMT, Constraint-Induced Movement Therapy; AAUTa, amount of use of the Actual Amount of Use Test; AAUTq, quality of movement of the Actual Amount of Use Test; MALa, amount of use of the Motor Activity Log; MALq, quality of movement of the Motor Activity Log; Δ, difference between 1-year follow-up and pre-CIMT; MDC90, minimal detectable change at 90% of confidence.
Discussion
This is the first study to explore “interpretability” in relation to measurement of spontaneous arm use poststroke. Data from a large multisite longitudinal study were used to compute the MDC90. In turn, MDC90 can be used to determine whether the training-induced effects are due to the intervention itself or to noise inherent to the measurement tool. The MDC90 reflects a 90% confidence interval that the magnitude of measurement variability will be less than the MDC values. 25 For example, the MDC90 for the AAUTa is 24.2% in the EXCITE trial. This suggests that the amount of affected arm use measured by the AAUTa will be less than 24.2% for 90% of the individuals who have suffered mild to moderate stroke. This finding implies that a change of amount of use greater than 24.2% for an individual is necessary to be 90% certain that the change is not because of measurement error in the context of CIMT as applied in the EXCITE trial.
The MDC90 for the AAUTa and MALa were 24.2% and 16.8%, respectively, whereas the MDC90 for the AAUTq and MALq were 14.4% and 15.3%, respectively. Relatively speaking, the MDC90 of quality of movement is less than that of amount of use for both the AAUT and the MAL, suggesting that the measurement of quality of movement has relatively higher sensitivity compared with that of amount of use when capturing spontaneous arm use for people with mild to moderate stroke. This observation is consistent with previous studies that explored the clinometric properties of the MAL24,26 where the quality of movement subscale was found to be a more reliable and valid measure of real-world arm use than the amount of use subscale. However, this analysis represents the first time the same comparison is reported for the AAUT.
An absolute difference in the MDC90 between the AAUTq and the MALq of 0.9% indicates a similar level of measurement variation for the quality of movement in both measures. In sharp contrast, an absolute difference in the MDC90 between the AAUTa and the MALa of 7.4% suggests the AAUT amount of use subscale is a relatively unstable measurement. 18 One explanation for the relatively large MDC90 of the AAUTa may be the large standard deviation. Recall that the MDC90 calculation is a function of the SD and ICC. Here, the ICC values for all 4 measures were similar and high, ranging from 0.85 (MALa) to 0.91 (AAUTq). However, a closer look at the SD for the 2 subscales of the AAUT and the MAL reveals a relatively large SD for the AAUTa (31.2%) compared with that for the AAUTq (20.7%), the MALa (18.3%), and the MALq (18.2%). Another possible explanation is that the large MDC90 of the AAUTa may be because of the lower resolution of the scoring system. 27 The AAUTa score is a ratio based on the number of tested items in which the participant “attempted or used the paretic arm.” Each item receives a binary score of 1 or 0, where a “1” is assigned for any action identified in the range of “attempt to use” to “normal use.” Therefore, the measurement of the AAUTa is not as sensitive as that of the MALa, which uses a finer grained scale of 0 to 5.
Although the AAUT consists of a relatively insensitive amount of use subscale, there are at least 2 reasons that it can be used to provide some useful information about poststroke spontaneous arm use. First, the AAUT is a real-time observational performance-based measure that provides information closer to actual arm use. Second, most tasks in the AAUT are designed to be bimanual in nature, which may provide a more natural and sensitive means to capture task-based paretic arm use. 28 The focus on bimanual tasks is thought to be important in daily life based on evidence from the literature. In an observational study, Kilbreath et al 29 investigated the frequency of hand use in older adults during daytime hours and found that in the majority of the observations 54% involved bimanual tasks whereas 29% involved unimanual activities. Recently, Johnson et al 30 reported an experimental result using force-feedback cueing in a robot-assisted stroke study. They found that if nonuse of the affected arm exists, the affected arm will most likely be underused for bimanual steering compared with unimanual steering. Later, Paranjape et al 31 reported findings in part supporting those of Johnson et al. 30 Together, these reports suggest that bimanual activities play an important role in many activities of daily living, and may be more sensitive in the detection of nonuse of the affected arm. An assessment focused on unimanual activities, therefore, may not capture the ‘amount of use’ as accurately as one that includes bimanual activities.
Constraint-induced movement therapy focuses on affected arm training with primarily unimanual tasks; therefore, how much and when the training effect would transfer to bimanual tasks is uncertain. The mean score improvement in spontaneous arm use after CIMT was significant for both the AAUT and the MAL, and this improvement was maintained at the 1-year follow-up. However, the magnitude of the improvement exceeded the MDC90 only for the MAL but not the AAUT. This was the case for both evaluation time points. The inconsistency between the mean score and the MDC results highlights the important differences in the nature of 2 instruments. In fact, more than half of the MAL tasks pertain only to unimanual performance. 12 The AAUT tasks, however, were designed to be performed primarily bimanually. 28 With this in mind, we might expect the AAUT to be more sensitive to changes in bimanual limb use. If there is a generalization to bimanual limb use after the CIMT intervention, we would further expect the AAUT to capture this generalization at the later time point. However, the AAUT result at 1-year follow-up suggests that a pattern of more normal bimanual arm use did not emerge.
One possible explanation for the minimal generalization to bimanual tasks is that, for some tasks, participants may feel more confident using the less-affected arm alone than using both.32-34 In sum, the fact that significant improvement was observed in the MAL but not in the AAUT (when comparing the MDC90 values) may be influenced by the nature of the tasks in each measure and the participants’ confidence to perform the bimanual tasks with the weak limb.
Regardless of the recall bias for the MAL and the reliability concerns for the AAUT, these 2 measures only provide rough estimates of spontaneous arm use. Uswatte et al9,10,35 performed a number of studies using accelerometers to directly capture the quantity of arm use in the real world for individuals poststroke and to verify the MAL.12,26 The Bilateral Arm Reaching Test 36 was found to have good reliability and acceptable validity as well. Mobile health innovations, such as wireless sensors that use machine-learning algorithms to determine the type, quantity and quality of movements in the laboratory and home, will make better measurements feasible and inexpensive.37,38
Conclusions
The determination of measurement variability using MDC is important for uncoupling a meaningful intervention-induced change from random measurement noise. Although the AAUT is arguably an objective measure of spontaneous arm use, here, its sensitivity in capturing “amount of use” is relatively low. One reason for the insensitivity may be the low resolution of its scoring system, particularly for the amount of use subscale. As such, there is considerable need to develop valid and reliable tools to capture purposeful arm use outside the laboratory, perhaps through leveraging new sensing technologies with objective activity monitoring capable of determining the actual amount of purposeful arm use in daily living.
Footnotes
Acknowledgements
The authors thank Dr Eric Wade and Dr Hsiu-Chen Lin for their suggestions and comments on the preparation of the article. Data for the secondary analysis were from the EXCITE trial database, funded by NIH grant R01 HD37606 from the National Center for Medical Rehabilitation Research (National Institute of Child Health and Development) and from the National Institute of Neurological Diseases and Stroke.
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The author(s) received no financial support for the research, authorship, and/or publication of this article.
