Abstract
Purpose
This study aimed to (1) examine the construct validity of the Zorowitz spasticity patient-reported outcome (PRO) scale in pediatric populations and (2) examine the scale's responsiveness to change in children to determine its clinical utility in guiding treatment of pediatric spasticity.
Methods
Retrospective analysis of data collected at a large academic pediatric hospital system, including 505 patients who received injections for spasticity from pediatric physiatrists, was performed. Zorowitz scores, spasticity (Modified Ashworth Scale) scores, and Gross Motor Function Classification System levels were extracted.
Results
Baseline Zorowitz score (median 19, interquartile range 13–25) was not related to functional level (r = −0.088, p = 0.20) nor muscle tone (r = 0.006, p = 0.95), but patients with follow-up data reported reduced impact of spasticity post-injection (p < 0.0001). Higher baseline Zorowitz score was related to a greater decrease in Zorowitz score after injection (r = −0.39, p < 0.00001). Injection location, sex, number of muscles injected, and botulinum toxin dose were not related to Zorowitz change score.
Conclusion
The Zorowitz scale may be responsive to spasticity treatment in children. However, construct validity to existing clinical measures was not observed, suggesting either that a clinical gold standard does not exist, that the scale measures a construct not otherwise captured clinically, or that it has limited validity in children.
Introduction
Spasticity is present in approximately 80% of children with cerebral palsy - the most common cause of physical disability in childhood 1 – and is also clinically manifested in pediatric stroke, acquired brain injury, spinal cord injury, hereditary spastic paraplegias, and numerous metabolic diseases. 2 Spasticity is velocity-dependent hypertonicity that can lead to muscular contracture and deformity that can cause pain, limit movement, and increase difficulty in activities of daily living. 3 Untreated spasticity can severely affect quality of life. Yet, there are no existing tools for evaluating the impact of spasticity on quality of life or function in children. Existing scales, such as the Modified Ashworth (MAS) 4 and Tardieu 5 scales, measure the degree of muscle or joint impairment but not the impact of such impairment on function or quality of life. 6
Given the inconsistent relationships between objective physical impairment and experienced disability, patient-reported outcomes (PROs) can help to determine patients’ perceived impact of impairments on function and life participation. Children's PROs were previously believed to be unreliable due to the belief that children's feedback is variable in immaturity, especially in the setting of complex developmental issues. 7 Many pediatric PROs that currently exist are directed toward assessing developmental milestone achievement and testing validity of PRO psychometric evaluation versus traditionally administered evaluations. 7 Validated pediatric PROs include KIDSCREEN for evaluation of health-related quality of life and PedsQL for evaluation of quality of life, both available as self-report or caregiver proxy.7,8 Pediatric-specific scales aim to account for the response shift relative to developmental stage in the pediatric population that can alter interpretation of PROs. In pediatric PROs, parents and/or caregivers are often used as a proxy. 8 PROs may be particularly useful in the field of rehabilitation, where large heterogeneity in physical impairment and disability are observed. PROs focus on the individual's experience, rather than a diagnosis, to inform treatment, thus allowing for heterogeneous experiences of the same condition. Despite their growing role in health care, there currently are no validated PRO scales for use in assessing the impact of pediatric spasticity. There is a need for reliable assessment of the impact of spasticity and injection therapy for spasticity on patient-reported function in pediatric populations.
Standard clinical measures of muscle tone and motor function reflect clinician assessment of impairment rather than PROs of function. PROs such as the Questionnaire on Pain caused by Spasticity (QPS) account for pediatric patient-reported pain but not functional outcome. 9 A 13-item spasticity screening tool was developed in adult populations by Zorowitz et al., 10 which this manuscript refers to as the Zorowitz scale. The Zorowitz scale was developed to screen for spasticity requiring treatment in adult populations using PROs and was not designed specifically for pediatric populations. 10 There have been no studies validating the scale in adult or pediatric populations. It is unknown if the Zorowitz scale has utility in pediatric populations, where respondents are often a parent proxy, or if the measure is responsive to change in this population. This would be meaningful information to guide procedure and prescribing practices and to evaluate the effectiveness of spasticity treatments in children.
The study's first objective was to examine the construct validity of the Zorowitz scale in pediatric populations by investigating the strength of the relationships between the proxy report of Zorowitz scale scores and clinical measures of spasticity and gross motor function (i.e., MAS and Gross Motor Function Classification System [GMFCS]). The second objective was to examine the scale's responsiveness to change in children by analyzing the strength of the relationships between Zorowitz change scores before and after injection therapy compared to change scores in MAS and GMFCS over the same time period.
Methods
Study design
This was a retrospective analysis of data collected in an electronic health record (EHR) at a large academic pediatric hospital system. The study cohort included patients who received injections for spasticity from rehabilitation medicine physicians at two locations in the health system. Study procedures were granted an exemption from review by the hospital system's Institutional Review Board and consent was therefore waived.
Data collection
Data were extracted from the EHR with the assistance of the hospital system's bioinformatics department. The extraction was for information from May 2017, when the Zorowitz scale was implemented at this institution as a PRO, through December 2019. Permission to implement the Zorowitz scale at the hospital system was obtained from Dr. Richard Zorowitz.
The surveys were available for completion by caregivers during office visits with rehabilitation medicine before and after spasticity injection therapy (prior to, at the injection visit, and at follow-up visits after injection therapy for spasticity).
Retrospective cohort sample
All patients who received injection therapy for spasticity in the time window were included. Accompanying demographic and clinical data that were automatically extracted included sex; age; weight; diagnoses; injection sites and doses of onabotulinum toxin A; Zorowitz scores, respondent, and dates completed; and MAS 4 scores reported by physical therapists and dates completed. The number of physical therapists rating MAS was not extracted. Additional data that were not documented in flowsheets and therefore could not be automatically extracted were gathered from manual record review after the dataset was filtered for analysis as described in the following section. These included GMFCS levels from physician notes.
Selection of records for analysis
The extracted dataset included 1644 surveys completed by 701 patients. There were a number of exclusions applied to the initial dataset before data analysis. Injections performed for the diagnoses of focal dystonia, alternating hemiplegia, drooling/sialorrhea, talipes varus, idiopathic toe walking, cervical dystonia, isolated hamstring contracture, facial nerve palsy, lateral femoral cutaneous neuropathy, neck pain secondary to Poland syndrome, and torticollis were not included because the Zorowitz scale items primarily address limb function (72 records). Injections that were performed only to the face, salivary glands, neck, and/or trunk muscles were similarly excluded after the diagnosis filter was applied (seven records).
Final exclusions were made if the accompanying clinical data were not reported within a reasonable time frame prior to the date of injection or after the expected window of injection effect had ended. Injections that did not have an associated Zorowitz score within 120 days prior to injection and/or 120 days following injection were also not included in the analysis (676 records) in order to capture the expected peak effect of injection.11,12 This allowable window provided flexibility, as visits could not always be scheduled to precisely capture peak effect. After these exclusions, the final dataset included 505 injection procedures with Zorowitz scores within 120 days prior to injection and/or 120 days following injection.
Due to insufficient data, the study did not include muscle tone information from the upper extremities. Muscle tone information for the lower extremities included MAS scores for the bilateral hip adductors, hamstrings, and plantarflexors. Additionally, GMFCS levels were excluded for any patient aged four or younger whose GMFCS level was not reported within one year of their injection. MAS scores collected outside of 120 days from injection were also excluded from analysis (187 records). Injections with associated post-Zorowitz surveys completed by different caregivers than pre-Zorowitz surveys were excluded (23 records). Patient weights recorded more than six months prior to onabotulinum toxin A injection were excluded from dose response analysis. In sum, of the 505 records included in analysis, 355 also had post-Zorowitz data available for analysis, 407 had GMFCS level, 208 had pre-injection MAS scores, and 98 had both pre-injection and post-injection MAS scores for analysis. In cases of multiple observations from the same patient, the observation with the most complete data was used for analysis (baseline Zorowitz score, post-injection Zorowitz score, GMFCS level, MAS scores). If all observations had full datasets, the first observation was used for analysis.
Measures for analysis
Zorowitz scores and MAS scores taken after injection were used in relation to those taken prior to injection in order to calculate change scores. MAS scores (0/1/1+/2/3/4) for each muscle group were converted to ordinal data (1/2/3/4/5/6). Ordinal values for bilateral hip adductors, hamstrings, and plantarflexors were then combined to generate a summative lower extremity MAS score (maximum possible score: 36).
Muscles receiving onabotulinum toxin A injections were organized by action into 30 groups on the left and right, such as “knee extension” and “hip flexion.” Records were then categorized into one of four bins (1–2, 3–4, 5–6, 7 + muscle groups injected) in order to evaluate whether Zorowitz change scores were influenced by the number of sites injected.
To study the influence of onabotulinum toxin A dose on Zorowitz change score, total onabotulinum toxin A dose was divided by patient weight within six months of injection. In order to examine if the Zorowitz score responded selectively to upper or lower limb injections, an additional dichotomous variable (upper versus lower limb) was created to indicate if the injection was only to the upper limb(s) (40 records) or only to the lower limb(s) (299 records).
Data analysis
The Zorowitz scores (baseline and change) were tested for normality with the Shapiro-Wilk test, which indicated that the distributions were not normal. Therefore, the non-parametric Spearman correlation coefficient was calculated for all analyses of continuous data. Point biserial correlation coefficient was used when the analysis included a two-level categorical variable, including sex and upper versus lower limb.
To address the primary objective of investigating the construct validity of the Zorowitz score in children, the strength of the relationships between the Zorowitz score and GMFCS level and MAS scores, as well as Zorowitz change score as defined as pre-injection Zorowitz score subtracted from post-injection Zorowitz score, were investigated.
Also, to address the secondary objective of investigating the responsiveness of the Zorowitz score in children, the strength of the relationship between the Zorowitz change score and MAS change score was investigated. Additional evaluations were performed to investigate whether Zorowitz change scores were influenced by sex, age, onabotulinum toxin A location (upper versus lower limb), number of muscle groups injected, or onabotulinum toxin A dose with univariate analyses and a multivariate linear regression model. Finally, to determine if the Zorowitz scale has potential to detect functional change in patients with low gross motor function, in whom injections are often administered to ease caregiver burden or increase comfort rather than improve function, change scores were compared in the subset of those who functioned at GMFCS level V compared to those who functioned at GMFCS levels I-IV.
Results/analysis
Participant sample
The final dataset included 317 records from unique patients, with median age 6.0 years (interquartile range [IQR] 4.0–11.0 years) and 55.2% male (Table 1). There was representation from all GMFCS levels with 46.2% functioning at levels I-III and 53.8% functioning at levels IV-V. Of the 317 records included in analyses, 47.9% had both pre-injection and post-injection Zorowitz scores, with a median Zorowitz baseline score of 19 (IQR 13–25) and median Zorowitz change score of −3.0 (IQR −7–0). This represented a significant difference in post-treatment scores from baseline (p < 0.0001). All respondents were caregivers, not patient self-reports. Injections to the lower limb made up only 59.9% of records, to the upper limb only 6.3%, and to both upper and lower limbs 33.8% of records. A median of five muscle groups (IQR 3–6) were injected with onabotulinum toxin A in a single episode, with a median of 2.2 (IQR 1.5–3) units of onabotulinum toxin A per kilo per muscle group, and median of 11 units (IQR 6.5–13.0) of onabotulinum toxin A were injected per total body weight by kilo. Six hundred units was the maximum dose administered. Two adolescent outliers were identified, receiving 640 and 720 units, respectively (see Editorial Note). The baseline Zorowitz scores and Zorowitz change scores were examined to identify outliers at maximum desired false discovery rates of 1% and 5%. There were no outliers identified with either tolerance level.
Participant characteristics and number of records included in analyses.
GMFCS: Gross Motor Function Classification System
Construct validity
There was no relationship between GMFCS score and baseline Zorowitz score (r = −0.088, p = 0.20, Figure 1). Similarly, there was no relationship between lower limb MAS score and baseline Zorowitz score (r = 0.006, p = 0.95, Figure 2).

There was no meaningful relationship between Gross Motor Function Classification System (GMFCS) score and baseline zorowitz scores, r = −0.088, p = 0.20. Grey boxes indicate the 25th-75th quartiles of Zorowitz scores, for each GMFCS level. Group medians are represented by the horizontal black line in the center of each box. Vertical bars above and below each box represent the range of scores extending from minimum to maximum. Individual data points are overlaid in light gray.

There was no meaningful relationship between lower limb (LL) muscle tone (modified ashworth scale) and baseline Zorowitz scores, r = 0.006, p = 0.95. Each point represents one record (n = 97). The black line represents the line of best fit. The slope of the line (m), strength of the relationship (r), and the significance of the relationship (p) are shown.
Responsiveness to change
There was a relationship between increasing baseline Zorowitz score and the magnitude of decrease in Zorowitz score after injection (r = −0.39, p < 0.00001, Figure 3), with higher baseline Zorowitz scores related to greater decreases in score after injection. There were no differences in the change in Zorowitz scores when comparing injections in male and female patients (p = 0.39, Mann Whitney test, Figure 4A), nor was there a difference in the change in Zorowitz scores when comparing upper and lower limb injections (p = 0.95, Mann Whitney test, Figure 4B). This suggests that there was no influence of sex nor limb injected on Zorowitz change scores. There was no significant relationship between the number of muscles injected and the Zorowitz change score (r = −0.086, p = 0.31). Similarly, there was no significant relationship between onabotulinum toxin A dose per total body weight by kilo and Zorowitz change scores (r = 0.028, p = 0.76), nor between onabotulinum toxin A dose per muscle group by weight per kilo and Zorowitz change score (r = 0.092, p = 0.32).

Relationship between baseline Zorowitz score and Zorowitz change score, r = −0.39, p < 0.0001. Each point represents one record (n = 140). Negative change scores represent reduced impact of spasticity on daily life. The black line represents the line of best fit. The slope of the line (m), strength of the relationship (r), and the significance of the relationship (p) are shown.

There were no meaningful differences in Zorowitz change score by (A) sex, p = 0.39 or (B) injection location (upper limb only versus lower limb only), p = 0.95. Grey boxes indicate the 25th-75th quartiles of the scores by group. Group medians are represented by the horizontal black line in the center of each box. Vertical bars above and below each box represent the range of scores extending from minimum to maximum. Individual data points are overlaid in light gray.
There were no significant differences found in baseline Zorowitz scores between the cohorts with and without follow-up Zorowitz scores (p = 0.31). There were also no differences in sex found between the two cohorts (p = 0.34). A significant difference in age was found between the two samples (p = 0.0047); mean age was 6.8 (standard deviation [SD] 4.2) years in the population with follow-up Zorowitz scores and 8.5 (SD 4.9) years in the population with baseline Zorowitz scores only. The multivariate linear regression including continuous variables of age, dose per weight per kilo, and number of muscles injected explained only 3.8% of the variance in Zorowitz change scores (F(3119) = 1.56, p = 0.20). There were no significant differences in change scores between the subset of patients who functioned at GMFCS level V (x = −3.4, SD = 5.6) compared to those who functioned at GMFCS levels I-IV (x = −3.5, SD 5.9, p = 0.83).
Discussion
With regard to the first objective, there was no observed evidence of construct validity between Zorowitz scores and existing clinical measures for gross motor function and spasticity. Regarding the second objective, improvement was observed in Zorowitz scores in 73% of the sample after injection, suggesting that the Zorowitz scale is responsive to change after spasticity treatment. Further, a higher baseline Zorowitz was related to greater decrease in Zorowitz after injection, suggesting that the Zorowitz scale captured larger improvements in the degree to which spasticity impacted life in patients whose spasticity more significantly impacted their life. The Zorowitz score appeared to respond equally to injections in upper and lower limbs, and in males compared to females.
Taken together, these observations suggest that patient-reported impact of spasticity on daily life (the intended focus of the Zorowitz scale) is a separate construct from, and cannot be predicted by, GMFCS level or severity of muscle tone as measured by MAS. Classification scales such as the GMFCS categorize function at a global level but are not designed to reflect the specific impact of spasticity on level of function. 13 It is important to collect PRO data because it measures information not otherwise captured in clinical data, as has been suggested of PROs in other contexts as well. 7 The Zorowitz score may act as a proxy for how patients are affected by their spasticity in daily life, which has value in understanding of the impact of a given intervention, therapy, or service. 14 If the Zorowitz scale did align directly with GMFCS or MAS, it would not offer additional useful information for decision-making. Rather, the Zorowitz scale may offer a different type of information to guide decision-making. This hypothesis was not addressed in the current study and requires further investigation to define the clinical utility and psychometric properties of the tool.
As a screening tool, the Zorowitz may be useful in decision-making on whether to use chemodenervation. For patients who have established spasticity, this tool could assess whether they are actually impacted by their spasticity, which is a more important part of decision-making than the severity of spasticity alone. Low scores would warrant a more focal injection approach. As a tracking tool, the Zorowitz scores changed more than MAS scores did after injection. If patients report extremely bothersome spasticity, then providers may expect injections to make a large impact and to see a change in quality of life. This change is not always reflected by MAS, and rarely by GMFCS measures, due to poor sensitivity to change. 15 The Zorowitz score appeared to respond to changes after injection to a greater degree than the stable GMFCS classification and relatively static MAS scores. This is clinically important because patients with severe spasticity may receive injections regularly without assessment for improvements in quality of life. Because the Zorowitz was responsive to the impact on function after injection, even in the cohort of patients who functioned at GMFCS V, it could be used clinically to help plan injections and evaluate change. If survey data show otherwise, physicians can then adjust dosing or the muscles injected, or make other modifications. If its utility is confirmed in future work, the Zorowitz scale is readily available, easy to administer, able to be integrated into electronic medical record systems, and not too burdensome on patients. There are no financial barriers to implementation and it does not require special equipment, nor a patented scoring system.
Limitations and future directions
There was a considerable amount of variability in the Zorowitz scores, which may suggest that the Zorowitz is not precise enough to reflect small differences between patients or small changes in individual patients. Despite being statistically significant, the strength of the correlation between baseline scores and change scores (r = −0.39) was a weak-moderate relationship, and therefore the degree of change in the scale after injection for individual patients remains difficult to predict. The study was also limited due to patient proxy completing the scoring questionnaire. Completion of a PRO by a parent or caregiver may become less reliable as the child becomes more independent and engages in activities not observed by the parent. Sensitivity of the scale for detecting change would differ if completed by patient versus proxy; younger patients may not have the required insight, but as they become older, they become able to more accurately portray their own limitations due to spasticity. Based on this, patient/proxy PRO scores would be expected to diverge with age. However, in those with cognitive impairments it is not ideal to omit any PRO from their care, even if they are unable to complete it themselves. Parent/caregiver proxy is preferable to no information at all, especially if language is created in the future to include dependent care. Additionally, an aggregate measure of muscle tone was analyzed (and only in the lower limbs) due to lack of statistical power to separate the cohort into subgroups based on individual muscle group MAS scores. In future studies, statistical power could be increased via addition of specific evaluation charts to document each muscle's scores along with obtaining pre-/post-injection Zorowitz scores. Other institutions could also be included in the study population. More sophisticated biomechanics methods could be considered to measure muscle tone, although many methods are not feasible in the pediatric population. Use of the Tardieu scale could be considered. Preserving individual muscle scores, as previously noted, would be helpful but would limit sample size. The cohort of patients with follow-up scores was, on average, younger in age than the population with baseline scores only, which may have influenced the outcomes; however, this influence is estimated to be small as the pattern of associations was similar in the follow-up only group to the whole sample. Finally, due to the nature of this retrospective study, there were several potential confounders that were unable to be controlled, such as adjunctive therapies and socioeconomic factors. Potential confounders could be analyzed by filtering records by factors such as age, GMFCS level, and muscle injected to generate more homogenous populations for analysis. Future efforts should also be directed at confirming the validity of the scale in children and, if favorable, at developing a statistical prediction tool for Zorowitz change after injections. The utility of the Zorowitz scale in children might be improved by adjusting outcome data to be specific to pediatric populations. For instance, capturing PRO data such as whether patients are better able to handle toys, hold crayons/markers for art, be held comfortably by their parents, and play with their siblings may capture more meaningful data in a pediatric population. Age-specific PRO questionnaires could also be developed to reflect ability to achieve milestones such as the ability to hold and stack blocks. Furthermore, the wording of questions would need to be adjusted to children's cognitive ability and vocabulary. Questions can also be added to include dependent care, such as “dressing my child” or “positioning my child.” Survey data could be added to corroborate validation of the scale and qualitative data obtained via interviews. If available, other PROs could be added to compare against the Zorowitz scale. There may also be benefit in evaluating the use of the Zorowitz scale in patients with spasticity managed via other modalities such as intrathecal baclofen pumps.
Footnotes
Acknowledgments
The work was funded by the Children's Hospital of Philadelphia and the University of Pennsylvania Perelman School of Medicine.
Ethical considerations
As a retrospective review of existing clinical records, this study was exempt from Institutional Review Board approval.
Author disclosures
The authors have no competing interests or other financial interests connected with this work. No part of the manuscript or the results have been published elsewhere, or have been submitted for publication elsewhere. All authors are responsible for reported research and have participated in the concept and design, analysis and interpretation of data, drafting or revising, and have approved this manuscript as submitted.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Editorial note
Although the editors recognize that more is sometimes used in clinical practice, please note that the US and Canadian guidelines recommend not injecting more than 400 units of botulinum toxin within a 3 month period.
