Abstract
A collateral intervention effect refers to changes in behaviors which were not directly targeted during intervention. Using predetermined search and inclusion procedures, this systematic review identified 46 studies involving children with autism spectrum disorder and 14 desirable collateral effects across multiple domains of functioning. Collateral effects were associated with: (a) interventions involving naturalistic behavioral strategies; (b) participants with limited communication and/or cognitive deficits; (c) performance deficits (i.e. there was some evidence of the collateral behavior in baseline); and (d) interventions directly targeting play, communication, joint attention, and/or stereotypy. Overall, this systematic review indicates that collateral effects arising from focused interventions warrant consideration by practitioners during intervention planning and require additional research to identify mechanisms responsible for the observed changes.
The diagnostic criteria for autism spectrum disorder (ASD) consist of impairment in social communication and restricted interests and patterns of behavior (American Psychiatric Association [APA], 2013). Although not part of the ASD diagnostic criteria, challenging behavior (e.g. aggression, self-injury) and co-morbid diagnoses (e.g. anxiety disorder, intellectual disability) are more prevalent in samples of children with ASD (Mannion & Leader, 2013; Matson & Nebel-Schwalm, 2007). These characteristics often have deleterious effects across a variety of domains (e.g. language, play, daily-living skills) which, in the absence of intervention, present obstacles to forming social relationships, educational attainment, employment, and autonomy throughout life (Chamak & Bonniau, 2016; Henninger & Taylor, 2012). Given the pervasiveness of skill deficits and behavioral excesses that may warrant intervention in children with ASD, interventions that occasion concomitant improvements in behaviors not directly targeted during intervention (collateral effects) may offer desirable intervention efficiency (Koegel, Koegel, & McNerney, 2001; McConnell, 2002; Pauwels, Ahearn, & Cohen, 2015).
Currently, intervention approaches most commonly associated with improvements across skill domains for children with ASD tend to be intensive (e.g. 20 to 40 hours per week), initiated early in life, and involve multiple intervention components that directly target a comprehensive set of behaviors (e.g. Lang, Hancock, & Singh, 2016; Virués-Ortega, 2010). Comprehensive and intensive intervention has been demonstrated to improve areas directly related to ASD diagnostic criteria (e.g. social communication), ameliorate common comorbidities (e.g. challenging behavior), and may even result in more typical neurological functioning (e.g. Dawson et al., 2012; Reichow, Barton, Boyd, & Hume, 2014; Ryberg, 2015; Vismara & Rogers, 2010). Unfortunately, many children with ASD and their families are confronted with a lack of available service providers with expertise in comprehensive intervention, prohibitive intervention costs, and comorbid health conditions that interrupt or preclude intensive intervention (Jacobson & Mulick, 2000; Thomas, Ellis, McLaurin, Daniels, & Morrissey, 2007; Vohra, Madhavan, Sambamoorthi, & Peter, 2014). These factors necessitate consideration of other less comprehensive intervention options (Pickard & Ingersoll, 2016).
As opposed to targeting a broad range of behaviors across multiple domains via comprehensive intervention, another option is a focused approach to intervention. Focused interventions involve the selection of a specific target behavior (e.g. initiating play with a peer) or a small set of related target behaviors (e.g. social initiations and responses) and then the development of an intervention that focuses on the selected behaviors (O'Reilly, Falcomata, Kang, & Fragale, 2014). Selection of target behaviors and focused intervention components is based on a multitude of considerations including developmental appropriateness, ecological validity, assessment of the child's existing skills and preferences, and family input regarding treatment priorities and preferences (Baer, Wolf, & Risley, 1987; Lifter, Ellis, Cannon, & Anderson, 2005; Lifter, Sulzer-Aaaroff, Anderson, & Cowdery, 1993). An additional consideration which may increase the efficacy of focused interventions is the selection of behaviors and intervention procedures that have been shown to produce collateral, or untargeted, skill improvements (McConnell, 2002). A number of terms used in the ASD intervention literature are related to collateral effects including response generalization, behavioral cusp, and pivotal response.
Response generalization refers to a generalized behavior change wherein a change in a targeted behavior results in a change in a nontargeted behavior that shares the same operant function, similar discriminative stimuli, or related topographies (Cooper, Heron, & Heward, 2007; Kazdin, 1994; Stewart, McElwee, & Ming, 2013). For example, an intervention designed to teach a specific form of play behavior that results in the child acquiring the target play skill as well as a play skill not directly targeted during intervention could be described as having resulted in response generalization (e.g. Lang et al., 2014).
The term behavioral cusp references a wider spread of collateral effects than is typically associated with the term response generalization. The concept of behavioral cusp refers to cases where change in a target behavior profoundly influences many nontargeted behaviors across multiple domains (Smith, McDougall, & Edelen-Smith, 2006). Bosch and Fuqua (2001) described behavioral cusps in the context of interventions that result in: (a) acquisition of a target behavior that enables exposure to new contingencies of reinforcement and novel environments; (b) socially valid behavior change; (c) increased ability to originate, produce, or create (i.e. generativeness); and (d) displacement of inappropriate behavior. Interventions targeting joint attention that also demonstrate collateral improvements in language which in turn lead to a reduction in challenging behavior are putative examples of this concept (White et al., 2011).
Pivotal responses are a group of specific intervention targets with the potential to occasion a broad range of concomitant behavior changes similar to a behavioral cusp. Acquisition of a pivotal response is theorized to reduce learned helplessness and increase motivation to respond to social and instructional stimuli (Koegel, Ashbaugh, & Koegel, 2016). Target behaviors considered pivotal responses include social initiations, attending to multiple features of a stimulus, and self-management skills (e.g. Koegel & Koegel, 2006; Koegel & Wilhelm, 1973; Koegel & Mentis, 1985; Schreibman & Stahmer, 2014). For example, a child who is taught to initiate a social interaction with peers may acquire other social skills and novel play behaviors as a result of increased peer interaction. The extent to which collateral effects associated with the acquisition of pivotal responses are a product of the specific pivotal responses targeted, the intervention components utilized, or a combination is not yet clear (Cadogan & McCrimmon, 2015).
Several previous reviews have addressed collateral effects but only in the context of a specific intervention package or a specific target behavior. For example, Verschuur, Didden, Lang, Sigafoos, and Huskens (2014) reviewed 43 studies investigating pivotal response treatment (PRT) and reported that targeted increases in social initiations were associated with collateral improvements in language, play skills, and challenging behavior. Similarly, a meta-analysis of the Picture Exchange Communication System (PECS) reported collateral improvements in spoken language, socialization, and challenging behavior (Ganz, Davis, Lund, Goodwyn, & Simpson, 2012). With regard to focusing on a specific target behavior, Lanovaz, Robertson, Soerono, and Watkins (2013) reviewed 60 studies targeting a reduction in stereotypy and reported that a decrease in the targeted form of stereotypy occasioned a desirable increase in adaptive behavior in most cases and an undesirable increase in other forms of stereotypy in a few cases. Finally, White et al. (2011) reviewed 27 studies that measured joint attention. When intervention targeting joint attention was effective, collateral improvements in social initiations, imitation, play, and speech were often reported.
This systematic review extends previous reviews by focusing on collateral intervention effects without restricting included studies to a specific focused intervention package or class of target behaviors. Given the importance of beginning intervention early in life (Pickles et al., 2016), the current systematic review aims to identify collateral effects demonstrated in early childhood intervention studies. Further, with the exception of the meta-analysis of PECS (Ganz et al., 2012), effect size estimates for collateral changes have not been calculated in previous reviews. The goals of the present systematic review are to: (a) identify collateral effects that have been reported in intervention research involving children with ASD; (b) evaluate the methodological rigor and calculate effect size estimates of the targeted and collateral behavior changes; (c) identify characteristics of participants, interventions, and target behaviors that may influence collateral effects; and (d) discuss implications for practice and research.
Methods
Protocol registration and PRISMA guidelines
The protocol for this systematic review was registered with the PROSPERO International prospective register of systematic reviews and was prepared in accordance with PRISMA guidelines (Ledbetter-Cho, Lang, Watkins, & O'Reilly, 2015; Moher, Liberati, Tetzlaff, & Altman, 2009).
Search strategy
A systematic search of four electronic databases was conducted, including Educational Resources Information Center (ERIC), Medline, Psychology and Behavioral Sciences Collection, and PsychINFO. Searches consisted of combinations of terms referring to collateral intervention effects (i.e. non-targeted or nontargeted or untargeted or unanticipated or collateral or concomitant or behavioral cusp or pivotal response or ancillary or response generalization); terms related to diagnosis (i.e. autis* or ASD or Asperger* or pervasive developmental disorder*); and terms suggesting an intervention study (i.e. intervention or treatment or program or train*). Because terms related to collateral behaviors may not appear in an article's title, abstract or keyword list, we set search parameters to “open field.” An open field search identifies articles containing the search term anywhere in the text (not limited to title, abstract, or key terms). Publication date was also unrestricted but studies were limited to those written in English and published in peer-reviewed journals. This database search procedure yielded 710 studies. Next, secondary searches of included articles and of previous literature reviews were conducted. Finally, hand searches of journals that often publish intervention research with children with ASD (e.g. Journal of Autism and Developmental Disorders) were conducted. The first and second author initially applied the inclusion criteria to the corpus of studies resulting from the search procedures. The third author then independently screened articles identified for inclusion and interrater agreement reached 98%. Based on recommendations from the Cochrane Collaboration, the disagreement was resolved by discussion among the authors (Higgins & Green, 2011). Figure 1 depicts the search and screening process.
Flowchart of included studies.
Study selection
An intervention study was required to meet predetermined criteria to be included. First, the intervention had to be delivered to at least one child (birth through 8 years old) diagnosed with ASD, Autistic Disorder, Asperger's Syndrome, or Pervasive Developmental Disorder-Not Otherwise Specified (PDD-NOS). If a study included some participants that met criteria and others that did not, only data pertaining to participants meeting the criteria were considered (e.g. Lanovaz et al., 2014; Lee & Odom, 1996). The study was excluded if data from participants meeting criteria could not be disaggregated from other participants' data (e.g. Karaaslan, Diken, & Mahoney, 2013). Second, studies involving comprehensive early intensive intervention packages (e.g. the Early Start Denver Model) and those involving biomedical or physiological procedures (e.g. exercise, dietary manipulations and chelation) were excluded because it was not possible to disaggregate collateral effects from the targeted behavior changes (e.g. Celiberti, Bobo, Kelly, Harris, & Handleman, 1997; Lovaas, Koegel, Simmons, & Long, 1973; Rogers et al., 2012). For example, studies involving sensory integration therapy were excluded because the purported mechanism of action involves changes in sensory processing via neuroplasticity and such a change (if it occurs) would be expected to have a wide spread of effects (e.g. Reichow, Barton, Sewell, Good, & Wolery, 2010). Third, a study had to clearly identify at least one target behavior (e.g. a specific communication, play, or social skill) measured by direct observation and describe intervention procedures focused directly on that target behavior (c.f., McEvoy et al., 1988). Fourth, the study had to include data indicating a change in a behavior that was not directly targeted by an intervention component or procedure (i.e. collateral effect). For example, studies involving Functional Communication Training (FCT) that reported a decrease in challenging behavior and an increase in an alternative targeted communication behavior would be excluded because the alternative communication behavior is prompted and differentially reinforced and challenging behavior is put on extinction: as such, there is an intervention component directly aimed at both dependent variables (Carr & Durrand, 1985). Similarly, if engagement in the target behavior displaced another behavior because the two were physically incompatible (e.g. the study's operational definition of on-task behavior required the child to stop engaging in stereotypy), the study was excluded.
Finally, included studies had to utilize an experimental group design or demonstrate experimental control for at least one target or collateral behavior in a single-case design (SCD). In some cases, experimental control in a SCD was demonstrated with either the target behavior or the collateral behavior but not both. For example, some studies targeted a specific behavior for improvement (e.g. communication) but only reported data on the collateral variable (e.g. social engagement; Koegel, Vernon, & Koegel, 2009). If experimental control was demonstrated for the collateral variable, the study was included. Other SCD studies demonstrated experimental control over the target behavior but reported collateral effects as averages across phases or participants, precluding the visual analysis of trend and variability necessary to evidence experimental control (e.g. Goldstein & Cisar, 1992). In those cases, at a minimum, the study had to measure the collateral behavior pre- and post-intervention in addition to demonstrating experimental control with the target behavior to be included. Differences in experimental control for target and collateral behaviors were accounted for when coding research rigor (see Data Extraction and Coding). If experimental control was compromised for both targeted and collateral effects by the exclusion of participants older than 8 years or without ASD (e.g. exclusion of a participant in a multiple baseline across participants design), the study was excluded (e.g. Thorp, Stahmer, & Schreibman, 1995). These exclusions ensure a minimum degree of rigor among included studies.
Data extraction and coding
Summary of included studies.
Years:months; f: female; AOC: abolishing operation component; Exp.: experiment; IRD: improvement rate difference; NAP: nonoverlap of all pairs; NR: not reported or not ratable; PECS: picture exchange communication system; PND: percent nonoverlapping data; RIRD: response interruption and redirection.
The first author developed a coding manual for data collection and analysis. After data collection was complete, the third author independently verified 30% of included studies. Agreements were defined as a match between the two coders, with effect size estimates required to match to the hundredths place in order to be scored as an agreement. Interrater agreement was 95.6% and was calculated by dividing the number of agreements by the total number of agreements plus disagreements and multiplying by 100.
Participant functioning level was categorized as lower, medium, or higher functioning according to the framework provided by Reichow and Volkmer (2010). Lower functioning refers to participants with very limited vocal communication skills or an IQ below 55. Participants classified as medium functioning had emerging vocal communication or an IQ between 55 and 85. Those classified as higher functioning displayed age-appropriate vocal communication and an average or above-average IQ.
Research rigor was coded according to criteria outlined by Reichow, Volkmar, and Cicchetti (2008) which has precedence in reviews of ASD intervention research (e.g. Siegel & Beaulieu, 2012; Whalon, Conroy, Martinez, & Werch, 2015). Specifically, methodological strength was coded as strong, adequate, or weak dependent upon the number of primary and secondary quality indicators met. Primary quality indicators include clear descriptions of participant characteristics, operational definitions of independent and dependent variables, and demonstration of experimental control in SCD or appropriate statistical analyses and power in group designs. Secondary quality indicators include adequate interobserver agreement (IOA), blind raters, treatment fidelity, generalization, maintenance, and social validity. Dependent variables classified as having strong methodological rigor met all primary quality indicators and at least three secondary quality indicators in SCD or four in group designs. Adequate rigor was assigned to variables that evidenced at least four primary quality indicators and two secondary quality indicators. Variables considered to have weak rigor met fewer than four primary quality indicators or fewer than two secondary quality indicators.
Several of the quality indicators described above focus on the rigor of data collection and analysis procedures. Because those procedures sometimes differed for target behaviors and collateral behaviors within the same study (e.g. sufficient baseline data collected for target behavior but not collateral behavior), multiple quality indicator scores were calculated per study. Scores were calculated first by applying Reichow et al.'s (2008) quality indicators to the target dependent variable. However, given the differences in procedures used for target and collateral behaviors, a rigor classification based on target behaviors should not be conflated with the certainty of evidence for the collateral behavior variables. Therefore, separate rigor scores were also calculated for collateral behavior dependent variables by applying the same Reichow et al. criteria to those data. This approach empowers a more nuanced consideration of the certainty of evidence specific to collateral behavior changes and seems consistent with the intent of Reichow et al.'s recommendations.
Consistent with recommendations regarding effect size estimates for synthesis of SCD studies, three different nonparametric effect size estimates were calculated (Kratochwill et al., 2013; Maggin & Odom, 2014). Specifically, the percentage of nonoverlapping data (PND), improvement rate difference (IRD), and nonoverlap of all pairs (NAP) were calculated for all targeted and collateral behaviors (Parker & Vannest, 2009; Parker, Vannest, & Brown, 2009; Scruggs, Mastropieri, & Casto, 1987). SCD graphs were prepared for analysis by manually extracting the data from each study and saving the raw data into an Excel file. For multiple baseline, multiple probe, and reversal designs, all adjacent AB series (i.e. the intervention phase and the preceding baseline) were contrasted (Maggin, O'Keefe, & Johnson, 2011). For multi-element designs, data between the treatment and comparison condition were contrasted to determine an effect size estimate (Maggin et al., 2011). No alternating treatment designs in the included studies utilized more than two conditions.
PND was selected because it has been in use longer than other options which enables comparison to a larger portion of previous research (Campbell, 2013). Further, PND is the most commonly utilized effect size estimate in synthesis of SCDs (Maggin et al., 2011). To calculate PND, the number of data points in the intervention phase that exceed the highest baseline point is divided by the total number of data points in the intervention phase (Scruggs et al., 1987). PND effect size estimates range from 0% to 100% and were interpreted using the criteria outlined by Scruggs and Mastropieri (1998) wherein a PND value greater than 90% suggests a highly effective intervention, values between 70.1% and 90% a moderate effect, and values below 70% a low effect.
IRD represents the difference in improvement rate between baseline and intervention phases (Parker et al., 2009). It is highly correlated with Phi and mathematically equivalent to risk difference, a widely-used effect size estimate in medical research (Parker et al., 2009). Data points in intervention phases which exceed all baseline points are considered improved. IRD values range from 0 to 1 and those above .70 suggest a large effect, .50 to .70 a moderate effect, and values below .50 a small or questionable effect (Parker et al., 2009). NAP compares each data point from baseline and intervention in a pairwise fashion to determine a complete nonoverlap index and is conceptualized as the percentage of data that improves across adjacent phases (Parker & Vannest, 2009). NAP values range from .50 to 1 and values of at least .93 suggest a large effect, .66 to .92 a moderate effect, and a small effect when at or below .65. Both IRD and NAP were calculated using the online calculator developed by Vannest, Parker, and Gonen (2011).
For studies utilizing group designs, Cohen's d was calculated for each reported variable using means and standard deviations (Cohen, 1988). Cohen's d is defined as the standardized difference between group means and is common in synthesis of group design studies (Warner, 2013). Effect sizes of .20 and lower are considered small, values from .21 to .79 moderate, and values at or above .80 large (Warner, 2013).
Results
Summary of collateral effects by intervention and targeted skill.
Effect Size Estimate Averages (PND, IRD, and NAP).
Participant characteristics
A total of 206 children (166 male) with ASD, ranging in age from 2;0 to 8;7 years (M = 4;2), participated in the included studies. The majority of participants had characteristics consistent with Reichow and Volkmar's (2010) description of lower functioning (n = 94 across 29 studies), followed by medium functioning (n = 40 across 23 studies), and higher functioning (n = 13 across seven studies). Seventeen studies included participants from different functioning levels (e.g. Charlop & Trasowech, 1991). These totals do not include participants from studies that did not provide enough detail to determine specific functioning level of included participants.
Intervention characteristics
A number of different intervention packages involving a variety of components were identified. Table 1 provides an exhaustive list but the most common examples include: naturalistic behavioral strategies (e.g. following the child's lead, use of natural reinforcers), prompting and reinforcement, script training, and motivating operation manipulation. Interventions were delivered in clinical settings (n = 18), school settings (n = 10), homes (n = 10), and distraction-free locations in applied settings (e.g. empty rooms at a school; n = 10). One study did not report the location of the intervention. Interventionists included researchers or graduate-level trained therapists (n = 32), teachers (n = 6), parents (n = 7), and peers (n = 2). Four studies utilized multiple intervention agents and five implemented the intervention across multiple settings. The total duration of intervention ranged from one to 400 hours (M = 21 hrs.) with the majority involving fewer than ten hours (median = 4 hrs.). Ten studies did not report intervention duration.
Target behaviors
Target behaviors in SCD studies included vocal utterances and requests (n = 8), social language (e.g. initiations, social question-asking, bids for joint attention; n = 7), stereotypy (n = 7), joint attention (n = 4), functional play (n = 3), social play (n = 3), expressive and receptive identification of stimuli (n = 3), academic skills (n = 3), daily-living tasks (n = 2), challenging behavior, compliance, observation of conditioned reinforcers, imitation, and social interaction (n = 1 each; see Table 1). Interventions investigated in group design studies targeted joint attention and symbolic play (Kasari et al., 2006) and prelinguistic joint attention acts and functional communication (Yoder & Stone, 2006b). Table 1 reports target behavior effect size estimates.
Collateral outcomes
Fourteen different collateral effects were identified across studies. Table 2 organizes studies in three groups according to how collateral effects align with the DSM-5's diagnostic criteria for ASD with social communication skills in group one and restricted/repetitive behaviors in group two. The third group included studies with collateral effects in domains not directly related to the DSM-5's diagnostic criteria (e.g. challenging behavior). For each specific collateral effect, Table 2 also provides the mean PND, IRD, and NAP scores across SCD studies or Cohen's d for group studies involving each collateral effect (column 1) and lists all the combinations of interventions and target behaviors associated with each collateral effect (column 2). For example, the first entry in Table 2 indicates that a total of four studies reported improved identification of stimuli (a skill related to receptive language) as a collateral effect and .79, .81, .88 are the mean PND, IRD, and NAP scores (respectively) for that collateral effect across the four studies. The intervention procedures and target behaviors involved in those four studies were Discrete Trail Teaching (DTT) targeting expressive and receptive language (n = 2), matrix training targeting spelling words (n = 1), and naturalistic behavioral strategies targeting question-asking (n = 1).
Collateral effects related to communication and social interaction skills were reported in 34 cases. In terms of communication, average effect size estimates for collateral behavior changes in SCD studies ranged from low to moderate and included increased verbal utterances (n = 8); language variability or vocabulary (n = 7); and expressive and receptive identification of stimuli (n = 4). Cohen's d ranged from .50 to .71 for the three group design studies reporting a collateral increase in verbal utterances, indicating a moderate effect. A variety of intervention packages and components occasioned those collateral effects. Although not specifically listed in Table 2, all of the intervention packages involved some form of systematic prompting and reinforcement. The next most common intervention characteristic associated with collateral gains in communication involved naturalistic reinforcement contingencies delivered in developmentally appropriate natural contexts (n = 6); for example, naturalistic reinforcement contingencies embedded in play or daily routines (e.g. Koegel et al., 2014). Script training to target scripted spoken language resulted in collateral improvements in unscripted spoken language in five studies (e.g. Ledbetter-Cho et al., 2015). Two studies used Response Interruption and Redirection (RIRD) and one used a stimulus control procedure to reduce a targeted form of stereotypy. Collateral improvements in verbal utterances were reported in seven out of seven children in those studies (e.g. Ahearn et al., 2007).
In regards to social skills, average collateral effect size estimates for SCD studies ranged from low to moderate (see Table 2) and included improved joint attention (n = 4); eye contact or orientation toward a social partner (n = 4); and social interaction (n = 2). The two group design studies reported a moderate increase in joint attention, with Cohen's d equaling .66. In terms of commonalities across interventions that reported collateral effects in social skills, prompting and reinforcement were components of the intervention packages in all 12 cases. Seven targeted language skills (e.g. Vismara & Lyons, 2007) and six studies specifically described naturalistic behavioral strategies (e.g. Kasari et al., 2006; Koegel et al., 2009). Two studies incorporated participants' perseverative interests into intervention procedures targeting play or language skills and reported collateral improvement in joint attention (Baker, 2000; Vismara & Lyons, 2007). Finally, one study used peer-mediated instruction to target reading comprehension and reported an improvement in social interaction (Kamps et al., 1994).
Regarding restrictive and repetitive patterns of behavior and interests, a collateral decrease in some form of stereotypy (i.e. motor or vocal) was reported in five studies with a low mean effect size estimate. In all five studies, intervention involved some form of systematic prompting and reinforcement, one involved peer-mediated instruction, one incorporated participants' perseverative interests into games, two implemented prompting and reinforcement, and one utilized visual work systems. The most common target behavior associated with a collateral decrease in stereotypy was some form of play (n = 3; e.g. Lang et al., 2014). One study targeted social interaction (Lee & Odom, 1996) and one on-task behavior during academics (Bennett et al., 2011).
Collateral effects in domains and behaviors not explicitly required in the DSM-5's diagnostic criteria for ASD included challenging behavior (n = 9), play (n = 6), attending, compliance, imitation, matching, and on-task behavior (n = 1 each). Average effect size estimates ranged from low to high, with one study not reporting enough information for calculation (Whalen et al., 2006). All but three studies included prompting and/or reinforcement; specifically: (a) Koegel et al. (1974) used punishment to decrease stereotypy and found an increase in appropriate play; (b) MacDonald et al. (2009) implemented video modeling to teach play and observed an increase in verbalizations that were not modeled in the video; and (c) Lanovaz et al. (2014) provided noncontingent access to music in an effort to decrease vocal stereotypy and reported improved on-task behavior. A decrease in stereotypy was the most common target behavior associated with collateral improvements in this group of studies (n = 6) followed by requesting (n = 3) and play (n = 2). Naturalistic behavioral strategies were associated with four of these collateral effects (e.g. Gianoumis et al., 2012; Ingersoll & Schreibman, 2006).
Research designs and rigor
Interventions were evaluated in randomized controlled trials (RCT) in four studies (Kasari et al., 2006, 2008; Yoder & Stone, 2006a, 2006b) and the remainder were SCD. With regard to the target behaviors across studies, 18 (32%) were rated as having strong methodological rigor. Twenty-five target behaviors (44%) were rated as adequate. Of these variables, most received a rating of adequate due to lower scores on visual analysis (i.e. stability of the data, overlap between adjacent phases, and a lack of shift between conditions; n = 20). Four of these variables received an adequate rating due to a lack of secondary quality indicators and one did not provide a sufficient description of participant characteristics. Fourteen target behaviors (24%) received ratings of weak methodological rigor due to an insufficient number of data points in baseline and/or intervention phases (n = 10), inadequate stability in the data (n = 3), or an absence of secondary quality indicators (n = 1).
Each collateral effect across studies was also coded for rigor, with some studies receiving different ratings on different collateral behaviors (e.g. Baker, 2000). Ten collateral behaviors (14%) received strong ratings of research quality. Thirty-five collateral behaviors (48%) were rated as adequate due to overlap and stability of data (n = 31), insufficient number of baseline data points (n = 2), or lack of secondary quality indicators (n = 2). The twenty-eight remaining collateral effects (38%) received ratings of weak as a result of excessive overlap or variability in data (n = 15), reporting averages across phases precluding visual analysis (n = 7), insufficient number of baseline data points (n = 3), or lack of secondary quality indicators (n = 3).
Discussion
This systematic review of 46 intervention studies resulted in the identification of 14 general collateral effects (Table 2). The most common collateral effects involved behaviors directly related to ASD diagnostic criteria. Specifically, in terms of social communication skills, the following collateral increases were reported: (a) spoken utterances; (b) novel and more varied language; (c) joint attention, eye contact, and orienting toward a communication partner; (d) social interactions; and (e) receptive and expressive identification of stimuli. In regards to the amelioration of restrictive and repetitive behaviors, collateral decreases were found in motor and vocal stereotypic behavior. Improvements in skills not directly aligned with ASD diagnostic criteria were also reported (e.g. decreased challenging behavior). Overall, this systematic review identified a wide range of collateral effects across multiple domains of functioning and supports conclusions of previous reviews focused on specific intervention packages or target behaviors (e.g. Ganz et al., 2012; Lanovaz et al., 2013; Verschuur et al., 2014; White et al., 2011).
The finding that 206 children across 46 studies evidenced a collateral change in behavior suggests that these effects may not be uncommon. Further, given that only one participant's collateral behavior change was undesirable (i.e. Cook et al., 2014), collateral benefits appear to be more common than undesirable collateral side effects. However, the commonality of beneficial collateral effects should be considered cautiously because the potential for a collateral effect is not always considered when planning intervention research and may often go unmeasured. Similarly, researchers may be less likely to report efforts aimed at measuring potential collateral effects when no collateral changes are detected. Finally, the absence of consistent terminology to describe collateral effects complicates database searches: for example, we counted thirty-six different terms referring to collateral effects across included studies (list of terms available on request). Considered in tandem with differences in participant characteristics and intervention procedures across studies, these factors preclude calculating the probability of collateral effects for a given scenario with sufficient certainty. Although the exact factors contributing to the probability of a collateral effect cannot be determined, notable trends across the included studies did emerge that suggest directions for future research and considerations for practitioners.
Within-study effect size estimates for target and collateral behaviors can be compared in 26 SCD studies that provided session-by-session data for all target and collateral behaviors (Table 1). In 20 of those studies (77%), every target behavior effect size estimate was larger than every collateral behavior effect size estimate in the same study and, in one of the remaining six studies, the effect size estimates for target and collateral behavior changes were equivalent (i.e. Krantz et al., 1993). The finding that target behavior effect size estimates tended to be larger suggests that interventions should include components that directly target the highest treatment priorities when possible. However, if intervention for a target behavior is unavailable, ineffective, or inefficient, it may be beneficial to initiate an intervention that targets a different behavior and/or include components that have been demonstrated to produce a collateral behavior change consistent with the goals of the initial focused intervention. For example, peer-mediated instruction targeting academics has produced collateral improvements in social skills and could be used to improve both academics and supplement a concurrent intervention targeting social skill deficits in cases where a child does not have access to a quality social skills intervention or the acquisition of targeted social skills has been slow (e.g. Kamps et al., 1994).
Many studies reporting collateral skill increases involved behaviors that were occasionally emitted prior to intervention. For example, children that produced at least a few vocal utterances prior to intervention appear to be more likely to experience collateral increases in utterances following intervention targeting joint attention than children who did not (e.g. Ingersoll & Schriebman, 2006). There were 129 demonstrations of collateral increases across studies that measured collateral behaviors in baseline sessions (e.g. a study with three participants that measured two potential collateral behaviors per participant could have up to six demonstrations of a collateral effect). Of the 129 collateral increases demonstrated across studies, 97 (75%) had two or more baseline sessions in which the collateral behavior occurred. This suggests that collateral increases may be more likely when there is a performance deficit as opposed to a skill deficit. Specifically, when a child has acquired a skill but does not demonstrate the skill at desired levels because stimulus control or motivation is insufficient (performance deficit), collateral increases in that skill may be more probable than in cases where the skill has not yet been acquired and is therefore absent in baseline (skill deficit). It is important to note that the nonoccurrence of a skill across baseline sessions does not necessarily indicate that there is a skill deficit and not a performance deficit. However, the observation that the majority of collateral skills were demonstrated, at least to some degree, by participants prior to intervention suggests future research considering the potential influence of preexisting skill levels on collateral skill increases may be worthwhile.
In three studies, a collateral increase in a behavior was demonstrated despite an absence of evidence of the skill in baseline (i.e. potential skill deficit). The collateral behaviors in those studies were notably similar to the target behaviors. Specifically, Wichnick, Verner, Pyrtek, et al. (2010) targeted scripted responses to social initiations using script training and reported a collateral increase in novel (unscripted) social responses. Koegel et al. (2014) used a naturalistic behavioral intervention package to teach children to ask specific target questions and reported a collateral increase in question forms not targeted by intervention. Pollard et al. (2012) targeted scripted bids for joint attention using script training and reported a collateral increase in types of joint attention bids that were not directly scripted during intervention. In all three of these studies, the collateral behaviors and target behaviors likely shared the same operant function and involved similar discriminative stimuli (i.e. stimuli that precede the responses and signal potential reinforcement), suggesting that response generalization was likely the mechanism responsible for the collateral gains (Cooper et al., 2007; Kazdin, 1994). Response generalization may be facilitated by reinforcing variability in the topography of a target behavior or novel combinations of previously acquired behaviors (Kinney et al., 2003; Lee, Sturmey, & Fields, 2007; Pauwels et al., 2015). Studies utilizing research designs specifically arranged to test whether specific strategies (e.g. matrix training, multiple exemplars) are responsible for collateral effects may be especially useful (e.g. Kinney et al., 2003; Lang et al., 2014).
Collateral effects occurring in different domains and/or involving different operant functions or discriminative stimuli than the target behaviors are more consistent with the concepts of pivotal response and behavioral cusp than response generalization. Target behaviors involving play skills, communication/language, joint attention, and stereotypy were the most common among studies reporting collateral effects that do not meet definitions of response generalization (Stewart et al., 2013). In regards to joint attention, language, and play, this finding is consistent with a large corpus of previous research and highlights the potential bidirectional nature of interactions between these variables. Specifically, interventions that target joint attention and/or play have reported collateral increases in language while interventions targeting language have occasioned collateral increases in play and joint attention (e.g. Baker, 2000; Kasari et al., 2006, 2008; Vismara & Lyons, 2007; Whalen et al., 2006; Yoder & Stone, 2006a). These collateral effects buttress conclusions of previous research linking joint attention and play to the emergence of language in children of typical development (e.g. Charman et al., 2000; Kuhn, Willoughby, Wilbourn, Vernon-Feagans, & Blair, 2014) and research suggesting that targeting developmentally appropriate behaviors (e.g. play in early childhood) may facilitate more efficient skill acquisition (e.g. Lifter et al., 1993, 2005). Further, because language ability predicts academic achievement, socialization, and executive functioning (e.g. Bono, Daley, & Sigman, 2004; Charman et al., 2003; Hart & Risely, 1995; Mundy, Sigman, & Kasari, 1990), it is not surprising that improved language can positively influence a wide range of additional variables including social interaction and challenging behavior (e.g. Charlop-Christy et al., 2002; Gianoumis et al., 2012; Koegel et al., 2009).
Targeted decreases in stereotypy were also associated with collateral improvements across a range of behaviors involving play, language, challenging behavior, and on-task behavior. Lanovaz et al.'s (2013) review noted that interventions aimed at reducing stereotypy should provide access to alternative activities or directly prompt appropriate replacement behaviors (e.g. play) to reduce the likelihood of undesirable collateral increases in another form of stereotypy or challenging behavior. Consistent with Lanovaz et al.'s recommendation, we found that a collateral increase in play was often reported following interventions targeting stereotypy and, conversely, a collateral decrease in stereotypy was often found following interventions targeting play (e.g. Baker, 2000; Koegel et al., 1974; Lang et al., 2009, 2010, 2014; Nuzzolo-Gomez et al., 2002). Those studies hypothesized that play and stereotypy may, in some cases, be maintained by similar operant functions (e.g. automatic reinforcement) which could facilitate collateral effects. However, the operant functions of the specific play and stereotypic behaviors were not directly assessed, and the extent to which a shared operant function between play and stereotypy contributes to the emergence of collateral effects warrants additional research.
Behavioral intervention components (e.g. prompting and reinforcement) embedded in naturalistic routines and activities constituted the most common intervention packages associated with collateral behavior change. Naturalistic behavioral intervention packages (e.g. PRT, Incidental Teaching) often involve parents or teachers as interventionists and are implemented in applied settings, which may facilitate collateral effects by helping to ensure intervention is delivered for more hours per day and across multiple environments (e.g. Vernon et al., 2012). Alternatively, it is possible that measuring and reporting collateral effects is simply more common in studies using these procedures and that similar collateral effects arise from other intervention approaches but simply go unmeasured. Future research comparing collateral effects resulting from naturalistic behavioral intervention packages to behavioral interventions involving more contrived stimuli (e.g. DTT) could help elucidate intervention characteristics that contribute to collateral effects.
In terms of participant characteristics that corresponded with collateral effects, participant summaries provided in Table 1 reveal that the majority of children across studies (64%) had very limited vocal communication skills and/or an IQ below 55 (i.e. lower functioning per Reichow & Volkmer, 2010) and only 9% had age-appropriate vocal communication and an average or above-average IQ (higher functioning). Although it is possible that the potential for collateral effects decreases as participant functioning level increases, the studies included in the current review consisted of younger participants who may have been more likely to exhibit severe symptoms. Future research involving controls for level of functioning could identify interactions between participant characteristics, intervention procedures, and target skills that contribute to collateral effects. For example, it is possible that the intervention procedures used more often with participants that are lower functioning facilitate collateral gains (e.g. establishing a context for joint attention through naturalistic strategies while targeting play; Kasari et al., 2006) and/or that developmentally appropriate target skills are more likely to produce collateral effects than target skills that are not properly aligned with participant functioning level (e.g. Lifter et al., 2005).
Limitations and future research
In order to consider a larger sample of studies, we chose not to exclude studies based on number of participants, specific research designs, target behaviors, or intervention characteristics. Although this allowed for a broad-based consideration of the literature, it also precluded use of the fine-grained meta-analytic procedures potentially capable of determining the extent to which specific factors influenced collateral effects (e.g. Shadish, Hedges, & Pustejovsky, 2014). Additionally, the effect size estimates calculated from SCDs quantify the degree of overlap between measurements of dependent variables across baseline and intervention phases but do not necessarily reflect the magnitude of behavior change. Future research reviews should attempt to identify potential moderators of collateral behavior change; however, that endeavor will require development of a novel or refined approach to calculating standardized effect sizes that can be utilized across a wider range of SCD variants (Pustejovsky & Ferron, 2017).
Regardless, in most cases, the included studies were rated as having strong or adequate methodological rigor, providing some certainty that effects reported in the included studies were not the result of maturation, concomitant intervention, measurement error, or other similar confounds. It is likely that additional focused intervention studies have produced collateral behavior changes that have not been measured or reported in the literature. Future research that measures collateral behavior changes throughout all phases of the study would provide additional insight into the nature of such behavior change. It is important to note that the majority of studies focused experimental controls on target behaviors and not collateral effects, resulting in higher ratings of research rigor for targeted behaviors (Reichow et al., 2008). Future research, could specifically tailor controls to ensure a higher degree of certainty regarding collateral effects and should consider addressing research questions regarding the mechanism of action for collateral effects directly. For example, the majority of reviewed studies did not involve design features or controls directed at testing hypothesized mechanisms of action for collateral behavior changes (e.g. recombinative generalization). Research illuminating the cause of specific untargeted behavior changes would better inform intervention creation and delivery.
Footnotes
Acknowledgements
The authors would like to thank Dr. James Pustejovsky for sharing his suggestions and expertise regarding effect sizes for single-case design studies.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research described in this article was supported in part by Grant H325H140001 from the Office of Special Education Programs, U.S. Department of Education. Nothing in the article necessarily reflects the positions or policies of the federal government, and no official endorsement by it should be inferred.
