Abstract
Researchers and prevention scientists often develop interventions to target intermediate variables (known as mediators) that are thought to be related to an outcome. When researchers target a mediating construct measured by self-report, the meaning of the self-report measure could change from pretest to posttest for the individuals who received the intervention—which is a phenomenon referred to as response shift. As a result, any observed changes on the mediator measure across groups or across time might reflect a combination of true change on the construct and response shift. Although previous studies have focused on identifying the source and type of response shift in measures after an intervention, there has been limited research on how using sum scores in the presence of response shift affects the estimation of mediated effects via statistical mediation analysis, which is critical for explaining how the intervention worked. In this article, we focus on recalibration response shift, which is a change in internal standards of measurement and affects how respondents interpret the response scale. We provide background on the theory of response shift and the methodology used to detect response shift (i.e., tests of measurement invariance). In addition, we used simulated data sets to provide an illustration of how recalibration in the mediator can bias estimates of the mediated effect and affect Type I error and power.
Keywords
A key aspect of intervention research is to determine how the intervention works. Statistical mediation analysis is an analytic technique that introduces intermediate variables, known as mediators (M), to explain how an intervention (X) transmits its effect to an outcome (Y; Baron & Kenny, 1986; MacKinnon, 2008; VanderWeele, 2015). Statistical mediation plays a critical role in the advancement of intervention research by identifying the program components that are beneficial or iatrogenic or that need to be reinforced (MacKinnon & Dwyer, 1993). Beyond testing whether an intervention changed the mediator (e.g., a manipulation check), if a particular mediator is identified in one intervention, that knowledge could be extended to other types of treatment. For example, if one found that a program component reduced cravings in a sample of smokers, which then led to reduced smoking, that program component could be used in other preventive interventions for addictive behaviors.
Mediators are often assessed by self-report measures. An inherent assumption made when using self-report measures is that all respondents interpret and respond to the measure in the same way such that a particular response has the same meaning for all individuals. Consider a hypothetical example that we refer to throughout the article featuring a randomized intervention to reduce cravings of alcohol and drugs (inspired by Hsiao et al., 2019). Participants in the treatment group received a mindfulness-based relapse prevention intervention, and the control participants received a 12-step abstinence-based program. The intervention targets self-awareness, a facet of mindfulness, which is thought to mediate the relation between the mindfulness intervention and reduced cravings. Suppose that as a result of undergoing the intervention, respondents in the treatment group develop a different interpretation of the mediator, self-awareness, than what they had at baseline. When this occurs, observed changes in the self-awareness measure could reflect both true change in the construct of self-awareness as well as differences in interpretation across respondents. When the meaning of the responses on the self-report measure change as a result of the treatment, this is called a response shift.
Although the term response shift and similar concepts were initially discussed in educational training interventions (Howard, 1980) and organizational research (Golembiewski et al., 1976), response shift has primarily been discussed with respect to measures of health-related quality of life (QoL; Oort et al., 2009). Response shift provided an explanation for counterintuitive findings in which individuals with severe life-threatening illnesses reported QoL that was equal or superior to what was reported before their diagnosis or relative to healthy individuals (Sprangers & Schwartz, 1999). Although response shift has been widely studied in QoL research, there has not yet been an investigation of how response shift in mediators may affect the understanding of how an intervention works. Therefore, the goal of this article is to illustrate how response shift might be manifested in a mediator and affect the estimation of the mediated effect. The structure of the article is the following. First, we provide background on statistical mediation and response-shift theory. Then, we discuss how potential response shift can be detected using latent-variable models. Next, we use simulated data sets to illustrate how recalibration, a type of response shift, in the mediator affects the estimation of the mediated effect when sum scores are used. Finally, we summarize the implications of our illustration and discuss limitations and future directions.
Statistical Mediation Analysis
In pretest-posttest intervention studies, an appropriate model to test for mediation is the two-wave mediation model (Cole & Maxwell, 2003; MacKinnon, 2008; Valente & MacKinnon, 2017). Examples of studies that used the two-wave mediation model have recently appeared in various fields, such as mental health and clinical psychology (e.g., Behrendet et al., 2020), developmental psychology (e.g., Luengo Kanacri et al., 2019), physical health or sports science (e.g., Plow et al., 2020), and sociology (e.g., Bruneau et al., 2020). The two-wave mediation model is represented by the following equations (also see Fig. 1):
where X is our binary treatment or control group indicator, M is the mediator (in our case, self-awareness) measured at pretest and posttest (M1 and M2), and Y is the outcome (in our case, craving) also measured at pretest and posttest (Y1 and Y2) – all variables are observed. See Figure 1 for an explanation of the coefficients of Equations 1 and 2. In the self-awareness example, we posit that the intervention changed the self-awareness mediator and that self-awareness then changed the cravings outcome. The mediated effect is defined by the product of the

Two-wave mediation model with observed mediator. Boxes refer to observed variables; X is a binary variable indicating treatment group. a is the effect of X on M2, s1 is the stability of M, s2 is the stability of Y, b1 is the cross-lag path between M1 and Y2, b2 is the cross-lag path between Y1 and M2, b3 is the effect of M2 on Y2, c′ is the direct effect of X on Y2*. In addition, from Equations 1 and 2, i1 and i2 are regression intercepts, and e1 and e2 are regression residuals (not shown in diagram).
Several assumptions need to be met so that the mediated effect is given a causal interpretation (MacKinnon, 2008). We assume that the functional form and temporal precedence between the variables have been correctly specified and that there is no unmeasured confounding among the X-M2 and X-Y2 relations conditional on pretest measures M1 and Y1, no unmeasured confounding of the M2-Y2 relation conditional on X and the pretest measures, and no posttreatment confounders of the M2-Y2 relation affected by X conditional on the pretest measures (Mayer et al., 2014; Pearl, 2014; Valente et al., 2019; Valeri & VanderWeele, 2013). In this article, we focus on the assumption that the mediator has been accurately assessed (Gonzalez & MacKinnon, 2021), specifically that the mediator was measured consistently across respondents and time (i.e., no response shift). Below, we provide more detail on response-shift theory and how to detect response shift.
Response-Shift Theory
Sprangers and Schwartz (1999) defined response shift as a change in the meaning of an individual’s self-evaluation (i.e., responses to the self-report measure) and described three ways in which it occurs, which we refer to as types of response shift: (a) recalibration, which is a change in internal standards of measurement; (b) reprioritization, which is a change in “values,” or a reevaluation of the importance of various domains that are relevant to the target construct; or (c) reconceptualization, which is a redefinition of the target construct (for examples, see Table 1 and Oort, 2005b). Oort (2005b) specified recalibration as a change in the meaning of the values on the item response scale, reprioritization as a change in the importance of the item to the measurement of the target construct, and reconceptualization as a change in the meaning of the item content. Furthermore, Sprangers and Schwartz (1999) defined catalysts as changes in health status (e.g., an intervention, elapsed time, a diagnosis, or medical procedure) and mechanisms as behavioral, cognitive, and affective processes that accommodate the catalyst (e.g., coping or social comparison) but are unrelated to true change in construct. Their theoretical model of response shift proposes that a catalyst triggers a mechanism that then changes the meaning of responses to the measure via recalibration, reprioritization, or reconceptualization. In general, response shift is a concern because when it occurs, observed changes on a self-report measure may not reflect true change in the target construct (Oort et al., 2009). 1 Given this theoretical model, response shift could potentially occur in the context of an intervention whenever self-report measures are used.
Summary of Levels of Invariance, Response Shift Terminology, and Examples From the Literature
In a review of response shift in QoL measures, Sajobi et al. (2018) reported that recalibration response shift occurred in 85% of the studies reviewed, making it the most common type. For this reason, we focus primarily on recalibration in the main text and discuss examples of reconceptualization and reprioritization in Supplement 5 in the Supplemental Material available online. To illustrate recalibration, suppose that our self-awareness mediator is measured by the eight-item acting with awareness subscale of the Five Facet Mindfulness Questionnaire (Baer et al., 2006). At pretest, a respondent in the treatment group interprets the item “I am easily distracted” as referring to becoming distracted by checking smartphone notifications and endorses the response of 5 (very often or always true), which the respondent interprets as meaning that they become distracted by smartphone notifications at a frequency of once per day. Suppose that during the intervention, the individual learns that for some people, distractions can actually occur so frequently that they affect the ability to complete tasks. At posttest, the respondent does not increase on self-awareness (target construct) and is still distracted by smartphone notifications daily. However, because of what was learned during the intervention, the respondent engages in social comparison and now interprets the response 5 (very often or always true) as becoming distracted by notifications approximately once per hour. Therefore, the respondent now endorses a 2 (rarely true) because they consider a daily distraction to be a lower frequency given this new information.
Relating this example to Sprangers and Schwartz’s (1999) theoretical model, as a result of the intervention (i.e., the catalyst), the respondent engaged in social comparison (i.e., the mechanism) that led to a shift in internal standards (i.e., recalibration), and as a result, the meaning of the respondent’s responses at posttest has changed—the response options now refer to different levels of being distracted than they did before. In contrast, an individual in the control group did not experience response shift because this individual has not learned new information from the intervention. If this shift were consistent for all individuals in the treatment group, the raw scores would show improvement in self-awareness from pretest to posttest, but this improvement is occurring because of changes in internal standards of self-awareness, not a true increase in self-awareness. The next section describes the detection of response shift using latent-variable models.
Detecting Response Shift Using Tests of Measurement Invariance
As outlined in Oort et al. (2009), response shift can be understood from either a conceptual perspective or a measurement perspective depending on whether one views response shift as leading to true change in the observed variable of interest or as measurement bias (i.e., a systematic difference in how the variable is measured). Either perspective has implications for the methodology used to identify response shift. Throughout this article, we embrace the measurement perspective and posit that response shift results in measurement bias—meaning that observed changes in the outcome variable do not necessarily reflect true change. 2 Moreover, under a so-called broad view of response shift (Oort, 2005b; Oort et al., 2009), there is less focus on identifying the precise mechanism causing response shift, whereas a narrow definition would argue that measurement bias must be caused by a particular mechanism (e.g., adaptation, coping, social comparison) to qualify as response shift. In this article, we adopt a broad view of response shift and therefore do not comment further on specific mechanisms and whether they lead to response shift because we believe this would be dependent on the application. Under this perspective and view, response shift can be detected by using tests for measurement invariance (Meredith, 1993) with confirmatory factor analysis (CFA; Oort, 2005b). Measurement invariance (i.e., a lack of measurement bias) is a very technical subject, and interested readers are referred to Millsap (2011) for a full discussion of the topic (for longitudinal invariance, see Millsap & Cham, 2011; Chapter 14 of Grimm et al., 2016).Here, we offer an overview of measurement invariance insofar as it relates to response shift.
First, we define a CFA model. Drawing from our example, suppose that researchers want to measure the participants’ level of self-awareness using a self-report measure consisting of eight items. We assume that the responses to the items are imperfect representations of each participant’s true or latent level of self-awareness and that differences in item responses are due to differences in the latent level of self-awareness. We use a one-factor CFA model to map the theoretical relation between the eight item responses we observe for individual i (vector
where
Conceptually, measurement invariance means that the measurement parameters (i.e.,
where
To detect measurement noninvariance, Equation 3 would be expanded to allow the intercepts (
Recall that in this article we focus on recalibration response shift, which would appear as a violation to scalar invariance (i.e., nonequivalent intercepts). Returning to the example provided for recalibration with the item “I am easily distracted,” where a response of a two at posttest had the same meaning as a five at pretest, assume that the true level of self-awareness is constant for all individuals in the treatment group and all responses are shifted down by three points. 4 Recalibration would shift the observed means downward for this item and result in a violation to scalar invariance because this change occurred despite self-awareness remaining constant. See Supplement 5 in the Supplemental Material for connection between violations to invariance and the other types of response shift.
Up to this point, we have discussed invariance with respect to groups, but in the two-wave model, one would need to test for invariance across groups and across time. Noninvariance in the mediator could arise in four main patterns in the two-wave model. First, the mediator could be noninvariant across groups at pretest (and hold at posttest), but this is unlikely because random assignment should ensure the groups are approximately equal before treatment. Second, if one ignores group assignment, noninvariance could occur across time (i.e., from pretest to posttest), suggesting that some other influence, such as development, resulted in a change in the intercepts that was consistent across groups. This is commonly referred to as maturation. 5 Third, the intercepts could be noninvariant across time for the control group and invariant across time for the treatment group, but this would be a surprising result, and the explanation would require specific knowledge about the intervention and research design. Finally, the intercepts could be invariant across time for the control group but noninvariant for the treatment group, which would suggest that recalibration response shift due to the intervention has occurred because the treatment group had received the intervention and the control group had not. See Figure 2 for a flowchart for making modeling decisions based on measurement invariance tests and Supplement 2 in the Supplemental Material for a tutorial on how to test these models.

Flowchart for testing for invariance and response shift. This flowchart focuses on scalar invariance (i.e., invariant intercepts) but could be used similarly for metric invariance (i.e., invariant loadings).
In sum, the relationship between response shift and measurement invariance is reciprocal. Measurement invariance provides a statistical definition for response shift as well as a methodological tool for assessing whether response shift may have occurred in an intervention (summarized in Box 1). On the other hand, the theory of response shift provides an explanation for measurement noninvariance occurring specifically in an intervention context. Most of the literature on measurement noninvariance focuses on research scenarios in which groups are compared or growth across development is studied. However, there is a lack of theoretical work on causes of measurement noninvariance in psychology, which makes the theory of response shift an important consideration and impetus for incorporating measurement invariance tests into intervention work. Moreover, it is unclear what the specific consequences of recalibration response shift would be in the two-wave mediation model. Below, we demonstrate the consequences of recalibration response shift in the two-wave model using a simulated illustration.
Overview of Steps of Measurement Invariance Testing
When testing for measurement invariance, there are a series of nested structural equation models that are fit using software such as Mplus or the
The steps are as follows:
Illustration
When researchers design an intervention to target a mediator and there is random assignment, response shift could occur because of either the intervention (i.e., response shift due to treatment) or time (i.e., maturation). Previous research on cross-sectional mediation models suggests that when there is measurement noninvariance in the mediator that is not accounted for in the model, the mediated effect could be biased, and Type I error rates could be higher than .05 (Guenole & Brown, 2014; Olivera-Aguilar et al., 2018; Williams et al., 2010). However, these prior studies are limited because they did not use a longitudinal model and used latent variables to represent the mediator rather than sum scores, which is the most common way to represent mediators in pretest-posttest studies (MacKinnon, 2008; Valente & MacKinnon, 2017).
In our illustration, we expand previous methodological work by examining how response shift would affect the estimation of the two-wave mediation model (Gonzalez et al., 2017) when sum scores are used. These examples feature recalibration only because it appears to be the most common type of response shift in interventions (Sajobi et al., 2018). In the illustration, we show how the power, Type I error rates, and bias related to the mediated effect are affected when recalibration (due to the intervention or maturation) in the mediator is ignored.
Data generation
The illustrations below are inspired by the mindfulness intervention example that we have discussed throughout, in which a randomized treatment group and control group intervention targets self-awareness to reduce alcohol and drug cravings. Data sets were simulated in the R statistical environment using the R package

Path diagram of the conceptual model used for data generation. Data were generated from a two-group model that represented the groups indexed by X. The paths emanating from X were derived by specifying different intercepts per group for variables M2 and Y2. The intercept values for M2 and Y2 were the true values for a and c′, respectively. Circles are latent variables, and squares are observed variables. The different indicator intercepts across groups on M2 and the residual correlations across time points for the indicators are not included to reduce clutter.
The binary variable X represents treatment group; Y1 and Y2 were continuous, normally distributed variables; and M1 and M2 were latent variables, each defined by six continuous items. For the items, the loadings were invariant and were specified to be
Data-Generating Item Intercepts and True Paths
Note: Intercepts for both groups at Time 1 are the same as those in Models 1, 4, and 7. Intercepts that are bold are noninvariant. The a path has either a zero or a small effect size (Cohen’s f2 = .02), the b3 path had a small effect size (f2 = .02), and the c′ path had a zero or a medium effect size (f2 = .15). For the correspondence between our true values on the paths and Cohen’s f2 effect size, see Supplement 4 in the Supplemental Material available online. The b3 path is the relation between M2 and Y2, and both are endogenous variables, so this path is fully standardized. For a demonstration on how the mediation paths were obtained, see Supplement 4.
Data analysis
The two-wave mediation model in Figure 1, which treats all of the variables as observed by using sum scores (thus not accounting for response shift), was used to analyze the generated data sets. Analyzing data sets in which we know that there is response shift provides insight into how the mediated effect is affected when response shift is ignored and sum scores are used. Observed scores for M1 and M2 were estimated by summing the indicators for M1 and M2. Data sets from Models 1, 4, and 7, which feature no response shift in the mediator, are provided as a baseline for power and Type I error rates. True parameter values for the model with observed variables were verified using the population-generating covariance matrix. Relative bias for the parameter estimates with a nonzero true value was calculated by taking the difference between each sample’s parameter estimates and the true values and then dividing by the true values. These estimates were then averaged over all the samples. Relative bias estimates below 0.05 were deemed acceptable. For conditions with a zero true value, standardized bias was estimated by dividing the difference between the parameter estimate and the true value by the empirical standard deviation of the estimate. Across all data sets, the significance of the mediated effect ab3 was examined with the distribution of the product method (e.g., MacKinnon et al., 2002) using the
Simulated Models and Results
Note: Relative bias below a level of 0.05 is considered acceptable. When response shift was present, it had a medium effect size for two of six items. ES = effect size.
Models with a nonzero mediated effect
Model 1 represents a situation in which respondents in the treatment group increased on the self-awareness construct and the mediator is free of response shift. The power to detect the mediated effect in data simulated from Model 1 was .787. Similar power estimates were found in Model 7, which includes a nonzero c′ path (i.e., partial mediation).
Furthermore, Model 2 represents a situation in which respondents in the treatment group increased on the self-awareness construct (i.e., the a path was significant) but also showed response shift in the same direction (i.e., the intercepts for the treatment group were higher at posttest). In this case, we would expect positively biased estimates for both the a path and the mediated effect, which in turn would inaccurately yield high power. In our results, the power to detect the mediated effect was .938, which is higher than the power to detect the mediated effect when the mediator is free of response shift (.787—as in Model 1). The relative bias for the a path was 0.394 and was below 0.05 for the b3 path. Thus, the bias in the a path resulted in greater power to detect a mediated effect. This is a concern because inflated mediated effects may pose a problem for planning future studies—the mediated effect may be overestimated, causing researchers to overstate the effect that the intervention had on the mediator. Similar bias in the a path and power estimates were found in Model 8, which differs from Model 2 by the inclusion of a nonzero c′ path. 7
Finally, Model 3 represents a situation in which respondents in the treatment group increased on the self-awareness construct (i.e., the a path was significant) but showed response shift in the opposite direction (i.e., the intercepts for the treatment group were lower at posttest). Therefore, we would expect negatively biased estimates for both the a path and the mediated effect. In our results, the power to detect the mediated effect was .407, which is nearly a 50% reduction in power to detect the mediated effect compared with when the mediator is free of response shift (.787—as in Model 1). The relative bias in the a path was −0.393 and was below 0.05 for the b path. This example underscores that recalibration in the opposite direction of the a path can result in enough bias to lead to Type 2 errors (i.e., a failure to detect a true effect). This is a concern because incorrect conclusions that the intervention did not work through the mediator could lead future intervention studies to no longer consider that mediator or, conversely, to enhance program components to produce a larger effect (i.e., increasing the number of hours and/or duration of the intervention), potentially wasting valuable resources.
Models with no mediated effect
Model 4 represents a situation in which the intervention did not increase self-awareness (i.e., a path is zero, no mediated effect) and there is no response shift in the mediator. In this case, the Type I error rate was .037, which provides a comparison for subsequent models. Model 5 and Model 6 represent situations in which respondents in the treatment group did not change on the self-awareness construct (a path is zero) but there is a response shift in a positive (Model 5) or in a negative (Model 6) direction for the treatment group. Consequently, the magnitude of bias for the a path estimates was |1.19| in Models 5 and 6 but was positive for Model 5 and negative in Model 6. Both have a similar Type I error rate of around .20, which is 4 times larger than the Type I error rate for the invariant model (Model 4). An inflated Type I error rate is a concern because incorrect conclusions that the intervention affected the mediator could motivate similar future studies rather than providing evidence that this particular mediator was not affected.
Summary
The power and Type I error rates for the mediated effect were impacted when there was response shift due to a recalibration in the mediator. When there was a nonzero mediated effect and the response shift was in the same direction as the a path, the mediated effect estimate was larger than it should be. On the other hand, response shift in the opposite direction of the a path resulted in a mediated effect estimate that was smaller than it should be. Finally, when there was no effect of the intervention on self-awareness (the a path was zero, and thus the mediated effect was zero) but there was recalibration response shift in the positive or negative direction for the treatment group, Type I error rates for the mediated effect were higher than .05. Results extend to situations in which there is full or partial mediation.
General Discussion
When researchers target a mediator assessed via self-report, there is a possibility that the responses at posttest have a different meaning than they did at pretest because of changes experienced as a result of the intervention, a phenomenon referred to as response shift. The goals of this article were to provide background on response shift as it could occur in an intervention study and demonstrate how ignoring response shift in the mediator could affect the detection of the mediated effect. Our simulated examples demonstrate that ignoring response shift can lead to drastically different conclusions about statistical mediation. These conclusions are important because the most common model to analyze intervention data uses sum scores, which do not allow for tests of measurement invariance. Therefore, we encourage researchers to test for response shift using measurement invariance tests and to understand the nature of the response shift by identifying its source (due to the treatment, maturation, or both) and its type (reconceptualization, reprioritization, or recalibration). If response shift in the mediator is assessed and detected, it could be accommodated by using a latent variable for the mediator and allowing some of the factor loadings and item intercepts to vary across groups or across time (for a tutorial, see Supplement 2 in the Supplemental Material). A more general recommendation is that researchers testing intervention-based mediation models use latent-variable models. Latent-variable models not only allow for tests of measurement invariance but also can address violations to measurement invariance in ways not possible with sum scores.
Although we focused on randomized interventions, the conclusions from the simulation regarding bias, Type I error, and power could potentially apply to nonrandomized interventions, longitudinal studies, or other models with mediators that violate measurement invariance (not necessarily due to response shift). In a randomized study, we expect the mediator measure to be invariant across groups at pretest, but we do not expect invariance at pretest in nonrandomized studies, nor can we expect this to hold for all measurement occasions in a longitudinal study. Therefore, we urge researchers to use measurement invariance tests to assess whether they are assessing the same construct at all measurement occasions and, if not, to understand the nature of the noninvariance. Additional technical and theoretical work is needed to determine how violations of invariance at pretest affect the estimation of the mediated effect. In addition, whereas response shift is defined specifically for self-report measures (e.g., Howard, 1980), similar effects could occur in other instruments, such as in parent-report measures on child behavior gathered before and after a parenting intervention. Finally, additional evidence, such as qualitative data, additional measures, or extensive subject-matter expertise would be required to determine the specific mechanisms responsible for response shift in a given study.
Limitations and future directions
Although we provided definitions and explanations for response shift and the theoretical model that is in line with Sprangers and Schwartz (1999), the concept of response shift is challenging to capture, and the theory continues to be refined within QoL research. We consider this article to offer a light introduction to response shift and think that adopting a measurement perspective allowed for greater clarity in describing response shift. However, a limitation of this article is that we have not provided a full discussion of the nuances of response shift or presented alternative perspectives from the literature. For example, Ubel et al. (2010) proposed abandoning the term response shift altogether, arguing that this term created conceptual confusion by conflating measurement bias and true change, whereas Donaldson (2005) critiqued the use of measurement invariance methodology for investigating response shift. 8 Future work should focus on fully translating response shift into a psychological context.
One shortcoming that affects any study of measurement invariance is that certain constraints must be put on the model for identification and scaling and these constraints also make assumptions about invariance. For example, invariance is assumed when using a scaling indicator by constraining the loadings and intercepts for the first item across groups or across time points. If recalibration affected all items, effects of noninvariance and true change could not be differentiated. Therefore, it is important to have at least one item that is invariant across groups and time points and to correctly identify it in the examined model.
We assumed throughout the article that the intervention is changing the mediating construct, which would mean that observed indicators are affected through changes in the latent construct, not directly by the intervention. A question for further research is to determine the best way to accommodate a situation in which the intervention causes change in specific behaviors while not affecting others (Gonzalez & MacKinnon, 2021). In this situation, the theory describing the impact of the intervention on the mediator is incorrect, and our assumed model is incorrect. Therefore, we may detect response shift when the actual problem is that our model is incorrect (i.e., misspecified).
Finally, the framework of measurement invariance assumes reflective indicators whereby the correct model is one in which the latent variable causes the observed indicators (i.e., as the latent variable increases, the scores on the indicators increase). An alternate conceptualization of this relationship is one in which the latent variables are caused by the indicators. In this case, the indicators could be causal indicators that assess the latent variable. If a causal indicator is incorrectly modeled as a reflective indicator, the model would be misspecified, and the results would not be meaningful. Although a full discussion of the implications of this type of misspecification is beyond the scope of this article, we recommend Bollen and Bauldry (2011) and Rhemtulla et al. (2019) for a thorough discussion of these issues.
Overall, we encourage researchers to probe for response shift when they are testing for mediation in an intervention setting. Response shift could affect the likelihood of finding statistically significant mediated effects, which in turn could affect conclusions about how the intervention worked. We hope that researchers incorporate the methodology presented to their toolbox to make the most accurate conclusions about statistical mediation analyses.
Supplemental Material
sj-docx-5-amp-10.1177_25152459211012271 – Supplemental material for Evaluating Response Shift in Statistical Mediation Analysis
Supplemental material, sj-docx-5-amp-10.1177_25152459211012271 for Evaluating Response Shift in Statistical Mediation Analysis by A. R. Georgeson, Matthew J. Valente and Oscar Gonzalez in Advances in Methods and Practices in Psychological Science
Supplemental Material
sj-html-1-amp-10.1177_25152459211012271 – Supplemental material for Evaluating Response Shift in Statistical Mediation Analysis
Supplemental material, sj-html-1-amp-10.1177_25152459211012271 for Evaluating Response Shift in Statistical Mediation Analysis by A. R. Georgeson, Matthew J. Valente and Oscar Gonzalez in Advances in Methods and Practices in Psychological Science
Supplemental Material
sj-html-2-amp-10.1177_25152459211012271 – Supplemental material for Evaluating Response Shift in Statistical Mediation Analysis
Supplemental material, sj-html-2-amp-10.1177_25152459211012271 for Evaluating Response Shift in Statistical Mediation Analysis by A. R. Georgeson, Matthew J. Valente and Oscar Gonzalez in Advances in Methods and Practices in Psychological Science
Supplemental Material
sj-html-3-amp-10.1177_25152459211012271 – Supplemental material for Evaluating Response Shift in Statistical Mediation Analysis
Supplemental material, sj-html-3-amp-10.1177_25152459211012271 for Evaluating Response Shift in Statistical Mediation Analysis by A. R. Georgeson, Matthew J. Valente and Oscar Gonzalez in Advances in Methods and Practices in Psychological Science
Supplemental Material
sj-html-4-amp-10.1177_25152459211012271 – Supplemental material for Evaluating Response Shift in Statistical Mediation Analysis
Supplemental material, sj-html-4-amp-10.1177_25152459211012271 for Evaluating Response Shift in Statistical Mediation Analysis by A. R. Georgeson, Matthew J. Valente and Oscar Gonzalez in Advances in Methods and Practices in Psychological Science
Footnotes
Transparency
Action Editor: Mijke Rhemtulla
Editor: Daniel J. Simons
Author Contributions
A. R. Georgeson, O. Gonzalez, and M. J. Valente collaboratively generated the idea for the study. A. R. Georgeson wrote the first draft of the manuscript and first version of the simulation code. O. Gonzalez verified the accuracy of the analyses. A. R. Georgeson, O. Gonzalez, and M. J. Valente critically edited the manuscript. All authors approved the final manuscript for submission.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
