Abstract
In randomized controlled trials, intention-to-treat analysis is customarily used to estimate the effect of the trial. However, in the presence of noncompliance, this can often lead to biased estimates because intention-to-treat analysis completely ignores varying levels of actual treatment received. This is a known issue that can be overcome by adopting the complier average causal effect approach, which estimates the effect the trial had on the individuals who complied with the protocol. When compliance is unobserved in the control group, the complier average causal effect estimate can be obtained via a latent class specification using the
Keywords
1 Introduction
In a standard randomized controlled trial (
However, in real-world scenarios, participants assigned to treatment may not receive the treatment in full or engage in all the activities required by the
In this article, we show how to fit a complier average causal effect (
To the best of our knowledge, there is no worked example available of how to fit a
The next section discusses the
2 Addressing noncompliance in RCTs
The
The overarching principle of
The main difficulty of estimating the causal effect of compliance is that, while it is straightforward to determine compliance with the intervention in the treatment group, it remains unknown or unobservable in the control group. It is necessary, therefore, to deploy a method that allows distinguishing, within the control group, those who would have complied had they been assigned to treatment from those who would not.
Skrondal and Rabe-Hesketh’s (2004) thorough description of the methods to obtain the
Consider the example of a school-based intervention where some classrooms are randomized to perform an activity for a prescribed length of time while others are randomized to continue with “business as usual”. During the intervention period, researchers record the times at which the activity is conducted in the intervention group and find wide variability. Given that some classrooms performed the activity for longer than others, we might expect the effect of the intervention to be “diluted” by those classrooms that conducted the activity for shorter periods. Dosage would then be key to understand the effect of the trial on the outcome of interest, and this measure can be used to determine compliance status in the intervention group.
Naturally, classrooms under the “business as usual” regime have no records for said activity, but within this group some would have conducted the activity had they had the chance to do so; hence, “true” compliance in the control group is unknown. Additionally, given that there is heterogeneity in the intervention group (that is, there are compliers and noncompliers), assuming homogeneity in the control group would not necessarily be enabled. By virtue of the randomization itself, the characteristics that make children in the intervention group more likely to comply would also make the children in the control group more likely to comply; that is, there is equivalence of groups prior to intervention (Skrondal and Rabe-Hesketh 2004). This is the rationale behind fitting a
Humphrey et al. (2022) used this approach when analyzing the effect of the “good behavior game” on health- and education-related outcomes in children attending primary schools in England. This
The following section illustrates how the latent class approach can be implemented in Stata using
2.1 Specifying a CACE model with gsem
Below, we can see the basic specification of a
The most basic specification of a
depvar is the outcome of interest, which is by default assumed to have Gaussian distribution.
The following are optional:
indepvars are the predictors of the outcome of interest.
varlist is the predictor or set of predictors for compliance.
The first line of the syntax is simply the call to
The fourth line,
This is not strictly essential in the estimation, but it is preferable to have a set of covariates that are reasonably good predictors of compliance (Jo 2002). If no variables are specified or the whole line is omitted, an intercept-only compliance model is fit.
The fifth and sixth lines compose the latent class model for compliance in the treatment group with special constraints that treat compliance in the treatment arm as known:
Finally, the last line states the name of the latent class (
Other available options are discussed in section 3.
2.2 An example CACE application
This example uses data from the
Variables in the
First, we read the data from the
The variable
Then, we specify and run the
The
With
Next we illustrate how we can use
Second, we run the
We omitted the plot because this is an intermediate step for the final plot presented in figure 1. We also requested the predicted values for the noncompliers class overall. There is no need to do this by treatment group—its predicted values are constrained to equality because we specified the exclusion restriction assumption.
Third, we run the
Afterward, we call the
Finally, we call

Predicted scores by complier status and trial arm
In figure 1, we can see that compliers (right-hand side) in the control group have higher predicted scores for depression at all values of baseline depression. On the other hand, noncompliers (regardless of trial arm) have predicted depression scores lower than compliers in the control group but not as low as compliers in the treatment group. The same steps can be repeated for “risk” scores at baseline to visualize the effect of compliance at different levels of risk.
Some additional options are discussed in the next section.
3 Conclusions
In this article, we presented the main features of a
In addition, equality constraints can be used potentially to test other hypotheses of interest, such as common or separate coefficients across compliers and noncompliers. For more details, see [
With
Finally, clustering around groups can be accounted for with the
To sum up,
Supplemental Material
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221106416 - Estimating the complier average causal effect via a latent class approach using gsem
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221106416 for Estimating the complier average causal effect via a latent class approach using gsem by Patricio Troncoso and Ana Morales-Gómez in The Stata Journal
Footnotes
4 Acknowledgments
This work was supported by the Economic and Social Research Council through the following grants: 1) Understanding Children’s Lives and Outcomes and 2) Scottish Centre for Administrative Data Research (grant numbers ES/V011243/1 and ES/S007407/1, respectively). This work was also supported in its early stages by the National Institute for Health Research (grant number 14/52/38). The authors are affiliated with the Scottish Centre for Administrative Data Research (
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
