After fitting a Poisson regression model to evaluate the effect of an intervention in a cohort study, one might be interested in estimating the number of events prevented by the intervention (assuming the observed associations are causal). This can be derived as the difference in the intervention group between the predicted number of events under the counterfactual (no intervention) and the factual (intervention) scenarios. One could use the predict command to obtain the predicted number of events under the two scenarios and then sum up the differences, but this approach would not be convenient for several reasons. One would need to change the intervention variable to get the counterfactual predicted values, and the confidence intervals would not be readily available (bootstrap or jackknife could be used, but this could be particularly time consuming if the dataset is large).
We here suggest using the margins command. Its use, however, is not straightforward for our specific problem because margins computes predictions for each observation (like predict) and then takes the average of these predicted values. For example, if our data are aggregated in years, margins will provide an average of the year-specific predictions. When margins is applied over N records and
is the predicted value for the ith observation (i = 1,…, N), the result is simply the average of these predicted values, that is,
. If we want margins to calculate the sum of the predictions instead of the mean, we can multiply each observation-specific prediction by the number of observations (that is, N), and the result of margins will be
.
Let’s consider a simple example using simulated data. Specifically, we use a Poisson distribution to generate a variable, cases, containing the number of events of interest (for example, the number of cancer cases) as a function of an intervention indicator (trt = 1 if treated, 0 otherwise); two covariates (x1 and x2); and an offset (pyar = person-years at risk).
We then fit a Poisson regression model:
To obtain an estimate of the number of events prevented by the intervention and its 95% confidence interval, the margins command will need to include the following (see [R] margins for more details):
an if qualifier (that is, if trt==1) or the corresponding subpop() option (the latter must be used if the vce(unconditional) option is specified too);
two at() options: one for the factual scenario (that is, at((asobserved) _all)) and one for the counterfactual scenario (that is, at(trt=0));
expression(predict(n)*r), where r is the size of the group of observations over which the margins command averages the predictions (it is here retrieved from the two command lines count if trt==1 & e(sample)==1 and scalar r=r(N)); and
the pwcompare option.
Hence, the commands and results are as follows:
The above results show that the intervention is estimated to have prevented 1,082 (95% ci: 875 to 1,289) cancer cases in our sample. Had we used the above margins command without the expression() option, we would have obtained the average of the observation-specific predicted number of events:
To better understand the above output, one can generate the variables (here called pred1 and pred2) containing the observation-specific predictions for the two scenarios and then look at their means. The pwcompare option will be omitted because it is not allowed when the generate() option is specified too.
If we calculate the difference between the means of pred2 (counterfactual scenario) and pred1 (factual scenario), we obtain the value reported in the above margins command, where we omitted both the expression() and pwcompare options (12.26016 − 10.1131 = 2.14706). If we now generate the difference between pred2 and pred1 (that is, generate diff=pred2-pred1) and use the total command, we will obtain the point estimate reported by margins with the expression() and pwcompare options.
Extensions to interventions with two or more levels (for example, 0 = no treatment, 1 = low-dosage treatment, 2 = high-dosage treatment) or other counterfactual scenarios would be straightforward. For example, if we want to estimate how many fewer cases we would have observed in the nonintervention group (that is, trt = 0) if everybody had received the treatment, then we would specify the following:
Thus, our model estimates that if everyone in the nonintervention group had been administered the treatment, there would have been 1,106 (95% <SC>CI</SC>: 895 to 1318) fewer cancer cases. Note that the contrast is negative, corresponding to fewer cases had everyone been treated. This is because we are comparing the counterfactual scenario represented by at(trt=1) (that is, scenario 2 = untreated patients are treated) versus the factual scenario specified by at((asobs) _all) (that is, scenario 1 = untreated patients are untreated).
What is discussed in this Stata tip could also be extended to case–control studies by using inverse-probability-of-sampling weights to estimate absolute rates.
Supplemental Material
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221106437 - Stata tip 146: Using margins after a Poisson regression model to estimate the number of events prevented by an intervention
Supplemental Material, sj-zip-1-stj-10.1177_1536867X221106437 for Stata tip 146: Using margins after a Poisson regression model to estimate the number of events prevented by an intervention by Milena Falcaro, Roger B. Newson and Peter Sasieni in The Stata Journal
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
0.00 MB
0.00 MB