Abstract
Introduction
Developing alternative approaches to evaluating absolute efficacy of new HIV prevention interventions is a priority, as active-controlled designs, whereby individuals without HIV are randomized to the experimental intervention or an active control known to be effective, are increasing. With this design, however, the efficacy of the experimental intervention to prevent HIV acquisition relative to placebo cannot be evaluated directly.
Methods
One proposed approach to estimate absolute prevention efficacy is to use an HIV exposure marker, such as incident rectal gonorrhea, to infer counterfactual placebo HIV incidence. We formalize a statistical framework for this approach, specify working regression and likelihood-based estimation approaches, lay out three assumptions under which valid inference can be achieved, evaluate finite-sample performance, and illustrate the approach using a recent active-controlled HIV prevention trial.
Results
We find that in finite samples and under correctly specified assumptions accurate and precise estimates of counterfactual placebo incidence and prevention efficacy are produced. Based on data from the DISCOVER trial in men and transgender women who have sex with men, and assuming correctly specified assumptions, the estimated prevention efficacy for tenofovir alafenamide plus emtricitabine is 98.1% (95% confidence interval: 96.4%–99.4%) using the working model approach and 98.1% (95% confidence interval: 96.4%–99.7%) using the likelihood-based approach.
Conclusion
Careful assessment of the underlying assumptions, study of their violation, evaluation of the approach in trials with placebo arms, and advancement of improved exposure markers are needed before the HIV exposure marker approach can be relied upon in practice.
Introduction
The last decade has seen dramatic success in HIV prevention 1 with effective pre-exposure prophylaxis (PrEP) products.2–8 Despite these successes, HIV remains a major threat to global health. 9 As considerable challenges to implementing existing prevention interventions exist,10,11 additional biomedical prevention interventions are needed.
A variety of new preventive interventions (e.g. alternative PrEP agents, vaccines, etc.) are in development. 12 Placebo-controlled randomized trials that enroll individuals without HIV and follow them for incident HIV acquisition have historically been required for regulatory approval of new interventions. For new interventions in the same “class” as an intervention already proven effective, future trials will likely be “active-controlled”; 13 participants without HIV are randomized to the experimental intervention or an existing “active-control” intervention already proven effective. Even for new interventions in as-yet-unproven classes, for example, vaccines, an active-controlled design may be necessary.
The fundamental challenge of an active-controlled trial is that absolute prevention efficacy, that is, the reduction in HIV incidence for the intervention relative to placebo, cannot be evaluated based on the trial data alone. Instead, relative efficacy of the experimental and active-control interventions is assessed. Yet absolute efficacy is arguably the parameter of most interest.14,15 A traditional approach to estimating efficacy is using data from a historical placebo-controlled trial of the active control to set a “margin” for establishing non-inferiority or superiority of the experimental intervention, based on the assumption that efficacy established in the historical trial can be carried over to the new trial.16,17 This approach is challenging in HIV prevention, since many interventions are highly user-dependent,18–20 and efficacy of vaccines and monoclonal antibodies depends on properties of the exposing virus;21–23 thus, efficacy in the historical trial may not apply to the current trial. In addition, non-inferiority trials generally require larger sample sizes than placebo-controlled trials, especially if the active control is highly effective. Therefore, developing alternative approaches to evaluating absolute efficacy of new HIV prevention interventions is a priority.
One approach proposed in concept24,25 and widely discussed in the HIV prevention field14,15,25–30 is to use a marker of HIV exposure as a proxy to infer “counterfactual placebo” HIV incidence, that is, the incidence observed had a placebo arm been included in the active-controlled trial. This requires establishing the association between incidence of HIV and an HIV exposure marker in the absence of intervention, estimated based on historical data. Provided the intervention does not affect the HIV exposure marker, incidence of the marker in the active-controlled trial can be used to estimate counterfactual placebo HIV incidence. Figure 1 illustrates this concept. Incident rectal gonorrhea has been proposed as the HIV exposure marker for men who have sex with men, based on observational data suggesting that the incidence rates of these two sexually transmitted infections are highly correlated. 24 U.S. Food and Drug Administration (FDA) 31 advisory committees reviewing new PrEP agents support this approach, and the FDA endorsed the approach in guidance to industry. Yet, a formal statistical framework is lacking.

Estimation of counterfactual HIV incidence based on an HIV exposure marker. Green solid and dashed curves correspond to the fitted model associating HIV and exposure marker incidences with an associated pointwise 95% confidence interval (CI), based on a set of external cohorts reporting HIV and exposure marker incidence rates (dark blue dots). Given the exposure marker incidence in the active-controlled trial (yellow dot), counterfactual placebo HIV incidence is estimated with use of the fitted model (red dot). The 95% CI for the counterfactual placebo incidence captures uncertainty due to the model fit and uncertainty in the exposure marker incidence.
Here, we (1) articulate a statistical framework for inferring counterfactual placebo HIV incidence for an active-controlled trial using a marker of HIV exposure; (2) describe two estimation approaches and articulate the assumptions under which they produce unbiased estimates; (3) conduct a simulation study designed to closely mimic data on HIV and rectal gonorrhea and evaluate the performance of the methods under idealized conditions, that is, when all assumptions are satisfied; and (4) apply the methodology to data from a recently conducted active-controlled HIV prevention trial 6 and highlight the limitations of the approach and implications for its use in future HIV prevention trials.
Methods
Setting and notation
Let
For an active-controlled trial,
We formulate a general approach for evaluating
Remark 1
Prevention efficacy is evaluated against a backdrop standard of HIV prevention for the target population, consisting of proven and available HIV prevention products.
13
Therefore,
In a randomized placebo-controlled trial for intervention
Assumptions
Let
We parameterize the relationship between HIV and the exposure marker incidences as
for
We state the following assumptions.
Assumption 1
Model equation (2) describes a general relationship between placebo HIV and exposure marker incidence rates that holds across external cohorts and the active-controlled trial population.
While HIV and exposure marker incidences may vary, the association between the incidence rates is assumed constant. To evaluate Assumption 1, one must consider carefully the background standard of HIV prevention for the active-controlled trial population, and whether any element of this prevention package influences the relationship between HIV and the exposure marker. For example, oral PrEP is known to reduce HIV but does not have a biological effect on rectal gonorrhea or other non-HIV sexually transmitted infections, 6 even though it may have an effect in terms of behavioral “risk disinhibition.” 32 Therefore, if the trial standard of HIV prevention does not include oral PrEP, the external cohorts should be drawn from populations without access to oral PrEP. Elements of the standard of HIV prevention (i.e. condoms and risk reduction counseling) may influence HIV and rectal gonorrhea incidences but not to modify their association, and therefore may not be critical to consider in evaluating external cohorts. Effective biomedical prevention of non-HIV sexually transmitted infections is another potential effect modifier. Other potential effect modifiers include subject demographics, behaviors, and features of the local HIV epidemic, that is, population prevalence of HIV and level of viral suppression for those living with HIV. Blinding may also influence the relationship between HIV and the exposure marker. While the counterfactual placebo arm is (conceptually) blinded, the external cohorts may not be. While Assumption 1 can be evaluated for the external cohorts, whether it holds for the trial population cannot be tested, given the absence of a placebo arm for the trial population.
Assumption 2
An unbiased estimate of the parameters in
Assumption 2 indicates that the relationship between HIV and exposure marker incidences can be consistently estimated using the observed incidence rates from the external cohorts. This assumption is specific to the estimation approach and will be discussed below.
Assumption 3
The exposure marker incidence is not modified by randomization to active intervention
Assumption 3 stipulates that the incidence of the HIV exposure marker under
Under Assumptions 1–3, counterfactual placebo HIV incidence can be consistently estimated by
where
where
Remark 2
If the exposure marker incidence is not modified by either intervention in the active-controlled trial, the exposure marker incidence among all trial participants may be used to estimate counterfactual placebo HIV incidence. This provides a more precise estimate, relative to the estimate based on the exposure marker incidence among participants that received intervention
Bivariate linkage model
To estimate
for
We assume the estimated incidence rates from the external cohorts,
where
Under the bivariate linkage model, the parameters
and prevention efficacy can be estimated by
See Supplemental Materials for details.
While maximum likelihood estimation yields consistent and efficient parameter estimates under correct model specification, it may not be stable when the number of external cohorts is small, for example,
In general, the working model is mis-specified. However, the estimated regression function based on working model estimates
and prevention efficacy is estimated by
See Supplemental Materials for details.
In summary, assuming the bivariate linkage model equation (3), the procedure for estimating counterfactual placebo HIV incidence and prevention efficacy is as follows:
Step 1. Given estimated incidences
Step 2. Given the estimated incidence rate of the exposure marker
Step 3. Given the estimated HIV incidence rate
R code for implementation is available on Github (https://github.com/feigao1/CF_Exposuremarker).
Simulation studies
To evaluate the numerical performance of the counterfactual placebo incidence and prevention efficacy estimates, we examine the ideal scenario when all assumptions hold (with maximum likelihood estimation) and when Assumption 2 holds approximately (with working model estimation).
External cohorts
Incidences in the external cohort
Active-controlled trial
We consider a single arm trial for conciseness, with a follow-up time of
Estimation methods and performance measures
We apply maximum likelihood and working model estimation approaches, following the procedure listed at the end of the Methods section. We evaluate the average bias, empirical standard deviation, and coverage probability of nominal 95% confidence intervals (CIs) for counterfactual placebo incidence and prevention efficacy estimates across 5000 simulations.
Results
Simulation
Table 1 summarizes the performance of counterfactual placebo HIV incidence estimates across simulation scenarios. We show results with
Bias, standard deviation, and empirical coverage for estimated counterfactual placebo HIV incidence, based on
The performance of estimates of prevention efficacy based on an active-controlled trial with
Bias, standard deviation, and empirical coverage for estimates of prevention efficacy (PE) based on
We evaluate power for testing prevention efficacy with this approach compared to a placebo-controlled trial (see Supplemental Figures S1 and S2 for simulation results). Surprisingly, we find power for the counterfactual approach may exceed that obtained from a placebo-controlled trial with the same active arm size. For example, 74% power to detect prevention efficacy of 0.6 can be obtained with 3 cases per 100 person-years placebo HIV incidence, active arm size of 2000 person-years, a highly correlated marker
We evaluate scenarios where the conditional independence assumption in equation (4) is violated; the estimated incidences are correlated conditional on the true incidences. Performance is similar to that under the conditional independence model equation (4) (see Supplemental Materials). Furthermore, we assess performance with external cohort data analyzed at the sub-cohort level, reflecting that site-level data may be available for multi-center studies (Supplemental Table S7). Given a fixed total sample size across external cohorts, more cohorts of smaller sizes are preferred to fewer cohorts of larger size.
Application
We apply the estimation to the DISCOVER trial, a randomized, double-blinded, double-dummy, active-controlled trial that compared the efficacy of coformulated tenofovir alafenamide plus emtricitabine and tenofovir disoproxil fumarate plus emtricitabine for preventing HIV in men and transgender women who have sex with men. 6 The US FDA approved tenofovir alafenamide plus emtricitabine for men and transgender women who have sex with men based on the trial results. 35 Rectal gonorrhea infections were captured in both arms. 5 Collectively, 1313 rectal gonorrhea cases were observed over 6243 person-years, implying a rectal gonorrhea incidence of 21.0 cases per 100 person-years. Historical data suggest that oral anti-retrovirals do not have biological effects on rectal gonorrhea incidence. 36
Table 3 contains point estimates and 95% CIs for counterfactual placebo HIV incidence using likelihood-based and working model estimation, assuming log and logit link functions in the bivariate linkage model equation (2), based on previously reported cohorts reporting both HIV and rectal gonorrhea incidence for men who have sex with men 24 (see Supplemental Table S1). The estimated counterfactual placebo HIV incidences are approximately 7 cases per 100 person-years for both estimation approaches and link functions. A naive analysis that assumes an identity link and treats estimated HIV and rectal gonorrhea incidence rates as fixed and known, similar to what is done in the applied literature, gives a lower counterfactual HIV incidence estimate of 6.6 cases per 100 person-years.
Estimated counterfactual placebo HIV incidence (cases per 100 person-years), and corresponding 95% confidence intervals (CIs) for the DISCOVER study. Uncertainty is quantified by 95% confidence intervals except for Bayesian estimates where * 95% and + 80% credible intervals (CrIs) are reported.
We compare the results with those from Glidden et al.,15,37 who applied Bayesian approaches with Gamma-Copula models and case-cohort sampling adjustment to the DISCOVER study. Posterior estimates of counterfactual placebo HIV incidence from the two Bayesian approaches are much lower at 4.51 and 3.4 cases per 100 person-years. We conjecture that the lower estimates are due in part to the chosen prior HIV incidence rate (mean of 2.9 cases per 100 person-years) in Glidden et al., 15 which was lower than the average incidence across the external cohort studies.
Another difference in the latter estimate is its reliance on an additional data source, namely historical estimates of efficacy.
Given the estimated 0.16 HIV diagnosis cases per 100 person-years from the tenofovir alafenamide plus emtricitabine arm, 6 the estimated prevention efficacy for tenofovir alafenamide plus emtricitabine versus counterfactual placebo is 98.1% (95% CI: 96.4%–99.4%), based on the working model and log link, and 98.1% with the likelihood-based estimation (95% CI: 96.4%–99.7%). This prevention efficacy inference is simple to interpret and supports tenofovir alafenamide plus emtricitabine effectiveness.
Conclusion
Advancing HIV prevention, and ultimately stemming the HIV pandemic, requires additional biomedical interventions. While active-controlled trials will likely be used in future trials evaluating candidate interventions, absolute efficacy of the experimental intervention cannot be evaluated based on the trial data alone. If a marker of HIV exposure is measured in the trial, and external data are leveraged to model the association between HIV and the exposure marker, under Assumptions 1–3 HIV incidence in a counterfactual placebo arm, and prevention efficacy of the experimental intervention relative to the counterfactual placebo, can be estimated reliably and precisely.
Importantly, we considered performance of the approach when Assumptions 1 and 3 hold, and Assumption 2 either holds or is slightly violated. These are strong and not fully testable assumptions that deserve careful attention. For one, correct specification of the model linking HIV incidence with the exposure marker is challenging. Mis-specification may be due to omission of covariates that modify the association, incorrect model form, or measurement error of variables. Recent work demonstrates that the rectal gonorrhea and HIV incidence association may differ across populations 38 and is difficult to model accurately across cohorts.14,28–30
While standard statistical methods may check for specific types of model mis-specification, with few external cohorts the power to detect model mis-specification is low. Given some of the assumptions are not fully testable, further research is needed into methods for incorporating uncertainty due to violation of these assumptions.
Our findings suggest that more cohorts of smaller size provide more precise inference than fewer cohorts of larger size. Accuracy and precision may be further improved with individual-level data. As well, with only study-level data from external cohorts the correlation between reported HIV and the exposure marker incidences in the external cohorts is rarely available. Accordingly, our estimation approaches assume conditional independence of HIV incidence and the exposure marker. Our simulation study suggests a degree of robustness to violation of this assumption, mainly because between-study variation dominated with-in study variation. Similar results were found for bivariate meta-analysis. 39 However, as discussed by Riley, 39 ignoring within-study correlation is expected to yield estimates with inferior statistical properties. Given individual-level data from external cohorts, estimation of the conditional dependence parameter would be feasible and performance improved.
We call for additional research, with application held until such research is conducted. Evaluation of the approach’s performance in HIV prevention trials that included placebo arms is needed to gauge the “real-world” accuracy of the counterfactual placebo estimation. Individual-or trial-sitelevel data from recent HIV prevention trials, with incidence of other sexually transmitted infections captured, should be made public to enable further evaluation of the correlation between HIV and other sexually transmitted infections as potential exposure markers. Finally, HIV exposure markers that more readily satisfy the assumptions we detail should be pursued; markers more fundamentally linked to HIV exposure may be needed to realize the potential.
We did not find existing statistical frameworks that provided a good fit for our problem.40–42 The exposure marker we considered is different from a surrogate marker for which the effect of the intervention on the surrogate reflects the effect of the intervention on the primary endpoint.40,43 The framework we developed may have application to other clinical contexts where a proxy outcome is associated with the clinical outcome under the control condition but is not impacted by the intervention, and a body of data is available for estimating the association between proxy and clinical outcome under the control condition.
Supplemental Material
sj-pdf-1-ctj-10.1177_17407745231203327 – Supplemental material for Estimating counterfactual placebo HIV incidence in HIV prevention trials without placebo arms based on markers of HIV exposure
Supplemental material, sj-pdf-1-ctj-10.1177_17407745231203327 for Estimating counterfactual placebo HIV incidence in HIV prevention trials without placebo arms based on markers of HIV exposure by Yifan Zhu, Fei Gao, David V Glidden, Deborah Donnell and Holly Janes in Clinical Trials
Footnotes
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: D.V.G. has accepted fees from Gilead Sciences. The remaining authors had nothing to disclose.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institutes of Health/National Institute of Allergy and Infectious Diseases (NIH/NIAID) through grants R01CA152089, R56AI143418, and UM1AI068635 to H.J., and R01AI143357 to D.V.G.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
