Abstract
Background/Aims
Cluster randomized trials with multiple endpoints feature complex correlation structures. Estimating treatment effects in a meaningful way that respects differences in scale and clinical importance is challenging. Pairwise comparison methods address these challenges by constructing all pairs featuring one treatment and one control participant, then evaluating endpoints in hierarchical order. For cluster randomized trials featuring such a “hierarchical composite endpoint,” we develop large-sample confidence interval estimators and hypothesis tests for the nonparametric treatment effect referred to here as the “win probability.”
Methods
For each pair of participants (one treated and one control), responses on each endpoint are compared in order of descending clinical importance until it can be determined which participant responded better (“won”) or all endpoints are exhausted. Dividing the number of wins attributed to the treatment arm by the total number of “pairwise comparisons” yields a point estimate of the win probability. The win probability can be transformed into alternative effect measures, including the “win difference” and “win odds.” A two-stage procedure, or “win fraction” approach, is used to obtain variance estimators for the win probability. Each participant’s multivariate response is transformed into a univariate “win fraction,” which quantifies the proportion of times they won when compared to all participants in the comparison arm. A working linear mixed model is applied to the win fractions to obtain cluster-adjusted point estimates of the win probability and its variance. Inference proceeds by the central limit theorem. Simulation is used to assess the performance of the proposed estimators for a hierarchical composite endpoint comprised of one binary component (more important) and one continuous component (less important) across a range of cluster trial designs. Performance of an empirical bootstrap estimator is also investigated. A case study using data from the REACT cluster trial demonstrates application of the methods, and corresponding SAS and R code is provided.
Results
Simulation suggests that the nominal 95% coverage probability is well maintained and type I error is controlled. Due to the large-sample nature of our method, confidence intervals may be conservative (over coverage) for fewer than 30 clusters. In comparison, the empirical bootstrap estimator is liberal (under coverage) for all numbers of randomized clusters (up to 50).
Conclusion
Our win fraction method uses a working linear mixed model to obtain confidence intervals and hypothesis tests which respect coverage and type I error. It is faster than the bootstrap, applicable to multiple components on different scales, bypasses specification of complex correlation matrices, permits adjustment, and can be implemented in existing software.
Keywords
Get full access to this article
View all access options for this article.
