Sage Journals: Discover world-class research

Abstract

Estimating the precision of a single proportion via a 100(1−α)% confidence interval in the presence of clustered data is an important statistical problem. It is necessary to account for possible over-dispersion, for instance, in animal-based teratology studies with within-litter correlation, epidemiological studies that involve clustered sampling, and clinical trial designs with multiple measurements per subject. Several asymptotic confidence interval methods have been developed, which have been found to have inadequate coverage of the true proportion for small-to-moderate sample sizes. In addition, many of the best-performing of these intervals have not been directly compared with regard to the operational characteristics of coverage probability and empirical length. This study uses Monte Carlo simulations to calculate coverage probabilities and empirical lengths of five existing confidence intervals for clustered data across various true correlations, true probabilities of interest, and sample sizes. In addition, we introduce a new score-based confidence interval method, which we find to have better coverage than existing intervals for small sample sizes under a wide range of scenarios.

Keywords

Clustered binary data beta-binomial distribution confidence interval small sample coverage

Get full access to this article

View all access options for this article.

References

Fleiss

. Statistical methods for rates and proportions, 3rd ed. New York, NY: John Wiley & Sons, 2003.

Kang

Lee

. New confidence intervals for the proportion of interest in one-sample correlated binary data. Commun Stat Theory Meth 2010; 39: 2865–2876.

Kwak

S-W

Jung

S-H

. Comparison of operational characteristics for binary tests with clustered data. Stat Med 2015; 34: 2325–2333.

Saha

Miller

Wang

. A comparison of some approximate confidence intervals for a single proportion for clustered binary outcome data. Int J Biostat 2016; 12: 1–18.

Rao

JNK

Scott

. A simple method for the analysis of correlated binary data. Biometrics 1992; 48: 577–585.

Liang

Zeger

Qaqish

. Multivariate regression analyses for categorical data. Appl Stat 1992; 54: 3–40.

Lui

K-J

. Statistical estimation of epidemiological risk, New York, NY: John Wiley & Sons, 2004.

Newcombe

. Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med 1998; 17: 857–872.

Wilson

. Probable inference, the law of succession, and statistical inference. J Am Stat Assoc 1927; 22: 209–212.

10.

Zeger

Liang

. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986; 42: 121–130.

11.

Hall

. On the removal of skewness by transformation. Appl Stat 1992; 54: 221–228.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

2.33 MB

A novel confidence interval for a single proportion in the presence of clustered binary outcome data

Abstract

Keywords

Get full access to this article

References

Supplementary Material