Abstract
The posterior distribution of the bivariate correlation (ρxy ) is analytically derived given a data set consisting N 1 cases measured on both x and y, N 2 cases measured only on x, and N 3 cases measured only ony. The posterior distribution is shown to be a function of the subsample sizes, the sample correlation (rxy ) computed from the N 1 complete cases, a set of four statistics which measure the extent to which the missing data are not missing completely at random, and the specified prior distribution for ρxy . A sampling study suggests that in small (N = 20) and moderate (N = 50) sized samples, posterior Bayesian interval estimates will dominate maximum likelihood based estimates in terms of coverage probability and expected interval widths when the prior distribution for ρxy is simply uniform on (0, 1). The advantage of the Bayesian method when more informative priors based on beta densities are employed is not as consistent.
Get full access to this article
View all access options for this article.
