Abstract
Researchers often impute continuous variables under an assumption of normality–yet many incomplete variables are skewed. We find that imputing skewed continuous variables under a normal model can lead to bias. The bias is usually mild for popular estimands such as means, standard deviations, and linear regression coefficients, but the bias can be severe for more shape-dependent estimands such as percentiles or the coefficient of skewness. We test several methods for adapting a normal imputation model to accommodate skewness, including methods that transform, truncate, or censor (round) normally imputed values as well as methods that impute values from a quadratic or truncated regression. None of these modifications reliably reduces the biases of the normal model, and some modifications can make the biases much worse. We conclude that, if one has to impute a skewed variable under a normal model, it is usually safest to do so without modifications–unless you are more interested in estimating percentiles and shape than in estimating means, variances, and regressions. In the conclusion, we briefly discuss promising developments in the area of continuous imputation models that do not assume normality.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
