Abstract
In survey data, some questions are asked only of a subset of applicable participants. This frequently occurs together with floor effects of the provided responses. For example, in the longitudinal Population Assessment of Tobacco and Health (PATH) survey, nicotine dependence is assessed only for a subsample of individuals at each occasion and, when assessed, often has value at the lower end of the scale. To capture trends over time in an unbiased and efficient way, it is important to jointly model the probabilities of being asked the questions of interest, of giving a response at the lower end of the scale, and of the mean response when above the lower end of the scale. We propose a three-part model for such data, which consists of two logistic submodels and a truncated normal model. Correlations among repeated observations on the same individual are induced by random effects. Maximum likelihood estimation and inference is performed in SAS PROC NLMIXED. The PATH data on young adults are used for illustration. A simulation study investigates bias and efficiency of the three-part model compared to simpler models. The three-part model has much lower bias and better coverage probabilities for the regression coefficients than simpler models.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
