Consider the conventional multilevel model
where
represents fixed effects and
are multivariate normal random effects. The continuous outcomes
and covariates
are fully observed with a subset
of
. The parameters are
. Dempster, Rubin and Tsutakawa framed the estimation as a missing data problem, where
are the complete data and the random effects
are conceived as missing data. Viewed in this way, the Expectation-Maximization (EM) algorithm has proven to be a natural and popular approach to estimation. However, when
is partially observed or subject to measurement error, it is natural to formulate a multilevel model for
that includes random effects,
. In this article, we extend this thinking to allow estimation of the joint distribution of data
and random effects
from observed data
and to generate multiple imputations of missing data
based on the estimated distribution under the assumption that the data
are missing at random. This approach contributes to the literature on multiple imputation in three ways: (a) it allows random effects
to be conceived as latent covariates, thus addressing measurement errors of
; (b) it allows non-linearities, including random coefficients, interaction effects, and other polynomial effects involving partially observed covariates; (c) it imputes
using two-step importance sampling. In these cases, the joint distribution of
is not analytically tractable even if the analytic multilevel model of interest to the analyst follows a multivariate normal distribution. We prove that our method of maximizing the likelihood and imputing missing data ensures compatibility of the nonnormal joint distribution with the analytic normal theory multilevel model via provisionally known random effects. We present and evaluate a sufficient condition under which the produced imputations are compatible with the analytic model.