Abstract
Aim:
Partial-mouth recording protocols often result in underestimation of population prevalence and extent of periodontitis. We posit that multiple imputation of measures such as clinical attachment loss for nonselected tooth sites in partial-mouth samples can reduce bias in periodontitis estimates.
Methods:
Multiple imputation for correlated site-level dichotomous outcomes in a generalized estimating equations framework is used to impute site-level binary indicators for clinical attachment loss exceeding a fixed threshold in partial-mouth samples. Periodontitis case definitions are applied to the imputed “complete” dentitions, enabling estimation of prevalence and other summaries of periodontitis for partial-mouth samples as if for full-mouth examinations. A multiple imputation-bootstrap procedure is described and applied for point and variance estimation of these periodontitis measures. The procedure is evaluated with pseudo-partial-mouth samples based on random site selection protocols of 28 to 84 periodontal sites repeatedly generated from full-mouth periodontal examinations of 3,621 participants in the 2013 to 2014 National Health and Nutrition Examination Survey (NHANES) survey.
Results:
Multiple imputation applied to partial-mouth samples overestimated periodontitis mean extent, defined as the number of sites with clinical attachment loss 3 mm or greater, by 9.5% in random site selection protocols with 84 sites and overestimated prevalence by 5% to 10% in all the evaluated protocols.
Conclusions:
In the 2013 to 2014 NHANES data, multiple imputation of site-level periodontal indicators provides less biased estimates of periodontitis prevalence and extent than has been reported from estimates based on the direct application of full-mouth case definitions to partial-mouth samples. Multiple imputation provides a promising solution to the longstanding, vexing problem of estimation bias in partial-mouth recording, with potential application to a wide array of case definitions, periodontitis measures, and partial recording protocols.
Knowledge Transfer Statement:
Partial-mouth sampling, while a resource-efficient strategy for obtaining oral disease estimates, often results in underestimation of periodontitis metrics. Multiple imputation for nonselected periodontal sites produces pseudo-full-mouth data sets that may be analyzed and combined to produce estimates with small bias.
Keywords
Introduction
Periodontal disease surveillance is impeded by changing case definitions and the use of a full-mouth periodontal examination (FMPE), which may be burdensome, lengthy, and costly. In studies of periodontitis in adults, the gold standard FMPE records clinical variables at 6 sites per tooth, for up to 28 teeth (excluding third molars), which can take 40 min to complete. Partial-mouth recording protocols (PRPs) provide substantial time- and cost-savings. In a partial-mouth exam, a subset of tooth sites is examined using either random site selection methods (RSSMs; Beck et al. 2006; Preisser et al. 2017) or fixed site selection methods (FSSMs; Alexander 1970; Mills et al. 1975; Fleiss et al. 1987). A major obstacle to the use of PRPs is that the standard method that applies a full-mouth periodontitis case definition directly to partial-mouth samples systematically underestimates periodontitis prevalence (Kingman and Albandar 2002; Susin et al. 2005; Tran et al. 2014).
Under the standard method, PRPs that select teeth (or sites) with the most disease tend to have higher sensitivity and less underestimation bias than PRPs that identify a “representative” selection of sites such as RSSMs (Beck et al. 2006) or Ramfjord teeth (Ramfjord 1959). The choice of PRPs that select teeth most susceptible to periodontal disease is routinized in a population ranking method (Alshihayb et al. 2022) based on a single cardinal measure: clinical attachment loss (CAL) or probing depth (PD). Similarly, case definitions proposed by the Group C Consensus Report of the Fifth European Workshop on periodontology (Tonetti et al. 2018) and the Centers for Disease Control and Prevention in conjunction with the American Academy for Periodontology (CDC/AAP) restrict consideration to interproximal (IP) sites (Eke et al. 2012), known to have relatively high levels of disease. While there has been progress in the clinical classification of periodontitis using PRPs (Alshihayb et al. 2022; Botelho et al. 2020), less progress has been made on the reliable use of PRPs in estimating periodontitis prevalence and extent.
Problematically, the underestimation of mean extent and prevalence of periodontitis by the standard method is frequently severe. Extent is the number of sites in an individual mouth exceeding an established threshold, that is, ≥ CAL 3 mm (i.e., CAL3+) or ≥ PD 4 mm (i.e., PD4+) (Heaton, Sharma, et al. 2018), whereas prevalence is the proportion of individuals in a population who fulfill a certain criterion (i.e., case definition) defined a priori. In a study of 10,680 participants from National Health and Nutrition Examination Survey (NHANES) cycles 2009 to 2014 (Alshihayb et al. 2022), PRPs examining all 6 sites from half-mouth protocols resulted in mean extent estimates with 45% to 55% underestimation relative to full-mouth extent, and the population ranking method that selects the 14 most diseased teeth had bias ranging from 25% to 38%. Eke et al. (2010) found that the NHANES III and NHANES 2001 to 2004 protocols underestimated the prevalences of moderate or severe periodontitis by the CDC/AAP case definitions by more than 50%; case definitions based on 1 or more sites with CAL meeting 3-mm or 6-mm thresholds resulted in underestimation of prevalence between 30% and 40%. This substantial bias has limited the use of PRPs, which ceased to be used by NHANES in 2009.
The excessive bias in these studies results from the direct application of full-mouth case definitions for periodontitis to partial-mouth samples. Assuming no measurement error at the site level, this approach systematically underestimates prevalence and extent because subjects with no diseased sites in the full mouth are always classified correctly as having no disease (i.e., specificity is 100%), whereas subjects with diseased sites could have zero such sites selected and thus be classified as without periodontitis (i.e., sensitivity is less than 100%). A new approach is needed to reduce estimation bias when using PRPs for periodontitis surveillance.
Within a general framework for well-defined estimands (i.e., the target parameter in the population of interest based on a FMPE), this article aims to show that multiple imputation (MI) of nonselected tooth sites in PRPs can provide population periodontitis estimates of extent and prevalence with small bias. Imputation fills in missing data for nonselected sites to create pseudo-“complete” clusters (individual dentitions) mimicking FMPEs. Single imputation (i.e., “filling in” missing data with 1 set of plausible values) fails to account for the uncertainty of the imputation model. MI overcomes this limitation by producing multiple “complete” data sets, each consisting of pseudo-full dentitions for all individuals in the sample, that differ with respect to their imputed values. When the model generating the imputations is correct, the distributions across these data sets of the imputed values for each missing datum implicitly reflect appropriate estimates of both the missing values and the underlying random variability. MI has rules for combining the multiple estimates of a quantity, such as prevalence or extent from the “complete” data sets, into a single overall estimate of that quantity and for pooling variability of the individual estimates between and within imputations into an overall variance of the overall estimate (Shafer and Graham 2002). In this article, the proposed MI methods are evaluated using repeated RSSMs generated from the FMPE data in the NHANES 2013 to 2014 study population.
Methods
A Multiple Imputation Estimator for the Mean Number of Diseased Sites in PRPs
In a population, the goal of periodontal disease surveillance is to estimate the mean of some individual-level characteristic derived from site-level periodontal information. While the quantification or identification of disease often incorporates multiple types of periodontal measures (e.g., CAL and PD), for clarity, we consider a single cardinal measure of periodontal disease, CAL, that allows direct comparison to published results for extent (Alshihayb et al. 2022). Using CAL, 3 types of disease metrics are common. The first quantifies disease severity as the mean CAL measurement for sites in the mouth. Severity is not considered further since use of PRPs, especially RSSMs, typically estimates it with small bias (Brown and Löe 1993; Kingman et al. 2008). The second and third metrics, extent and prevalence, depend upon site-level threshold indicators (e.g., CAL3+). Note that prevalence, which is the proportion of individuals with disease according to case classification, is a mean of 0s (disease absent) and 1s (disease present). Thus, the presentation of the MI procedure described herein for population means encompasses both prevalence and extent.
Periodontal Disease Estimands
It is important to define the population quantity that is being measured—the estimand. In a population of K individuals, the ith individual’s disease is quantified by a summary statistic
Full-Mouth Estimators
As MI creates “complete” full-mouth data, estimators of
with sampling weights wi equal to the inverse probability of selection and
Partial-Mouth Estimators
Consider a random sample of K individuals who undergo a PRP of m < 168 randomly selected tooth sites; when mouths have fewer than m sites,
where
Multiple Imputation Method
The MI method consists of an imputation and a calculation stage. First, the imputation model specifying the first and second moments (i.e., site-level probabilities and pairwise correlations) must be identified and estimated using partial-mouth data. During the imputation stage, site-level binary variables are imputed recursively as
where
where
and VAW is the average within-imputation variance of the estimates
where
where
Evaluation
NHANES 2013 to 2014 periodontal examination data were used to illustrate the utility of MI for RSSMs in the evaluation of periodontitis estimators relative to gold standard FMPE estimators. We considered the mean number of periodontal sites with CAL3+ (extent) and 2 site-threshold periodontitis case definitions for prevalence: 1 or more sites with CAL3+ and 2 or more sites with CAL3+.
RSSM Sampling of NHANES 2013–2014 Full-Mouth Periodontal Exam
Partial-mouth extent and prevalence estimators were evaluated by inducing 4 RSSMs from the NHANES 2013 to 2014 FMPE data. For each RSSM, participants (clusters) were resampled with replacement to generate 500 samples of 3,621 clusters. Then, 28, 36, 42, or 84 tooth sites were randomly selected per RSSM protocol (excluding third molars). Sites from missing teeth (i.e., nonexistent tooth sites) were excluded; however, unmeasurable sites were eligible for selection. For comparison, FMPE estimators were also evaluated by generating 500 with-replacement samples of 3,621 clusters (the number of participants in the study), by using data from all tooth sites.
Imputation Model for NHANES 2013–2014 RSSM Samples
The MI procedure was applied to each RSSM evaluation sample. The SDM used for imputation of missing dichotomous CAL3+ values consisted of a logistic regression for the marginal mean (probability of a tooth site being affected) and a linear model for within-mouth correlations among site pairs estimated jointly by GEE (Prentice 1988). Predictors in the mean model consisted of indicators for each of the 6 tooth sites (distal buccal, buccal, mesiobuccal, mesiolingual, lingual, and distal lingual), sextant tooth location (maxillary right posterior, maxillary anterior, maxillary left posterior, mandibular left posterior, mandibular anterior, and mandibular right posterior), the log number of teeth, and categorical age (30–39, 40–49, 50–59, 60–69, 70+ y). The correlation model consisted of indicators for site pairs located on the same tooth, pairs of sites located on different teeth, pairs of adjacent sites that share the same IP space, pairs of adjacent sites that do not share the same IP space for sites, and pairs of sites on teeth that are directly above and below each other with 1 tooth on the maxillary jaw and 1 tooth on the mandibular jaw.
For each RSSM evaluation sample, the mean and correlation models were fitted to produce
Metrics for the Comparison of MI and Full-Mouth Estimates
The MI-boot estimators were compared to their respective FMPE estimators using evaluation metrics similar to bias and efficiency; for convenience, these are referred to as such. Since the true value of
Evaluation Metrics for the MI-boot Method for the Mean Extent of Periodontitis Using 500 Simulations of Random Partial-Mouth Periodontal Examinations from NHANES 2013–2014.
Extent is defined as the number of tooth sites with CAL3+.
CAL, clinical attachment loss; MI, multiple imputation.
Results
Evaluation Results for the Mean Number of Sites with CAL3+
The analytic data consisted of 3,621 participants with FMPEs from the NHANES 2013 to 2014 study population (Fig. 1). The estimated mean number of periodontal sites with CAL3+ based on the analytic NHANES 2013 to 2014 FMPE population is 19.20 sites (95% confidence interval, 18.35–20.05). The relative biases of the Monte Carlo estimates from the FMPE evaluations were less than 1%, as expected (Table 2). Meanwhile, the imputation-based estimators overestimated the FMPE gold standard mean estimates for all RSSMs with percent relative biases increasing as m, the number of sampled sites, decreased. Relative bias ranged from 9.5% for RSSM 84 to 16.2% for RSSM 28, which is less than the 25% to 38% bias reported for the population ranking method of selecting PRPs with 14 teeth (i.e., 84 sites; Appendix). Next, the MI-boot variance estimator for the mean number of sites with CAL3+ underestimated the gold standard Monte Carlo variance estimator for RSSMs by 12% to 25% (Fig. 2). Finally, the percent relative efficiency of the MI-boot procedures decreased (with greater information loss) as m decreased (Table 2). Specifically, the information loss of RSSM 84 is 12% relative to the FMPE estimator, whereas the loss of RSSM 28 is 55%.

The selection of National Health and Nutrition Examination Survey (NHANES) 2013 to 2014 study participants.
Evaluation Results for Multiple Imputation–Based Estimators of the Mean Number of Sites with CAL3+ using 500 Simulations of Random Partial-Mouth Periodontal Examinations from NHANES 2013–2014 (N = 3,621 Participants).
CAL, clinical attachment loss; FMPE, full-mouth periodontal examination.
The average of the estimates over 500 evaluations.
The gold standard mean based on the National Health and Nutrition Examination Survey 2013 to 2014 FMPE estimate is 19.201 sites (variance = 0.1909).
The ratio of the full-mouth variance estimate to the average variance estimate.
Bias of the average variance estimate relative to the Monte Carlo variance of the estimates of the mean number of sites with CAL3+ (i.e., extent).

Average MI-boot and gold standard Monte Carlo variance estimates for the mean number of tooth sites with CAL3+ according to number of sites in four random site selection methods. CAL, clinical attachment loss; MI, multiple imputation.
Bias Results for Estimating Prevalence with PRPs
Prevalence estimates per 100 persons from the NHANES 2013 to 2014 FMPE were 90.19 and 83.22 for case definitions of 1 or more and 2 or more sites with CAL3+, respectively. The relative biases of the estimates from the FMPE evaluations were less than 0.1% for both prevalence estimators, confirming good performance of the evaluation methods. All imputation-based RSSM prevalence estimates overestimated the FMPE estimate (Table 3). In particular, relative bias was about 5% for 1 or more sites affected, which compares favorably to the literature (Appendix), and 10% for 2 or more sites.
Percent Relative Bias of Multiple Imputation–Based Prevalence Estimators Using 500 Simulations of Random Partial-Mouth Periodontal Examinations from NHANES 2013–2014 (N = 3,621 Participants).
The gold standard prevalence based on the National Health and Nutrition Examination Survey (NHANES) 2013 to 2014 FMPE is 90.2% and 83.2% of participants with 1 and 2 or more sites with CAL3+, respectively.
CAL, clinical attachment loss; FMPE, full-mouth periodontal examination.
Discussion
This study suggests that MI may produce valid estimates for population periodontitis measures from epidemiological surveys using PRPs. Because MI produces complete data sets of full dentitions, MI can surmount the problem of shifting case definitions and sampling protocols over time by allowing reanalyses of periodontitis surveillance data for selected periodontal metrics and sampling protocols (Rozier et al. 2017). Moreover, MI may potentially reduce bias in the association between epidemiological exposures such socioeconomic status or systemic disease indicators and periodontitis in data generated by PRPs (Akinkugbe et al. 2015; Heaton, Garcia, et al. 2018; Alawaji et al. 2022).
In this article, the MI approach gave less biased estimates of periodontitis extent than the standard partial-mouth classification method with established PRPs. While the MI method resulted in 10% relative bias for mean extent measured by CAL3+ for an RSSM with 84 sites, the bias of population rank-based PRPs based on the same number of sites exceeded 25% (Alshihayb et al. 2022). While the ranking method was used to explore and characterize periodontitis misclassification patterns under PRPs, its practicality for periodontitis surveillance is unclear. First, the selection of PRPs is based on the ranking of tooth measures from FMPEs, which may not be available for the population of interest when a PRP is employed in practice. Second, the application of rank-based PRPs to the same data set used to select them may give overly optimistic results in terms of bias and sensitivity than had the selected PRPs been applied to new or “left-out” data. Finally, these authors note that rank-based PRPs tend to overestimate mean severity because they select the most diseased sites.
The MI approach proposed in this article also has limitations. First, misspecification of the imputation model, which is based on a pair of regression models for distributional features of binary outcomes that are clustered in mouths, may lead to invalid results. Because our evaluation was based on a single data set, the true imputation model is unknown. Hence, the source of underestimation of the MI-boot variances in this study is not clear. Future research will employ simulation studies to better illuminate the sources of bias in both estimates and their variances. Second, this article limited consideration to simple periodontal case definitions. Because MI is a general approach, its downstream application to more complex case definitions of periodontal disease such as the 2012 CDC/AAP is possible; however, to obtain estimates with minimal bias, the imputation model would need to account for both CAL and PD. Third, the particular implementation of MI methods was limited to RSSMs. While RSSMs have been extensively studied, FSSMs are the norm in practice.
The use of MI methods in FSSMs or random half-mouth protocols (RHMs) would require that the PRP selects the particular sites and/or teeth needed to estimate the chosen imputation model. For example, the RHM (Drury et al. 1996) that selects opposing contralateral quadrants (i.e., upper right/lower left or upper left/lower right) would not be able to estimate the correlation in our model for “a pair of sites on teeth that are directly above and below each other with 1 tooth on the maxillary jaw and 1 tooth on the mandibular jaw.” On the other hand, a half-mouth PRP that randomly selects 1 upper jaw and 1 lower jaw quadrant without the restriction that they be contralateral would provide the data needed to perform the imputations.
The closest available method to MI for estimating periodontitis prevalence using PRPs is a formula based on the case definition that 1 or more diseased sites meet a threshold for a cardinal measure (e.g., CAL3+; Preisser et al. 2017). The formula, which circumvents the case classification of study participants, is based on a working model for the intensity and pattern of disease in the mouth that assumes disease risk is the same across all tooth sites, and the within-mouth correlation of disease among any 2 sites is constant. Despite the inaccuracy of these assumptions, the formula gave prevalence estimates with bias from 1% to 23% across a range of RSSMs and CAL/PD thresholds. The SDM underlying the formula could be generalized to account for the symmetry of CAL or PD across sites and quadrants (Alshihayb et al. 2022). On the other hand, developing prevalence formulae for more complex case definitions may be challenging.
Given the well-known underestimation bias of the standard method that classifies study participants based only on their partial-mouth data, a statistical model-based approach to estimating periodontitis extent and prevalence using PRPs is advocated. While the formulaic approach of Preisser et al. (2017) and the proposed MI method both require a SDM for each study participant, MI is particularly promising since it can be applied to any downstream case definition once full-mouth data are imputed. NHANES and other surveys have all but abandoned use of PRPs in the past decade. However, development and refinement of the MI approach could conceivably improve PRP accuracy enough to justify, on grounds of logistical efficiency, a return to the regular use of PRPs in large oral epidemiological studies.
Author Contributions
J.S. Preisser, B.F. Qaqish, contributed to conception and design, data analysis and interpretation, drafted and critically revised manuscript; T. Shing, contributed to conception and design, data acquisition, analysis, and interpretation, drafted and critically revised manuscript; K. Divaris, contributed to conception and design, data interpretation, drafted and critically revised manuscript; J. Beck, contributed to conception, and data interpretation, drafted and critically revised manuscript. All authors gave final approval and agree to be accountable for all aspects of the work.
Supplemental Material
sj-docx-1-jct-10.1177_23800844221143683 – Supplemental material for Multiple Imputation for Partial Recording Periodontal Examination Protocols
Supplemental material, sj-docx-1-jct-10.1177_23800844221143683 for Multiple Imputation for Partial Recording Periodontal Examination Protocols by J.S. Preisser, T. Shing, B.F. Qaqish, K. Divaris and J. Beck in JDR Clinical & Translational Research
Footnotes
A supplemental appendix to this article is available online.
Declaration of Conflicting Interests
The authors have no potential conflicts of interest with respect to this research, authorship and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
