Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Abstract

Introduction:

Periodontal disease (PD) is closely linked to systemic health, with established associations with chronic conditions (eg, diabetes, cardiovascular disease). However, most predictive models rely solely on dental data, limiting the consideration of systemic factors such as medical conditions.

Objectives:

This study aimed to enhance PD risk prediction by using linked electronic dental records (EDRs) with electronic health records (EHRs) and machine learning (ML).

Methods:

We used EDR data from 20,946 adult patients at Temple University School of Dentistry’s (2022–2023) axiUm®, linked with medical data (physician documented) from the Pennsylvania Health Share Exchange. The dataset includes demographics, dental diagnoses, medical history, medications, procedures, and social determinants of health. The target variable was PD. Because EHR data are not research ready, extensive preprocessing was required (eg, 1 patient may have 400+ medical codes, which ML/statistical models cannot process directly). To prepare for artificial intelligence/ML, we developed 5 automated feature reduction approaches to retain rich information while reducing variables. After preprocessing, 106 features were retained as independent variables. ML models (Gaussian Naive Bayes, Random Forest, LightGBM, XGBoost) were trained using cross-validation across 5 experimental strategies, including (1) features selected via chi-square test, (2) raw data (without extensive processing), (3) aggregated data, (4) systemic disease complexity system, and (5) EHR-only data. Model performance was assessed using sensitivity, specificity, and area under the curve (AUC).

Results:

The chi-square–selected features yielded the best performance: 85% specificity, 67% sensitivity, and 84% AUC. Although adding medical conditions did not significantly improve overall performance, key conditions (eg, cardiovascular diseases, endocrine/metabolic disorders, renal diseases, respiratory conditions, hematologic disorders, etc) contributed meaningfully to PD risk prediction. EDR factors (oral hygiene, periodontal treatment, brushing, flossing, smoking, and American Society of Anesthesiologists classification) dominated prediction.

Conclusion:

Although dental factors remained dominant predictors, strong systemic–oral health associations were observed. Future studies should validate these findings by integrating medical and dental records.

Knowledge Transfer Statement:

The results of this study can guide clinicians and policymakers in identifying patients at increased risk of periodontitis by integrating medical and dental records. This approach supports earlier interventions and highlights the importance of systemic health in oral disease management. It also demonstrates the potential of artificial intelligence–based prediction models to improve personalized care and promote interdisciplinary collaboration for better overall health outcomes.

Keywords

Oral and systemic health connection periodontal disease risk assessment dental informatics/bioinformatics deep learning/machine learning artificial intelligence

Background

Oral and systemic health are closely interconnected, with strong evidence linking periodontal disease (PD) to conditions such as cardiovascular disease, diabetes, arthritis, adverse pregnancy outcomes, and respiratory illnesses (Huang et al 2024; Zhao et al 2025). Some studies suggest a bidirectional relationship, in which PD may not only be associated with systemic diseases but could also contribute to their progression (Centers for Disease Control and Prevention 2024). Given these connections, early identification and prevention of PD are essential for improving both oral and overall health (Isola et al 2023). Predictive models offer a promising approach to identifying high-risk individuals before disease onset or progression (Toma and Wei 2023; Khalifa and Albadawy 2024). While electronic dental record (EDR) data provide valuable insights, existing prediction models primarily focus on dental factors, overlooking critical medical information from their physician’s office, such as diagnoses, medications, and procedures in electronic health records (EHRs) (Teza et al 2023; Ferrara et al 2024; Tokede et al 2024; Swinckels et al 2025). Integrating EHR with EDR may offer a more comprehensive patient profile, yet medical and dental records remain siloed due to system incompatibilities, data standardization issues, and lack of interoperability (Beserra et al 2022; Gurupur et al 2024). As a result, to the best of our knowledge, no existing prediction model effectively uses linked EHR and EDR data to improve PD risk assessment (Farina et al 2023; Tokede et al 2024; Swinckels et al 2025).

Various PD prediction models have been developed for the past 2 decades using different methodologies, including rule-based systems and artificial intelligence (AI)–driven machine learning (ML) (Farina et al 2023; Ferrara et al 2024; Swinckels et al 2025). Rule-based systems rely on predefined expert-driven conditions that lack adaptability, whereas ML models learn from large datasets to identify complex patterns and interactions (Tokede et al 2024). Unlike rigid rule-based approaches, ML continuously refines predictions based on new data, leading to more personalized and accurate risk assessments (Farina et al 2023; Swinckels et al 2025). However, ML models are only as effective as the quality and completeness of the training data (Sanyal et al 2021). Ensuring data accuracy, standardization, and integration is crucial for developing robust predictive models (Budach et al 2022). EHR data offer significant advantages over surveys and registry datasets, including real-time availability, longitudinal tracking, and integration of medical, behavioral, and demographic factors (Rudin et al 2020; Ghildayal et al 2024). By incorporating these additional variables, ML-based predictive models have the potential to enhance the detection, prevention, and early intervention strategies for PD (Bashir et al 2022; Zhang et al 2025).

Despite promising advances, existing prediction models lack access to up-to-date medical histories from linked EHRs, relying instead on self-reported histories, which studies have shown to be unreliable (Sulieman et al 2022). In our previous work, we extracted free-text EDR data into structured formats, phenotyped PD diagnoses across multiple record sections, and improved data quality before developing a PD prediction model (Patel et al 2022b; Patel et al 2022c). Using only EDR data, our model demonstrated moderate–high classification performance based on PD severity rather than a binary outcome (Patel et al 2022b). The present study aimed to improve PD prediction by integrating linked EHR and EDR data, incorporating medical histories, medications, and procedures. We hypothesize that incorporating comprehensive medical data will enhance model performance and provide deeper insights into systemic contributors to PD risk.

Methods

Datasets

We obtained data from 20,946 patients using linked EHR and EDR. The EDR dataset was retrieved from Temple University Kornberg School of Dentistry’s axiUm® database. We included adult patients who underwent at least 1 comprehensive oral examination or periodic oral evaluation with complete documentation of periodontal charting findings and periodontal diagnoses between January 2021 and December 2022. Medical information for these dental patients was obtained from their EHR through the Health Share Exchange. The final dataset comprised cross-sectional data, including patient demographics, dental diagnoses, periodontal charting findings, and dental treatments from the EDR, as well as medical history, procedures, lab values, and medication records from the EHR documented by physicians (for additional details, see Appendix: Linkage Process). See Figure 1 for the overall workflow.

Figure 1.

Overall workflow and study design.

Prediction versus Association

We clarify the distinction between prediction and association, as terminology may vary across clinical dentistry, dental research, and data science domains. Our study does not use longitudinal data suited for causal inference or prognostic modeling. Instead, we apply ML in a diagnostic framework to estimate the presence and severity of PD based on cross-sectional EHR/EDR data. This aligns with the diagnostic prediction framework outlined by van Smeden et al (2021), in which the goal is to infer a current unknown state using observable features, not to forecast future outcomes. Since predictions are made on “unseen” test data, the process qualifies as a prediction rather than a simple association, as the ML method uses distinct training and testing datasets. The model classifies PD status and severity in the test set, which it has not seen during training. This approach is common in data science, especially with cross-sectional data.

We recognize concerns about using “prediction” in nonlongitudinal contexts. Breiman (2020) described this as a divide between explanatory modeling, which assumes data-generating mechanisms, and algorithmic modeling, which focuses on predictive performance. ML falls into the latter category, data driven and flexible, often surfacing meaningful associations without mechanistic explanations (Breiman 2020). The approach proposed in this study aligns with this predictive tradition. As Kass (2021) noted, explanation and prediction can coexist, depending on the modeling goal. Statistical modeling, in this context, involves both quantifying uncertainty and assessing utility, even in the absence of causal claims (Kass 2021).

Therefore, our objective is not causal inference or prognosis but identifying variables associated with PD severity and assessing their predictive value using real-world data. Many studies in medical informatics and data science domains have used cross-sectional data to develop prediction models (Daines et al 2019; Antwi et al 2020). While “association” or “correlates” may better describe the relationships, the use of ML to estimate unobserved outcomes from observed features justifies a careful use of the term “prediction” in a diagnostic sense.

Data Processing for an AI/ML-Ready Dataset

Although EDR and EHR data contain valuable clinical information, they are primarily collected for patient care rather than research or AI/ML applications (Tsai et al 2020). As a result, significant preprocessing is required before they can be used for statistical analysis or ML. PD diagnoses, for example, may appear in free-text clinical notes and structured fields or can be obtained from periodontal charting data when structured diagnosis codes are missing (Patel et al 2022c). This inconsistency arises because dentists are reimbursed based on procedure codes rather than diagnoses, leading to variable completeness in the EDR. In addition, critical patient information, such as social habits, is often documented in free text, further complicating data extraction (Patel et al. 2022b).

To optimize EDR data for research and AI applications, our previous studies developed, tested, and validated automated tools to

convert free-text data into structured categories or codes for statistical and ML analysis (Patel et al 2022a; Patel et al 2022c; Patel et al 2023),

assess and enhance the completeness and depth of recorded information (Patel et al 2022b),

evaluate the concordance between diagnoses documented in different sections of the EDR (eg, structured codes vs clinical notes) (Patel et al 2022a; Patel et al 2022c), and

perform data imputation to manage missing values (Patel et al 2022b).

These preprocessing steps produced high-quality, structured EDR data, ensuring its suitability for PD research and AI-driven predictive modeling.

EHR Data Processing

Like EDR data, we cleaned and preprocessed EHR data to ensure its suitability for ML and statistical modeling. Many patients had more than 400 diagnosis codes ((International Classification of Diseases[ICD]), procedures, and medications, making the dataset highly dimensional and impractical for direct ML applications. Excessive features can lead to overfitting, computational inefficiencies, and reduced interpretability, necessitating feature reduction techniques to retain essential clinical information while improving model performance (Ying 2019).

To address this concern, we developed feature-reduction methods to streamline EHR data while preserving key clinical insights. For instance, medical diagnoses are often recorded under multiple ICD codes for the same condition. Diabetes mellitus, for example, may appear as follows:

E11.9: Type 2 diabetes mellitus without complications

E11.65: Type 2 diabetes mellitus with hyperglycemia

E13.9: Other specified diabetes mellitus without complications

To enhance interpretability and consistency, we collapsed these variations into a single feature labeled “diabetes” to ensure accurate modeling without redundancy. Implementing feature reduction techniques improved computational efficiency, reduced noise, and allowed the model to focus on clinically meaningful variables, ultimately enhancing predictive accuracy (for additional details, see Appendix: Feature Reduction for EHR Data).

Target Variable/Dependent and Independent Variables for ML

Our target variable was periodontitis, classified into healthy, mild (stage I), moderate (stage II), and severe (stages III and IV). We did not consider grading of the disease in the analysis (for data quality and the distribution of PD across various stages, see Appendix Table 1; for aggregated PD classification for ML and methods for improved data quality and calibration practices at Temple University Kornberg School of Dentistry, see Appendix section page 2). We obtained the target variable from the EDR using our phenotyping approach as described in our previous study (Patel et al 2022c), while independent variables were retrieved from linked EHR/EDR datasets. These included demographics (age, gender, race, ethnicity), insurance status, medical history, medications, dental history, treatments, oral hygiene habits (brushing, flossing), and behavioral factors (smoking, alcohol, drug use). We also included biomarkers such as HbA1c levels and systolic and diastolic blood pressure values for each patient. We did not include any periodontal findings or any markers that identify disease in the independent variable list. To minimize confounding, we removed highly correlated features such as diagnosis and medications using the correlation coefficient model (see Appendix Fig 1). The final dataset included 109 independent variables for ML analysis.

Data Imputation

To address missing values in our data, we used multiple imputations by chained equations to estimate missing values while preserving variable relationships, minimizing information loss, and creating a complete dataset for ML (van Buuren and Groothuis-Oudshoorn 2011). Because we evaluated XGBoost with other key ML models, incorporating imputation was necessary, particularly since models such as Gaussian Naive Bayes (GNB) and standard Random Forest (RF) do not handle missing data effectively (Bashir et al 2022; Bates et al 2024). Hence, imputation allowed us to retain a large portion of our dataset (for additional details, see Appendix: Data Imputation & Missingness and Appendix Table 2). This ensured that we did not introduce bias and potentially skew the performance of the model.

ML/AI Training, Testing, Debiasing, and Interpretation

We encountered imbalanced data among different periodontitis categories. Therefore, we addressed data imbalance using undersampling and the synthetic minority oversampling technique (SMOTE) to prevent bias toward the majority class and improve generalizability (Junsomboon and Phienthrakul 2017). We trained and tested GNB, RF, LightGBM, and XGBoost models (see Appendix: Machine Learning and Model Explainability). To improve reliability and reduce bias, we applied 10-fold cross-validation and optimized hyperparameters (see Appendix: Handling Overfitting and Appendix Table 3) (Bashir et al 2022; Bates et al 2024). We evaluated bias using the Prediction model Risk of Bias Assessment Tool (PROBAST) and Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) tool, which assesses participants, predictors, outcomes, and analysis (see Appendix: PROBAST Checklist & TRIPOD Checklist) (Collins et al 2021). The PROBAST checklist is designed to assess the risk of bias and applicability of diagnostic and prognostic prediction model studies. The TRIPOD checklist guides the transparent reporting of studies that develop, validate, or update prediction models. Furthermore, ensemble techniques were not considered, as model interpretability through SHAP cannot be reliably performed from ensemble models, as the SHAP values for each component model would need to be interpreted independently, thereby reducing the interpretability of the already complex black-box models (Lundberg and Lee 2017).

Feature Selection Experiments and Model Validation

We conducted 5 experiments to validate the model through feature selection. Although ML models can handle complex datasets with multiple independent variables, overfitting remains a concern (Bashir et al 2022; Bates et al 2024). We conducted these 5 experiments to ensure consistency among the ML models, determine the most informative features, and compare the influence of different feature selection strategies on model outcomes. Although ML models can process high-dimensional data, they are prone to overfitting if irrelevant or redundant features are not removed. By testing various preprocessing and feature aggregation methods, we reduced noise, enhanced model generalizability, and determined the contribution of EHR-derived variables to PD prediction. These experiments helped us explore which features consistently emerged as strong predictors and evaluate how model performance varied when including or excluding medical versus dental information (Bashir et al 2022; Bates et al 2024).

In experiment 1, we used chi-square tests with a significance threshold of P < 0.05 to identify medical conditions that were significantly associated with periodontitis. We then included only those medical diagnoses and medications that showed significant associations in the predictive model. Chi-square testing is often used as a method of feature reduction in ML, where significant variables are kept and nonsignificant variables are excluded (Dissanayake and Johar 2021).

In experiment 2, we imported semiprocessed EHR data with ICD-10 codes without applying feature reduction methods. This helped us determine which medical conditions were highly correlated with PD outcomes before performing any data simplification.

In experiment 3, we used feature reduction methods to aggregate 10,671 ICD codes, 60,570 medications, and 4,316 procedures to 28 unique diagnosis categories, 58 medication categories, and 150 procedure categories. This was performed to minimize noise and eliminate confounders in the dataset, which can often mislead the model and degrade its performance (Ying 2019).

In experiment 4, we ranked patients based on the number of grouped medical conditions and medications in their medical records to measure systemic health complexity. For example, a patient with ICD codes in 5 diagnosis categories received a complexity score of 5.

In experiment 5, we removed all dental factors from the model and used only EHR data to predict PD. This allowed us to assess how well medical history alone could predict PD outcomes.

For each experiment, we split the data into 70% training and 30% testing sets, using stratified sampling to maintain the target variable’s distribution. To compare model performance across these experiments, we measured sensitivity (correctly identify true positives), specificity (correctly identify true negatives), recall (proportion of actual positives correctly identified), precision (proportion of predicted positives that are true positives), F1 score (harmonic mean of precision and recall), accuracy (overall proportion of correct predictions), and area under the curve (AUC) (how well the model differentiates between positive cases from negative cases) (Rainio et al 2024).

Results

Patient Characteristics

The study included 20,946 patients, with most aged 50 to 70 y (45%), Black/African American (45%), and 57% female. Medicaid covered 42% of patients, while 33% were self-pay, and 23% had private insurance. Smoking was reported by 23% of patients, 20% consumed alcohol, and 13% used recreational drugs. Oral health findings showed a mean DMFT index of 5.07 (standard deviation [SD]: 4.6, confidence interval [CI]: [5.01, 5.14]), a DMFS index of 14.15 (SD: 15.08, CI: [13.94, 14.35]), and an average of 24 teeth (SD: 6.59, CI: [24.06, 24.24]) per patient. The mean plaque index was 77 (SD: 29.05, CI: [73.70, 77.72]), and the bleeding on probing score averaged 20 (SD: 21.49, CI: [19.79, 20.48]). Regarding oral hygiene habits, 54% of patients brushed twice daily, while 24% flossed daily, although 20% never flossed. In addition, 25% had tooth mobility, 21% had defective restorations, 25% had bruxism, and 21% had tooth crowding (Table 1).

Table 1.

Patient Demographics and Behavioral and Oral Health Factors in the Study Population.

Category	Frequency (%)
Patient demographics
Age, y
18, 20	96 (0)
20, 30	28 (0)
30, 40	2,570 (12)
40, 50	4,063 (19)
50, 60	4,151 (20)
60, 70	5,271 (25)
70+	4,767 (23)
Total	20,946 (100)
Race distribution
Unknown	6,038 (29)
Black	9,410 (45)
White	3,694 (18)
Asian	766 (4)
Native	86 (0)
Other	952 (5)
Total	20,946 (100)
Gender distribution
Female	11,914 (57)
Male	9,020 (43)
Other	12 (0)
Total	20,946 (100)
Insurance status
Medicaid	8,884 (42)
Cash	6,840 (33)
Private insurance	4,714 (23)
Ryan White	367 (2)
Medicare	141 (1)
Total	20,946 (100)
Patients’ behavioral status
Smoking
No	15,605 (75)
Yes	4,866 (23)
Unknown	475 (2)
Total	20,946 (100)
Alcohol
No	15,213 (72)
Yes	4,189 (20)
Unknown	1,544 (7)
Total	20,946 (100)
Recreational drugs
No	17,802 (85)
Yes	2,664 (13)
Unknown	480 (2)
Total	20,946 (100)
Patient’s oral health status
DMFT	Mean	5.07
	SD	4.60
	95% CI	(5.01, 5.14)
DMFS	Mean	14.15
	SD	15.08
	95% CI	(13.94, 14.35)
Number of teeth	Mean	24 Teeth
	SD	6.59
	95% CI	(24.06, 24.24)
Plaque index	Mean	77.21
	SD	29.05
	95% CI	(73.70, 77.72)
Bleeding on probing	Mean	20.13
	SD	21.49
	95% CI	(19.79, 20.48)
Brushing frequency
More than twice a day	1,500 (7)
Twice a day	11,238 (54)
Once a day	4,689 (22)
Sometimes	673 (3)
Never	161 (1)
Unknown	2,685 (13)
Total	20,946 (100)
Flossing frequency
More than twice a day	2,163 (10)
Once a day	5,011 (24)
Once a week	1,300 (6)
Sometimes	5,506 (26)
Never	4,230 (20)
Unknown	2,736 (13)
Total	20,946 (100)
Teeth mobility
No	8,395 (40)
Yes	5,285 (25)
Unknown	7,266 (35)
Total	20,946 (100)
Defective restorations
No	9,243 (44)
Yes	4,437 (21)
Unknown	7,266 (35)
Total	20,946 (100)
Bruxism
No	8,443 (40)
Yes	5,237 (25)
Unknown	7,266 (35)
Total	20,946 (100)
Tooth crowding
No	7,931 (39)
Yes	5,749 (21)
Unknown	7,266 (40)
Total	20,946 (100)

DMFS, decayed, missing, and filled surfaces; DMFT, decayed, missing, and filled teeth; CI, confidence interval; SD, standard deviation.

Medical Histories, Medications, and Procedures of Dental Patients

The most prevalent patient diagnoses included hypertension (4,431 cases), hyperlipidemia (2,171 cases), type 2 diabetes (1,603 cases), and gastroesophageal reflux disease (1,805 cases), highlighting the common comorbidities among dental patients. Frequently prescribed medications included fluticasone propionate (2,018 cases), ibuprofen (1,553 cases), amlodipine (1,517 cases), and atorvastatin (1,483 cases), reflecting treatments for inflammatory, cardiovascular, and metabolic conditions, while the most common procedures involved venipuncture and transfusion (1,462 cases), lipid panel tests (1,219 cases), and cardiography (1,095 cases), emphasizing routine monitoring of systemic health in this patient population (see Fig 2 and Appendix Table 4 for medical diagnosis distribution).

Figure 2.

Medical diagnosis, medications, and medical procedures of dental patients.

From the chi-square analysis, we found that PD was associated with cardiovascular diseases (hypertension, ischemic heart disease, myocardial infarction), endocrine and metabolic disorders (type 2 diabetes, hyperlipidemia, hyperglycemia, obesity), renal diseases (chronic kidney disease, kidney failure), respiratory conditions (chronic obstructive pulmonary disease, asthma, respiratory failure), hematologic disorders (hyperkalemia, anemia), gastrointestinal conditions (gastroesophageal reflux disease, hepatitis C, colon polyps), neurologic and psychiatric disorders (anxiety, depression, migraines, epilepsy, posttraumatic stress disorder), and autoimmune conditions (see Appendix Table 5: chi-square results for all factors).

Predictors for Periodontitis Obtained from Linked EHR–EDR Data

As demonstrated in Figure 3, key predictors from the EDR included periodontal findings, American Society of Anesthesiologists severity (ASA) (Describes a patient’s systemic health), smoking, government insurance, self-pay status, fewer prophylaxis procedures, fewer dental visits, older age, high caries risk index, overhanging restorations, and tooth mobility. From the EHR, predictors include features as described above. These findings highlight periodontitis as a systemic inflammatory condition rather than an isolated oral disease. Integrating medical records into periodontal risk assessment enables a more comprehensive, personalized approach to early detection and intervention in both dental and medical settings.

Figure 3.

Beeswarm plot for machine learning interpretability and factors predicting periodontitis.

ML Performance and Evaluation

XGBoost outperformed all other ML models (see Appendix Table 6 for the comparison matrix). As shown in Table 2, the model performance metrics were relatively consistent across experiments 1 through 4. Experiment 1 yielded the best performance (chi-square test for EHR feature selection + ML). Using only EDR data, the model achieved an overall AUC of 84% and an F1 score of 82%, with a sensitivity of 66% and a specificity of 86%, indicating strong predictive performance but moderate sensitivity in detecting periodontitis severity levels (Table 2). When incorporating both EDR and EHR data, the overall accuracy and sensitivity remained at 81% and 66%, respectively; however, the AUC improved to 86%. Despite the small improvement, it added medical features to the prediction, especially in severe periodontitis cases (such as HbA1c levels and blood pressure). Therefore, adding EHR data provided valuable insights into the contribution of medical factors to the prediction.

Table 2.

Scores Summarizing the Model Performance Metrics across the Various Experiments Using XGBoost.

	Healthy	Slight	Moderate	Severe	Average
Experiment 1: Chi-square test for EHR feature selection + ML
Dental records only
Accuracy (weighted)	89%	77%	76%	84%	81%
F1 score (weighted)	90%	77%	76%	84%	82%
Precision (weighted)	90%	78%	77%	84%	82%
Recall (weighted)	89%	77%	76%	84%	81%
Sensitivity	64%	68%	57%	76%	66%
Specificity	93%	81%	82%	87%	86%
AUC	90%	81%	74%	90%	84%
Brier score	0.0810	0.1537	0.1609	0.1188	0.1286
Dental records with medical records
Accuracy (weighted)	89%	77%	74%	84%	81%
F1 score (weighted)	90%	77%	75%	84%	82%
Precision (weighted)	90%	78%	76%	84%	82%
Recall (weighted)	89%	77%	74%	84%	81%
Sensitivity	62%	69%	58%	76%	66%
Specificity	93%	80%	80%	88%	85%
AUC	90%	85%	78%	90%	86%
Brier score	0.0784	0.1556	0.1668	0.1224	0.13
Experiment 2: No feature reduction and use of raw EHR data
Dental records only
Accuracy (weighted)	89%	77%	76%	84%	81%
F1 score (weighted)	90%	77%	76%	84%	82%
Precision (weighted)	90%	78%	77%	84%	82%
Recall (weighted)	89%	77%	76%	84%	81%
Sensitivity	64%	68%	57%	76%	66%
Specificity	93%	81%	82%	87%	86%
AUC	90%	85%	79%	90%	86%
Brier score	0.081	0.1537	0.1609	0.1188	0.1286
Dental records with medical records
Accuracy (weighted)	89%	76%	74%	84%	81%
F1 score (weighted)	89%	77%	74%	84%	81%
Precision (weighted)	90%	77%	76%	84%	82%
Recall (weighted)	89%	76%	74%	84%	81%
Sensitivity	63%	69%	58%	76%	67%
Specificity	92%	80%	79%	87%	85%
AUC	90%	84%	77%	90%	85%
Brier score	0.087057	0.154892	0.168035	0.121977	0.13
Experiment 3: Reduced EHR Features
Dental records only
Accuracy (weighted)	89%	77%	76%	84%	81%
F1 score (weighted)	90%	77%	76%	84%	82%
Precision (weighted)	90%	78%	77%	84%	82%
Recall (weighted)	89%	77%	76%	84%	81%
Sensitivity	64%	68%	57%	76%	66%
Specificity	93%	81%	82%	87%	86%
AUC	90%	85%	79%	90%	86%
Brier score	0.0810	0.1537	0.1609	0.1188	0.1286
Dental records with medical records
Accuracy (weighted)	89%	76%	73%	82%	80%
F1 score (weighted)	89%	77%	74%	83%	81%
Precision (weighted)	90%	77%	75%	83%	81%
Recall (weighted)	89%	76%	73%	82%	80%
Sensitivity	65%	70%	57%	78%	68%
Specificity	92%	79%	78%	84%	83%
AUC	90%	84%	77%	90%	85%
Brier score	0.0822	0.1588	0.1713	0.1234	0.1339
Experiment 4: Aggregated Medical Records Results
Dental records only
Accuracy (Weighted)	89%	77%	75%	84%	81%
F1 score (weighted)	90%	78%	76%	84%	82%
Precision (weighted)	90%	78%	77%	84%	82%
Recall (weighted)	89%	77%	75%	84%	81%
Sensitivity	65%	69%	57%	76%	67%
Specificity	93%	81%	81%	87%	86%
AUC	89%	85%	79%	90%	86%
Brier score	0.0827	0.1545	0.1595	0.1189	0.1289
Dental records with medical records
Accuracy (weighted)	89%	77%	75%	83%	81%
F1 score (weighted)	90%	77%	75%	84%	81%
Precision (weighted)	90%	78%	76%	84%	82%
Recall (weighted)	89%	77%	75%	83%	81%
Sensitivity	65%	68%	57%	76%	67%
Specificity	93%	81%	81%	87%	85%
AUC	89%	84%	78%	89%	85%
Brier score	0.0818	0.1543	0.1660	0.1232	0.1313
Experiment 5: EHR records only
Accuracy (weighted)	75%	60%	57%	61%	63%
F1 score (weighted)	77%	61%	60%	62%	65%
Precision (weighted)	81%	63%	64%	64%	68%
Recall (weighted)	75%	60%	58%	61%	64%
Sensitivity	33%	49%	40%	50%	43%
Specificity	81%	65%	67%	66%	70%
AUC	58%	59%	52%	60%	57%
Brier score	0.1757	0.2392	0.2379	0.2379	0.22

AUC, area under the curve; EHR, electronic health record; ML, machine learning.

Overall, we observed that while sensitivity generally improved or remained consistent, this often came at the expense of other key performance indicators, including specificity, Brier score, and F1 score.

Experiment 5 (only EHR features) revealed that many medical conditions were significantly associated with PD; some emerged as key predictors, and some appeared in the top 20 features. This model achieved an F1 score of 65% and a specificity of 70%, demonstrating that even with only EHR data, it can deliver satisfactory performance in predicting PD. SHAP analysis further revealed that the model successfully identified meaningful associations between diabetes, heart disease, and periodontitis, reflecting its ability to detect key known correlations within the EHR data (Huang et al 2024). However, when EDR data are introduced, the model tends to prioritize dental features only, which are stronger indicators of the disease outside of biomarkers such as HbA1c levels of blood pressure. This suggests that although EHR data hold valuable information, they may require more nuanced integration with EDR data (for more details, see Appendix Figs 2 and 3 and the Beeswarm plots) (Lundberg and Lee 2017).

Across experiments 1 to 4, we observed a mean feature similarity of 95% with experiment 1, showing the highest overlap with the other experiments, indicating that its feature set captured most of the key predictors of PD. Please see Appendix Experiments Feature Similarities for more details.

Discussion

This study highlights the importance of PD extending beyond the oral cavity, underscoring its connection to overall systemic health. While dental factors remained dominant in prediction, integrating medical history improved ML performance and provided additional insights. Our models show that men are more likely to develop severe PD than women are. They also suggest that the White/Caucasian population is less susceptible to periodontitis than the Black/African American population. These results align with our previous studies (Patel et al 2022b). Given the well-established connections between periodontitis and systemic diseases such as cardiovascular disease, diabetes, and respiratory illnesses, a siloed health care approach is no longer adequate. Patients with underlying conditions may benefit from proactive periodontal screenings to prevent disease progression. Medical professionals should also incorporate periodontal health into the management of chronic diseases, particularly for high-risk populations (Huang et al 2024). However, implementing integrated care faces significant barriers, particularly the lack of interoperability between EHR and EDR (Beserra et al 2022). Most health care systems operate in separate domains, limiting real-time, data-driven risk assessment that accounts for both oral and systemic health (Rudman et al 2010). While some institutions have adopted integrated medical–dental EHR platforms, these systems are expensive and inaccessible to many providers. In addition, historical patient data stored in standalone dental records remain difficult to incorporate into newer platforms (Tsai et al 2020).

Finally, despite the structured nature of EHR data, we encountered challenges in adapting these datasets for AI/ML. Patients often had more than 400 separate codes for various medical conditions, medications, medical procedures, and laboratory tests, resulting in high-dimensional data with redundancy and noise. We addressed this issue using feature reduction methods to retain key medical predictors while enhancing model efficiency and interpretability. Our feature reduction method addresses a critical gap in transforming high-dimensional data into clinically interpretable, low-dimensional representations suitable for ML and statistical modeling. Unlike existing methods that prioritize harmonization or black-box dimensionality reduction (Bashir et al 2022), this tool integrates ontology-driven mappings and expert-informed heuristics to preserve clinical meaning across diagnoses, medications, and procedures. By automating a task that is usually manual and prone to errors, this method makes it easier to use real-world clinical data for building prediction models, supports teamwork across different fields, and helps bring AI-ready data for research.

To advance integrated care, efforts should focus on strengthening data linkage frameworks between EHR and EDR (Rudman et al 2010; Huang et al 2024). Health information exchanges centralize medical data across health care providers; however, similar infrastructures are also needed for dental records (Rudman et al 2010; Huang et al 2024). Standardizing data formats and adopting common terminologies for diagnoses could facilitate seamless integration (Gurupur et al 2024). In addition, AI-driven tools could help extract and harmonize unstructured data from both domains, improving predictive modeling and patient risk stratification.

Our results demonstrate that interdisciplinary collaboration is another crucial step in providing patient care. Shared risk assessments, referral systems, and co-management strategies could improve patient outcomes by addressing periodontal and systemic health in tandem (Mohd Norwir et al 2025). Policy-level changes, such as insurance reimbursement models that incentivize preventive periodontal care for high-risk medical patients, could further drive integration efforts (Huang et al 2024).

Our findings align with prior research and demonstrate the predictive power of ML models for assessing periodontitis risk. However, most existing models rely solely on dental data, limiting their ability to incorporate systemic health influences. While linked EHR data provided valuable insights, dental factors remained the primary predictors. Compared with other studies that incorporated limited risk factor information, our study provided new knowledge about associated medical conditions and medications with periodontitis (Farina et al 2023; Ferrara et al 2024; Swinckels et al 2025).

This study has limitations. First, this study used only cross-sectional data. However, longitudinal data analysis will provide superior results to determine the correlations between oral and systemic health. Next, the accuracy of ICD-coded diagnoses in EHRs and procedure and diagnosis codes within EDRs can vary, and misclassification or missing data could influence feature importance rankings. Finally, the study results may not be generalizable as the dataset was used from one institute’s EHR. Our future studies will include (1) natural language processing–extracted clinical diagnoses with ICD codes to improve reliability, (2) conduct multisite studies to improve generalizability, and (3) conduct longitudinal studies to determine correlations between systemic and oral health.

Conclusion

This study highlights the potential of linking EHR and EDR for PD prediction. While dental factors remain the most influential predictors, systemic conditions contribute to disease risk, reinforcing the need for interdisciplinary collaboration in managing periodontitis. Efforts should focus on improving interoperability between EHR and EDR, optimizing AI/ML models for integrated datasets, and expanding large-scale data linkages to study various research questions aiming to connect systemic health with oral health.

Supplemental Material

sj-docx-3-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-docx-3-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-docx-4-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-docx-4-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-pdf-1-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-pdf-1-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-pdf-2-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-pdf-2-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-png-5-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-png-5-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-png-6-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-png-6-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Supplemental Material

sj-png-7-jct-10.1177_23800844251408849 – Supplemental material for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records

Supplemental material, sj-png-7-jct-10.1177_23800844251408849 for Periodontitis Prediction Model Using Linked Electronic Health and Dental Records by J.S. Patel, M. Tellez, R. Katiyar, N.N. Al-Hebshi, R. Santana, R.M. Yucel and A. Ismail in JDR Clinical & Translational Research

Footnotes

Acknowledgements

We would like to express our sincere gratitude to Dr Margaret Grisius and Dr Lorena Baccaglini, program directors at the National Institute of Dental and Craniofacial Research (NIDCR), for their thoughtful review and valuable feedback throughout this project. We gratefully acknowledge Mr Manan Patel for his programming support and Mr Ryan Brandon for his invaluable assistance in extracting data from the axiUm system. We also thank the Temple University Institutional Review Board (IRB) office for reviewing and approving this study under the exempt category (IRB No. 28321-0006). Finally, we thank Health Share Exchange for providing us with the dental patients’ linked medical records.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This publication is supported by the NIH-NIDCR’s U01DE033259 award.

ORCID iDs

J.S. Patel

N.N. Al-Hebshi

Data Availability Statement

This study uses real patient records from EHR and EDR, which contain patient identifiers. Due to patient privacy and institutional security requirements, this data cannot be shared publicly. Researchers interested in accessing the dataset should contact the first and senior author, Dr Jay Patel, to explore potential collaboration. Access will be granted only after completion of the required documentation, IRB approval, a data use agreement, and a business associate agreement.

A supplemental appendix to this article is available online.

References

Antwi

, et al. 2020. Systematic review of prediction models for gestational hypertension and preeclampsia. PLoS One. 15(4):e0230955. https://doi.org/10.1371/journal.pone.0230955

Bashir

Rahman

Chen

SL-S

. 2022. Systematic comparison of machine learning algorithms to develop and validate predictive models for periodontitis. J Clin Periodontol. 49(10):958–969. https://doi.org/10.1111/jcpe.13692

Bates

Hastie

Tibshirani

2024. Cross-validation: what does it estimate and how well does it do it? J Am Stat Assoc. 119(546):1434–1445. https://doi.org/10.1080/01621459.2023.2197686

Beserra

Freire

Moisés

Piagge

Melo

2022. Impacts and challenges of using electronic records in dental practice—a scoping review. Rev Aten Saúde. 19(70):111–121. https://doi.org/10.13037/ras.vol19n70.8197

Breiman

. 2020. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci. 16(3):199–231. https://doi.org/10.1214/ss/1009213726

Budach

, et al. 2022. The effects of data quality on machine learning performance. arXiv [preprint]. https://doi.org/10.48550/arXiv.2207.14529

Centers for Disease Control and Prevention. 2024 May 15. About periodontal (gum) disease. Oral health. CDC; [accessed 2025 Mar 3]. https://www.cdc.gov/oral-health/about/gum-periodontal-disease.html

Collins

, et al. 2021. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 11(7):e048008. https://doi.org/10.1136/bmjopen-2020-048008

Daines

, et al. 2019. Systematic review of clinical prediction models to support the diagnosis of asthma in primary care. NPJ Prim Care Respir Med. 29(1):19. https://doi.org/10.1038/s41533-019-0132-z

10.

Dissanayake

Md Johar

. 2021. Comparative study on heart disease prediction using feature selection techniques on classification algorithms. Liao

, editor. Applied computational intelligence and soft computing. 2021(1):1–17. https://doi.org/10.1155/2021/5581806

11.

Farina

Lopez

Simonelli

Trombelli

2023. Accuracy and applicability of periodontitis risk assessment tools: a critical appraisal. Periodontol 2000. [published online ahead of print]. https://doi.org/10.1111/prd.12498

12.

Ferrara

Rapone

D’Albenzio

2024. Applications of deep learning in periodontal disease diagnosis and management: a systematic review and critical appraisal. J Med Artif Intell. 8:23. https://dx.doi.org/10.21037/jmai

13.

Ghildayal

, et al. 2024. Public health surveillance in electronic health records: lessons from PCORnet. Prev Chronic Dis. 21:E51. https://doi.org/10.5888/pcd21.230417

14.

Gurupur

Mayya

King

2024. The need for standards in evaluating the quality of electronic health records and dental records: a narrative review. Big Data Cogn Comput. 8(12):168. https://doi.org/10.3390/bdcc8120168

15.

Huang

, et al. 2024. Association between periodontal disease and systemic diseases: a cross-sectional analysis of current evidence. Mil Med Res. 11(1):74. https://doi.org/10.1186/s40779-024-00583-y

16.

Isola

, et al. 2023. Periodontal health and disease in the context of systemic diseases. Mediators Inflamm. 2023:9720947. https://doi.org/10.1155/2023/9720947

17.

Junsomboon

Phienthrakul

2017. Combining over-sampling and under-sampling techniques for imbalance dataset. In: Proceedings of the 9th International Conference on Machine Learning and Computing; 2017 Feb 24–26; Singapore. Association for Computing Machinery. p 243–247.

18.

Kass

. 2021. The two cultures: statistics and machine learning in science. Obs Stud. 7(1):135–144.

19.

Khalifa

Albadawy

2024. Artificial intelligence for clinical prediction: exploring key domains and essential functions. Comput Methods Programs Biomed Update. 5:100148. https://doi.org/10.1016/j.cmpbup.2024.100148

20.

Lundberg

Lee

S-I.

2017. A unified approach to interpreting model predictions. In: von Luxburg

Guyon

Bengio

Wallach

Fergus

, editors. Proceedings of the 31st International Conference on Neural Information Processing Systems; 2017 Dec 4–9; Long Beach, CA, United States. Curran Associates Inc. p 4768–4777.

21.

Mohd Norwir

Mohd-Said

Abdul Aziz

Mohd-Dom

. 2025. Leveraging dental visits for systemic health: diabetes screening and referral compliance in periodontitis patients in Malaysia. J Clin Med. 14(3):739. https://doi.org/10.3390/jcm14030739

22.

Patel

, et al. 2022a. Develop a natural language processing pipeline to automate extraction of periodontal disease information from electronic dental clinical notes. In: Proceedings of the 6th International Conference on Medical and Health Informatics; 2022 May 13–15; Virtual Event, Japan. Association for Computing Machinery. p 61–68.

23.

Patel

, et al. 2022b. Developing and testing a prediction model for periodontal disease using machine learning and big electronic dental record data. Front Artif Intell. 5:979525. https://doi.org/10.3389/frai.2022.979525

24.

Patel

, et al. 2022c. Developing automated computer algorithms to phenotype periodontal disease diagnoses in electronic dental records. Methods Inf Med. 61(S 02):e125–e133. https://doi.org/10.1055/s-0042-1757880

25.

Patel

, et al. 2023. Comparing gingivitis diagnoses by bleeding on probing (BOP) exclusively versus BOP combined with visual signs using large electronic dental records. Sci Rep. 13(1):17065. https://doi.org/10.1038/s41598-023-44307-z

26.

Rainio

Teuho

Klén

2024. Evaluation metrics and statistical tests for machine learning. Sci Rep. 14(1):6086. https://doi.org/10.1038/s41598-024-56706-x

27.

Rudin

Friedberg

Shekelle

Shah

Bates

. 2020. Getting value from electronic health records: research needed to improve practice. Ann Intern Med. 172(11 Suppl):S130–S136. https://doi.org/10.7326/m19-0878

28.

Rudman

Hart-Hester

Jones

Caputo

Madison

2010. Integrating medical and dental records: a new frontier in health information management. J AHIMA. 81(10):36–39.

29.

Sanyal

, et al. 2021. Fix your models by fixing your datasets. arXiv [preprint]. https://doi.org/10.48550/arXiv.2112.07844

30.

Sulieman

, et al. 2022. Comparing medical history data derived from electronic health records and survey answers in the All of Us Research Program. J Am Med Inform Assoc. 29(7):1131–1141. https://doi.org/10.1093/jamia/ocac046

31.

Swinckels

, et al. 2025. A personalized periodontitis risk based on non-image electronic dental records by machine learning. J Dent. 153:105469. https://doi.org/10.1016/j.jdent.2024.105469

32.

Teza

, et al. 2023. Development of risk prediction models for severe periodontitis in a Thai population: statistical and machine learning approaches. JMIR Form Res. 7:e48351. https://doi.org/10.2196/48351

33.

Tokede

, et al. 2024. Development and validation of a rule-based algorithm to identify periodontal diagnosis using structured electronic health record data. J Clin Periodontol. 51(5):547–557. https://doi.org/10.1111/jcpe.13938

34.

Toma

Wei

. 2023. Predictive modeling in medicine. Encyclopedia. 3(2):590–601. https://doi.org/10.3390/encyclopedia3020042

35.

Tsai

, et al. 2020. Effects of electronic health record implementation and barriers to adoption and use: a scoping review and qualitative analysis of the content. Life. 10(12):327. https://doi.org/10.3390/life10120327

36.

van Buuren

Groothuis-Oudshoorn

. 2011. MICE: multivariate imputation by chained equations in R. J Stat Softw. 45(3):1–67. https://doi.org/10.18637/jss.v045.i03

37.

van Smeden

Reitsma

Riley

Collins

Moons

KG.

2021. Clinical prediction models: diagnosis versus prognosis. J Clin Epidemiol. 132:142–145. https://doi.org/10.1016/j.jclinepi.2021.01.009

38.

Ying

2019. An overview of overfitting and its solutions. J Phys Conf Ser. 1168:022022. https://doi.org/10.1088/1742-6596/1168/2/022022

39.

Zhang

Deng

Zou

Jin

Jiang

2025. Artificial intelligence models for periodontitis classification: a systematic review. J Dent. 156:105690. https://doi.org/10.1016/j.jdent.2025.105690

40.

Zhao

, et al. 2025. The association between periodontal disease and adverse pregnancy outcomes: a bibliometric analysis from 2000 to 2023. Front Med (Lausanne). 12:1526406. https://doi.org/10.3389/fmed.2025.1526406