Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Abstract

Background

Atherogenic dyslipidemia significantly influences diabetes development, yet the longitudinal prognostic value of the atherogenic index of plasma (AIP) in prediabetic populations remains understudied. This study investigated whether dynamic AIP measures predict diabetes conversion in middle-aged and older Chinese adults with prediabetes while exploring interactions with other metabolic risk factors through interpretable machine learning.

Methods

We analyzed data from 1965 prediabetic adults (≥ 45 years) from the China Health and Retirement Longitudinal Study (2012–2015), assessing baseline AIP, cumulative AIP exposure (AIPcum), and change patterns via K-means clustering. Associations with incident diabetes were evaluated using logistic regression with progressive confounder adjustment. Five machine learning models were developed using predictors identified through the Boruta algorithm and recursive feature elimination, with the best model interpreted via SHAP (SHapley Additive exPlanations) analysis.

Results

During follow-up, 15.0% of participants progressed to diabetes. Elevated AIPcum showed the strongest association with diabetes risk (OR for highest vs. lowest quartile = 2.37; 95% CI: 1.60–3.57; P < 0.001), while “Persistently High” and “Increasing” AIP change patterns exhibited ∼2.5-fold higher risk versus “Persistently Low” patterns. Random forest achieved the best predictive performance (AUROC = 0.760), with glucose, hand grip strength, AIPcum, and waist circumference as key predictors. Interaction analyses revealed significant synergistic effects between AIPcum and other metabolic factors.

Conclusion

Cumulative AIP exposure and unfavorable AIP change patterns independently predict prediabetes-to-diabetes progression. Incorporating these measures into risk assessment may enhance early identification of high-risk individuals and inform targeted interventions.

Keywords

Atherogenic index of plasma prediabetes diabetes cumulative exposure machine learning SHAP analysis

Introduction

Diabetes mellitus represents a global health challenge of increasing magnitude, with ∼537 million adults currently affected worldwide and projections suggesting this number will rise to 783 million by 2045.¹ Of particular concern is the large population with prediabetes—characterized by impaired fasting glucose or impaired glucose tolerance—who face an annual conversion rate to diabetes of 5%–10%.² With China hosting one of the world's largest populations of individuals with prediabetes, estimated at 35.7% of adults, identifying reliable predictors of diabetes development is of paramount public health importance.³

Atherogenic dyslipidemia, characterized by elevated triglycerides (TGs) and decreased high-density lipoprotein cholesterol (HDL-C), plays a critical role in the pathogenesis of both cardiovascular disease and diabetes.^4,5 The atherogenic index of plasma (AIP), calculated as the logarithmic ratio of TG to HDL-C, has emerged as a comprehensive marker that reflects the balance between atherogenic and protective lipoproteins.⁶ While AIP has demonstrated strong associations with insulin resistance, metabolic syndrome, cardiovascular outcomes, and obesity,^7,8 its value in predicting progression from prediabetes to diabetes remains inadequately explored.

Previous studies investigating lipid parameters and diabetes risk have predominantly used baseline measurements from large cohort studies,^9,10 with limited attention to the dynamic and cumulative nature of lipid exposure over time.¹¹ Moreover, the complex interplay between AIP and other metabolic risk factors in determining diabetes risk has not been systematically evaluated using advanced analytical approaches that can account for non-linear relationships and interactions.

Machine learning (ML) techniques offer promising tools to address these limitations by identifying complex patterns and interactions within multivariate data. Several studies have demonstrated successful applications of ML approaches to diabetes prediction and diagnosis, with Nguyen et al. achieving exceptional performance using random forest (RF) algorithms (cross-validation scores reaching 0.998), highlighting the potential of tree-based models in clinical applications.¹² However, their application of ML in diabetes prediction has been hampered by concerns about interpretability.¹³ The recent development of explainable artificial intelligence methods, such as SHapley Additive exPlanations (SHAP), presents an opportunity to leverage the predictive power of ML while maintaining clinical interpretability.¹⁴ This breakthrough enables clinicians to understand not only what the model predicts but why specific predictions are made, thereby making ML technology practically applicable in clinical settings while maintaining the sophisticated analytical capabilities that distinguish ML from traditional statistical approaches.

In this study, we aimed to investigate the association between dynamic measures of AIP and the risk of progression from prediabetes to diabetes in middle-aged and older Chinese adults. Specifically, we evaluated three complementary AIP indices: baseline AIP, cumulative AIP exposure (AIPcum), and longitudinal AIP change patterns identified through unsupervised clustering. Furthermore, we employed interpretable ML approaches to identify key predictors of diabetes development and explore potential interaction effects between AIP and other metabolic risk factors. Our findings may provide novel insights into lipid-related risk stratification for individuals with prediabetes and inform targeted preventive strategies to mitigate the growing burden of diabetes in aging populations.

Methods

Data source and study population

This study analyzed data from the China Health and Retirement Longitudinal Study (CHARLS), a nationally representative cohort of Chinese adults aged ≥ 45 years. Participants were selected through stratified multistage probability sampling across 150 county-level units in 28 provinces. Data collection included demographic characteristics, health status, behavioral factors, and biomarkers, obtained via standardized face-to-face interviews. The study design and methodology of CHARLS have been previously described in detail.¹⁵ All participants provided written informed consent, and the study was approved by the Institutional Review Board of Peking University (IRB00001052-11015).

We designated wave 1 (2011–2012) as the baseline and utilized data from this survey and wave 3 (2015) to evaluate dynamic changes in the AIP and assess diabetes outcomes. As shown in the study flowchart (Figure 1), we applied specific inclusion and exclusion criteria to define our analytical sample. From the initial CHARLS cohort, we identified participants with prediabetes at baseline (n = 3065), defined according to American Diabetes Association criteria as having fasting plasma glucose levels between 5.6 and 6.9 mmol/L (100–125 mg/dL) or hemoglobin A1c (HbA1c) values between 5.7% and 6.4%.¹⁶ We excluded participants who were younger than 45 years (n = 104), had incomplete diabetes follow-up data at wave 3 (n = 990), or lacked laboratory values necessary for AIP calculation (n = 6). After applying these criteria, 1965 participants with prediabetes remained for the final analysis.

Figure 1.

The overall flowchart of the study.

Measurement and calculation of AIP

AIP was selected as the primary study variable to evaluate atherogenic dyslipidemia. Blood samples were collected after overnight fasting from participants during the baseline survey (2011–2012) and follow-up (2015), with over 92% of respondents who gave blood reporting that they were fasting. Serum TG and HDL-C were measured using enzymatic colorimetric tests by trained medical personnel following standardized protocols.¹⁵ All blood samples were processed at the Youanmen Center for Clinical Laboratory of Capital Medical University using an automated chemistry analyzer.¹⁷ Laboratory standardization included rigorous quality control measures with the following assay specifications: TGs were measured using an enzymatic colorimetric test with a within-assay coefficient of variation (CV) of 1.5%, between-assay CV of 1.8%, and detection limits of 4–1000 mg/dL; HDL-C was measured using an enzymatic colorimetric test with a within-assay CV of 1.0%, between-assay CV of 1.3%, and detection limits of 3–120 mg/dL. This laboratory maintains regular external quality assessment organized by the Chinese Ministry of Health and conducts daily quality control samples. The center has been accredited by the Beijing Health Bureau and demonstrates excellent performance during annual evaluation by the External Quality Assurance (EQA) Program organized by the National Center for Clinical Laboratories, China Ministry of Health. All test results from quality control samples were within the target range (within two standard deviations of mean QC control concentrations).

The AIP was calculated using the following formula⁶:

AIP = \log_{10} [TG (mmol / L) / HDL - C (mmol / L)]

We calculated AIP values for both baseline (2011–2012) and follow-up (2015) measurements. To comprehensively assess the impact of AIP dynamics on diabetes development, we employed three analytical approaches:

Baseline AIP: Participants were stratified into quartiles (Q1–Q4) based on their baseline AIP values.¹⁸

Cumulative AIP exposure (AIPcum): To capture long-term AIP effects, we calculated cumulative exposure using the formula: AIPcum = (AIP₂₀₁₂ + AIP₂₀₁₅)/2 × (2015–2012), as previously described by Zheng et al.¹⁹

AIP change patterns: Using K-means cluster analysis, we identified distinct patterns of AIP change over time, categorizing participants into four change groups: persistently low, decreasing, increasing, and persistently high.²⁰

Assessment of diabetes

The primary outcome of this study was the development of diabetes at the 2015 follow-up (wave 3). In the CHARLS cohort, diabetes status was assessed through both self-reported information and objective laboratory measurements.²¹ Participants were classified as having diabetes if they met any of the following criteria: self-reported physician diagnosis of diabetes, self-reported current use of anti-diabetic medications, or laboratory evidence of diabetes. Laboratory diagnosis was based on American Diabetes Association criteria,¹⁶ defined as fasting plasma glucose (FPG) ≥ 126 mg/dL (7.0 mmol/L), random plasma glucose (RPG) ≥ 200 mg/dL (11.1 mmol/L), or HbA1c ≥ 6.5%.

Data collection

Variable selection was based on clinical expertise and supported by previous literature on risk factors for diabetes. Variables were categorized into six major domains for subsequent analysis. Sociodemographic variables included age, gender, residence (urban or rural), marital status (married or single/divorced/widowed), and educational level (elementary school or below vs. middle school or above). Lifestyle variables comprised smoking status (non-smoker or smoker) and alcohol consumption (non-drinker or drinker). Physical activity was assessed using the International Physical Activity Questionnaire (IPAQ), and Metabolic equivalent (MET) scores were calculated as follows: vigorous activity × 8.0 + moderate activity × 4.0 + walking × 3.3.^22,23 Based on the total MET score, participants were classified into low and high activity groups. Anthropometric indicators included body mass index (BMI), waist circumference, and hand grip strength, which was measured using a calibrated dynamometer. Blood pressure was measured three times using an Omron HEM-7200 electronic monitor after participants had abstained from smoking, alcohol, exercise, and food for at least 30 minutes. The final blood pressure values were derived as the average of the three readings. Cognitive function was evaluated using the Mini-Mental State Examination (MMSE), a validated tool for assessing global cognitive status.²⁴ Laboratory measurements included complete blood count (white blood cell [WBC] count, hemoglobin, and platelet [PLT] count); glycemic markers such as fasting plasma glucose and HbA1c; lipid profile including TG, HDL-C, low-density lipoprotein cholesterol (LDL-C), and calculated remnant cholesterol (RC); renal function indices including blood urea nitrogen (BUN), serum creatinine (Cr), estimated glomerular filtration rate (eGFR), and cystatin C; inflammatory marker C-reactive protein (CRP); and other biochemical indicators such as serum uric acid (UA). Psychological status was assessed using the 10-item Center for Epidemiologic Studies Depression Scale (CESD-10), with scores ≥ 10 indicating clinically significant depressive symptoms.²⁵ Additionally, we included self-reported comorbidities covering a broad range of conditions, such as cardiovascular disease, stroke, chronic liver and kidney diseases, digestive disorders, arthritis or rheumatism, cancer, lung disease, asthma, psychiatric disorders, and sensory impairments (vision and hearing conditions).

AIP change patterns identification and regression modeling

To identify distinct longitudinal patterns in AIP, we applied K-means clustering, a distance-based unsupervised learning algorithm valued for its simplicity, scalability, and interpretability.²⁰ The optimal number of clusters (k) was determined using the elbow method, which evaluates the within-cluster sum of squares (WCSS) across different values of k, and further validated through silhouette analysis to assess clustering quality. The final number of clusters was selected by comprehensively considering multiple criteria: (1) the inflection point in the elbow curve where additional clusters yielded diminishing returns in WCSS reduction, (2) acceptable silhouette scores (> 0.3) indicating reasonable clustering quality, (3) clinical interpretability of the resulting change patterns, and (4) clear inter-cluster separation as evidenced by density distribution plots.

Clustering was based on individual AIP measurements at baseline (2011–2012) and follow-up (2015), allowing for the classification of participants into distinct change groups reflecting patterns of AIP change over time. Each participant was assigned to one of the derived change groups, and mean AIP values at both time points were calculated to characterize each cluster's temporal trend.

Subsequently, we examined associations between baseline AIP, AIPcum, AIP change patterns, and diabetes risk using both univariate and multivariate logistic regression models. To ensure the robustness of these analyses, multicollinearity among covariates was assessed using variance inflation factors (VIFs), and no significant collinearity was detected.

Data preprocessing and feature selection

To comprehensively evaluate the predictive role of AIP in diabetes development, we incorporated AIPcum into our ML framework as a key variable. Prior to model development, we conducted comprehensive data preprocessing to ensure quality and consistency. Variables with more than 30% missing data were excluded from consideration. For continuous variables with partial missingness, we applied the K-nearest neighbors (KNNs) imputation method (n_neighbors = 5), which estimates missing entries based on similar observations in the feature space.²⁶ For categorical variables, mode imputation was employed.²⁷ Outliers were identified using the interquartile range (IQR) method, defined as values below Q1−1.5 × IQR or above Q3 + 1.5 × IQR. To mitigate their influence while preserving sample size, we applied winsorization to cap these extreme values rather than removing them.

To identify the most relevant predictors of diabetes onset, we employed a comprehensive feature selection strategy that integrated two complementary methods. First, the Boruta algorithm—a RF-based wrapper technique—was applied over 20 iterations to ensure result stability. This approach assesses feature importance by statistically comparing each variable against randomly permuted “shadow features.”²⁸ Concurrently, we conducted recursive feature elimination (RFE) with 5-fold cross-validation, using the area under the receiver operating characteristic curve (AUROC) and its 95% confidence intervals (CIs) to evaluate model performance at each elimination step.²⁹ Features consistently identified as important by both methods were selected for final model development. By synthesizing the results of both methods, we identified an optimal subset of seven predictors for diabetes risk, with AIPcum ranking among the top features.

ML algorithms

A stratified random sampling approach was employed to partition the dataset into a training set (70%) and a test set (30%), ensuring preservation of the original class distribution. The training set was used to develop ML models, while the test set was reserved exclusively for performance evaluation. Based on the seven most predictive features identified through feature selection, five ML algorithms were constructed: logistic regression (LR), RF, Extreme Gradient Boosting (XGBoost), light gradient boosting machine (LightGBM), and multilayer perceptron (MLP).

To address class imbalance, class weighting strategies were employed, with weights calculated as the ratio of negative to positive samples.³⁰ Hyperparameter tuning was conducted via grid search with 5-fold stratified cross-validation to explore predefined parameter spaces while minimizing the risk of overfitting. The F1 score was selected as the primary optimization criterion due to its balanced representation of precision and recall, which is particularly important in clinical prediction contexts.³¹

The optimal configurations for each model were as follows: LR with elastic net regularization; RF with increased tree depth and class weighting; XGBoost with depth control and regularization to reduce overfitting; LightGBM with tuned leaf count and regularization parameters; and MLP with an adaptive learning rate and early stopping. Model performance was evaluated using multiple metrics, including accuracy, precision, recall, specificity, F1 score, and AUROC. Additionally, confusion matrices were calculated for all models to provide an intuitive understanding of classification performance, including true positives, true negatives, false positives, and false negatives.

SHAP interpretability analysis

To improve the interpretability of the ML models, we conducted a SHAP analysis on the best-performing algorithm. SHAP provides a consistent and model-agnostic framework for explaining individual predictions by estimating the marginal contribution of each feature to the model's output. For this analysis, we employed the TreeExplainer method, which is specifically optimized for tree-based algorithms. In this context, positive SHAP values indicate that a given feature contributes to an increased predicted probability of diabetes, whereas negative values reflect a protective effect.

To visualize and interpret the results, we used three complementary techniques. First, SHAP summary plots were generated to display the distribution and relative importance of all features across the test set. Second, partial dependence plots were constructed using locally weighted scatterplot smoothing (LOWESS) to explore potential non-linear relationships between individual features and their SHAP values. Third, SHAP force plots were utilized to decompose individual-level predictions, visually demonstrating how each feature contributed to deviations from the model's baseline prediction.

Additionally, to investigate non-linear interactions between AIPcum and other key diabetes risk factors, we generated two-dimensional SHAP interaction heatmaps.

Comparator indices and incremental value

In addition to AIPcum, we derived four commonly used indices (TyG, LAP, VAI, and METS-IR; formulas in the Supplementary Methods). Using the same evaluation dataset, we fitted six logistic models: base, base + AIPcum, and Base + each competing index (TyG/LAP/VAI/METS-IR). We compared discrimination as ΔAUROC versus base using the DeLong test, calibration (intercept/slope with reliability plots), and decision-curve analysis (DCA) across 5%–20% risk thresholds. Because models and decision curves were assessed on the same dataset, DCA estimates may be slightly optimistic; we therefore emphasize relative net-benefit comparisons.

Statistical analysis

Continuous variables were expressed as mean ± standard deviation (SD) or median with IQR, depending on their distribution. Group comparisons were conducted using Student's t-test for normally distributed data and the Wilcoxon rank-sum test for non-normally distributed data. Categorical variables were presented as frequencies and percentages, and differences between groups were assessed using the chi-square test or Fisher's exact test when expected cell counts were ≤5. A two-sided P-value <0.05 was considered statistically significant. All analyses were conducted using R (version 4.4.2; R Foundation for Statistical Computing, Vienna, Austria) and Python (version 3.12.9), with ML and statistical procedures implemented via the scikit-learn, pandas, and statsmodels libraries.

Sensitivity analyses

We conducted a baseline-only analysis using AIP measured in 2011–2012 (excluding any 2015 information) to minimize simultaneity/reverse causation. We also computed E-values for each key odds ratio and for the lower bound of its 95% CI to quantify the minimum strength of association an unmeasured confounder would need with both the exposure and the outcome (on the risk-ratio scale) to explain away the observed effect.³²

Results

Patient characteristics

Among 1965 participants with prediabetes at baseline, 295 (15.0%) developed diabetes during follow-up. The mean age was 59.0 ± 8.8 years, and 54.1% were female. While baseline characteristics showed expected patterns consistent with established diabetes risk factors (older age, lower education, and adverse metabolic profiles among converters), our primary focus was on evaluating the added prognostic value of dynamic AIP measures beyond these conventional predictors.

As shown in Table 1, baseline characteristics demonstrated anticipated differences between diabetes converters and non-converters across established risk factors, including sociodemographic, anthropometric, and metabolic parameters.

Table 1.

Baseline characteristics of included participants stratified by diabetes status.

Characteristics	Total	Non-diabetic	Diabetic	P-value*
Characteristics	(n = 1965)	(n = 1670)	(n = 295)	P-value*
Age (years), mean (SD)	59.0 (8.8)	58.8 (8.7)	60.4 (9.1)	0.004
Gender, n (%)				0.209
Female	1063 (54.1)	893 (53.5)	170 (57.6)
Male	902 (45.9)	777 (46.5)	125 (42.4)
Residence, n (%)				0.291
Rural	1269 (64.6)	1070 (64.1)	199 (67.5)
Urban	696 (35.4)	600 (35.9)	96 (32.5)
Marital status, n (%)				0.393
Married	1750 (89.1)	1492 (89.3)	258 (87.5)
Single, divorced, and widowed	215 (10.9)	178 (10.7)	37 (12.5)
Education, n (%)				0.017
Elementary school or below	1345 (68.4)	1125 (67.4)	220 (74.6)
Middle school or above	620 (31.6)	545 (32.6)	75 (25.4)
Smoking status, n (%)				0.187
Non-smoker	1210 (61.6)	1039 (62.2)	171 (58.0)
Smoker	755 (38.4)	631 (37.8)	124 (42.0)
Drinking status, n (%)				0.327
Non-drinker	1311 (70.9)	1105 (70.5)	206 (73.6)
Drinker	537 (29.1)	463 (29.5)	74 (26.4)
MET group, n (%)				0.068
Low	409 (48.1)	341 (46.8)	68 (56.2)
High	441 (51.9)	388 (53.2)	53 (43.8)
BMI (kg/m²), mean (SD)	25.2 (58.7)	25.3 (63.9)	24.7 (4.0)	0.872
Waist circumference (cm), mean (SD)	85.0 (12.5)	84.6 (12.2)	87.2 (13.5)	0.002
SBP (mmHg), mean (SD)	131.2 (21.0)	130.7 (21.1)	133.9 (19.8)	0.020
DBP (mmHg), mean (SD)	76.0 (11.8)	75.8 (11.7)	76.8 (12.4)	0.183
Hand grip strength (kg), mean (SD)	32.0 (10.4)	32.4 (10.4)	30.1 (10.5)	0.001
MMSE cognitive score, (mean (SD))	12.2 (3.3)	12.2 (3.3)	12.1 (3.2)	0.760
WBC count (× 10⁹/L), mean (SD)	6.2 (1.8)	6.2 (1.8)	6.3 (1.9)	0.143
Hemoglobin (g/dL), mean (SD)	14.6 (2.0)	14.6 (2.0)	14.6 (1.9)	0.834
Platelet count (× 10⁹/L), mean (SD)	215.1 (72.8)	214.9 (72.8)	215.8 (72.7)	0.861
HbA1c (%), mean (SD)	5.1 (0.4)	5.1 (0.4)	5.2 (0.3)	<0.001
Glucose (mg/dL), mean (SD)	107.9 (6.7)	107.6 (6.6)	109.4 (6.9)	<0.001
Triglycerides (mg/dL), mean (SD)	132.8 (84.3)	131.6 (85.2)	139.4 (78.4)	0.140
HDL-C (mg/dL), mean (SD)	51.1 (15.4)	51.4 (15.4)	49.0 (15.3)	0.014
LDL-C (mg/dL), mean (SD)	120.0 (35.2)	119.3 (35.6)	123.9 (32.3)	0.038
RC (mg/dL), mean (SD)	0.7 (0.6)	0.7 (0.6)	0.7 (0.5)	0.282
BUN (mg/dL), mean (SD)	15.8 (4.3)	15.9 (4.4)	15.2 (3.9)	0.011
Creatinine (mg/dL), mean (SD)	0.8 (0.2)	0.8 (0.2)	0.8 (0.2)	0.673
eGFR (mL/min/1.73 m²), mean (SD)	92.7 (13.8)	92.9 (13.8)	91.6 (13.8)	0.125
Cystatin C (mg/L), mean (SD)	1.0 (0.2)	1.0 (0.2)	1.0 (0.2)	0.329
CRP (mg/L), mean (SD)	2.3 (5.4)	2.2 (5.2)	3.1 (6.3)	0.008
Uric acid (mg/dL), mean (SD)	4.4 (1.2)	4.4 (1.2)	4.6 (1.3)	0.045
Depression, n (%)				1.000
No	429 (23.2)	364 (23.2)	65 (23.1)
Yes	1424 (76.8)	1208 (76.8)	216 (76.9)
Heart disease, n (%)				0.008
No	1688 (86.1)	1449 (87.0)	239 (81.0)
Yes	272 (13.9)	216 (13.0)	56 (19.0)
Stroke, n (%)				0.011
No	1921 (98.0)	1638 (98.4)	283 (95.9)
Yes	39 (2.0)	27 (1.6)	12 (4.1)
Liver disease, n (%)				0.901
No	1897 (96.9)	1613 (96.9)	284 (96.6)
Yes	61 (3.1)	51 (3.1)	10 (3.4)
Kidney disease, n (%)				0.033
No	1858 (94.9)	1586 (95.4)	272 (92.2)
Yes	100 (5.1)	77 (4.6)	23 (7.8)
Digestive disease, n (%)				1.000
No	1537 (78.4)	1306 (78.4)	231 (78.3)
Yes	424 (21.6)	360 (21.6)	64 (21.7)
Arthritis or rheumatism, n (%)				0.023
No	1370 (69.8)	1181 (70.8)	189 (64.1)
Yes	592 (30.2)	486 (29.2)	106 (35.9)
Cancer, n (%)				0.330
No	1950 (99.4)	1656 (99.3)	294 (100.0)
Yes	11 (0.6)	11 (0.7)	0 (0.0)
Lung disease, n (%)				0.687
No	1781 (90.7)	1511 (90.6)	270 (91.5)
Yes	182 (9.3)	157 (9.4)	25 (8.5)
Asthma, n (%)				0.739
No	1870 (95.5)	1590 (95.6)	280 (94.9)
Yes	89 (4.5)	74 (4.4)	15 (5.1)
Psychiatric problems, n (%)				1.000
No	1937 (98.8)	1646 (98.8)	291 (98.6)
Yes	24 (1.2)	20 (1.2)	4 (1.4)
Vision problem, n (%)				1.000
No	1858 (94.6)	1579 (94.6)	279 (94.6)
Yes	106 (5.4)	90 (5.4)	16 (5.4)
Hearing problem, n (%)				1.000
No	1809 (92.1)	1537 (92.0)	272 (92.2)
Yes	156 (7.9)	133 (8.0)	23 (7.8)
AIP₂₀₁₂ (mean (SD))	0.4 (0.3)	0.4 (0.3)	0.4 (0.3)	0.006
AIP₂₀₁₅ (mean (SD))	0.4 (0.3)	0.4 (0.3)	0.5 (0.3)	<0.001
AIPcum (mean (SD))	1.2 (0.8)	1.1 (0.8)	1.4 (0.8)	<0.001

SD: standard deviation; MET: metabolic equivalent; BMI: body mass index; SBP: systolic blood pressure; DBP: diastolic blood pressure; MMSE: Mini-Mental State Examination; WBC: white blood cell; HbA1c: glycated hemoglobin; HDL-C: high-density lipoprotein cholesterol; LDL-C: low-density lipoprotein cholesterol; RC: remnant cholesterol; BUN: blood urea nitrogen; eGFR: estimated glomerular filtration rate; CRP: C-reactive protein; AIP: atherogenic index of plasma; AIPcum: cumulative exposure to the atherogenic index of plasma.

* P-values were calculated using Student's t-test for continuous variables and chi-square test for categorical variables.

Comorbidities were more prevalent in the diabetes group, particularly heart disease (19.0% vs. 13.0%; P = 0.008), stroke (4.1% vs. 1.6%; P = 0.011), kidney disease (7.8% vs. 4.6%; P = 0.033), and arthritis (35.9% vs. 29.2%; P = 0.023).

Most notably, all AIP measures were significantly elevated in diabetes converters, including baseline AIP (2012; P = 0.006), follow-up AIP (2015; P < 0.001), and AIPcum (1.4 vs. 1.1; P < 0.001), suggesting potential added prognostic value beyond conventional risk factors.

Associations between AIP measures and risk of diabetes development

To identify distinct longitudinal patterns of AIP, we applied K-means clustering analysis. As shown in Figure 2(a) and (b), we evaluated clustering solutions from k = 2 to k = 8 using both the elbow method and silhouette analysis. While the elbow method showed the steepest WCSS reduction at k = 3, and silhouette analysis indicated a slightly higher average silhouette width for k = 3 (0.37) compared to k = 4 (0.36), we selected k = 4 as the optimal solution based on comprehensive evaluation criteria. The k = 4 solution provided the best balance between statistical clustering quality (silhouette score >0.3, indicating acceptable cluster separation) and clinical interpretability, yielding four distinct and biologically meaningful change patterns that capture the heterogeneous nature of lipid dysregulation in prediabetic individuals. The resulting four AIP change patterns (Figure 2(c) and (d)) were characterized as follows: Cluster 1 (persistently low) maintained consistently low AIP values; Cluster 2 (decreasing) exhibited a marked decline in AIP over time; Cluster 3 (increasing) showed a progressive increase in AIP; and Cluster 4 (persistently high) maintained consistently elevated AIP levels. Density plots of AIP values in 2012 and 2015 (Figure 2(e) and (f)) further demonstrated clear separation between clusters and stability of the clustering solution, supporting the validity of our k = 4 selection.

Figure 2.

Identification of longitudinal atherogenic index of plasma (AIP) trajectory patterns in adults with prediabetes: (a) elbow method for determining optimal cluster number; (b) silhouette analysis evaluating clustering quality; (c) K-means clustering results (k = 4) showing four distinct AIP trajectory patterns; (d) cluster mean trajectories from 2012 to 2015: persistently low (Cluster 1), decreasing (Cluster 2), increasing (Cluster 3), and persistently high (Cluster 4); and (e and f) density distribution plots of AIP scores in 2012 (e) and 2015 (f).

Table 2 presents univariate and multivariate logistic regression analyses examining associations between AIP measurements and diabetes risk. For baseline AIP, each SD increase was associated with 77% higher diabetes risk in the unadjusted model (OR = 1.77, 95% CI: 1.20–2.61, P = 0.004). This association persisted after adjustment for sociodemographic and lifestyle factors in Model 2 (OR = 1.83, 95% CI: 1.21–2.76, P = 0.004) and clinical factors in Model 3 (OR = 1.79, 95% CI: 1.16–2.77, P = 0.009). Quartile analysis revealed participants in the highest baseline AIP quartile had 52% higher diabetes risk compared to the lowest quartile in the fully adjusted model (OR = 1.52, 95% CI: 1.03–2.26, P = 0.037), with a significant dose–response relationship (P for trend = 0.018).

Table 2.

Associations between baseline AIP, AIPcum, AIP change patterns, and risk of diabetes development.

	Model 1		Model 2		Model 3
	OR (95% CI)	P-value	OR (95% CI)	P-value	OR (95% CI)	P-value
AIP₂₀₁₂ (per SD increase)	1.77 (1.20–2.61)	0.004	1.83 (1.21–2.76)	0.004	1.79 (1.16–2.77)	0.009
AIP₂₀₁₂ IQR
Q1	Reference		Reference		Reference
Q2	0.93 (0.64–1.36)	0.710	0.92 (0.62–1.37)	0.687	0.98 (0.65–1.48)	0.921
Q3	1.30 (0.91–1.86)	0.146	1.28 (0.88–1.86)	0.200	1.23 (0.83–1.82)	0.315
Q4	1.46 (1.03–2.07)	0.035	1.52 (1.06–2.21)	0.025	1.52 (1.03–2.26)	0.037
P for trend	0.009		0.007		0.018
AIPcum (per SD increase)	1.47 (1.26–1.72)	<0.001	1.51 (1.28–1.78)	<0.001	1.48 (1.24–1.77)	<0.001
AIPcum IQR
Q1	Reference		Reference		Reference
Q2	1.41 (0.96–2.09)	0.078	1.41 (0.95–2.13)	0.092	1.34 (0.88–2.05)	0.173
Q3	1.39 (0.95–2.06)	0.095	1.40 (0.94–2.11)	0.104	1.22 (0.80–1.88)	0.354
Q4	2.32 (1.62–3.35)	<0.001	2.43 (1.66–3.58)	<0.001	2.37 (1.60–3.57)	<0.001
P for trend	<0.001		<0.001		<0.001
AIP change patterns
Cluster 1 (persistently low)	Reference		Reference		Reference
Cluster 2 (decreasing)	1.27 (0.88–1.85)	0.203	1.26 (0.86–1.87)	0.240	1.13 (0.75–1.71)	0.560
Cluster 3 (increasing)	2.49 (1.75–3.56)	<0.001	2.61 (1.81–3.80)	<0.001	2.50 (1.70–3.69)	<0.001
Cluster 4 (persistently high)	2.46 (1.71–3.55)	<0.001	2.66 (1.81–3.92)	<0.001	2.57 (1.71–3.87)	<0.001

AIP: atherogenic index of plasma; AIPcum: cumulative exposure to atherogenic index of plasma; OR: odds ratio; CI: confidence interval; SD: standard deviation; IQR: interquartile range; BMI: body mass index; SBP: systolic blood pressure; DBP: diastolic blood pressure; HbA1c: glycated hemoglobin.

Model 1 is not adjusted for any of the variables.

Model 2 is adjusted for age, gender, residence, marital status, education, smoking status, and drinking status.

Model 3 is adjusted for age, gender, residence, marital status, education, smoking status, drinking status, BMI, SBP, DBP, and HbA1c.

Cumulative AIP exposure demonstrated stronger associations with diabetes risk. Each SD increase in AIPcum was associated with 48% higher diabetes risk after full adjustment (OR = 1.48, 95% CI: 1.24–1.77, P < 0.001). Participants in the highest AIPcum quartile had 2.37-fold increased diabetes risk compared to the lowest quartile (OR = 2.37, 95% CI: 1.60–3.57, P < 0.001), with a highly significant dose-response relationship (P for trend<0.001).

For AIP change patterns, participants in the “Increasing” and “Persistently High” clusters had 2.50-fold (OR = 2.50, 95% CI: 1.70–3.69, P < 0.001) and 2.57-fold (OR = 2.57, 95% CI: 1.71–3.87, P < 0.001) higher diabetes risk, respectively, compared to the “Persistently Low” cluster. The “Decreasing” cluster showed no significant association with diabetes risk (OR = 1.13, 95% CI: 0.75–1.71, P = 0.560).

Feature selection and performance of ML models for diabetes risk prediction

Feature selection was performed using two separate and complementary methods operating independently: the Boruta algorithm and RFE. As shown in Figure 3, this combined approach identified seven key predictors of diabetes development among individuals with prediabetes. AIPcum emerged as the most important feature, followed by glucose, waist circumference, BMI, hand grip strength, uric acid, and LDL-C. These seven variables were consistently identified as important predictors across multiple feature selection methods and were subsequently used for all model development and analyses.

Figure 3.

Feature selection results for diabetes conversion prediction in adults with prediabetes. (a) Sorted feature ranking distribution by the Boruta algorithm. Displays features ranked by their importance scores, with classification into selected, tentative, or rejected categories based on statistical comparison with shadow features. (b) Feature contribution and area under the curve (AUC) performance with recursive feature elimination. Shows feature contribution (importance) on the left y-axis and mean AUC performance on the right y-axis as features are sequentially eliminated.

Five ML models were evaluated for predicting diabetes development among individuals with prediabetes. As shown in Figure 4 and Supplemental Table S1, the MLP model achieved the highest accuracy in both training (85.2%) and test (79.7%) sets, with superior specificity (89.7% and 86.0%, respectively). However, its relatively low sensitivity (59.7% in training and 48.5% in test sets) limited its ability to correctly identify individuals who developed diabetes.

Figure 4.

ROC curves for diabetes conversion prediction in adults with prediabetes using five machine learning models. ROC curves with AUC and CIs for LR, RF, MLP, XGBoost, and LightGBM models. (a) Training set and (b) test set.

The RF model demonstrated the most balanced overall performance, with the highest AUROC of 0.877 in the training set and 0.760 in the test set. This model maintained good sensitivity (82.0% and 72.7%) while preserving reasonable specificity (76.1% and 68.0%) across both datasets.

XGBoost and LightGBM performed similarly but slightly below RF, with test set AUROCs of 0.736 and 0.721, respectively. These models showed moderate but balanced sensitivity and specificity values. In contrast, the LR model exhibited the poorest overall performance, with the lowest AUROC (0.647 in the test set) and a substantial imbalance between high sensitivity (83.8%) and low specificity (26.5%), suggesting a tendency to over-classify individuals as high-risk.

The MLP model achieved the highest F1 scores (0.547 in training and 0.444 in test sets), while the RF model achieved F1 scores of 0.516 in training and 0.439 in test sets. Despite the MLP model's superior F1 performance, the RF model demonstrated the most balanced overall performance, achieving the highest AUROC values (0.877 in training and 0.760 in test sets) while maintaining good sensitivity (82.0% and 72.7%) and preserving reasonable specificity (76.1% and 68.0%) across both datasets. This superior balance between discrimination ability and clinical utility made the RF model the preferred choice for subsequent SHAP analyses.

Confusion matrix analysis provided additional insights into the classification performance of each model (Supplemental Table S2). The RF model demonstrated balanced performance with 64 true positives, 24 false negatives, 341 true negatives, and 160 false positives in the test set, correctly identifying 72.7% of participants who developed diabetes while maintaining reasonable specificity. In contrast, the LR model exhibited a problematic pattern with excessive false positives (368 out of 501 non-diabetes participants in the test set), explaining its poor specificity and limited clinical utility. The MLP model showed the opposite pattern with high specificity but lower sensitivity, resulting in 45 false negatives out of 88 diabetes cases in the test set. XGBoost and LightGBM demonstrated moderate performance with similar false positive rates (160 and 176, respectively) but higher false negative rates (28 each) compared to the RF model. These results underscore the importance of considering both sensitivity and specificity trade-offs when evaluating model performance for clinical applications.

Interpretability analysis of the RF model based on SHAP

The SHAP summary plot (Figure 5(a)) identified primary contributors to diabetes risk prediction: elevated glucose, reduced hand grip strength, increased AIPcum, and enlarged waist circumference. Quantitative feature importance ranking (Figure 5(b)) confirmed glucose as the most influential predictor, followed by hand grip strength and AIPcum.

Figure 5.

SHAP interpretation of the random forest model for diabetes conversion prediction in adults with prediabetes. (a) Summary plot of SHAP values. Each row represents a feature, and each point denotes an individual sample. The x-axis indicates SHAP values (impact on model output), with positive values increasing and negative values decreasing diabetes conversion risk. Red and blue points represent high and low feature values, respectively. (b) Feature importance ranked by mean absolute SHAP values, with glucose, hand grip strength, and AIPcum as the top three contributors. (c) Force plot for individualized prediction in a representative patient (predicted diabetes conversion risk = 0.72). Red bars indicate the risk-enhancing features and blue bars indicate the protective features. Bar length reflects the magnitude of the contribution. (d) SHAP dependence plots showing non-linear associations between selected features (x-axis) and their SHAP values (y-axis), with vertical green dashed lines indicating critical threshold values.

For a representative participant with a predicted diabetes probability of 0.72 (Figure 5(c)), the force plot demonstrated how specific risk factors influenced this prediction. Elevated AIPcum (1.37), high uric acid (4.69 mg/dL), reduced hand grip strength (25.0 kg), increased BMI (29.23 kg/m²), and enlarged waist circumference (110.9 cm) collectively elevated diabetes risk, while lower LDL-C (86.6 mg/dL) and moderately elevated glucose (103.14 mg/dL) exerted protective effects.

The dependence plots (Figure 5(d)) revealed clinically relevant threshold effects across multiple predictors. BMI demonstrated a threshold at ∼25.85 kg/m², below which SHAP values became increasingly negative, indicating protection. Glucose exhibited a positive relationship with diabetes risk that accelerated above 107 mg/dL. AIPcum showed a biphasic pattern, with values below 0.89 being protective and values above being detrimental. Hand grip strength conferred protection most substantially above 30 kg, while waist circumference exhibited increased risk above 86 cm. UA demonstrated a threshold effect near 4.2 mg/dL, and LDL-C showed a protective effect at values below 110 mg/dL.

To investigate interactions between key predictors, we generated two-dimensional heatmaps (Figure 6) revealing how combinations of AIPcum with other risk factors synergistically influenced diabetes risk. The interaction between BMI and AIPcum (Figure 6(a)) demonstrated that the combination of high BMI and elevated AIPcum substantially increased diabetes risk (indicated by dark red regions), while low values of both parameters conferred protection (dark blue regions). Notably, even with normal BMI, increasing AIPcum raised diabetes risk, suggesting that unfavorable lipid profiles could counteract the benefits of normal weight. For glucose and AIPcum (Figure 6(b)), the highest risk occurred with elevated levels of both parameters, particularly when glucose exceeded 107 mg/dL and AIPcum exceeded 0.7. Interestingly, at intermediate AIPcum levels (1.0–1.5), glucose showed a stronger influence on risk, while at higher AIPcum values, the effect became more pronounced. The interaction between hand grip strength and AIPcum (Figure 6(c)) revealed that higher hand grip strength (>30 kg) provided protection against diabetes risk even at moderately elevated AIPcum levels, while the combination of reduced hand grip strength (<30 kg) and elevated AIPcum (>0.7) was associated with substantially increased diabetes risk. Waist circumference and AIPcum showed a strong synergistic effect (Figure 6(d)), with the highest risk observed when both parameters were elevated (waist >90 cm and AIPcum >1.0). Similarly, the UA and AIPcum interaction (Figure 6(e)) demonstrated increased risk with elevated levels of both markers, particularly when uric acid exceeded 4 mg/dL and AIPcum exceeded 0.7. The LDL-C and AIPcum interaction (Figure 6(f)) showed a more complex pattern, with the highest risk observed at moderate LDL-C levels (125–170 mg/dL) combined with elevated AIPcum (0.7–1.5), while both very low and very high LDL-C levels showed different interaction patterns with AIPcum.

Figure 6.

Interaction heatmaps between AIPcum and other key risk factors for diabetes conversion in adults with prediabetes. Two-dimensional heatmaps showing diabetes conversion probability (blue: low risk and red: high risk) across combinations of AIPcum (y-axis) and other predictors (x-axis). (a) BMI and AIPcum interaction; (b) glucose and AIPcum interaction; (c) hand grip strength and AIPcum interaction; (d) waist circumference and AIPcum interaction; (e) uric acid and AIPcum interaction; and (f) LDL-C and AIPcum interaction.

Incremental value

Adding AIPcum to the base model increased AUROC from 0.695 to 0.708 (ΔAUROC = 0.013, DeLong P = 0.337), representing the largest (though modest) gain among the indices examined (TyG Δ0.009, P = 0.298; LAP Δ0.004, P = 0.546; VAI Δ0.006, P = 0.451; METS-IR Δ0.011, P = 0.334). Calibration was acceptable by reliability plots (intercept ∼0 and slope ∼1). Decision-curve analysis indicated slightly higher net benefit for base + AIPcum compared with base and with models including TyG/LAP/VAI/METS-IR across 5%–20% risk thresholds (Supplemental Table S3 and Figure S1). Together, these results support AIPcum as a parsimonious and interpretable enhancer to standard covariates, performing at least as well as competing indices.

Sensitivity analyses

In the baseline-only analysis (2011–2012 AIP only), higher baseline AIP remained associated with conversion to diabetes: per-SD OR = 1.79 (95% CI: 1.16–2.77) and Q4 versus Q1 OR = 1.52 (95% CI: 1.03–2.26), consistent with the main findings (Table 2). For the primary dynamic contrast (AIPcum Q4 vs. Q1), the E-values were 4.17 for the point estimate and 2.58 for the lower 95% CI bound; for baseline AIP, E-values were 2.98 (per-SD point), 1.59 (per-SD lower CI), 2.41 (Q4 vs. Q1 point), and 1.21 (Q4 vs. Q1 lower CI) (Supplemental Table S4).

Discussion

In this large, nationally representative cohort of middle-aged and older Chinese adults with prediabetes, we investigated the relationship between dynamic measures of AIP and progression to diabetes. AIPcum was independently associated with diabetes development, with individuals in the highest quartile exhibiting a 2.37-fold increased risk compared to those in the lowest quartile after adjusting for established risk factors. We identified four distinct AIP change patterns, with “Persistently High” and “Increasing” patterns conferring ∼2.5-fold higher diabetes risk compared to the “Persistently Low” pattern. ML analysis identified glucose as the primary predictor of diabetes development, with hand grip strength and AIPcum as secondary yet critical contributors.

Our findings extend previous cross-sectional studies linking elevated AIP with prevalent diabetes^8,33 and prospective studies identifying associations between baseline AIP and incident diabetes.¹⁸ However, our study uniquely assesses AIP dynamics and their relationship with diabetes progression, specifically in prediabetic individuals. The strong independent associations observed with AIPcum and change patterns suggest that dynamic AIP measures provide additional prognostic information beyond conventional assessments.

The clinical utility of dynamic AIP measures extends beyond traditional risk factors through several mechanisms. While conventional diabetes risk factors such as age, BMI, and fasting glucose are well-established predictors, our findings demonstrate that AIP dynamics capture distinct pathophysiological processes related to atherogenic dyslipidemia that are not fully captured by these traditional markers. The stronger association observed with cumulative AIP exposure (OR = 2.37 for the highest vs. the lowest quartile) compared to single-point measurements underscores the importance of sustained lipid dysregulation in diabetes pathogenesis. This finding is particularly clinically relevant because AIP can be easily calculated from routine lipid panels without additional cost, yet provides prognostic information that complements rather than merely duplicates existing risk assessment tools.

The stronger association between AIPcum and diabetes risk, compared to single-time-point measurements, underscores the role of sustained lipid dysregulation in diabetes pathogenesis. AIP, calculated as the log-transformed ratio of TGs to HDL-C, serves as a surrogate marker for an atherogenic lipid profile characterized by elevated TGs, decreased HDL-C, and a higher proportion of small, dense LDL particles.^6,34 Persistent elevation in AIP indicates prolonged exposure to an atherogenic lipid environment, potentially contributing to insulin resistance through multiple biological pathways.³⁵

Elevated TGs and their metabolites can accumulate in non-adipose tissues, inducing lipotoxicity and impairing insulin signaling,^36,37 while reduced HDL-C levels compromise anti-inflammatory and antioxidant functions that normally protect against insulin resistance.³⁸ When sustained, these effects can lead to progressive β-cell dysfunction and impaired glucose homeostasis.³⁹ Chronic lipid dysregulation also contributes to low-grade inflammation and endothelial dysfunction, further exacerbating insulin resistance.⁴⁰ The cumulative nature of these processes may explain why AIPcum demonstrates stronger associations with diabetes development than single-point measurements.

The identification of distinct AIP change patterns provides insights into the heterogeneous nature of lipid dysregulation over time. The “Persistently High” pattern conferred significant risk increase, supporting the notion that chronic exposure to atherogenic lipid profiles promotes diabetes development. Interestingly, the “Increasing” pattern demonstrated similarly elevated risk, suggesting that deteriorating lipid profiles may rapidly accelerate diabetes pathogenesis. Conversely, the “Decreasing” pattern showed no significant association with diabetes risk compared to the “Persistently Low” pattern, suggesting that improvements in AIP may attenuate diabetes risk even among individuals with prediabetes.

Our ML analysis identified a combination of metabolic, anthropometric, and lipid parameters as key predictors of diabetes development. While glucose emerged as the strongest predictor, the high ranking of hand grip strength and AIPcum represents a novel finding. SHAP analysis revealed clinically relevant threshold effects: fasting glucose demonstrated an accelerated positive relationship with diabetes risk above 107 mg/dL, while AIPcum values below 0.89 exhibited protective effects. BMI showed a threshold effect around 25.85 kg/m², aligning with established clinical definitions of overweight status.⁴¹ These thresholds could serve as valuable reference points for clinicians developing personalized preventive strategies.

Interaction analyses revealed complex interrelationships between AIPcum and other metabolic factors. The synergistic effect between AIPcum and BMI suggests that lipid abnormalities may amplify the diabetogenic effects of excess adiposity.⁴² The interaction between AIPcum and glucose indicates that atherogenic dyslipidemia may lower the threshold at which glycemic abnormalities progress to diabetes. Notably, elevated AIPcum could counteract the benefits of normal BMI, suggesting that “metabolically unhealthy normal weight” individuals might benefit from lipid-focused interventions.⁴³

One of the most intriguing findings was the identification of hand grip strength as the second most important predictor of diabetes development. This finding adds to growing evidence suggesting muscular strength plays a significant role in metabolic health.^44,45 Several mechanisms might explain this association: decreased muscle strength often reflects sarcopenia, which is associated with impaired glucose disposal⁴⁶; myokines secreted during muscle contraction improve insulin sensitivity⁴⁷; and muscle-adipose tissue crosstalk plays a critical role in metabolic homeostasis.⁴⁸

These findings align with recent longitudinal studies linking grip strength to diabetes risk.^49,50 Our study extends these findings by demonstrating that hand grip strength remains a strong predictor even after accounting for AIP and other metabolic factors, specifically in individuals with prediabetes. Higher hand grip strength may provide protection against diabetes development even with elevated AIPcum, suggesting that interventions to improve muscular strength might mitigate the adverse effects of atherogenic dyslipidemia on glucose metabolism.

Hand grip strength assessment offers several advantages in clinical settings: it is inexpensive, non-invasive, and easily measured. Unlike some risk factors, grip strength is modifiable through targeted interventions such as resistance training.⁵¹ These findings highlight the importance of including physical function parameters in diabetes risk evaluation and suggest that resistance training should be an integral component of diabetes prevention strategies, particularly for individuals with prediabetes and unfavorable lipid profiles.

Our interpretable ML approach demonstrates clinically meaningful predictive capability (AUROC = 0.760) comparable to established diabetes risk scores,^52,53 while offering several distinct advantages over traditional risk assessment methods. Beyond capturing complex non-linear relationships and interactions that conventional models miss, our SHAP-enhanced methodology provides transparent, actionable insights that can be readily translated into clinical decision support tools. The analysis revealed clinically relevant thresholds and factor interactions that enable personalized risk stratification, while SHAP force plots allow clinicians to visualize individual risk contributions and explain to patients why they are at high risk and which specific factors drive their risk profile.

This interpretability has significant implications for clinical practice and patient engagement. Clinicians can develop targeted intervention strategies based on each patient's dominant risk factors—for instance, directing patients with elevated AIPcum but normal BMI toward lipid-focused interventions, while recommending resistance training programs for those with reduced hand grip strength. This personalized approach, enabled by transparent AI explanations, enhances patient understanding and engagement while informing more precise, individualized prevention strategies that address each patient's specific risk determinants rather than applying generic interventions.

In head-to-head comparisons against TyG, LAP, VAI, and METS-IR, AIPcum delivered the largest—albeit modest—increment in discrimination with acceptable calibration and slightly higher clinical net benefit on decision curves. Given the similar magnitude of improvements across indices, the parsimony and interpretability of AIPcum favor its practical use.

Despite its strengths, our study has several limitations. First, we did not include oral glucose tolerance testing (OGTT) in baseline assessments. While OGTT is the gold standard for diagnosing impaired glucose tolerance (IGT), our use of FPG and HbA1c reflects standard practice in large-scale epidemiological studies and routine clinical screening. The absence of OGTT may have resulted in misclassification of some participants with undiagnosed diabetes as prediabetic or exclusion of individuals with isolated IGT, potentially influencing the observed AIP-diabetes associations. This limitation was imposed by the CHARLS study design, which prioritized feasibility in large-scale population screening. Future studies should incorporate the OGTT to provide more precise prediabetes classification. Second, the relatively short follow-up period may not fully capture long-term AIP dynamics and diabetes development. We were limited to only two time points for AIP measurements, constraining our ability to characterize more complex trajectories. Third, despite comprehensive confounder adjustment, residual confounding remains a concern, particularly regarding lifestyle factors. We had limited capacity to account for lifestyle interventions that significantly influence diabetes development. While we collected data on smoking, alcohol consumption, and physical activity (IPAQ), several limitations affected the comprehensive incorporation of these factors. Notably, detailed dietary information was unavailable in the CHARLS database, preventing adjustment for nutritional patterns that directly influence TG and HDL-C levels—the key AIP components. Additionally, physical activity data exceeded our 30% missing data threshold and was excluded from ML models to maintain analytical rigor. Given that dietary patterns and physical activity are established modifiers of lipid metabolism and diabetes risk, their incomplete assessment represents potential unmeasured confounding that could influence observed associations between AIP dynamics and diabetes progression.

Fourth, generalizability may be limited to Chinese adults aged ≥45 years, and—critically—medication data in CHARLS are available only as self-reported broad categories without drug class, dose, or duration. As a result, we could not adjust for or exclude specific glucose- or lipid-affecting agents (e.g. statins/fibrates, glucocorticoids, thiazides, β-blockers, atypical antipsychotics, calcineurin inhibitors, and selected antiretrovirals), and we did not exclude participants based on medication use. We state this explicitly as a major limitation. Such incomplete medication detail can plausibly introduce residual confounding in both directions, for example, statins/fibrates may lower TG and/or raise HDL-C and thereby reduce AIP (biasing our associations toward the null), whereas glucocorticoids or some antihypertensives may increase diabetes risk (biasing away from the null). Future studies should stratify analyses by medication use or exclude users of relevant pharmacologic interventions when detailed drug information is available.

Fifth, our machine learning models demonstrated notable performance decline from training to test sets, with the RF model AUROC declining from 0.877 to 0.760 (13.3% decrease). While test set performance still represents good discriminative ability comparable to published diabetes risk prediction models, this performance gap raises concerns about generalizability to external populations. This limitation may reflect the inherent complexity of predicting diabetes conversion in prediabetic populations, modest sample size, and class imbalance (15.0% conversion rate). Despite implementing rigorous cross-validation, regularization, and class weighting strategies, external validation in independent cohorts remains essential.

Future studies should extend follow-up periods beyond 5–10 years with regular AIP assessments to elucidate the temporal dynamics between atherogenic dyslipidemia and diabetes development. Validation across diverse populations is necessary as AIP-diabetes relationships likely vary by ethnicity, age, and geography. Interventional studies examining AIP reduction through targeted lifestyle modifications and pharmacological approaches in prediabetic individuals would establish causality and inform preventive strategies. Most importantly, external validation of our ML models in independent cohorts is crucial to assess true generalizability and clinical utility beyond the current study population.

Conclusions

Cumulative AIP exposure and unfavorable AIP change patterns independently predict prediabetes-to-diabetes progression. Our interpretable ML approach not only enhances predictive accuracy compared to traditional methods but also provides clinically actionable insights through individual-level risk explanations and identification of key interaction effects. Integrating these ML-derived insights into clinical decision support systems may revolutionize diabetes risk assessment by enabling more precise, personalized prevention strategies and improving patient engagement through transparent risk communication.

Supplemental Material

sj-xlsx-1-dhj-10.1177_20552076251390574 - Supplemental material for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Supplemental material, sj-xlsx-1-dhj-10.1177_20552076251390574 for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes by Zhan Chen, Yue Zhang and Liming Wang in DIGITAL HEALTH

Supplemental Material

sj-xlsx-2-dhj-10.1177_20552076251390574 - Supplemental material for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Supplemental material, sj-xlsx-2-dhj-10.1177_20552076251390574 for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes by Zhan Chen, Yue Zhang and Liming Wang in DIGITAL HEALTH

Supplemental Material

sj-xlsx-3-dhj-10.1177_20552076251390574 - Supplemental material for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Supplemental material, sj-xlsx-3-dhj-10.1177_20552076251390574 for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes by Zhan Chen, Yue Zhang and Liming Wang in DIGITAL HEALTH

Supplemental Material

sj-xlsx-4-dhj-10.1177_20552076251390574 - Supplemental material for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Supplemental material, sj-xlsx-4-dhj-10.1177_20552076251390574 for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes by Zhan Chen, Yue Zhang and Liming Wang in DIGITAL HEALTH

Supplemental Material

sj-docx-5-dhj-10.1177_20552076251390574 - Supplemental material for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes

Supplemental material, sj-docx-5-dhj-10.1177_20552076251390574 for Dynamics of the atherogenic index of plasma and diabetes conversion: Insights from machine learning and longitudinal pattern analysis in middle-aged and older Chinese adults with prediabetes by Zhan Chen, Yue Zhang and Liming Wang in DIGITAL HEALTH

Footnotes

Abbreviations

Acknowledgements

The authors express their sincere gratitude to the participants of the CHARLS for their valuable contributions. We thank the research team at the National School of Development, Peking University, for their meticulous data collection and management.

ORCID iD

Yue Zhang

Ethics approval and consent to participate

The study protocol was reviewed and approved by the Institutional Review Board of Peking University (IRB00001052-11015). Written informed consent was obtained from all participants prior to their inclusion in the CHARLS.

Consent for publication

Not applicable.

Authors’ contributions

YZ, LW, and ZC conceived the study design and conceptualization. YZ and ZC performed data analysis and statistical interpretation. ZC drafted the initial manuscript. YZ and LW critically revised the manuscript for important intellectual content. All authors read and approved the final manuscript.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The datasets used and analyzed during the current study are available from the CHARLS database. Researchers can access these data through a formal application process at .

Supplemental material

Supplemental material for this article is available online.

References

Magliano

Boyko

and committee IDFDAtes . IDF Diabetes Atlas. In Idf diabetes atlas. Brussels: International Diabetes Federation, 2021, p.1.

Tabák

Herder

Rathmann

, et al. Prediabetes: a high-risk state for diabetes development. Lancet 2012; 379: 2279–2290.

Wang

Peng

Zhao

, et al. Prevalence and treatment of diabetes in China, 2013–2018. Jama 2021; 326: 2498–2506.

Parhofer

. Diabetic dyslipidemia. Metabolism 2014; 63: 1469–1479.

Taskinen

Borén

. New insights into the pathophysiology of dyslipidemia in type 2 diabetes. Atherosclerosis 2015; 239: 483–495.

Dobiásová

Frohlich

. The plasma parameter log (TG/HDL-C) as an atherogenic index: correlation with lipoprotein particle size and esterification rate in apoB-lipoprotein-depleted plasma (FER(HDL)). Clin Biochem 2001; 34: 583–588.

Gao

Zheng

, et al. Atherogenic index of plasma (AIP): a novel predictive indicator for the coronary artery disease in postmenopausal women. Lipids Health Dis 2018; 17: 197.

Zhu

Zhou

, et al. Atherogenic index of plasma is a novel and better biomarker associated with obesity: a population-based cross-sectional study in China. Lipids Health Dis 2018; 17: 37.

Zhou

Yang

, et al. Predictive performance of lipid parameters in identifying undiagnosed diabetes and prediabetes: a cross-sectional study in eastern China. BMC Endocr Disord 2022; 22: 76.

10.

Jasim

Mahmood

Ad'hiah

. Significance of lipid profile parameters in predicting pre-diabetes. Arch Razi Inst 2022; 77: 277–284.

11.

Simon

Peterson

Castro

, et al. Predicting diabetes clinical outcomes using longitudinal risk factor trajectories. BMC Med Inform Decis Mak 2020; 20: 6.

12.

Nguyen

Tung

Nguyen

, et al. The utilization of machine learning algorithms for assisting physicians in the diagnosis of diabetes. Diagnostics (Basel) 2023; 13: 2087.

13.

Ahmad

Teredesai

Eckert

. Interpretable machine learning in healthcare. In: 2018 IEEE International Conference on Healthcare Informatics (ICHI). 4–7 June 2018, pp.447–447.

14.

Lundberg

Lee

S-I

. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems. Long Beach, California, USA: Curran Associates Inc., 2017, pp.4768–4777.

15.

Zhao

Smith

, et al. Cohort profile: the China Health and Retirement Longitudinal Study (CHARLS). Int J Epidemiol 2014; 43: 61–68.

16.

American Diabetes Association . 2. Classification and diagnosis of diabetes: standards of medical care in diabetes-2022. Diabetes Care 2022; 45: S17–S38.

17.

Chen

Crimmins

, et al. Venous blood-based biomarkers in the China Health and Retirement Longitudinal Study: rationale, design, and results from the 2015 wave. Am J Epidemiol 2019; 188: 1871–1877.

18.

Zhou

. Association between the atherogenic index of plasma and long-term risk of type 2 diabetes: a 12-year cohort study based on the Japanese population. Cardiovasc Diabetol 2025; 24(1): 50.

19.

Zheng

, et al. Relationship between the cumulative exposure to atherogenic index of plasma and ischemic stroke: a retrospective cohort study. Cardiovasc Diabetol 2023; 22(1): 313.

20.

Genolini

Falissard

. Kml: k-means for longitudinal data. Comput Stat 2010; 25: 317–328.

21.

Zhao

Crimmins

, et al. Prevalence, diagnosis, and management of diabetes mellitus among older Chinese: results from the China Health and Retirement Longitudinal Study. Int J Public Health 2016; 61: 347–356.

22.

Bai

Tao

Huang

, et al. Effects of physical activity on cognitive function among patients with diabetes in China: a nationally longitudinal study. BMC Public Health 2021; 21(1): 481.

23.

Han

Zhang

, et al. Systemic inflammation accelerates the adverse effects of air pollution on metabolic syndrome: findings from the China Health and Retirement Longitudinal Study (CHARLS). Environ Res 2022; 215: 114340.

24.

Jia

Yang

. Mini-Mental State Examination in Elderly Chinese: a population-based normative study. J Alzheimers Dis 2016; 53: 487–496.

25.

Chen

Mui

. Factorial validity of the Center for Epidemiologic Studies Depression Scale short form in older population in China. Int Psychogeriatr 2014; 26: 49–57.

26.

Troyanskaya

Cantor

Sherlock

, et al. Missing value estimation methods for DNA microarrays. Bioinformatics 2001; 17: 520–525.

27.

Donders

van der Heijden

Stijnen

, et al. Review: a gentle introduction to imputation of missing values. J Clin Epidemiol 2006; 59: 1087–1091.

28.

Speiser

Miller

Tooze

, et al. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 2019; 134: 93–101.

29.

Guyon

Weston

Barnhill

, et al. Gene selection for cancer classification using support vector machines. Mach Learn 2002; 46: 389–422.

30.

Garcia

. Learning from imbalanced data. IEEE Trans Knowl Data Eng 2009; 21: 1263–1284.

31.

Saito

Rehmsmeier

. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 2015; 10(3): e0118432.

32.

VanderWeele

Ding

. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med 2017; 167: 268–274.

33.

Jiang

, et al. Non-linear associations of atherogenic index of plasma with prediabetes and type 2 diabetes mellitus among Chinese adults aged 45 years and above: a cross-sectional study from CHARLS. Front Endocrinol (Lausanne) 2024; 15: 1360874.

34.

Dobiásová

. AIP–atherogenic index of plasma as a significant predictor of cardiovascular risk: from research to practice. Vnitr Lek 2006; 52: 64–71.

35.

Boden

. Obesity, insulin resistance and free fatty acids. Curr Opin Endocrinol Diabetes Obes 2011; 18: 139–143.

36.

Yazıcı

Sezer

. Insulin resistance, obesity and lipotoxicity. Adv Exp Med Biol 2017; 960: 277–304.

37.

Accili

Deng

Liu

. Insulin resistance in type 2 diabetes mellitus. Nat Rev Endocrinol 2025; 21: 413–426.

38.

von Eckardstein

Widmann

. High-density lipoprotein, beta cells, and diabetes. Cardiovasc Res 2014; 103: 384–394.

39.

Brunham

Kruit

Verchere

, et al. Cholesterol in islet dysfunction and type 2 diabetes. J Clin Invest 2008; 118: 403–408.

40.

Esser

Legrand-Poels

Piette

, et al. Inflammation as a link between obesity, metabolic syndrome and type 2 diabetes. Diabetes Res Clin Pract 2014; 105: 141–150.

41.

Tan

KCB

. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 2004; 363: 157–163.

42.

Saltiel

Olefsky

. Inflammatory mechanisms linking obesity and metabolic disease. J Clin Invest 2017; 127: 1–4.

43.

Stefan

Schick

Häring

HU.

Causes, characteristics, and consequences of metabolically unhealthy normal weight in humans. Cell Metab 2017; 26: 292–300.

44.

Wittert

Vincent

, et al. Muscle grip strength predicts incident type 2 diabetes: population-based cohort study. Metabolism 2016; 65: 883–892.

45.

Lee

Jung

Bang

, et al. Association between muscle strength and type 2 diabetes mellitus in adults in Korea: data from the Korea National Health and Nutrition Examination Survey (KNHANES) VI. Medicine (Baltimore) 2018; 97: e10984.

46.

DeFronzo

Tripathy

. Skeletal muscle insulin resistance is the primary defect in type 2 diabetes. Diabetes Care 2009; 32: S157–S163.

47.

Severinsen

MCK

Pedersen

. Muscle-organ crosstalk: the emerging roles of myokines. Endocr Rev 2020; 41: 594–609.

48.

Romero-Corral

Somers

Sierra-Johnson

, et al. Normal weight obesity: a risk factor for cardiometabolic dysregulation and cardiovascular mortality. Eur Heart J 2010; 31: 737–746.

49.

Wang

Lee

Brellenthin

, et al. Association of muscular strength and incidence of type 2 diabetes. Mayo Clin Proc 2019; 94: 643–651.

50.

Yeung

CHC

Au Yeung

Fong

SSM

, et al. Lean mass, grip strength and risk of type 2 diabetes: a bi-directional Mendelian randomisation study. Diabetologia 2019; 62: 789–799.

51.

Strasser

Pesta

. Resistance training for diabetes prevention and therapy: experimental findings and molecular mechanisms. Biomed Res Int 2013; 2013: 805217.

52.

Lindström

Tuomilehto

. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2003; 26: 725–731.

53.

Bang

Edwards

Bomback

, et al. Development and validation of a patient self-assessment score for diabetes risk. Ann Intern Med 2009; 151: 775–783.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.06 MB

0.01 MB