Recalibration of a Non-Laboratory-Based Risk Model to Estimate Pre-Diabetes/Diabetes Mellitus Risk in Primary Care in Hong Kong

Abstract

Introduction/Objectives:

A non-laboratory-based pre-diabetes/diabetes mellitus (pre-DM/DM) risk prediction model developed from the Hong Kong Chinese population showed good external discrimination in a primary care (PC) population, but the estimated risk level was significantly lower than the observed incidence, indicating poor calibration. This study explored whether recalibrating/updating methods could improve the model’s accuracy in estimating individuals’ risks in PC.

Methods:

We performed a secondary analysis on the model’s predictors and blood test results of 919 Chinese adults with no prior DM diagnosis recruited from PC clinics from April 2021 to January 2022 in HK. The dataset was randomly split in half into a training set and a test set. The model was recalibrated/updated based on a seven-step methodology, including model recalibrating, revising and extending methods. The primary outcome was the calibration of the recalibrated/updated models, indicated by calibration plots. The models’ discrimination, indicated by the area under the receiver operating characteristic curves (AUC-ROC), was also evaluated.

Results:

Recalibrating the model’s regression constant, with no change to the predictors’ coefficients, improved the model’s accuracy (calibration plot intercept: −0.01, slope: 0.69). More extensive methods could not improve any further. All recalibrated/updated models had similar AUC-ROCs to the original model.

Conclusion:

The simple recalibration method can adapt the HK Chinese pre-DM/DM model to PC populations with different pre-test probabilities. The recalibrated model can be used as a first-step screening tool and as a measure to monitor changes in pre-DM/DM risks over time or after interventions.

Keywords

pre-diabetes risk estimation risk prediction model early detection model recalibration

Introduction

Early detection and management of diabetes mellitus (DM) are important to prevent complications and premature mortality. Identifying individuals with pre-diabetes (pre-DM), which is potentially reversible, would be most effective to prevent progression to DM. Multivariable risk prediction models have been developed to facilitate the early detection of individuals with pre-DM and DM.^1,2 Non-laboratory-based risk models could identify and triage those with higher risks for targeted diagnostic blood tests and preventive interventions for better allocation of resources.³ Given the high prevalence of undiagnosed pre-DM/DM in Hong Kong,⁴ 2 new Hong Kong (HK) Chinese non-laboratory-based pre-DM/DM risk prediction models were developed from a population-representative Population Health Survey (PHS) 2014/15 dataset⁴ using logistic regression (LR) and machine learning (ML) methods, respectively.⁵ Both models showed equally good external validity in discriminating pre-DM/DM cases from non-cases in a Chinese adult population recruited from primary care (PC) clinics in Hong Kong.⁶ However, both models showed poor external calibration, as indicated by the significant differences between the absolute pre-DM/DM risks predicted by the models and the observed incidence. Calibration plots of the models indicated that the models tended to systematically underestimate the absolute pre-DM/DM risks for individuals in PC.

The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement recommends that risk models should be recalibrated/updated based on the characteristics of the intended population if poor calibration is observed during validation,⁷ as poorly calibrated risk models could have a lower clinical utility in practice.⁸ In our case, the HK Chinese risk models appear to underestimate the level of absolute pre-DM/DM risks for individuals in the PC population, who tend to have a higher prevalence of pre-DM/DM than the general population and could therefore provide them with a false sense of reassurance regarding their pre-DM/DM risk levels. Furthermore, absolute risk estimates can be used to guide the intensity of interventions and evaluate their effectiveness.^9-13 Recalibrating/updating the HK Chinese pre-DM/DM risk models is required to enable their applications as reliable outcome measures in PC.

Previous studies have shown that recalibrating existing risk models based on the characteristics of risk factors and outcome incidence of the intended population can substantially improve the model’s external predictive accuracy.^14,15 Model recalibration is often the preferred method to generalize existing models to other populations. It builds onto the associations established between risk factors and the outcome in the original population while enhancing the prediction accuracy through recalibrating the models’ prediction algorithm to accommodate different predictor distributions and pre-test probability in another population.¹⁶ In addition to model recalibration, more extensive methods, such as re-estimating the regression coefficient of individual predictors (model revision) and including additional predictors that were not available during the original model development process (model extension), have also been deployed to further enhance the predictive performance of models in external populations

As we found no significant difference in the external performance between the 2 (LR and ML) HK Chinese models in the PC population,⁶ this study aimed to recalibrate/update the LR model, which is more straightforward and transparent than the ML model. We evaluated whether recalibrating, revising and/or extending the HK Chinese non-laboratory-based pre-DM/DM risk prediction LR model could improve the accuracy of estimating pre-DM/DM risks, as well as the discriminatory ability in case-finding of pre-DM/DM in the Hong Kong PC population.

Subjects

Study Design and Data Source

This was a secondary analysis of data on predictors of the model and blood test results of 919 Chinese adults with no prior DM diagnosis who were recruited from PC clinics in Hong Kong in our study on the external validity of HK Chinese pre-DM/DM risk prediction models.⁶ We randomly split the dataset in half into a training set and a test set. We recalibrated/updated the HK Chinese pre-DM/DM risk prediction LR model based on the training set and evaluated the performance of the updated models using the test set. The validation study population was a convenience voluntary sample recruited from public/private PC clinics from 8th April 2021 to 19th January 2022 in Hong Kong. Inclusion criteria were adults between 18 and 84 years with no prior doctor-diagnosed DM, coronary heart disease, stroke, chronic kidney disease, cancer or anemia. Exclusion criteria were individuals who were non-Chinese, could not communicate in Chinese/English, were pregnant, or were too ill to participate. Details on the study population, participant recruitment, and study procedures of the PC validation study are available in the published study protocol.¹⁷ The study was approved by the institutional review board of The University of Hong Kong/Hong Kong Hospital Authority Hong Kong West Cluster (UW19-831) and Hong Kong Hospital Authority Kowloon Central/Kowloon East Cluster (REC(KC/KE)-21-0042/ER-3). The study is registered at the US ClinicalTrial.gov (NCT04881383) and the HKU clinical trials registry (HKUCTR-2808).

In addition to the original model predictors,⁵ the PC validation study also collected data on other risk factors of pre-DM/DM, including the presence/absence of a family history of DM and weekly vegetable consumption. The dataset also included participants’ blood test results on oral glucose tolerance test (OGTT) and hemoglobin A1c (HbA1c) levels, which were used to diagnose pre-DM/DM according to the World Health Organization and American Diabetes Association’s definitions.^18,19 The incidence of pre-DM/DM in the PC validation study population was 53.43% (n = 491; Pre-DM: 49.18% (n = 452), DM: 4.24% (n = 39). Each training set and test set contained at least 100 cases of pre-DM/DM after the random splitting, which achieved an acceptable statistical power to detect any effect of model recalibration/update on predictive performance.²⁰

Materials and Methods

Outcome Measures

The primary outcome measure of this study was the calibration accuracy of the recalibrated/updated model in the test set, measured by the intercept and calibration slope on the calibration plot and the concordance between the predicted absolute pre-DM/DM risk estimates and the observed incidence. Ideally, the model should have perfect calibration with an intercept of 0 and slope of 1 in the calibration plot.²¹ The secondary outcome was the discrimination of the recalibrated/updated models in detecting pre-DM/DM cases, as evaluated by the area under the receiver operating characteristic curves (AUC-ROC).

Statistical Analyses

The original multivariable LR model estimates the absolute pre-DM/DM risk levels from the weighted sum of the non-laboratory-based predictors, namely age, body mass index (BMI), waist-hip-ratio (WHR), smoking status, sleep duration, weekly fruit consumption and amount of vigorous activity per week.⁵ Additionally, it contains two interaction terms, that is, age² and age*sleep duration, which can be summarized altogether as: Pre-DM/DM risk = 1/(1 + e^{-linear predictor}), where linear predictor = α + β₁*predictor₁ + . . . + β₉*predictor_9.⁵ Here, α is the regression constant, and β₁ to β₉ are the regression coefficients of the predictors. Poor calibration was indicated by the calibration plot of the original LR model in external validation on the PC population (intercept: 1.79 [1.64, 1.94], calibration slope: 0.74 [0.60, 0.87]).

Using the training set, we recalibrated/updated the model using the methodologies recommended by Janssen et al²² and Steyerberg et al²³ (Table 1). In brief, we applied 7 step-wise methods in an attempt to recalibrate/update the model in which the first 2 are considered as model recalibrating methods, the third and fourth are considered model revising methods that revise the regression coefficients of the predictors, and the final 3 are model extending methods that extend the model with the additional predictors.

Table 1.

Step-Wise Methodology Used to Recalibrate and Update Multivariable Logistic Regression Risk Prediction Model, as Proposed by Janssen et al and Steyerbel et al.

	Parameters of the models that are updated	Description
Model recalibration	1. α	Known as the “calibration in the large.” This method updated only the regression constant of the model to offer predicted risk estimates that match with the observed incidence.
Model recalibration	2. α + β_overall	Known as the “logistic recalibration.” On top of updating the regression constant, this method multiplied an overall update coefficient to the coefficients of each original predictor.
Model revision	3. α + β_overall + γ_{1. . .9 \| P} _< _.05	This method examines whether any individual predictor has a significantly different effect on the risk of the outcome in the training set. Each predictor was tested in a forward stepwise matter. On top of multiplying the overall update coefficient, the coefficient of a predictor was also updated individually, by multiplying a respective additional revision factor, if a significant relationship to the prediction outcome in the training set (z value, P < .05) and an improvement in the relative goodness of the fit of the model (Lowered Akaike information criterion [AIC]) was found.
Model revision	4. α + β_{1. . .9}	This method re-estimates the coefficients of all the predictors based on the training set without selection.
Model extension	5. α + β_overall + γ_{1. . .9 \| P} _< _.05 + β_{10. . .11\| P} _< _.05	This method built onto method 3 and updated the model by adding additional predictors that were unavailable in model development. Forward stepwise test with the nine original predictors to update those that had significant effects by regressions in the training set. Also, additional predictors that had significant effect on outcome risk, indicated by z value (P < .05) and a lowered AIC, were added one at a time to extend the model.
	6. α + β_{1. . .9} + β_{10. . .11 \| P} _< _.05	This method built onto method 4 and additional predictors that had significant effect on outcome risk, indicated by z value (P < .05) and a lowered AIC, were added one at a time to extend the model.
	7. α + β_{1. . .12}	This method re-estimates the coefficients of the original predictors and includes all available additional predictors without selection.

We evaluated the performance, i.e. the calibration (calibration plot) and discrimination (AUC-ROC), of each updated model in the split test set. DeLong’s test was used to compare the AUC-ROCs of the updated models to that of the original model.²⁴ All statistical analyses were done using R 3.5.1 and IBM SPSS Statistics version 26.

Results

The characteristics of participants were similar between the training and test datasets (Table 2). The mean age for the training and test sets were 51.7 and 51.0 years, respectively. The proportion of females were 66.7% (n = 306) and 66.3% (n = 305) in the training and test sets, respectively. The pre-DM/DM incidence of the training set and test set were 54.2% (n = 249) and 52.6% (n = 242), respectively. The calibration plot of the original model in the test set is shown in Figure 1.

Table 2.

Participants’ Characteristics in the Split Training Set and the Test Set.

	Training set(N = 459)	Test set(N = 460)	t-Test
	Mean (SD)		P-value
Age	51.7 (12.92)	51.0 (14.24)	0.385
BMI (kg/m²)	23.4 (3.64)	23.3 (3.65)	0.742
WHR	0.84 (0.072)	0.85 (0.071)	0.339
Waist (cm)	81.8 (10.27)	82.3 (10.30)	0.509
Hip (cm)	96.8 (7.01)	96.8 (7.19)	0.981
Mean SBP (mmHg)	121.4 (18.83)	120.2 (18.24)	0.325
Mean DBP (mmHg)	72.2 (10.59)	71.7 (10.13)	0.479
Vigorous recreational activity (min/week)	42.6 (101.64)	39.45 (114.90)	0.657
Fruit consumption (serves/month)	37.8 (25.36)	34.7 (23.53)	0.059
Sleep duration (hours/day)	6.8 (1.26)	6.7 (1.24)	0.620
	% (n)		Chi-square test (p)
Pre-DM/DM incidence	54.25 (249)	52.61 (242)	0.618
Smokers	5.01 (23)	5.43 (25)	0.773

Abbreviations: BMI, body mass index; DBP, diastolic blood pressure; SBP, systolic blood pressure; WHR, waist hip ratio.

Figure 1.

Calibration plot of the original model to detect pre-diabetes mellitus and diabetes mellitus on the test set (N = 460). The x-axis is the predicted risk estimates of pre-DM/DM, and the y-axis is the observed case incidence. The curves were fitted based on restricted cubic splines. At the bottom of the graphs, histograms of the predicted risks are shown for the participants with (1) and without (0) pre-DM/DM.

Table 3 summarizes the updates applied to each recalibrated/updated model and their respective predictive performance in the test set. While all updated models improved calibration, there was no significant difference in the discrimination between any of the 7 updated models (AUC-ROC: 0.742-0.750) and the original (AUC-ROC: 0.746). At the study’s proposed sensitivity of 75%,¹⁷ both of the recalibrated models (based on methods 1-2) had specificities, positive predicted values and negative predicted values of 0.60, 0.67, 0.68, respectively, which were identical to the original model as the recalibration methods did not update any coefficients of the prediction algorithm of the model. At a sensitivity of 75%, the specificities, positive predicted values, and negative predicted values of the updated models (based on method 3-7) ranged from 0.57 to 0.61, 0.66 to 0.68, and 0.68 to 0.69, respectively, which were not statistically different to that of the original model.

Table 3.

Model Discrimination Performance of Models Obtained by Different Recalibration/Update Methods in the Test Set (N = 460).

			Discrimination(AUC-ROC)	At sensitivity of 0.75
			Discrimination(AUC-ROC)	Specificity	PPV	NPV
Original LR model			0.746	0.60	0.67	0.68
Original coefficients
β_age	0.085
β_BMI	0.125
β_WHR	0.230
β_smoker	0.556
β_{sleep duration}	−0.972
β_{vigorous recreational activity time}	−0.003
β_{fruit consumption}	−0.004
β_age2	−0.001
β_{agesleep duration}*	0.015
Original regression constant	−6.059
	Update(s)		Discrimination(AUC-ROC)	at sensitivity of 0.75
	Update(s)		Discrimination(AUC-ROC)	Specificity	PPV	NPV
Model 1: Recalibration			0.746	0.60	0.67	0.68
Adjustment to the original regression constant (α)	1.796 (0.58, 1.00)
Model 2: Recalibration			0.746	0.60	0.67	0.68
Adjustment to the original regression constant (α)	1.479 (1.28, 1.67)
Overall update coefficient (β_overal_l)	0.792 (0.69, 0.90)
Model 3: Revision			0.750	0.61	0.68	0.69
Adjustment to the original regression constant (α)	−6.675 (−8.76, −4.59)
Overall update coefficient (β_overall)	0.364 (0.20, 0.53)
Additional revision factor (γ) applied on individual predictors
γ_WHR	7.323 (5.41, 9.23)
γ_Age	0.024 (0.01, 0.04)
Model 4: Revision			0.745	0.57	0.66	0.68
Re-estimated regression constant (α)	−10.549 (−14.09, −7.00)
Re-estimated coefficients (β)
β_age	0.047 (−0.03, 0.13)
β_BMI	0.055 (0.02, 0.09)
β_WHR	8.271 (6.31, 10.23)
β_smoker	0.171 (−0.33, 0.67)
β_{sleep duration}	0.026 (−0.37, 0.42)
β_{vigorous recreational activity time}	−0.003 (−0.004, −0.002)
β_{fruit consumption}	0.003 (−0.002, 0.007)
β_age2	−0.092 (−0.62, 0.43)
β_{agesleep duration}*	−0.084 (−0.56, 0.39)
Model 5: Extension			5a: 0.7505b: 0.746	0.600.59	0.680.67	0.690.68
Adjustment to the original regression constant (α)	−6.701 (−8.79, −4.61)	−6.611 (−8.70, −4.52)
Overall update coefficient (β_overall)	0.362 (0.20, 0.52)	0.359 (0.20, 0.52)
Additional revision factor (γ) applied on individual predictors
γ_WHR	7.337 (5.42, 9.25)	7.286 (5.37, 9.20)
γ_Age	0.025 (0.01, 0.04)	0.026 (0.01, 0.04)
Coefficient of the additional predictor (β)
Model 5a—inclusion of family history of DM
β_{family history of DM}	0.031 (−0.11, 0.18)	—
Model 5b—inclusion of vegetable consumption
β_{vegetable consumption}	—	−0.002 (−0.006, 0.001)
Model 6: Extension			6a: 0.7466b: 0.741	0.570.60	0.660.67	0.680.68
Re-estimated regression constant (α)	−10.526 (−14.08, −6.98)	−10.533 (−14.07, −6.99)
Re-estimated coefficients (β)
β_age	0.046 (−0.03, 0.13)	0.051 (−0.03, 0.13)
β_BMI	0.055 (0.02, 0.09)	0.056 (0.02, 0.09)
β_WHR	8.275 (6.32, 10.23)	8.168 (6.21, 10.13)
β_smoker	0.173 (−0.33, 0.67)	0.125 (−0.38, 0.63)
β_{sleep duration}	0.023 (−0.37, 0.42)	0.036 (−0.36, 0.43)
β_{vigorous recreational activity time}	−0.003 (−0.004, −0.002)	−0.003 (−0.004, −0.002)
β_{fruit consumption}	0.003 (−0.002, 0.007)	0.004 (−0.001, 0.008)
β_age2	−0.095 (−0.62, 0.43)	−0.078 (−0.60, 0.45)
β_{agesleep duration}*	−0.079 (−0.55, 0.39)	−0.094 (−0.57, 0.38)
Coefficient of the additional predictor (β)
Model 6a—inclusion of family history of DM
β_{family history of DM}	0.027 (−0.12, 0.17)	—
Model 6b—inclusion of vegetable consumption
β_{vegetable consumption}	—	−0.003 (−0.007, 0.001)
Model 7: Extension			0.742	0.60	0.67	0.68
Re-estimated regression constant (α)	−10.511 (−14.06, −6.96)
Re-estimated coefficients (β)
β_age	0.050 (−0.03, 0.13)
β_BMI	0.056 (0.02, 0.09)
β_WHR	8.172 (6.21, 10.13)
β_smoker	0.127 (−0.38, 0.63)
β_{sleep duration}	0.033 (−0.36, 0.43)
β_{vigorous recreational activity time}	−0.003 (−0.004, −0.002)
β_{fruit consumption}	0.004 (−0.001, 0.008)
β_age2	−0.082 (−0.61, 0.44)
β_{agesleep duration}*	−0.090 (−0.56, 0.38)
β_{family history of DM}	0.027 (−0.12, 0.17)
β_{vegetable consumption}	−0.003 (−0.0070, 0.0005)

The calibration plots of the recalibrated models (methods 1 and 2) had an intercept and slope of −0.01 [−0.22, 0.20] and 0.69 [0.51, 0.87], and −0.02 [−0.22, 0.18] and 0.88 [0.65, 1.10], respectively (Figure 2a and b). In method 3, we found that WHR and age had significantly stronger effects on predicting pre-DM/DM risks in the test set. Thus, on top of applying the overall update coefficient (βoverall: 0.364), an additional revision factor (γWHR: 7.323, γAge: 0.024) was applied to their coefficients, respectively. The calibration plot of the updated model 3 showed an intercept and slope of −0.07 [−0.28, 0.14] and 0.88 [0.67, 1.09] (Figure 2c). In method 4, which re-estimated the coefficients of all predictors without selection, we noted that the coefficients of most predictors were different from their respective original values except for weekly vigorous recreational activity time. The intercept and calibration slope of the updated model 4 were −0.08 [−0.29, 0.13], 0.84 [0.64, 1.04], respectively (Figure 2d).

Figure 2.

Calibration plots of the recalibrated/updated models to detect pre-diabetes mellitus and diabetes mellitus on the test set (N = 460). The x-axis is the predicted risk estimates of pre-DM/DM, and the y-axis is the observed case incidence. The curves were fitted based on restricted cubic splines. At the bottom of the graphs, histograms of the predicted risks are shown for the participants with (1) and without (0) pre-DM/DM. (a) Recalibrated model in method 1. (b) Recalibrated model in method 2. (c) Revised model in method 3. (d) Revised model in method 4. (e) Extended model (a) in method 5. (f) Extended model (b) in method 5. (g) Extended model (a) in method 6. (h) Extended model (b) in method 6. (i) Extended model in method 7.

In methods 5 to 7, we applied additional predictors, namely, the presence/absence of a family history of DM and/or weekly vegetable consumption, to extend the original model. The additional predictors did not contribute any significant additional effects on predicting the absolute pre-DM/DM risks in all extended models. In method 5, we added the additional predictor 1 at a time to the updated model 3 to create 2 extended models (model 5a and b). The calibration plots of models 5a and 5b indicated intercepts and slopes of −0.07 [−0.28, 0.14], −0.06 [−0.27, 0.15], and 0.88 [0.67, 1.09], 0.87 [0.66, 1.07], respectively (Figure 2e, Model 5a: extended model with the presence/absence of a family history of DM, and Figure 2f, Model 5b: extended model with weekly vegetable consumption). In method 6, we added the additional predictor 1 at a time to the updated model 4 to create 2 extended models (model 6a and 6b), which resulted in calibration plots that indicated intercepts and slopes of −0.08 [−0.29, 0.13], −0.06 [−0.27, 0.15], and 0.84 [0.64, 1.04], 0.82 [0.63, 1.02], respectively (Figure 2g, Model 6a: extended model with the presence/absence of a family history of DM, and Figure 2h, Model 6b: extended model with weekly vegetable consumption). In method 7, we included both additional predictors into the model and re-estimated the coefficients of all the original predictors. The intercept and calibration slope of the extended model in method 7 were −0.06 [−0.27, 0.15] and 0.82 [0.63, 1.02], respectively (Figure 2i).

Discussion

This study assessed whether recalibrating/updating the HK Chinese non-laboratory-based LR risk prediction model could improve its performance, for example, calibration and discrimination, in a PC population in Hong Kong. We found that a simple recalibration of the model’s regression constant was sufficient to improve the calibration, that is, the accuracy in estimating the absolute pre-DM/DM risks for individuals in the PC population. It should be noted that re-estimating the regression coefficients or extending the model with additional predictors, including a family history of DM and weekly consumption of vegetables, did not improve its calibration any more than that obtained by simple model recalibration. Furthermore, we demonstrated the robustness of the original model’s validity in terms of its discrimination (AUC-ROC) between pre-DM/DM cases and non-cases, which did not change significantly despite different update methods.

Our findings support those reported by Masconi et al,²⁵ where model recalibration was sufficient to improve the calibration of 5 existing DM models. The same study also found that total re-estimation of all the regression coefficients could not improve the models’ performance any further,²⁵ as, potentially, the data from the external populations could not add more information to the predictor-outcome associations established in the original development population. In contrast, Xu et al²⁶ reported a significant improvement in discrimination when they revised all the coefficients of the well-established Framingham DM risk model according to an older Chinese adult population. Findings from previous studies seem to indicate that when a model is applied to an external population within the same culture as the development population, the intercorrelation of effects of the predictors to outcome could remain relative and significant^25,26 and extensive update methods might not be needed to improve the model’s performance. However, if the external population had vast cultural or racial differences from the development population, differences in genetic predisposition related to pre-DM/DM could undermine the relative effects of the predictors of outcome, and more extensive update methods to re-estimate the coefficients of the predictors might be needed. Since the development and external validation population were both derived from the Hong Kong Chinese population in our case,^5,17 we demonstrated that simple recalibration to adjust to the difference in incidences was the most adequate method to adapt the HK Chinese risk model for application in the Hong Kong primary care setting.

We noted that including additional DM risk factors, that is, a family history of DM and weekly vegetable consumption, did not significantly improve the model’s performance. The variability of these risk factors in the split training set might not be large enough to contribute a significant prediction effect to the outcome. This result aligns with a previous study conducted by Simmons et al,²⁷ which also found no improvement in prediction accuracy when additional dietary predictors were included to extend an existing DM risk model. Another potential explanation is that several metabolic risk factors found to be strongly associated with pre-DM/DM were already included as predictors in the original model, for example, age, BMI, and WHR, and could have dominated the potential effects of the additional factors on predicting pre-DM/DM risk. Also, since fruit consumption was already included as a predictor, the lack of improvement when adding vegetable consumption to the model could be related to the multicollinearity effect of these 2 factors. When developing the original LR model, Dong et al⁵ applied the assessment of multicollinearity and bidirectional stepwise consideration of factors based on the Akaike information criterion (AIC), which helped to ensure the robustness of the model. Furthermore, the development population was derived from a sizeable population-representative dataset. This provided sufficient power for precise estimates of the association coefficients between the predictors and pre-DM/DM risk,⁵ thereby supporting its generalisability to an external local population.

Given the robustness of the model, we confirmed that the original model was valid for screening to differentiate the high-risk individuals from low-risk individuals for further testing that confirms the diagnosis of pre-DM/DM in the Hong Kong Chinese PC population. The re-calibrated HK Chinese risk prediction model could also be used to accurately estimate the absolute pre-DM/DM risks for individuals presenting to PC, who tend to have a higher prevalence of pre-DM/DM than the general population. The HK Chinese pre-DM/DM risk prediction model is novel in that it includes non-laboratory-based lifestyle predictors, for example, sleep duration, fruit consumption and amount of weekly vigorous activity,⁵ which emphasizes the behavior-disease-risk link of pre-DM/DM, thereby motivating individuals to adopt positive lifestyle behavioral changes to prevent DM progression. Individuals could also use the estimated absolute risk predicted by the recalibrated model to assess and monitor their efforts in behavioral changes. Based on the self-regulation model and the attribution theory, individuals would perceive their actions as more effective if they observe an agreement on the changes they have made with their anticipated effects.²⁸

There were a few limitations in this study. First, we deployed convenience voluntary sampling that included self-referral and snowball sampling during participant recruitment in the validation study. This method could have attracted individuals with higher pre-DM/DM risks to participate. Thus, the study population’s incidence might not accurately represent the actual incidence of pre-DM/DM in primary care in Hong Kong. Second, we could not access the original development PHS 2014/15 dataset and, therefore, could not combine the development and validation population to re-estimate the coefficients in the updated models (method 4), as recommended by Janssen et al.²² Third, the diagnosis of pre-DM/DM was based on 1 single blood test of OGTT and blood HbA1c level in our study, which could have overestimated the case incidence. Also, as the PHS 2014/15 did not include postprandial blood glucose level by OGTT for case definition,⁴ the current definition might have overestimated the incidence in our study. Nonetheless, the incidence in the study population remained high (48.42%; n = 445) when we used the same case definition that PHS 2014/15 used.⁴ A repeat test within 1 month may be considered to confirm the diagnosis in future studies, but this may increase the burden on the participants and research resources. Fourth, the model recalibration/update results were specific to the Hong Kong Chinese PC population and may not be generalizable to Chinese populations in other parts of the world due to potential lifestyle and environmental differences. Further recalibration/update of the HK Chinese non-laboratory-based risk model should be carried out before its application to other Chinese populations.

Our study found that simple recalibration was sufficient to improve the model’s calibration accuracy but did not significantly improve discrimination in case finding of pre-DM/DM. We conclude that the HK Chinese non-laboratory-based pre-DM/DM risk prediction model can be used as a first-step screening tool. With its dichotomous prediction outcome of a likely versus unlikely case of pre-DM/DM, it can help to identify high-risk individuals for further blood tests to detect pre-DM/DM in asymptomatic Chinese adults presenting to primary care. In contrast, by taking the high prevalence of pre-DM/DM in primary care into consideration, the absolute risk levels estimated by the recalibrated HK Chinese pre-DM/DM risk prediction model can serve as a reliable non-laboratory-based measure to monitor changes in individuals’ risk level of pre-DM/DM over time or following interventions.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Health and Medical Research Fund, Health Bureau, Government of the Hong Kong Special Administrative Region (reference number: 17181641). The funding organisation did not play any role in the design and conduct of the study, collection, management, analysis, interpretation of the data, or manuscript preparation.

ORCID iDs

Will H. G. Cheng

Cindy L. K. Lam

References

Cheng

WH-G

Dong

, et al. Non-laboratory-based risk prediction tools for undiagnosed pre-diabetes: a systematic review. Diagnostics. 2023;13(7):1294.

Barber

Davies

Khunti

Gray

LJ.

Risk assessment tools for detecting those with pre-diabetes: a systematic review. Diabetes Res Clin Pract. 2014;105(1):1-13.

Khunti

Gillies

Taub

, et al. A comparison of cost per case detected of screening strategies for type 2 diabetes and impaired glucose regulation: modelling study. Diabetes Res Clin Pract. 2012;97(3):505-513.

Department of Health, HKSAR Government. Surveillance and Epidemiology Branch, Centre for Health Protection. Report of Population Health Survey 2014/15. 2017.

Dong

Tse

TYE

Mak

, et al. Non-laboratory-based risk assessment model for case detection of diabetes mellitus and pre-diabetes in primary care. J Diabetes Investig. 2022;13(8):1374-1386.

Cheng

Dong

Tse

Wong

Lam

. External validation of new non-laboratory-based pre-DM/DM risk prediction models for case finding in HK primary care. Presented at: International Diabetes Federation Western Pacific Region Congress 2023 and the 15th Scientific Meeting of the Asian Association for the Study of Diabetes; 2023; Kyoto, Japan. Accessed January 3, 2024. https://onlinelibrary.wiley.com/doi/abs/10.1111/jdi.14082

Collins

Reitsma

Altman

Moons

KG.

Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63.

Van Calster

McLernon

Van Smeden

Wynants

Steyerberg

EW.

Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17(1):1-7.

D’Agostino

Russell

Huse

, et al. Primary and subsequent coronary risk appraisal: new results from the Framingham study. Am Heart J. 2000;139(2):272-281.

10.

Wilson

D’Agostino

Levy

Belanger

Silbershatz

Kannel

WB.

Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97(18):1837-1847.

11.

Grundy

Balady

Criqui

, et al. Primary prevention of coronary heart disease: guidance from Framingham: a statement for healthcare professionals from the AHA Task Force on Risk Reduction. Circulation. 1998;97(18):1876-1887.

12.

Wood

De Backer

Faergeman

Graham

Mancia

Pyörälä

Prevention of coronary heart disease in clinical practice: recommendations of the Second Joint Task Force of European and other Societies on Coronary Prevention1, 2. Atherosclerosis. 1998;140(2):199-270.

13.

Pearson

Blair

Daniels

, et al. AHA guidelines for primary prevention of cardiovascular disease and stroke: 2002 update: consensus panel guide to comprehensive risk reduction for adult patients without coronary or other atherosclerotic vascular diseases. Circulation. 2002;106(3):388-391.

14.

Liu

Hong

D’Agostino

RB , et al. Predictive value for the Chinese population of the Framingham CHD risk assessment tool compared with the Chinese Multi-Provincial Cohort Study. JAMA. 2004;291(21):2591-2599.

15.

Marrugat

d’Agostino

Sullivan

, et al. An adaptation of the Framingham coronary heart disease risk function to European Mediterranean areas. J Epidemiol Community Health. 2003;57(8):634-638.

16.

Chow

Joshi

Celermajer

Patel

Neal

BC.

Recalibration of a Framingham risk equation for a rural population in India. J Epidemiol Community Health. 2009;63(5):379-385.

17.

Dong

Cheng

WHG

Tse

ETY

, et al. Development and validation of a diabetes mellitus and prediabetes risk prediction function for case finding in primary care in Hong Kong: a cross-sectional study and a prospective study protocol paper. BMJ Open. 2022;12(5):e059430.

18.

World Health Organization. Definition and diagnosis of diabetes mellitus and intermediate hyperglycaemia: report of a WHO/IDF consultation; 2006.

19.

American Diabetes Association. Classification and diagnosis of diabetes: standards of medical care in diabetes—2019. Diabetes Care. 2019;42(Supplement_1):S13-S28.

20.

Vergouwe

Steyerberg

Eijkemans

Habbema

JDF

. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005;58(5):475-483.

21.

Van Calster

Nieboer

Vergouwe

De Cock

Pencina

Steyerberg

EW.

A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016;74:167-176.

22.

Janssen

Moons

Kalkman

Grobbee

Vergouwe

Updating methods improved the performance of a clinical prediction model in new patients. J Clin Epidemiol. 2008;61(1):76-86.

23.

Steyerberg

Borsboom

van Houwelingen

Eijkemans

Habbema

JDF

. Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Stat Med. 2004;23(16):2567-2586.

24.

DeLong

Clarke-Pearson

. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988:837-845.

25.

Masconi

Matsha

Erasmus

Kengne

AP.

Effect of model updating strategies on the performance of prevalent diabetes risk prediction models in a mixed-ancestry population of South Africa. PloS One. 2019;14(2):e0211528.

26.

Jiang

Schooling

Zhang

Cheng

Lam

Prediction of 4-year incident diabetes in older Chinese: recalibration of the Framingham diabetes score on Guangzhou Biobank Cohort Study. Prev Med. 2014;69:63-68.

27.

Simmons

Harding

Wareham

Griffin

Team

ENP

. Do simple questions about diet and physical activity help to identify those at risk of Type 2 diabetes? Diabetic Med. 2007;24(8):830-835.

28.

Marteau

Weinman

Self-regulation and the behavioural response to DNA risk information: a theoretical analysis and framework for future research. Soc Sci Med. 2006;62(6):1360-1368.