Sage Journals: Discover world-class research

Abstract

The glycemia risk index (GRI) is an emerging metric designed to quantify the risk of both hypo- and hyperglycemia, providing a combined assessment of glycemic control quality. A high GRI is associated with an increased risk of diabetic complications. In this study, we leverage long-term continuous glucose monitoring (CGM) data to develop and validate predictive models for a high GRI (>60) in individuals with T1D. We assessed over 250 000 days of measurements collected over four years from 736 patients with type 1 diabetes. Our modeling approach shows promise for predicting glycemic control quality (area under the receiver operating characteristic curve [ROC-AUC] of 0.87) six to nine months from baseline. However, additional analysis and validation are imperative to determine its full clinical utility.

Keywords

glycemia risk index type 1 diabetes machine learning prediction model glycemic control

Introduction

Type 1 diabetes (T1D) is a chronic autoimmune condition characterized by the destruction of insulin-producing beta cells in the pancreas, leading to lifelong dependency on exogenous insulin therapy. Despite advances in diabetes management, maintaining optimal glycemic control remains a significant challenge for individuals with T1D, often resulting in periods of hyperglycemia (high blood glucose levels) and hypoglycemia (low blood glucose levels).^1,2 Continuous glucose monitoring (CGM) technology has revolutionized diabetes management by providing real-time glucose readings, enabling more precise insulin dosing and timely interventions to prevent extreme glycemic events. The CGM also serves as a useful tool for assessing and quantifying the quality of glycemic control.²

The glycemia risk index (GRI)³ is an emerging metric designed to quantify the risk of both hypo- and hyperglycemia, offering a combined assessment of glycemic control quality. A high GRI is associated with increased risks of diabetic complications, including cardiovascular diseases, nephropathy, retinopathy, and reduced quality of life, increased diabetes-related stress, and lower satisfaction with treatment.^3-6 Accurate prediction of glycemic control quality is important for developing personalized treatment strategies, minimizing complications, and improving the quality of life for individuals with T1D.⁷

To date, most predictive initiatives in this domain have focused on short-term prediction horizons, typically within minutes or hours.^8-11 While these models provide valuable insights for immediate glucose management, they fall short of addressing longer-term glycemic patterns. A limited number of studies have explored mid-range prediction horizons, such as week-to-week forecasts, which are crucial for understanding and managing more extended glycemic trends.^12-14 A longer prediction horizon could be particularly beneficial in targeting patients at risk, allowing for better timely interventions and more effective long-term management strategies.

In this study, we leverage long-term CGM data to develop and validate predictive models for high GRI in individuals with T1D. By utilizing advanced machine-learning algorithms, we aim to identify key patterns and features within the CGM data that indicate elevated glycemic risk.

Methods

Data Material

We analyzed data from the previously published T1DiabetesGranada study,¹⁵ which encompasses over 250 000 days of measurements collected over four years from 736 patients with T1D in Granada, Spain. The primary flash glucose monitoring (FGM) device used during the study was the FreeStyle Libre 2, with some initial use of the first version, FreeStyle Libre 1.

The cohort’s characteristics included a mean age of 40.3 ± 15.8 years, ranging from 12 to 81 years, with 373 female patients (50.68%) and 363 male patients (49.32%). On average, patients had 350.2 ± 284.2 days of glucose measurements.

Patients were included in our analysis if FGM data were available for at least 14 days of monitoring with ≥70% wear time¹⁶ at baseline (0-30 days) and six to nine months past the baseline end.

Approach

We developed binary classification models to predict a high^17,18 GRI (> 60, zone D-E) or low GRI (≤60) within a six- to nine-month period from baseline using predictors extracted from the FGM data at baseline. The threshold of GRI >60 was chosen based on prior studies that have identified GRI >60 as indicative of increased glycemic instability and a higher risk of adverse glycemic outcomes.³ XGBoost was selected for its nonlinear modeling capabilities and its previously demonstrated high performance in the medical domain.¹⁹ XGBoost is an ensemble learning method that combines multiple decision trees to produce a prediction. The output is a probability for, in our case, having a high GRI at six to nine months follow-up.

A 5-fold cross-validation approach was employed to estimate the performance of the modeling approach. The data set was divided into five subsets (or folds). In each iteration, four folds were used to train the model, and the remaining fold was held out for testing. This process is repeated five times, such that each fold is used as the test set exactly once. This validated approach ensures complete separation between training and testing data, reducing the risk of overfitting and providing a reliable estimate of the model’s performance on unseen data. It also maximizes the use of available data for modeling while maintaining strict evaluation standards.²⁰ To achieve optimal performance, tuning of the XGBoost model was performed on hyperparameters (learning rate, depth, n estimators) using a subsequently cross-validation on the training set for each iteration of the outer cross-validation.

To assess the clinical usefulness of our model, we used a net benefit approach,²¹ which balances the advantages of correctly identifying at-risk individuals (true positives) against the drawbacks of incorrectly labeling individuals as high risk (false positives) at different threshold probabilities (p_t). This method helps determine whether using the model leads to better decision-making compared to alternative strategies, such as treating everyone as high risk or assuming no one is at risk. The net benefit is calculated by assigning weight to false positives based on a chosen risk threshold—reflecting how much harm a false alarm causes compared to missing a true case. By plotting net benefit across different risk thresholds, we can visualize how well our model performs in guiding clinical decisions:

N e t B e n e f i t = \frac{T r u e p o s i t i v e s}{N} - \frac{F a l s e p o s i t i v e s}{N} \cdot \frac{p_{t}}{1 - p_{t}}

Predictors

Several metrics which have been linked to abnormal glucose control were calculated based on the data from the FGM baseline. The predictors included conventional statistical metrics (mean, median, standard deviation, coefficient of variation, interquartile range), GRI metrics³ (GRI, GRI hyper component, GRI hypo component), metrics related to time in ranges^16,22 (time in range [TIR] between 70 and 180 mg/dL, time below range [TBR1], between 54 and 69 mg/dL, time below range [TBR2] below 54 mg/dL, time above range [TAR1], between 181 and 250 mg/dL, and time above range [TAR2], exceeding 250 mg/dL, as well as time in tight range [TITR]).²³ Furthermore, age and gender at baseline were included as predictors.

Assessment and Explainability

Assessment of model performance involved the use of the area under the receiver operating characteristic curve (ROC-AUC) across folds, distribution of predicted values, calibration plot including the Brier score,²⁴ and net benefit plot.²¹. To enhance model interpretability, feature importance and explainability were evaluated through SHAP (SHapley Additive exPlanations) values.²⁵

All analyses were performed using MATLAB (R2021b), Python (v3), the Scikit-learn (v1.5.0) for machine-learning utilities, SHAP (v 0.42.1), and the XGBoost (v1.7.5).

Results

The analysis included 434 patients with T1D who had sufficient FGM measurements at baseline and at a six to nine-month follow-up. At follow-up, 138 patients exhibited high GRI scores (>60).

Figure 1 presents the assessment characteristics of the models for predicting patients with high GRI scores. The models achieved an average ROC-AUC of 0.87 (SD = 0.05) using cross-validation—indicates good discrimination ability of the approach, meaning it can effectively distinguish between patients with high and low GRI scores. In other words, it can predict a high proportion of patients having high GRI (true positive rate) at follow-up without falsely predicting a high proportion of patients with low GRI score at follow-up (false positive rate).

Figure 1.

Illustrates (a) the ROC for each and the average cross-validation folds, (b) normalized predictions for meeting the low or high GRI (>60), (c) calibration plot and Brier score for the prediction model, (d) the Net Benefit curve for the model, treat all, or treat none, (e) violin plot for GRI in the predicted groups from baseline to 21 months, and (f) violin plot for TAR2 in the predicted groups from baseline to 21 months.

The Brier score of 0.13 reflects the model’s calibration or accuracy in predicting probabilities. It measures the mean squared difference between the predicted probabilities and the actual outcomes, with lower values indicating better calibration. A score of 0 indicates a perfectly calibrated model, while a score of 1 represents poor calibration. In addition, the net benefit plot compares the clinical utility of the model with two strategies: treating all patients and treating none. The net benefit quantifies the trade-off between the true positives captured and the false positives incurred at different threshold probabilities. A higher net benefit line for the model demonstrates its utility in identifying at-risk patients while minimizing unnecessary interventions. Furthermore, a suggested cutoff probability of 0.5 identified a group of patients at risk who exhibited persistently higher GRI values from baseline through the intervals of six to nine months, 12 to 15 months, and 18 to 21 months in the future, as observed in Figure 1e. The proposed cut-off demonstrates good classification performance, achieving a true positive rate of 0.70, a false positive rate of 0.10, a positive predictive value of 0.76, a negative predictive value of 0.86, and an overall accuracy of 0.83. A total of 11% had an initial baseline GRI < 60.

For the explainability analysis, the mean SHAP value of GRI at baseline accounted for 49% of the model’s total impact, while the subsequently most important predictors, standard deviation, TIR, TITR, and interquartile range (IQR) contributed 12%, 7%, 6%, and 6%, respectively, to the total impact. This indicates that additional features increase the models’ predictive capabilities beyond the baseline GRI score.

Conclusions

In this investigation, we have successfully developed and internally validated a machine-learning approach to predict high GRI scores at six to nine months from baseline within a cohort of individuals with T1D. This underscores the potential for identifying patients at risk of sustained poor glucose control over an extended prediction horizon. While managing patients with baseline high GRI is important, a six to nine months predictive model offers added clinical utility by identifying at-risk patients earlier, enabling more tailored and timely interventions. The model should not replace baseline management strategies but augment them with a longer-term view. A six to nine months predictive window could enable health care providers to identify patients who might currently have a GRI ≤60 but are at risk of exceeding this threshold in the future. This allows earlier intervention, such as lifestyle modifications, medication adjustments, or increased monitoring, potentially preventing a decline in health. In addition, it could support better planning and resource allocation, as high-risk patients can be flagged earlier for targeted care.

Within our study, the group predicted to be at high risk of high GRI scores at six to nine months continued to exhibit elevated GRI scores 21 months from baseline, despite a regression toward the predicted low-risk group.

The approach demonstrated clear differentiation between the two classes, with a relatively low Brier score suggesting the potential clinical utility of the model’s output probabilities for assessing long-term risk of poor glucose control in individual patients. In addition, a net benefit effect was observed when compared to simplistic strategies such as treating all patients as at-risk or none at all.

Baseline GRI score emerged as the most significant predictor for future poor glucose control, and the inclusion of additional features in the combined model improved its predictive capability substantially. To our knowledge, this is the first study to explore the identification of individuals at risk of poor glucose control quality using the GRI metric. However, our findings align with similar studies, such as those reported by Hilliard et al,²⁶ who employed latent group-based trajectory modeling to identify subgroups exhibiting sustained elevated HbA_1c levels over 18 to 24 months in 150 T1D patients.

Despite the valuable insights provided by our study, it is important to acknowledge its limitations. The relatively small number of events within the analyzed cohort presents a challenge, affecting the robustness of our findings. Therefore, caution should be exercised when extrapolating these results to a broader population. Further validation is necessary, highlighting the importance of replicating our model’s performance across diverse data sets and populations of individuals with T1D. Also, a key limitation of this study is that all CGM data were derived exclusively from the FreeStyle Libre system. While this approach ensures consistency in glucose measurement and GRI calculation, it may limit the generalizability of our findings to individuals using other CGM devices, such as Dexcom or Medtronic systems. Future studies should explore whether our predictive modeling approach remains robust across different CGM technologies. Another limitation of our study is that we did not include demographic or socioeconomic factors as predictors in our model. While these variables can influence diabetes management and glycemic outcomes, our primary focus was on CGM-derived metrics, which provide direct, continuous physiological insights into glycemic variability. In addition, demographic and socioeconomic data were not comprehensively available in our data set, which may have introduced bias if included selectively. Future research should explore the integration of these factors to determine whether combining physiological and social determinants of health can enhance long-term risk prediction and improve personalized diabetes management strategies.

In conclusion, while our approach shows promise for predicting glycemic control quality, additional analysis and validation are imperative to determine its full clinical utility.

Footnotes

Abbreviations

CGM, continuous glucose monitoring; FGM, flash glucose monitoring; GRI, glycemia risk index; ROC-AUC, receiver operating characteristic curve; SHAP, SHapley Additive exPlanations; TAR, time above range; TBR, time below range; TIR, time in range.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Ethical Approval

Ethical approval and informed consent were obtained. The T1DiabetesGranada study was reviewed and approved by the Ethics Committee of Biomedical Research of the Province of Granada (CEIm/CEI GRANADA), protocol code K134665CRL, ethics portal code 0698-N-21.

ORCID iD

Simon Lebech Cichosz

References

Leslie

Evans-Molina

Freund-Brown

, et al. Adult-onset type 1 diabetes: current understanding and challenges. Diabetes Care. 2021;44(11):2449-2456. doi:10.2337/DC21-0770.

Holt

RIG

Devries

Hess-Fischl

, et al. The management of type 1 diabetes in adults. A consensus report by the American Diabetes Association (ADA) and the European Association for the study of diabetes (EASD). Diabetes Care. 2021;44:2589-2625. doi:10.2337/DCI21-0043.

Klonoff

Wang

Rodbard

, et al. A glycemia risk index (GRI) of hypoglycemia and hyperglycemia for continuous glucose monitoring validated by clinician ratings. J Diabetes Sci Technol. 2023;17(5):1226-1242. doi:10.1177/19322968221085273/ASSET/IMAGES/LARGE/10.1177_19322968221085273-FIG7.JPEG.

Yoo

Kim

. Association between continuous glucose monitoring-derived glycemia risk index and albuminuria in type 2 diabetes. Diabetes Technol Ther. 2023;25(10):726-735. doi:10.1089/DIA.2023.0165/SUPPL_FILE/SUPPL_TABLES4.DOCX.

Díaz-Soto

Pérez-López

Férnandez-Velasco

, et al. Quality of life, diabetes-related stress and treatment satisfaction are correlated with glycemia risk index (GRI), time in range and hypoglycemia/hyperglycemia components in type 1 diabetes. Endocrine. 2024;86(1):186-193. doi:10.1007/S12020-024-03846-9/TABLES/4.

Wang

, et al. Association between glycaemia risk index (GRI) and diabetic retinopathy in type 2 diabetes: a cohort study. Diabetes Obes Metab. 2023;25(9):2457-2463. doi:10.1111/DOM.15068.

American Diabetes Association Professional Practice Committee. 6. Glycemic targets: standards of medical care in diabetes-2022. Diabetes Care. 2022;45:S83-S96. doi:10.2337/DC22-S006.

Fleischer

Hansen

Cichosz

. Hypoglycemia event prediction from CGM using ensemble learning. Front Clin Diabetes Healthc. 2022;3:1066744. doi:10.3389/FCDHC.2022.1066744.

Cichosz

Jensen

Hejlesen

. Short-term prediction of future continuous glucose monitoring readings in type 1 diabetes: development and validation of a neural network regression model. Int J Med Inform. 2021;151:104472. doi:10.1016/J.IJMEDINF.2021.104472.

10.

Cichosz

Kronborg

Jensen

Hejlesen

. Penalty weighted glucose prediction models could lead to better clinically usage. Comput Biol Med. 2021;138:104865. doi:10.1016/j.compbiomed.2021.104865.

11.

Woldaregay

Årsand

Botsis

, et al. Data-driven blood glucose pattern classification and anomalies detection: machine-learning applications in type 1 diabetes. J Med Internet Res. 2019;21(5):e11030. doi:10.2196/11030. Accessed April 1, 2025. https://www.jmir.org/2019/5/e11030

12.

Lebech Cichosz

Hasselstrøm Jensen

Schou Olesen

. Development and validation of a machine learning model to predict weekly risk of hypoglycemia in patients with type 1 diabetes based on continuous glucose monitoring. Diabetes Technol Ther. 2024;26(7):457-466. doi:10.1089/DIA.2023.0532. https://www.liebertpub.com/doi/10.1089/dia.2023.0532

13.

Giammarino

Senanayake

Prahalad

, et al. A machine learning model for week-ahead hypoglycemia prediction from continuous glucose monitoring data. J Diabetes Sci Technol. Published online March, 2024. doi:10.1177/19322968241236208.

14.

Cichosz

Olesen

Jensen

. Explainable machine-learning models to predict weekly risk of hyperglycemia, hypoglycemia, and glycemic variability in patients with type 1 diabetes based on continuous glucose monitoring. J Diabetes Sci Technol. Published online October 8, 2024. doi:10.1177/19322968241286907.

15.

Rodriguez-Leon

Aviles-Perez

Banos

, et al. T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus. Sci Data. 2023;10:1-11. doi:10.1038/s41597-023-02737-4.

16.

Battelino

Danne

Bergenstal

, et al. Clinical targets for continuous glucose monitoring data interpretation: recommendations from the international consensus on time in range. Diabetes Care. 2019;42(8):1593-1603. doi:10.2337/DCI19-0028.

17.

Klonoff

Wang

Rodbard

18.

Piona

Marigliano

Roncarà

, et al. Glycemia risk index as a novel metric to evaluate the safety of glycemic control in children and adolescents with type 1 diabetes: an observational, multicenter, real-life cohort study. Diabetes Technol Ther. 2023;25(7):507-512. doi:10.1089/DIA.2023.0040/ASSET/IMAGES/DIA.2023.0040_FIGURE1.JPG.

19.

Chen

Guestrin

. XGBoost: a scalable tree boosting system. Paper presented at the proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 2016; Barcelona, Spain.

20.

Berrar

. Cross-validation. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics. Amsterdam, the Netherlands: Elsevier Science; 2019:542-545. doi:10.1016/B978-0-12-809633-8.20349-X.

21.

Vickers

Van Calster

Steyerberg

. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:i6. doi:10.1136/BMJ.I6.

22.

Danne

Nimri

Battelino

, et al. International consensus on use of continuous glucose monitoring. Diabetes Care. 2017;40:1631-1640. doi:10.2337/DC17-1600.

23.

Beck

. Is it time to replace time-in-range with time-in-tight-range? Maybe not. Diabetes Technol Ther. 2024;26(3):147-150. doi:10.1089/DIA.2023.0602.

24.

Cohen

Goldszmidt

. Properties and benefits of calibrated classifiers. Lecture Notes in Computer Science. London: Springer Nature: 2004;3202:125-136. doi:10.1007/978-3-540-30116-5_14.

25.

Lundberg

Allen

Lee

. A unified approach to interpreting model predictions. Paper presented at the 31st Conference on Neural Information Processing Systems (NIPS 2017); December 2017; Long Beach, CA.

26.

Hilliard

Rausch

Dolan

Hood

. Predictors of deteriorations in diabetes management and control in adolescents with type 1 diabetes. J Adolesc Health. 2013;52(1):28-34. doi:10.1016/J.JADOHEALTH.2012.05.009.

Predicting High Glycemia Risk Index Trajectory in Individuals With Type 1 Diabetes and Long-term Continuously Glucose Monitoring

Abstract

Keywords

Introduction

Methods

Data Material

Approach

Predictors

Assessment and Explainability

Results

Conclusions

Footnotes

Abbreviations

Declaration of Conflicting Interests

Funding

Ethical Approval

ORCID iD

References