Imputation of Missing Continuous Glucose Monitor Data

Abstract

Background:

Continuous glucose monitors (CGMs) in research and clinical settings characterize glycemic profiles through repeated measurement of interstitial glucose levels on the order of minutes. Missing values from devices are unavoidable. Data from the Glycemic Observation and Metabolic Outcomes in Mothers and Offspring (GO MOMs) study were used to investigate the impact of missing data on CGM summary metrics. Several imputation techniques were evaluated by comparing mean relative bias (MRB) between true and imputed CGM data for the summary metrics.

Methods:

We used 105 CGM profiles with nine days of complete glucose measurements and introduced missing data strings using a zero-inflated negative binomial hurdle model. Overall missingness was introduced at 2% consistent with GO MOMs data and increased to 5%, 10%, and 20%. Imputation approaches included single, multiple, machine learning techniques, and hot-deck imputation, where missing values are replaced with the participant’s observed values. Removing missing values prior to analysis (complete case analysis) was also evaluated.

Results:

The MRB is minimal across most metrics and imputation methods at overall 2% missing data and increases with higher missing data frequency, with trends depending on metric and imputation method. Hot-deck imputation and complete case analysis show consistently low MRB.

Conclusions:

Missing CGM data are to be expected. For periods of wear with up to 20% missing data, hot-deck imputation and complete case analysis may be acceptable if data are missing completely at random. Explored imputation techniques are robust, but each has their own limitations, which should be considered if these techniques are implemented.

Keywords

continuous glucose monitoring hot-deck imputation machine learning missing data multiple imputation

Get full access to this article

View all access options for this article.

References

GO MOMs Study Group. Design, rationale and protocol for glycemic observation and metabolic outcomes in mothers and offspring (GO MOMs): an observational cohort study. BMJ Open. 2024;14(6):e084216. doi:10.1136/bmjopen-2024-084216.

Rehman

Contreras

Beneyto

Vehi

. The impact of missing continuous blood glucose samples on machine learning models for predicting postprandial hypoglycemia: an experimental analysis. Mathematics. 2024;12(10):1567.

Butt

Khosa

Iftikhar

. Feature transformation for efficient blood glucose prediction in type 1 diabetes mellitus patients. Diagnostics (Basel). 2023;13(3):340. doi:10.3390/diagnostics13030340.

Acuna

Aparicio

Palomino

. Analyzing the performance of transformers for the prediction of the blood glucose level considering imputation and smoothing. Big Data Cogn Comput. 2023;7(1):41.

Millard

LAC

Patel

Tilling

Lewcock

Flach

Lawlor

. GLU: a software package for analysing continuously measured glucose levels in epidemiology. Int J Epidemiol. 2020;49(3):744-757. doi:10.1093/ije/dyaa004.

ACOG practice bulletin no. 190 summary: gestational diabetes mellitus. Obstet Gynecol. 2018;131(2):406-408. doi:10.1097/AOG.0000000000002498.

Aris

Kleinman

Belfort

Kaimal

Oken

. A 2017 US reference for singleton birth weight percentiles using obstetric estimates of gestation. Pediatrics. 2019;144(1):e20190076. doi:10.1542/peds.2019-0076.

Yee

. The VGAM package for categorical data analysis. J Stat Softw. 2010;32(10):1-34. doi:10.18637/jss.v032.i10.

Broll

Urbanek

Buchanan

, et al. Interpreting blood GLUcose data with R package iglu. PLoS ONE. 2021;16(4):e0248560. doi:10.1371/journal.pone.0248560.

10.

Wood

. Stable and efficient multiple smoothing parameter estimation for generalized additive models. J Am Stat Assoc. 2004;99(467):673-686.

11.

Siddique

Belin

. Multiple imputation using an iterative hot-deck with distance-based donor selection. Stat Med. 2008;27(1):83-102. doi:10.1002/sim.3001.

12.

van Buuren

Groothuis-Oudshoorn

. Mice: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1-67. doi:10.18637/jss.v045.i03.

13.

Erler

Rizopoulos

van Rosmalen

Jaddoe

VWV

Franco

Lesaffre

EMEH

. Dealing with missing covariates in epidemiologic studies: a comparison between multiple imputation and a full Bayesian approach. Stat Med. 2016;35(17):2955-2974. doi:10.1002/sim.6944.

14.

Samad

Yin

. Non-linear regression models for imputing longitudinal missing data. Paper presented at the 2019 IEEE International Conference on Healthcare Informatics (ICHI); June 2019; Xi’an, China.

15.

Ribeiro

Freitas

. A data-driven missing value imputation approach for longitudinal datasets. Artif Intell Rev. 2021;54(8):6277-6307. doi:10.1007/s10462-021-09963-5.

16.

Rubin

. Multiple Imputation for Nonresponse in Surveys. Hoboken, NJ: John Wiley & Sons; 1987.

17.

Shah

Bartlett

Carpenter

Nicholas

Hemingway

. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol. 2014;179(6):764-774. doi:10.1093/aje/kwt312.

18.

Salfran

Spiess

. Generalized additive model multiple imputation by chained equations with package imputerobust. R J. 2018;10(1):61-72.

19.

Côté

Liu

. SAITS: self-attention-based imputation for time series. Expert Syst Appl. 2023;219:119619. doi:10.1016/j.eswa.2023.119619.

20.

Cao

Wang

Zhou

. BRITS: bidirectional recurrent imputation for time series. Paper presented at the Advances in Neural Information Processing Systems 31 (NeurIPS 2018); December 2018; Montréal, Canada.

21.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. Vol 30. Advances in Neural Information Processing Systems. Newry, Northern Ireland: Curran Associates, Inc; 2017.

22.

Danne

Nimri

Battelino

, et al. International consensus on use of continuous glucose monitoring. Diabetes Care. 2017;40(12):1631-1640. doi:10.2337/dc17-1600.

23.

Battelino

Alexander

Amiel

, et al. Continuous glucose monitoring and metrics for clinical trials: an international consensus statement. Lancet Diabetes Endocrinol. 2023;11(1):42-57. doi:10.1016/S2213-8587(22)00319-9.

24.

Riddlesworth

Beck

Gal

, et al. Optimal sampling duration for continuous glucose monitoring to determine long-term glycemic control. Diabetes Technol Ther. 2018;20(4):314-316. doi:10.1089/dia.2017.0455.

25.

Smith

Abraham

de Bock

, et al. Impact of missing data on the accuracy of glucose metrics from continuous glucose monitoring assessed over a 2-week period. Diabetes Technol Ther. 2023;25(5):356-362. doi:10.1089/dia.2022.0101.

26.

ElSayed

Aleppo

Aroda

, et al. 7. Diabetes technology: standards of care in diabetes—2023. Diabetes Care. 2022;46(suppl 1):S111-S127. doi:10.2337/dc23-S007.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.35 MB

0.03 MB