Sage Journals: Discover world-class research

Abstract

Background:

Continuous glucose monitors (CGMs) have become important tools for providing estimates of glucose to patients with diabetes. Recently, neural networks (NNs) have become a common method for forecasting glucose values using data from CGMs. One method of forecasting glucose values is a time-delay feedforward (FF) NN, but a change in the CGM location on a participant can increase forecast error in a FF NN.

Methods:

In response, we examined a NN with gated recurrent units (GRUs) as a method of reducing forecast error due to changes in sensor location.

Results:

We observed that for 13 participants with type 2 diabetes wearing blinded CGMs on both arms for 12 weeks (FreeStyle Libre Pro—Abbott), GRU NNs did not produce significantly different errors in glucose prediction due to sensor location changes (P < .05).

Conclusion:

We observe that GRU NNs can mitigate error in glucose prediction due to differences in CGM location.

Keywords

glucose forecast continuous glucose monitoring neural network gated recurrent unit

Introduction

Diabetes mellitus is a metabolic disease ensuing from insufficient β-cell action in the pancreas. Common classifications of diabetes mellitus include type 1 diabetes (T1D), in which β-cells are destroyed by an autoimmune response, and type 2 diabetes (T2D), in which β-cells fail following initial insulin resistance.¹ In both types, loss of β-cell function leads to insufficient insulin secretion and hyperglycemia (chronically high blood glucose). Chronic hyperglycemia can lead to severe macrovascular and microvascular complications.²

A critical component in the treatment of diabetes and minimization of vascular complications is the ability for patients to accurately and precisely monitor their blood glucose level. Traditionally, this has been done by using a glucometer to measure small blood samples acquired through multiple daily fingersticks. Self-monitored blood glucose (SMBG) is an important tool for the treatment of diabetes especially when therapy includes insulin injections.^1-4

Recently, interstitial continuous glucose monitors (CGMs) have begun to replace the traditional SMBG method. The Food and Drug Administration (FDA) approved CGMs for non-adjunctive use in 2016, allowing patients and providers to exclusively use CGMs, without fingerstick confirmation, for treatment decisions.^5-8 With SMBG, patients frequently sample four to seven glucose measurements per day which incompletely capture overnight and postprandial glycemic trends.^9,10 Conversely, with CGMs a small sensor is inserted under the skin to measure the glucose levels in the interstitial fluid. The major advantage with CGM devices is the ability to record glucose levels every 5 to 15 minutes for 14 days.¹¹ Continuous glucose monitor usage results in approximately ten times the number of glycemic measurements compared with SMBG; patients using CGMs also improve glycemic control and reduce glycemic variability compared with SMBG monitoring.¹² Although there is a well-known lag between glucose measured by CGM compared with SMBG, many patients and providers are moving toward exclusive us of CGMs for treatment decisions. Studies have shown that CGM readings can lag 4 to 14 minutes behind SMBG; CGM readings can be corrected via an algorithm with variables chosen with a least squares fit.^13,14

To further improve glycemic control, there has been a considerable body of work dedicated to mathematical forecasting of the glycemic profile from CGM data. Some of these methods include autoregressive models, deterministic models, support vector machines, and neural networks (NNs).^15-18 Notably, NN methods have effectively forecasted glucose behavior 30 minutes into the future^19-23 which is relevant to patient care since this time frame permits the patient to take action to avoid hypoglycemia. Equipping patients and providers with automated glycemic forecasting is desirable because it enables more individualized, patient-specific therapy.

While there are numerous methods available for glucose prediction, every algorithm depends on sensor accuracy. Continuous glucose monitors are approved to replace traditional SMBG, yet questions persist regarding accuracy, particularly during hypoglycemia and hyperglycemia.^24-27 Factors affecting accuracy include device-tissue motion and forces, patient physiology, calibration, and implantation site physiology, and solutions such as de-noising, sensor delays, and additional calibration have proven successful in reducing error.^26,28-34 Enhancing CGM accuracy is critical due to its importance in implementing effective treatment.^35,36

Recent work has highlighted sensor location between the left and the right arms as an important factor in CGM accuracy. Continuous glucose monitors on the market are approved for use on either arm, but a 2019 study by Liu et al reported significant differences between CGMs placed on the left and right arms of their participants.^8,37 Moving beyond real-time accuracy, our previous work examined CGM sensor agreement between different recommended locations in feedforward (FF) NN glycemic forecasting. In 10 of 13 participants, we observed significant differences in NN prediction performance from NNs trained on the right arm versus the left arm.³⁸ In this companion article, we examined NNs with gated recurrent units (GRUs) as a method for reducing forecasting error due to sensor location changes. Neural networks with GRUs offer notable benefits over other NN methods since GRUs can capture both long-term and short-term behaviors. By using a reset gate to nullify unimportant short-term information and an update gate to capture behavior of previous states, GRUs are more robust in managing variability. Since GRUs calculate recursively, the network has the effect of remembering performance across every forecast which improves on the long-term performance of FF NNs.^39-41 We hypothesize that GRU NNs trained from different manufacturer-approved CGM locations will not experience increases in forecast error (P < .05).

Data Acquisition

This is a secondary data analysis and companion study from a previously published study.³⁸ Thirteen subjects with T2D not previously using CGMs were eligible for this study. All participants wore one FreeStyle Libre Pro CGM device (Abbott, Alameda, California) on the back of each arm (two sensors per participant) for approximately 15 weeks. Devices were replaced every two weeks and placed per manufacturer’s instructions. Both sensors were blinded, meaning the subjects were unable to view the CGM data throughout the study and only received results after the study. Participants were instructed to continue monitoring their blood sugars and manage their diabetes as per their usual habits. The FreeStyle Libre Pro is one of the CGM systems approved for non-adjunctive, standalone use by the FDA and does not require calibration from fingerstick blood glucose measurements.⁸ Other important characteristics of the FreeStyle Libre Pro system include 14-day continuous monitoring, a 15-minute sampling rate,^42,43 and the ability to download the raw glucose data and time stamp from the sensor.

All data were collected from individuals participating in a University of Minnesota Institutional Review Board approved study, “Role of CGMS Usage in Predicting Risk of Hypoglycemia” (NCT03481530, ID:STUDY00002113). All participants provided written informed consent before participation. A breakdown of participant characteristics is presented in Table 1.

Table 1.

Characteristics of Individuals Participating in the “Role of CGMS Usage in Predicting Risk of Hypoglycemia” Study.

Participant characteristics
Gender		Height (cm)
Female	6	Mean ± SD	170.6 ± 9.7
Male	7	Range	156-184
Age		Weight (kg)
Mean ± SD	60.4 ± 12	Mean ± SD	99.7 ± 16.6
Range	45-79	Range	73-122
HbA1c (%)		Glucose observations per participant
Mean ± SD	7.5 ± 1.2	Mean ± SD	15 537 ± 6670
Range	5.8-10.2	Range	4683-24 202

Abbreviations: HbA1c, hemoglobin A1c; SD, standard deviation.

Methods and Materials

Problem Formulation

We previously identified the need to account for CGM location when building algorithms to forecast glucose values.³⁸ In response, this work aims to reduce NN forecast error due to sensor location. To test this, we isolated sensor location as a testable variable by comparing results between algorithms trained and tested on data from the same arm versus ones trained and tested on data from different arms. This resulted in construction of two groups of two NNs per participant, the left arm group and the right arm group. In the left arm group, an algorithm trained and tested on the left arm (Left-Left) was compared with one trained on the right arm and tested on the left arm (Right-Left). Conversely, in the right arm group an algorithm trained and tested on the right arm (Right-Right) was compared with one trained on the left arm and tested on the right arm (Left-Right). Within each group, the root mean square error (RMSE) was used to evaluate performance due to its use as a common metric in glucose forecasting evaluation (see Figure 1).^20,22,44-46

Figure 1.

Glucose forecasting algorithms were evaluated with the 5 × 2 CV method by comparing algorithms trained and tested on data from the same arm with algorithms trained on data from the opposite arm. Abbreviations: CGMs, continuous glucose monitors; CV, cross-validation; GRU, gated recurrent unit; NN, neural networks.

General Algorithm Criteria

While the focus of this work is mitigating error from change in sensor location, the algorithmic framework adhered to four general criteria. First, algorithms were formulated to provide clinically acceptable error rates. For scenarios in which forecasts were trained and tested with data from the same arm, our algorithm was structured to provide clinically acceptable accuracy. Second, forecasts were made to be participant specific. We believe that avoiding inter-participant differences is important when isolating sensor location as a testable variable. Third, multiple prediction horizons (PHs) of 15, 30, 45, and 60 minutes were considered for a robust analysis across time. Prediction horizon is defined as the time period between the current time and the forecasted value, and past works have shown larger PHs increase forecast error.^20,21 Finally, overlapping data were examined for the purposes of this analysis. In other words, data were only used during time periods in which both sensors were operating correctly.

Feedforward Neural Network

In our companion article, a FF NN with one hidden layer was implemented to forecast glucose values. Inputs, $x$ , were built from participant CGM time series by using three previous values of glucose with a bias $x_{0}$ , outputs, $y$ , were forecasts for each PH (15, 30, 45, 60 minutes) and the ground truths, $r$ , were CGM values at each PH.

\begin{array}{l} x = {x_{0}, x_{t - 1}, x_{t - 2}, x_{t - 3}} \\ r = {x_{t}, x_{t + 1}, x_{t + 2}, x_{t + 3}} \\ y = {y_{t}, y_{t + 1}, y_{t + 2}, y_{t + 3}} \end{array}

Since the FreeStyle Libre Pro has a sampling rate of 15 minutes, the inputs represented 45 minutes’ worth of glucose data. Data into the NNs were normalized to prevent saturation during training. Standard gradient descent presented by Alpaydin⁴⁷ was used for training. For a more detailed explanation, view the companion article Examining Sensor Agreement in Neural Network Blood Glucose Prediction.³⁸

NN With GRUs

We constructed a NN with GRUs to compare with the results from the time-delay FF NN in the companion study.³⁸ Gated recurrent units were first proposed in 2014 by Cho et al and have seen increasing use since their inception.^40,48 Identical to the FF NN, inputs to the GRU network, $x$ , were comprised of three lagged glucose data points with a bias, $x_{0}$ . Outputs and ground truth were also identical to the FF NN while backpropagation with standard gradient descent was used for training.

Gated recurrent units offer notable improvements over FF NNs. First, as the model is training, the network retains information about previous model states using a hidden state that is propagated through the network. During training, the NN determines how much information to use from previous model states with an update gate. This is also a feature of other recurrent neural networks. However, a GRU also contains a reset gate which evaluates the usefulness of previous hidden states and can reduce model dependency on unimportant information. Therefore, if a forecast has a strong dependency on short-term behavior, the algorithm will capture that behavior with its reset gate. Conversely, for reliance on long-term behavior, the model will frequently activate the update gate.⁴⁸ After each pass through a GRU, the output $y_{t}$ is passed to the next GRU as a hidden state $h_{t}$ where it is combined with the model inputs $x_{t}$ . The behavior of a GRU is illustrated in Figure 2.

Figure 2.

The structure of the gated recurrent unit contains a reset gate and an update gate which combine to capture long-term and short-term behavior.

Algorithm Evaluation

To compare forecast performance, the 5 × 2 cross-validation (5 × 2 CV) method was used. This approach is commonly used to compare algorithms by performing five randomized replications of two-fold cross-validation. Data were split into two random, equal-sized groups: training set and validation set. For each set, five new predictors were trained and validated on the opposite set, leading to ten total predictors and a set of error metrics for Alpaydin’s combined 5 × 2 CV F-test.^49,50

F = \frac{\sum_{i = 1}^{5} \sum_{j = 1}^{2} {(p_{i}^{(j)})}^{2}}{2 \sum_{i = 1}^{5} s_{i}^{2}}

Alpaydin has shown that this representation is F distributed with 10 and 5 degrees of freedom where $p_{i}^{(j)}$ is the difference between the error rates of each replication across the two folds and $s_{i}^{2}$ is the sample variance. This F-test was used to determine whether compared algorithms exhibited statistically significant differences in error rates.⁴⁹

Two metrics were considered when assessing performance. First, the RMSE was used both to assess overall forecast performance and to compare performance within groups using the F-test (eg, performance between Left-Left and Right-Left predictors). Second, the mean absolute relative difference (MARD) was used as another assessment of general predictor performance due to its support in literature.^21,23

\begin{array}{l} E_{RMSE} = \frac{1}{5} \sum_{k = 1}^{5} \sqrt{\frac{1}{N_{k}} \sum_{t}^{N_{k}} {(r^{t} - y^{t})}^{2}} \\ E_{MARD} = \frac{1}{5} \sum_{k = 1}^{5} \frac{1}{N_{k}} \sum_{t}^{N_{k}} \frac{| y^{t} - r^{t} |}{y^{t}} \end{array}

Results

The data collected for this analysis were used to construct 40 glucose forecasting algorithms per participant, for a total of 520 glucose predictors. For each participant, the RMSE rates from two groups of 20 predictors were compared using 5 × 2 CV to evaluate changes in forecast error rate due to changes in physiological sensor location. Thus, the comparisons of interest are between the Left-Left and Right-Left algorithms as well as the Right-Right and Left-Right algorithms. Figure 3 shows an example forecast of the GRU NN.

Figure 3.

The gated recurrent unit neural networks were able to accurately forecast glucose for a variety of PHs. In this example, a single-day forecast is shown for participant 12. Abbreviations: CGM, continuous glucose monitor; PH, prediction horizon.

Previous results in our companion publication using this methodology for time-delay FF NNs showed significant increases in error for 10 of 13 participants due to CGM location (P < .05).³⁸When using the NN with GRUs, we observed no significant differences in error due to changes in sensor location (P < .05). Glycemic standard deviations for each patient are shown in Table 2. Root mean square error values are presented in Table 3, MARD values are shown in Table 4, the comparison between the FF NN is shown in Figure 4, and the results of the GRU NN are shown in Figure 5.

Table 2.

The Glycemic Standard Deviations are the Standard Deviations Across the Entire CGM Data Set per Arm per Participant.

Participant	Left-arm CGM standard deviation (mg/dL)	Right-arm CGM standard deviation (mg/dL)
1	37.56	36.61
2	43.15	46.85
3	40.42	40.48
4	43.6	45.82
5	32.94	34.09
6	34.72	38.49
7	36.32	35.05
8	45.48	45.81
9	40.24	40.81
10	24.62	26
11	40.31	38.4
12	44.84	45.56
13	42.72	44.51

Abbreviation: CGM, continuous glucose monitor.

Table 3.

The RMSE Results from the Algorithm 5 × 2 Cross-Validation Process are Presented with their Standard Deviations. Between Train-Test Pairs, there were No Instances of Significant Increases in Error (P < .05).

Prediction horizon (minutes)	Algorithm setup (trained-tested)	Participant 1 RMSE (mg/dL)	Participant 2 RMSE (mg/dL)	Participant 3 RMSE (mg/dL)	Participant 4 RMSE (mg/dL)	Participant 5 RMSE (mg/dL)	Participant 6 RMSE (mg/dL)	Participant 7 RMSE (mg/dL)	Participant 8 RMSE (mg/dL)	Participant 9 RMSE (mg/dL)	Participant 10 RMSE (mg/dL)	Participant 11 RMSE (mg/dL)	Participant 12 RMSE (mg/dL)	Participant 13 RMSE (mg/dL)
15	Left-Left	13.11 ± 0.42	10.84 ± 0.3	17.92 ± 0.14	8.7 ± 0.48	10.35 ± 0.79	8.21 ± 0.33	14.69 ± 0.42	9.12 ± 0.72	8.42 ± 0.5	12.65 ± 0.18	9.77 ± 0.14	11.81 ± 0.47	12.58 ± 0.58
	Right-Left	13.27 ± 0.5	11.25 ± 0.43	17.2 ± 0.4	8.55 ± 0.46	10.13 ± 0.41	8.28 ± 0.64	13.9 ± 0.55	9.49 ± 0.54	7.87 ± 0.25	12.69 ± 0.17	10.05 ± 0.27	11.97 ± 0.64	12.43 ± 0.77
	Right-Right	12.86 ± 0.44	9.04 ± 0.4	18.21 ± 0.59	8.85 ± 0.5	10.42 ± 0.44	8.62 ± 0.73	13.94 ± 0.42	9.46 ± 0.54	8.15 ± 0.3	14.64 ± 0.22	12.1 ± 0.18	11.38 ± 0.86	12.84 ± 0.96
	Left-Right	12.76 ± 0.48	9.3 ± 0.49	18.91 ± 0.24	9.08 ± 0.59	10.74 ± 0.82	8.51 ± 0.41	14.7 ± 0.42	9.1 ± 0.65	8.71 ± 0.59	14.63 ± 0.21	12.05 ± 0.12	11.36 ± 0.41	13.09 ± 0.53
30	Left-Left	20.08 ± 0.3	15.79 ± 0.36	23 ± 0.15	15.76 ± 0.55	17.42 ± 0.64	14.89 ± 0.27	21.31 ± 0.34	14.98 ± 0.71	13.8 ± 0.39	16.55 ± 0.25	15.94 ± 0.19	20.57 ± 0.31	20.11 ± 0.48
	Right-Left	20.18 ± 0.52	15.89 ± 0.28	22.63 ± 0.26	15.82 ± 0.48	17.44 ± 0.76	14.93 ± 0.23	20.92 ± 0.37	15.21 ± 0.56	13.46 ± 0.14	16.54 ± 0.19	16 ± 0.22	20.54 ± 0.48	19.87 ± 0.7
	Right-Right	21.68 ± 0.32	15.9 ± 0.52	24.32 ± 0.49	15.88 ± 0.48	17.87 ± 0.59	15.21 ± 0.46	20.9 ± 0.37	15.15 ± 0.49	12.59 ± 0.31	17.56 ± 0.19	14.84 ± 0.18	19.71 ± 0.5	21 ± 1.05
	Left-Right	21.61 ± 0.33	16.27 ± 0.62	24.72 ± 0.3	15.95 ± 0.53	18 ± 0.78	15.19 ± 0.44	21.26 ± 0.43	14.91 ± 0.57	13.02 ± 0.52	17.62 ± 0.16	14.8 ± 0.26	19.8 ± 0.41	21.35 ± 0.51
45	Left-Left	25.45 ± 0.45	20.98 ± 0.4	28.13 ± 0.18	21.87 ± 0.55	22.34 ± 0.78	19.83 ± 0.32	25.5 ± 0.28	19.51 ± 0.64	17.96 ± 0.47	18.95 ± 0.22	20.19 ± 0.33	26.98 ± 0.22	25.27 ± 0.38
	Right-Left	25.52 ± 0.41	21.14 ± 0.36	27.82 ± 0.27	21.99 ± 0.48	22.52 ± 0.86	19.85 ± 0.27	25.35 ± 0.42	19.67 ± 0.54	17.73 ± 0.24	18.98 ± 0.18	20.33 ± 0.43	26.87 ± 0.39	25.15 ± 0.6
	Right-Right	26.74 ± 0.37	21.36 ± 0.44	29.39 ± 0.39	21.59 ± 0.55	23.1 ± 0.67	20.26 ± 0.44	25.26 ± 0.36	19.55 ± 0.44	16.76 ± 0.33	20.11 ± 0.2	19.9 ± 0.26	26.02 ± 0.5	26.67 ± 0.98
	Left-Right	26.74 ± 0.38	21.76 ± 0.51	29.73 ± 0.25	21.62 ± 0.56	23.13 ± 0.9	20.31 ± 0.42	25.41 ± 0.45	19.38 ± 0.49	17.09 ± 0.59	20.15 ± 0.18	19.84 ± 0.26	26.17 ± 0.38	26.82 ± 0.49
60	Left-Left	28.2 ± 0.47	24.37 ± 0.44	31.32 ± 0.07	26.8 ± 0.53	25.54 ± 0.71	23.39 ± 0.48	28.43 ± 0.31	23.32 ± 0.52	21.3 ± 0.43	20.61 ± 0.21	23.42 ± 0.34	31.37 ± 0.23	29.38 ± 0.39
	Right-Left	28.31 ± 0.4	24.43 ± 0.48	31.33 ± 0.09	26.95 ± 0.53	25.89 ± 0.75	23.42 ± 0.41	28.32 ± 0.4	23.43 ± 0.43	21.11 ± 0.25	20.63 ± 0.16	23.55 ± 0.39	31.29 ± 0.27	29.21 ± 0.54
	Right-Right	29.65 ± 0.45	25.69 ± 0.3	32.93 ± 0.36	26.18 ± 0.44	26.7 ± 0.75	24.01 ± 0.54	28.17 ± 0.37	23.28 ± 0.48	19.89 ± 0.33	21.63 ± 0.19	21.55 ± 0.3	30.54 ± 0.64	30.85 ± 0.95
	Left-Right	29.7 ± 0.43	26.04 ± 0.31	33.09 ± 0.34	26.19 ± 0.5	26.56 ± 0.99	24.1 ± 0.52	28.28 ± 0.46	23.14 ± 0.52	20.2 ± 0.54	21.74 ± 0.21	21.47 ± 0.25	30.68 ± 0.5	31.05 ± 0.61

Abbreviation: RMSE, root mean square error.

Table 4.

The MARD was Used to Evaluate Overall Predictor Performance. The Mean and Standard Deviations from the 5 × 2 Cross-Validation Process are Reported. Note that MARD Standard Deviation was Low Across all Prediction Horizons.

Prediction horizon (minutes)	Algorithm setup (trained-tested)	Participant 1 MARD (%)	Participant 2 MARD (%)	Participant 3 MARD (%)	Participant 4 MARD (%)	Participant 5 MARD (%)	Participant 6 MARD (%)	Participant 7 MARD (%)	Participant 8 MARD (%)	Participant 9 MARD (%)	Participant 10 MARD (%)	Participant 11 MARD (%)	Participant 12 MARD (%)	Participant 13 MARD (%)
15	Left-Left	7.41 ± 0.6	5.83 ± 0.39	10.24 ± 0.26	4.13 ± 0.29	4.17 ± 0.49	3.45 ± 0.16	5.98 ± 0.4	4.57 ± 0.39	3.51 ± 0.4	7.8 ± 0.12	3.68 ± 0.2	5.48 ± 0.28	6.56 ± 0.41
	Right-Left	7.65 ± 0.8	5.96 ± 0.2	9.92 ± 0.35	4.18 ± 0.41	4.01 ± 0.31	3.54 ± 0.43	5.63 ± 0.36	4.8 ± 0.25	3.2 ± 0.23	7.84 ± 0.1	3.75 ± 0.31	6.04 ± 0.42	6.57 ± 0.73
	Right-Right	7.29 ± 0.79	4.92 ± 0.25	10.71 ± 0.35	4.51 ± 0.41	4.13 ± 0.25	3.78 ± 0.53	5.67 ± 0.32	4.75 ± 0.21	3.33 ± 0.2	9.22 ± 0.19	4.69 ± 0.23	6.09 ± 0.51	7.09 ± 0.78
	Left-Right	7.08 ± 0.69	5.14 ± 0.44	11.03 ± 0.29	4.51 ± 0.38	4.37 ± 0.49	3.67 ± 0.18	5.99 ± 0.38	4.51 ± 0.34	3.67 ± 0.4	9.19 ± 0.16	4.66 ± 0.15	5.55 ± 0.29	7.17 ± 0.4
30	Left-Left	11.43 ± 0.87	8.72 ± 0.65	13.21 ± 0.26	7.22 ± 0.42	6.96 ± 0.45	6.26 ± 0.19	8.95 ± 0.41	7.61 ± 0.49	5.76 ± 0.4	10.37 ± 0.24	6.21 ± 0.33	9.77 ± 0.42	10.66 ± 0.42
	Right-Left	11.92 ± 1.22	8.51 ± 0.28	13.22 ± 0.44	7.47 ± 0.69	7.03 ± 0.46	6.29 ± 0.3	8.8 ± 0.37	7.79 ± 0.37	5.48 ± 0.27	10.38 ± 0.19	6.02 ± 0.27	10.14 ± 0.54	10.78 ± 0.94
	Right-Right	13.09 ± 1.07	8.77 ± 0.33	14.62 ± 0.43	7.96 ± 0.66	7.29 ± 0.29	6.64 ± 0.38	8.79 ± 0.38	7.68 ± 0.32	5.11 ± 0.24	11.11 ± 0.27	5.59 ± 0.27	10.35 ± 0.6	11.91 ± 1.15
	Left-Right	12.58 ± 0.88	9.14 ± 0.63	14.65 ± 0.33	7.83 ± 0.41	7.35 ± 0.41	6.61 ± 0.23	8.93 ± 0.38	7.49 ± 0.39	5.47 ± 0.43	11.12 ± 0.23	5.72 ± 0.34	10.03 ± 0.44	11.87 ± 0.46
45	Left-Left	15.28 ± 0.99	12.31 ± 0.79	16.6 ± 0.39	10.34 ± 0.51	9.21 ± 0.48	8.43 ± 0.23	11.08 ± 0.4	10.13 ± 0.57	7.66 ± 0.46	12.11 ± 0.28	8.06 ± 0.32	13.61 ± 0.79	13.75 ± 0.45
	Right-Left	15.99 ± 1	12.04 ± 0.33	16.79 ± 0.46	10.67 ± 0.82	9.4 ± 0.56	8.43 ± 0.31	11.07 ± 0.34	10.28 ± 0.45	7.35 ± 0.36	12.05 ± 0.18	8.1 ± 0.5	13.92 ± 0.63	13.98 ± 1.05
	Right-Right	16.99 ± 0.97	12.01 ± 0.48	18.24 ± 0.5	10.99 ± 0.82	9.77 ± 0.36	8.97 ± 0.41	11 ± 0.34	10.09 ± 0.39	7.02 ± 0.33	13.1 ± 0.36	8.13 ± 0.39	14.26 ± 0.75	15.31 ± 1.37
	Left-Right	16.28 ± 0.92	12.52 ± 0.74	18.08 ± 0.46	10.8 ± 0.54	9.7 ± 0.44	9.01 ± 0.32	11.02 ± 0.38	9.93 ± 0.47	7.38 ± 0.45	13.2 ± 0.33	8.12 ± 0.31	13.97 ± 0.85	15.14 ± 0.56
60	Left-Left	17.38 ± 0.99	14.5 ± 0.89	18.93 ± 0.51	13.15 ± 0.66	10.84 ± 0.46	10.11 ± 0.31	12.75 ± 0.3	12.42 ± 0.63	9.33 ± 0.51	13.47 ± 0.26	9.68 ± 0.31	16.52 ± 0.92	16.37 ± 0.48
	Right-Left	18.22 ± 0.88	14.23 ± 0.57	19.49 ± 0.61	13.59 ± 0.89	11.06 ± 0.56	10.11 ± 0.4	12.77 ± 0.37	12.49 ± 0.48	8.97 ± 0.39	13.28 ± 0.24	9.67 ± 0.49	16.94 ± 0.81	16.51 ± 1.12
	Right-Right	19.4 ± 0.95	14.77 ± 0.74	20.98 ± 0.65	13.76 ± 0.91	11.65 ± 0.32	10.85 ± 0.46	12.64 ± 0.34	12.21 ± 0.4	8.48 ± 0.35	14.15 ± 0.34	8.73 ± 0.46	17.43 ± 0.91	17.91 ± 1.41
	Left-Right	18.66 ± 0.78	15.17 ± 0.76	20.58 ± 0.59	13.47 ± 0.7	11.53 ± 0.46	10.89 ± 0.35	12.64 ± 0.37	12.12 ± 0.54	8.91 ± 0.52	14.47 ± 0.26	8.72 ± 0.35	17.02 ± 1.05	17.82 ± 0.61

Abbreviation: MARD, mean absolute relative difference.

Figure 4.

Error rates for the same arm algorithms (eg, Left-Left) between the two NNs were compared. The FF NN and GRU NN produced many significant instances (P < .05) of different forecast root mean square errors. Significant instances are denoted by an asterisk (*) of the same color as the corresponding PH. Abbreviations: FF, feedforward; GRU, gated recurrent unit; NN, neural networks; PH, prediction horizon.

Figure 5.

The results of the forecasts show that there are no participants which have significant increases (P < .05) in forecast error due to changes in sensor location, across all PHs. Comparisons in this figure consist of predictors Left-Left versus Right-Left and Right-Right versus Left-Right. Mean and standard deviations are reported in Table 3. Abbreviations: GRU, gated recurrent unit; PH, prediction horizon.

Forecast RMSEs for same arm predictions (ie, Left-Left and Right-Right) were compared with the FF NN from the companion article. For most participants, 10 of 13, the GRU NN and the FF NN produced significantly different forecast RMSEs (P < .05). Of 20 possible comparisons (two for each of the ten with significant differences), the GRU NN produced larger error than the FF NN in 14 instances. The errors for approximately 75% of GRU NN forecasts were within 10% of the forecasts made by the time-delay FF NN while over 90% of GRU NN forecasts were within 20%. These results are shown in Figure 4. All forecast RMSEs for the FF NN are reported in our companion article.³⁸

These results were independent of differences in variance between training and testing data sets and differences in baseline participant physiological characteristics such as glycemic standard deviation between left and right arm CGMs, age, height, five-year hypoglycemia risk score,⁵¹ blood pressure, fructosamine levels, resting heart rate, and hemoglobin A1c. The standard deviations of the glycemic profile were acquired by measuring the standard deviation across the entire data set per CGM per participant. The glycemic standard deviations are shown in Table 2. We observed no meaningful differences in forecast performance due to overall glycemic standard deviations.

In terms of general performance, we observed low variance between the RMSE and the MARD across forecasting algorithms. Root mean square error values and MARD values are presented in Tables 3 and 4, respectively, and RMSE graphical results are presented in Figure 5.

Training was completed on a Windows 11 desktop machine with an Intel i7-10700 2.9 GHz CPU, 32.0 GB of RAM, and an NVIDIA Quadro M2000 GPU. Average training time for convergence to a data set of 1000 randomly selected glucose inputs was 2.05 seconds for the GRU NN and 0.94 seconds for the FF NN. For 10 000 randomly selected inputs, the average time for the GRU NN was 9.02 seconds compared with 3.16 seconds for the FF NN.

Discussion

In this work, we made the novel observation that GRU NNs can mitigate forecast error due to changes in CGM location. Our previous work identified that CGM location is important when forecasting glucose values using NN methods.³⁸ While participant-specific time-delay FF NNs can accurately forecast values for algorithms trained and tested on the same arm, they remain inadequate when the sensor data are acquired from a different location. We believe that GRUs are particularly helpful due to their ability to capture both long-term and short-term dependencies. Furthermore, these algorithms perform comparably to the time-delay FF NNs from our previous work. Approximately, 75% of forecasts with the GRU NN were within 10% of the error of the time-delay FF NN, and over 90% of forecasts were within 20%.

From a clinical perspective, this observation suggests that glucose prediction with NN methods would be much better served by using networks with GRUs. While time-delay FF NNs can capture behavior using data solely from one arm, GRU NNs are clearly better at mitigating error due to sensor location changes.

We believe this study has numerous strengths. First, we examined a population with T2D and a diverse range of five-year risk for hypoglycemia. With the use of two sensors and 15 weeks of data collection per participant, we analyzed an abundant, varied data set. Furthermore, we utilized the same testing procedure as in our previous work identifying sensor location error, which strengthens the significance of our results. In addition, we completed our analysis without any use of synthetic data.

We also identified several weaknesses. First, by using GRU NNs, the training time for each forecasting algorithm increased markedly. On average, the time to convergence for randomly selected inputs was two to three times longer for the GRU NN compared with the FF NN. On a larger data set, performing these calculations would be time consuming and computationally intensive. It is likely that specialized hardware would be needed to reduce calculation time. Differences were observed for error rates for same arm predictions (ie, Left-Left and Right-Right) between the GRU NN and the FF NN. In most (but not all) cases, the GRU yielded higher forecast error than the FF NN which provides users with a tradeoff between same arm error and mitigation of error due to CGM location changes. In addition, we did not get the opportunity to test other CGM systems, and we do not know how these results will apply to other CGM devices. Participants also placed CGMs on manufacturer-approved locations only, meaning that the effects of CGM on other body locations are unknown. Furthermore, these forecasting algorithms were only tested on individuals with T2D, we do not know whether these results will be replicated in individuals with T1D.

Finally, participant-specific forecasts likely improved forecast error overall. While participant-specific forecasts were a requirement for isolating sensor location as a testable variable, we recognize that this was a limitation of this study since our methods were not exposed to additional glycemic variance resulting from forecasting with multiple participant data sets.

Conclusion

Accurate and precise glucose forecasting continues to be a long-term goal for diabetes therapy, and we believe that GRU NNs can significantly reduce error due to changes in CGM location. Furthermore, we believe clinicians and researchers should prioritize the effects of sensor location on their predictions. We strongly believe that any prediction method should account for both long-term and short-term dependencies, and we believe the error robustness of GRU NNs merits further investigation for glucose forecasting.

Footnotes

Abbreviations

AHC, Academic Health Center; CGM, continuous glucose monitor; FDA, Food and Drug Administration; HbA1c, hemoglobin A1c; NIH, National Institutes of Health; NN, neural network; FF, feedforward; GRU, gated recurrent unit; MARD, mean absolute relative difference; PH, prediction horizon; RMSE, root mean square error; SMBG, self-monitored blood glucose; T1D, type 1 diabetes; T2D, type 2 diabetes; 5 × 2 CV, 5 × 2 cross-validation.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the National Institutes of Health National Center for Advancing Translational Sciences (UL1TR002494) and the University of Minnesota Academic Health Center (ACH-FRD-17-08 to L.S.C.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health’s National Center for Advancing Translational Sciences.

ORCID iDs

Aaron P. Tucker

Lisa S. Chow

References

American Diabetes Association. Standards of medical care in diabetes. Diabetes Care. 2005;28(suppl 1):S4-S36.

American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care. 2008;31(suppl 1):S55-S60. doi:10.2337/dc08-S055.

American Diabetes Association. Nutrition recommendations and interventions for diabetes: a position statement of the American Diabetes Association. Diabetes Care. 2008;31(suppl 1):S61-S78. doi:10.2337/dc08-S061.

Klonoff

DC.

Benefits and limitations of self-monitoring of blood glucose. J Diabetes Sci Technol. 2007;1:130-132. doi:10.1177/193229680700100121.

Castle

Jacobs

PG.

Nonadjunctive use of continuous glucose monitoring for diabetes treatment decisions. J Diabetes Sci Technol. 2016;10(5):1169-1173. doi:10.1177/1932296816631569.

Diabetes Technology and Therapeutics. FDA advisory panel votes to recommend non-adjunctive use of Dexcom G5 Mobile CGM. Diabetes Technol Ther. 2016;18(8):512-516. doi:10.1089/dia.2016.07252.mr.

U.S. Food and Drug Administration. P120005/S041: FDA Summary of Safety and Effectiveness Data (SSED); 2016.

Food and Drug Administration. P160030: FDA Summary of Safety and Effectiveness Data; 2017.

Diabetes Control and Complications Trial Research Group; Nathan

Genuth

, et al. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N Engl J Med. 1993;329(14):977-986. doi:10.1056/NEJM199309303291401.

10.

Kilpatrick

Rigby

Atkin

SL.

The effect of glucose variability on the risk of microvascular complications in type 1 diabetes. Diabetes Care. 2006;29(7):1486-1490. doi:10.2337/dc06-0293.

11.

Klonoff

Ahn

Drincic

Continuous glucose monitoring: a review of the technology and clinical use. Diabetes Res Clin Pract. 2017;133:178-192. doi:10.1016/j.diabres.2017.08.005.

12.

Beck

Riddlesworth

Ruedy

, et al. Effect of continuous glucose monitoring on glycemic control in adults with type 1 diabetes using insulin injections the diamond randomized clinical trial. JAMA. 2017;317:371-378. doi:10.1001/jama.2016.19975.

13.

Zaharieva

Turksoy

McGaugh

, et al. Lag time remains with newer real-time continuous glucose monitoring technology during aerobic exercise in adults living with type 1 diabetes. Diabetes Technol Ther. 2019;21(6):313. doi:10.1089/DIA.2018.0364.

14.

Schmelzeisen-Redeker

Schoemaker

Kirchsteiger

Freckmann

Heinemann

del Re

. Time delay of CGM sensors: relevance, causes, and countermeasures. J Diabetes Sci Technol. 2015;9(5):1006. doi:10.1177/1932296815590154.

15.

Georga

Protopappas

Ardigò

, et al. Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression. IEEE J Biomed Health Inform. 2013;17(1):71-81. doi:10.1109/TITB.2012.2219876.

16.

Oviedo

Vehí

Calm

Armengol

A review of personalized blood glucose prediction strategies for T1DM patients. Int J Numer Method Biomed Eng. 2017;33(6):1-21. doi:10.1002/cnm.2833.

17.

Steil

Hipszer

Reifman

Update on mathematical modeling research to support the development of automated insulin delivery systems. J Diabetes Sci Technol. 2010;4:759-769. doi:10.1177/193229681000400334.

18.

Zhao

Dassau

Jovanovič

Zisser

Doyle

Seborg

DE.

Predicting subcutaneous glucose concentration using a latent-variable-based statistical method for type 1 diabetes mellitus. J Diabetes Sci Technol. 2012;6(3):617-633. doi:10.1177/193229681200600317.

19.

Baghdadi

Nasrabadi

AM.

Controlling blood glucose levels in diabetics by neural network predictor. Annu Int Conf IEEE Eng Med Biol Soc. 2007;2007:3216-3219. doi:10.1109/IEMBS.2007.4353014.

20.

Zecchin

Facchinetti

Sparacino

de Nicolao

Cobelli

Neural network incorporating meal information improves accuracy of short-time prediction of glucose concentration. IEEE Trans Biomed Eng. 2012;59(6):1550-1560. doi:10.1109/TBME.2012.2188893.

21.

Liu

Zhu

Herrero

Georgiou

GluNet: a deep learning framework for accurate glucose forecasting. IEEE J Biomed Health Inform. 2020;24(2):414-423. doi:10.1109/JBHI.2019.2931842.

22.

Perez-Gandia

Facchinetti

Sparacino

, et al. Artificial neural network algorithm for online glucose prediction from continuous glucose monitoring. Diabetes Technol Ther. 2010;12(1):81-88.

23.

Daniels

Liu

Herrero

Georgiou

Convolutional recurrent neural networks for glucose prediction. IEEE J Biomed Health Inform. 2020;24(2):603-613. doi:10.1109/JBHI.2019.2908488.

24.

Tsalikian

Beck

Tamborlane

, et al. Accuracy of the GlucoWatch G2 Biographer and the continuous glucose monitoring system during hypoglycemia experience of the Diabetes Research in Children Network. Diabetes Care. 2004;27(3):722-726.

25.

Fox

Beck

Steffes

, et al. Relative accuracy of the BD Logic ® and FreeStyle ® blood glucose meters. Diabetes Technol Ther. 2007;9(2):165-168.

26.

Rebrin

Sheppard

Steil

GM.

Use of subcutaneous interstitial fluid glucose to estimate blood glucose: revisiting delay and sensor offset. J Diabetes Sci Technol. 2010;4:1087-1098. doi:10.1177/193229681000400507.

27.

Edelman

Argento

Pettus

Hirsch

IB.

Clinical implications of real-time and intermittently scanned continuous glucose monitoring. Diabetes Care. 2018;41(11):2265-2274. doi:10.2337/dc18-1150.

28.

Helton

Ratner

Wisniewski

NA.

Biomechanics of the sensor—tissue interface—effects of motion, pressure, and design on sensor performance and the foreign body response—part I: theoretical framework. J Diabetes Sci Technol. 2011;5:632-646. doi:10.1177/193229681100500317.

29.

Helton

Ratner

Wisniewski

NA.

Biomechanics of the sensor—tissue interface—effects of motion, pressure, and design on sensor performance and foreign body response—part II: examples and application. J Diabetes Sci Technol. 2011;5(3):647-656. doi:10.1177/193229681100500318.

30.

Haidar

Elleri

Kumareswaran

, et al. Pharmacokinetics of insulin aspart in pump-treated subjects with type 1 diabetes: reproducibility and effect of age, weight, and duration of diabetes. Diabetes Care. 2013;36(10):173-174. doi:10.2337/dc13-0485.

31.

Heinemann

Variability of insulin absorption and insulin action. Diabetes Technol Ther. 2002;4(5):673-682. doi:10.1089/152091502320798312.

32.

Castle

Ward

WK.

Amperometric glucose sensors: sources of error and potential benefit of redundancy. J Diabetes Sci Technol. 2010;4(1):221-225. doi:10.1177/193229681000400127.

33.

Facchinetti

Sparacino

Cobelli

Online denoising method to handle intraindividual variability of signal-to-noise ratio in continuous glucose monitoring. IEEE Trans Biomed Eng. 2011;58(9):2664-2671. doi:10.1109/TBME.2011.2161083.

34.

Facchinetti

Sparacino

Guerra

, et al. Real-time improvement of continuous glucose monitoring accuracy. Diabetes Care. 2013;36:793-800. doi:10.2337/dc12-0736.

35.

Facchinetti

Continuous glucose monitoring sensors: past, present and future algorithmic challenges. Sensors (Basel). 2016;16(12):2093. doi:10.3390/s16122093.

36.

Hovorka

Nodale

Haidar

Wilinska

ME.

Assessing performance of closed-loop insulin delivery systems by continuous glucose monitoring: drawbacks and way forward. Diabetes Technol Ther. 2013;15(1):4-12. doi:10.1089/dia.2012.0185.

37.

Liu

Shah

Eid

Shek

Differences in glucose levels between left and right arm. J Diabetes Sci Technol. 2019;13(4):794-795. doi:10.1177/1932296819851123.

38.

Tucker

Erdman

Schreiner

Chow

LS.

Examining sensor agreement in neural network blood glucose prediction. J Diabetes Sci Technol. 2021. doi:10.1177/19322968211018246.

39.

Chung

Gulcehre

Cho

Bengio

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. CoRR. 2014; abs/1412.3555. https://arxiv.org/abs/1412.3555v1. Accessed January 27, 2022.

40.

Cho

van Merrienboer

Bahdanau

Bengio

. On the properties of neural machine translation: encoder–decoder approaches. Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics, and Structure in Statistical Translation. Published online 2015:103-111. doi:10.3115/v1/w14-4012.

41.

Allam

Nossair

Gomma

Ibrahim

Abdelsalam

IFIP AICT 363—A Recurrent Neural Network Approach for Predicting Glucose Concentration in Type-1 Diabetic Patients. International Federation for Information Processing (IFIP); 2011.

42.

Bolinder

Antuna

Geelhoed-Duijvestijn

Kröger

Weitgasser

Novel glucose-sensing technology and hypoglycaemia in type 1 diabetes: a multicentre, non-masked, randomised controlled trial. Lancet. 2016;388:2254-2263. doi:10.1016/S0140-6736(16)31535-5.

43.

Haak

Hanaire

Ajjan

Hermanns

Riveline

Rayman

Flash glucose-sensing technology as a replacement for blood glucose monitoring for the management of insulin-treated type 2 diabetes: a multicenter, open-label randomized controlled trial. Diabetes Ther. 2017;8(1):55-73. doi:10.1007/s13300-016-0223-6.

44.

Mougiakakou

Prountzou

Iliopoulou

Nikita

Vazeou

Bartsocas

CS.

Neural network based glucose—insulin metabolism models for children with type 1 diabetes. Conf Proc IEEE Eng Med Biol Soc. 2006;2006:3545-3548. doi:10.1109/IEMBS.2006.260640.

45.

Robertson

Lehmann

Sandham

Hamilton

Blood glucose prediction using artificial neural networks trained with the AIDA diabetes simulator: a proof-of-concept pilot study. J Electr Comput Eng. 2011;2011:1-11 doi:10.1155/2011/681786.

46.

Pappada

Cameron

Rosman

, et al. Neural network-based real-time prediction of glucose in patients with insulin-dependent diabetes. Diabetes Technol Ther. 2011;13(2):135-141. doi:10.1089/dia.2010.0104.

47.

Alpaydin

In: Dietterich

Bishop

Heckerman

Jordan

Kearns

, eds. Introduction to Machine Learning. 3rd ed. London, England: The MIT Press; 2014.

48.

Cho

van Merriënboer

Gulcehre

, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. Published online 2014:1724-1734. doi:10.3115/v1/d14-1179.

49.

Alpaydin

Combined 5 x 2 cv F test for comparing supervised classification learning algorithms. Neural Comput. 1999;11(8):1885-1892. doi:10.1162/089976699300016007.

50.

Dietterich

TG.

Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 1998;10:1895-1923. doi:10.1007/978-3-319-50926-6_6.

51.

Chow

Zmora

Seaquist

Schreiner

PJ.

Development of a model to predict 5-year risk of severe hypoglycemia in patients with type 2 diabetes. BMJ Open Diabetes Res Care. 2018;6(1):e000527. doi:10.1136/bmjdrc-2018-000527.

Neural Networks With Gated Recurrent Units Reduce Glucose Forecasting Error Due to Changes in Sensor Location

Abstract

Background:

Methods:

Results:

Conclusion:

Keywords

Introduction

Data Acquisition

Methods and Materials

Problem Formulation

General Algorithm Criteria

Feedforward Neural Network

NN With GRUs

Algorithm Evaluation

Results

Discussion

Conclusion

Footnotes

Abbreviations

Declaration of Conflicting Interests

Funding

ORCID iDs

References