Sage Journals: Discover world-class research

Abstract

Background:

Data-driven models of the glucose-insulin metabolism have recently emerged as an effective framework for realistic virtual patient modeling in diabetes. The growing demand for personalized therapies requires precise and individualized models that align naturally with machine learning models trained on patient-specific data. Using deep generative models such as generative adversarial networks opens new possibilities for incorporating previously unmodeled physiological phenomena into simulations.

Methods:

In this study, we developed a new extended version of our conditional Wasserstein generative adversarial network model by incorporating aerobic exercise intensity data from the T1DEXI dataset, along with insulin administration and carbohydrate consumption data. We use an aerobic physical activity model to describe the effects of immediate and prolonged exercise on glycemia from recorded discrete intensity levels. This enables the network to retain contextual information about recent aerobic physical activity. A total of 1479 days of data from 56 patients, including 308 exercise sessions, were used to train and validate our model.

Results:

We evaluated the model to ensure that it replicates real-world data from the T1DEXI study in terms of mean blood glucose, time below range, time in range, time above range, and time in tight range, both in aggregate and when separated by active and sedentary days. In addition, the model reproduces aerobic exercise-induced glucose drops.

Conclusions:

This new model provides a more reliable, extended framework for in silico trials that incorporate physical activity scenarios, which has the potential to be used in the design and validation of automated insulin delivery.

Keywords

aerobic exercise generative deep learning type 1 diabetes virtual twins

Introduction

Physical activity (PA) is a pillar of healthy lifestyle, particularly for individuals living with type 1 diabetes (T1D): it enhances insulin sensitivity, supports cardiovascular health, and reduces overall insulin requirements.¹ Insulin doses must be adjusted, as exercise alters glucose levels both during the activity and in the hours that follow, even when insulin delivery is automated.^2,3 Ignoring PA in therapeutic decision-making may result in suboptimal insulin management, unnecessary hypoglycemic or hyperglycemic episodes, and a less personalized treatment strategy. Explicitly including PA information in models of the glucose-insulin system can improve simulation tools and decision support systems, helping individuals safely integrate exercise into their diabetes management routines.⁴

Existing literature on modeling the effects of exercise on glucose dynamics for T1D has predominantly relied on traditional physiological modeling approaches. Such approaches include extensions of the Bergman Minimal Model,⁵ integration of PA into white-box physiological frameworks,⁶ and control-oriented methods employing first-order transfer functions with time delays to represent exercise-induced glucose disturbances.⁷ Physiological models remain a cornerstone in this and other related fields. They offer a transparent representation of physiological mechanisms,⁸ enabling clinicians to interpret and trust the model’s behavior. Despite this, they face substantial limitations. Primarily, these models often rely on simplified assumptions for a multitude of their components. Such assumptions include linear responses, constant metabolic rates, and uniform physiological parameters across individuals. Static parameters typically include fixed insulin sensitivity, glucose effectiveness or response rates that do not adapt to changing physiological conditions, such as varying exercise intensities or types. This rigidity complicates the personalization of models and shows the inherent conflict present between model complexity and data collection in the clinical environment.⁹ The UVA/Padova model is among the most widely adopted physiological models for simulation,^8,10 including nonlinear components for insulin-dependent glucose utilization and absorption, but still features fixed physiological parameters and linear components. The calibration of such models also demands the acquisition of invasive, multi-level data, which is often inaccessible, limiting their practical scalability.¹¹

Given the high variability of PA within and between individuals, only a handful of explicit PA simulators exist. Clinicians instead rely on empirical rules such as the consensus guidelines established by Riddell et al.¹² Most published PA modules are ad hoc solutions that are directly attached to glucose-insulin models that have already been developed. For instance, Dalla Man et al,⁶ simulate both immediate and delayed exercise effects, yet they assumed unverified sensitivity dynamics and primarily tracked activity through heart rate (HR). Heart rate does correlate with exercise intensity but returns to its baseline minutes after exercise and is influenced by age, fitness, hydration, and venous return. Consequently, it provides minimal information regarding the prolonged glycemic impact.^13,14 These shortcomings leave a clear gap for more faithful, data-driven representations.

Data-driven approaches offer a more flexible alternative.¹⁵ In particular, generative models can capture the joint distribution of physiological and behavioral variables, allowing simulation of individual variability and lifestyle patterns. Generative adversarial networks (GANs) can approximate complex systems,¹⁶ and their variants can generate realistic data when conditioned to drivers such as insulin or meals.¹⁷ We have already shown that GANs reproduce glucose trajectories with high fidelity.^18,19

Inter-patient and intra-patient variability are inherent to the physiology of real T1D patients. To represent this variability, generative models have previously been used to capture the unpredictability of factors such as glucose levels or the rate of carbohydrate appearance.²⁰ Interest is growing in precision-medicine applications:²¹ time-series GANs for personalized insulin dosing^22,23 and data augmentation for glucose prediction.^24,25

Adding inputs naively can degrade data-driven models.²⁶ In earlier work, we developed two GAN-based virtual twins, in which events are transformed into physiologically meaningful vectors. The first model, a pixel-to-pixel GAN, translated one-dimensional plasma insulin (PI) values into grayscale images, then back into blood glucose (BG) images, generating BG curves conditioned solely on PI dynamics.¹⁸ The subsequent sequence-to-sequence GAN included PI, as well as replacing raw meal announcements with carbohydrate rate of appearance (RA). Shifting inputs 90 minutes captured physiological delays, while a Wasserstein loss and 1D convolutional layers learned temporal structure and produced realistic BG trajectories.¹⁹

The present study aims to extend this framework to aerobic PA. A bi-exponential filter converts discrete intensity readings into a continuous signal that represents both the immediate glucose drop during exercise and the prolonged post-exercise rise in insulin sensitivity.²⁷ This PA signal becomes a third conditioning input to the GAN, enabling simultaneous modeling of short- and long-term exercise effects.

To the best of our knowledge, this is the first time that a data-driven model of aerobic PA for glycemic simulation has been proposed. Furthermore, we extend the GAN-based methodology by incorporating a third input to condition the generation of glucose values while preserving realistic behavior. Finally, we propose a method for intensity-based translation of PA sessions to represent both long-term and short-term aerobic exercise–induced glucose dynamics, incorporating it as an input to the GAN.

Methods

Dataset

The T1DEXI study²⁷ monitored adults with T1D for a period of four weeks, during which they engaged in six video-guided aerobic, interval, or resistance workouts. Participants recorded numerous data, including continuous glucose monitoring (CGM), insulin delivery, carbohydrate intake, and exercise intensity. In the present study, we analyzed 56 individuals who met the following criteria: (1) completed ≥ 1 prescribed aerobic workout, (2) used a standard open-loop insulin pump. Prescribed aerobic exercise in T1DEXI was defined a priori by 30-minute study videos designed to keep HR at 70% to 80% of age-predicted HRmax. Participants simply performed these prelabeled sessions.

The intensity of exercise was extracted directly from the participant-reported activity intensity levels recorded in the T1DEXI dataset. Exercise logs classified each session as “light” (0), “moderate” (1), or “vigorous” (2). For modeling we shift labels to 1-3 (Supplemental Table S1). Baseline characteristics appear in Table 1.

Table 1.

Participant Demographics and Characteristics.

Characteristic	Value
Total Participants	56
Recorded Days $(μ \pm σ)$	$26.4 \pm 3.4 d a y s$
Sex	$F e m a l e : 41 (73.2 %)$
Sex	$M a l e : 15 (26.8 %)$
Race	$W h i t e : 50 (89.3 %)$
	$A s i a n : 2 (3.6 %)$
	$B l a c k : 2 (3.6 %)$
	$O t h e r s : 2 (3.6 %)$
Age $(μ \pm σ)$	$34.8 \pm 12.3 y e a r s$
HbA1c $(μ \pm σ)$	$6.6 \pm 0.8 %$
Diabetes Onset $(μ \pm σ)$	$18.7 \pm 9.3 y e a r s$
Total Daily Insulin $(μ \pm σ)$	$B a s a l : 17.1 \pm 11.5 U / d a y$
Total Daily Insulin $(μ \pm σ)$	$B o l u s : 18.9 \pm 11.0 U / d a y$
All Daily Vigorous Physical Activities $(μ \pm σ)$	$3.6 \pm 1.7 d a y s / w e e k$
All Daily Vigorous Physical Activities $(μ \pm σ)$	$44.4 \pm 34.2 m i n u t e s / d a y$
Body Measurements $(μ \pm σ)$	$H e i g h t : 66.4 \pm 3.5 i n$
	$W e i g h t : 161.9 \pm 30.7 l b s$
	$B M I : 25.7 \pm 4.1 k g / m^{2}$

Data Processing and Integration

Raw XPT files (CGM, insulin, carbs, PA intensities, HR, etc) were merged, and CGM values and pump hourly basal rates were resampled to 5 minutes (see pipeline in Figure 1). Duplicates within 5 minutes were dropped, gaps of less than 30-minute spline-interpolated, and CGM readings below 40 mg/dL were removed. Series with more than 100 consecutive missing samples were split into distinct patient IDs.

Figure 1.

Schematic representation of the data extraction and preprocessing.

The cleaned table was saved as a CSV file. Then, discrete inputs were converted to continuous curves (see subsections 2.2.1 and 2.2.2). Blood glucose data were log-transformed to balance hypoglycemic and hyperglycemic events.²⁸ This log-transformation aimed to improve the representation of hypoglycemic events, which were underrepresented compared with hyperglycemic events.

Transformed BG = 1.794 \times ({(\ln (bg))}^{1.026} - 1.861)

(1)

Auto-normalization was applied to BG, PI, and RA. These were min-max scaled for each patient to the range [−1, 1], guaranteeing a full span per individual. Meanwhile, PA was scaled once for the entire dataset. This ensured, only level 3 sessions reached ±1. After scaling, BG values were grouped into sliding-window vectors of length 25 to match the model’s prediction horizon shape.

Physiological modeling

Discrete insulin and carbohydrate inputs were converted into continuous absorption curves using the short-acting insulin and meal absorption models described by Hovorka.²⁹ Specifically, the model defined by equation (2) gives us the insulin concentration in plasma or PI. In our notation, the original symbol for insulin state $I (t)$ from Hovorka is replaced with $PI (t)$ for clarity.

{\begin{array}{l} \frac{d S_{1} (t)}{d t} = & u_{i} (t) - \frac{1}{τ_{S}} S_{1} (t) \\ \frac{d S_{2} (t)}{d t} = & \frac{1}{τ_{S}} S_{1} (t) - \frac{1}{τ_{S}} S_{2} (t) \\ \frac{d P I (t)}{d t} = & \frac{S_{2}}{τ_{S} \cdot V_{I}} - k_{e} \cdot P I (t) \end{array}

(2)

where $u_{i} (t)$ represents insulin administration, $S_{1} (t)$ and $S_{2} (t)$ are insulin-subcutaneous compartments, $P I (t)$ represents the plasma insulin concentration, $k_{e}$ (min^-1) is the elimination rate, and $τ_{S}$ (min) is a time constant that defines the duration of insulin action.

Carbohydrate absorption is modeled using equation (3) to produce a continuous rate of appearance $RA (t)$ .

{\begin{array}{l} \frac{d D_{1} (t)}{d t} = & A_{G} D (t) - \frac{1}{τ_{D}} D_{1} (t) \\ \frac{d D_{2} (t)}{d t} = & \frac{1}{τ_{D}} D_{1} (t) - \frac{1}{τ_{D}} D_{2} (t) \\ RA (t) = & \frac{D_{2} (t)}{τ_{D}} \end{array}

(3)

where $D (t)$ is the amount of carbohydrates ingested (grams of carbohydrate), $D_{1} (t)$ and $D_{2} (t)$ are compartment states, $A_{G}$ is a meal absorption parameter, and $τ_{D}$ is the carbohydrate absorption time constant.

PA modeling

The aerobic exercise module is a preprocessing step that converts the discrete intensity log into a single, dimensionless PA signal conveying both the acute and long-term effects of a workout. The model’s objective is not to mechanistically describe physiological processes in detail, but rather to provide the GAN with a single, compact variable that captures “recent exercise history.”

{\begin{array}{l} \frac{d T_{1} (t)}{d t} = & - \frac{T_{1} (t)}{τ_{1} (u_{e i} (t))} + A (t) \\ \frac{d T_{2} (t)}{d t} = & - \frac{T_{2} (t)}{τ_{2} (u_{e i} (t))} + \frac{T_{1} (t)}{τ_{1} (u_{e i} (t))} \end{array}

(4)

where $u_{e i} (t)$ represents prescribed activities’ exercise intensity $T_{1} (t)$ is a unitless compartment representing immediate fatigue/tiredness, $T_{2} (t)$ is a unitless compartment representing residual fatigue/tiredness, $τ_{1}, τ_{2}$ are discrete time constants (min) that control how quickly $T_{1}$ and $T_{2}$ decay toward zero after the PA ends, $u (t) \in ℝ \to 0, 1, 2, 3$ are discrete exercise intensity values extracted from the dataset exercise session log files, and A(t) is defined as follows

A (t) = {\begin{array}{l} 0, & if intensity u_{e i} (t) = 0, \\ 1, & if intensity u_{e i} (t) \in 1, 2, 3 . \end{array}

(5)

The first compartment $T_{1}$ captures the rapid onset and immediate effects associated with aerobic exercise. These effects reflect mechanisms such as acute glucose uptake by muscle cells. In contrast, $T_{2}$ represents the more gradual and longer-lasting post-exercise effects, such as increased insulin sensitivity and sustained muscle glucose uptake. The PA vector values are derived directly from $T_{2}$ . The time constants $τ_{1}$ and $τ_{2}$ were selected heuristically to model faster decay following low-intensity PA (approximately 2 hours) and a slower decay following high-intensity PA (around 3 hours). Supplemental Figures S1-S3 illustrate the temporal dynamics of the $T 1$ and $T 2$ compartments for each of the exercise intensities, showing the different decay profiles. It is important to note that in this modeling approach higher-intensity exercise does not increase $A (t)$ . Instead, differences in exercise intensity are modeled by defining distinct values for the time constants $τ_{1}$ and $τ_{2}$ corresponding to each intensity level.

(τ_{1}, τ_{2}) = {\begin{array}{l} (1, 1) & i f u_{e i} (t) = 0 \\ (20, 15) & i f u_{e i} (t) = 1 \\ (15, 35) & i f u_{e i} (t) = 2 \\ (5, 45) & i f u_{e i} (t) = 3 \end{array}

(6)

System Architecture

Building on the previously validated conditional Wasserstein GAN (cWGAN) framework,³⁰ we adopt a sequence-to-sequence design (Figure 2). Both the generator and the critic are conditioned on three physiological streams: RA, PI, and PA.

Figure 2.

Diagram of the GAN model.

The generator (Supplemental Figure S4) concatenates a five-dimensional Gaussian latent vector with the single-time-point inputs $(R A_{t}, P I_{t}, P A_{t})$ through an initial dense layer. Then, it upsamples with three $C o n v 1 D$ -Transpose blocks (filters $512 \to 256 \to 128$ , kernel = 1, $L e a k y R e L U α = 0.2$ ). A final $t a n h$ layer outputs a 25-point BG trajectory scaled to $[- 1, 1]$ .

The critic (Supplemental Figure S5) receives the same conditional triplet plus a 25-point BG vector that is alternately real or generated. Five $C o n v 1 D$ layers (filters $64 \to 128 \to 256 \to 512 \to 1024$ ; kernels 9, 5, 3; $L e a k y R e L U α = 0.2$ ) progressively down-sample the sequence, and a linear head returns the Wasserstein score. This symmetric conditioning ensures that both networks learn to link RA, PI, and PA with physiologically plausible glucose dynamics.

Loss functions

Training the cWGAN relies on three complementary objectives.

Critic Loss ( $ℒ_{C}$ ): The critic obeys the Wasserstein formulation, assigning higher scores to real than to synthetic sequences by minimizing:

ℒ_{C} = E_{(x, y) \sim ℙ_{d a t a}} [- C (x, y)] + E_{z \sim p_{z}, y \sim ℙ_{d a t a}} [C (G (z, y), y)]

(7)

where $x$ is the real blood glucose profile, $y$ represents the conditional inputs (PI, RA, PA), $z \sim p_{z}$ is a random noise vector sampled from a standard normal distribution, $G (z, y)$ is the generator output conditioned on $y$ , and $C (-)$ is the critic’s output score. The symbol $E$ denotes the expected value, which in this context corresponds to averaging over samples from the specified distributions.

Generator Adversarial Loss ( $ℒ_{G}$ ): generator adversarial loss rewards sequences that the critic judges real:

ℒ_{G} = - E_{z \sim p_{z}, y \sim ℙ_{d a t a}} [C (G (z, y), y)]

(8)

Reconstruction Loss ( $ℒ_{2}$ ): to encourage fidelity of individual trajectories we add an L2 norm regularization term:

ℒ_{2} = E_{(x, y) \sim ℙ_{d a t a}, z \sim p_{z}} [{‖ G (z, y) - x ‖}_{2}^{2}]

(9)

where $| - |_{2}^{2}$ represents the squared L2 norm or Mean Squared Error.

Final Generator Loss: the generator is optimized using a weighted combination of adversarial and reconstruction losses. In our implementation, the reconstruction term is emphasized to improve sample fidelity:

ℒ_{total} = ℒ_{G} + 15 \cdot ℒ_{2}

(10)

Training procedure

We used Python 3.9.19 with TensorFlow version 2.10 running on an RTX 4070 Ti. Specific training parameters included a batch size of 64, learning rates of $8 \times 10^{- 6}$ for both networks, weight clipping threshold of 0.01, and a critic-to-generator update ratio of 5:1.³⁰ Real samples were drawn randomly from scaled patient data. Fake samples were generated by passing random latent vectors, along with conditional inputs (PI, RA, PA) through the generator. Both real and fake samples were then assessed by the critic. Training ended after 24 hours 38 minutes due to early stopping after 22 epochs (116 880 steps) based on distributional distance. We used the entire dataset for training (Algorithm 1) and simulation (Algorithm 2).

Algorithm 1.

Training of the cWGAN.

Require: Preprocessed dataset with inputs (BG, PI, RA, PA)
1. Initialize generator

$G$

and critic

$C$

weights with RandomNormal initializer (

$σ = 0.02$

)
2. Set learning rate

$α = 8 \times 10^{- 6}$

, batch size = 64, c = 0.01,

$n_{c r i t i c} = 5$

$λ = 15$

3. WHILE JSD has not converged DO:
4. # Update Critic
5. FOR

$s t e p = 0, \dots, n_{c r i t i c}$

DO:
6. Sample a batch of real data:

$(B G_{r e a l}, P I, R A, P A)$

7. Sample latent vectors

$z \sim N (0, I)$

8. Generate fake samples:

$B G_{f a k e} = G (z, P I, R A, P A)$

$ℒ_{C} \leftarrow (+ 1) \cdot C (B G_{f a k e}, P I, R A, P A) + (- 1) \cdot C (B G_{r e a l}, P I, R A, P A)$

10.

$g_{w_{C}} \leftarrow \nabla_{w_{C}} [ℒ_{C}]$

11.

$w_{C} \leftarrow w_{C} + α \cdot R M S P r o p (w_{C}, g_{w_{C}})$

12.

$w_{C} \leftarrow c l i p (w_{C}, - c, c)$

13. END FOR
14. # Update Generator
15. Sample a new batch of real data:

$(B G_{r e a l}, P I, R A, P A)$

16. Sample latent vectors

$z \sim N (0, I)$

17. Generate fake samples:

$B G_{f a k e} = G (z, P I, R A, P A)$

18.

$ℒ_{2} \leftarrow ∥ B G_{r e a l} - B G_{f a k e} ∥_{2}^{2}$

19.

$ℒ_{G} \leftarrow (- 1) \cdot D (B G_{f a k e}, P I, R A, P A)$

20.

$ℒ_{t o t a l} \leftarrow ℒ_{G} + λ \cdot ℒ_{2}$

21.

$g_{w_{G}} \leftarrow \nabla_{w_{G}} [ℒ_{t o t a l}]$

22.

$w_{G} \leftarrow w_{G} + α \cdot R M S P r o p (w_{G}, g_{w_{G}})$

23. END WHILE

$G$ = Generator, $C$ = Critic, $z$ = Latent vector, $α$ = Learning rate, $c$ = Weight clip value, $n_{c r i t i c}$ = Critic updates per generator update, $λ$ = L2 loss weight, $L_{C}$ = Critic’s loss, $L_{G}$ = Generator’s loss, $w_{C}$ = Weights of the critic, $w_{G}$ = Weights of the generator, $g_{w_{C}}$ = Gradient of critic’s weights, $g_{w_{G}}$ = Gradient of generator’s weights, JSD = Jensen-Shannon Divergence.

Algorithm 2.

Simulation of Blood Glucose Curve Using Trained Generator.

Require: Trained Generator

$G$

, preprocessed dataset with inputs (PI, RA, PA), the dataset length L and prediction horizon PH=25 samples
1. Initialize

$B G_{gen} \leftarrow 0$

tensor of shape

$[L, L]$

2. FOR

$i = 0, \dots, L$

DO:
3. Extract slice from the inputs:

$(z, P I_{i}, R A_{i}, P A_{i})$

4. Inference

$B G_{{gen}_{i}} \leftarrow G (z, P I_{i}, R A_{i}, P A_{i})$

5. Insert

$B G_{{gen}_{i}}$

with size

$(1, P H)$

into

$B G_{gen}$

at shifted position

$i$

6. END FOR
7. Replace all empty positions in

$B G_{gen}$

with NaN
8.

$B G_{scaled} \leftarrow$

Compute vertical mean of

$B G_{gen}$

ignoring NaNs
9.

$B G_{unscaled} \leftarrow Reverse \min \max scaling from B G_{scaled}$

$G$ = Generator, $C$ = Critic, $z$ = Latent vector, PH=Prediction Horizon, $x_{\max}$ = saved scaling parameter with the maximum value of the data, $x_{\min}$ = saved scaling parameter with the minimum value of the data.

Validation

Model plausibility is assessed in three stages. First, statistical equivalence is tested by comparing mean BG, coefficient of variation, time below range (TBR), time in range (TIR), time above range (TAR), and time in tight range (TITR) between real and generated days. The Shapiro-Wilk test is used to check normality, and two-sided Student’s t-tests are used to check for group differences. Structural similarity is visualized using t-distributed stochastic neighbor embedding (t-SNE) on real and synthetic windows. The physiological response to exercise is examined by calculating the glucose drop for each workout, a validation strategy also used by Fushimi et al.³¹ This is defined as the difference between the 10-minute pre-exercise baseline and the nadir during activity. These drops and the previously mentioned glycemic metrics are then contrasted across active versus sedentary days, and real versus generated patients. An active day is defined as the 24-hour period following the end of an exercise activity, while sedentary periods are any days that fall outside this definition.^27,32 Finally, we probe causal consistency through convergent cross-mapping (CCM), to confirm that PI, RA, and PA drive the generated BG dynamics.

Results

After training, the generator alone is used to produce glucose traces based on real conditional inputs. Statistical validation began with Shapiro-Wilk tests, which confirmed normality for all metrics except TBR (see Supplemental Table S2). Two-tailed Student’s t-tests on these metrics (Table 2) revealed that there are no statistically significant differences between synthetic and real glycemic metrics $(p > 0.05)$ , except for CV, where synthetic data exhibited a higher median and upper quartile, indicating minor discrepancies at the population level.

Table 2.

Comparative of Median (q1-q3) Glycemic Outcomes for Generated Versus Real Data Grouped by Patients (56 patients).

Type	Metric	Real	Generated
Overall	Mean BG (mg/dL)	146.7(129.4−156.1)	152.4(137.2−158.3)
	CV (%)	34.21(31.51−37.32)	39.38*(32.33−46.97)
	TBR (%)	3.36(1.54−5.99)	2.41(1.16−3.98)
	TIR (%)	74.97(64.30−80.83)	74.55(69.56−80.78)
	TAR (%)	21.37(13.26−32.25)	22.48(15.81−27.54)
	TITR (%)	47.39(40.60−60.63)	51.56(40.09−60.35)

Abbreviations: Mean BG, average blood glucose; CV, coefficient of variation; TBR, time below range; TIR, time in range; TAR, time above range; TITR, time in tight range.

Asterisk (*) indicates a statistically significant difference $(P < 0.05)$ .

Exercise validation first contrasted glucose drops during cardiovascular activity (Table 3). Since the Shapiro-Wilk test rejected the normality assumption for both real and synthetic glucose, we applied an independent two-sample Kolmogorov-Smirnov test between the two distributions of glucose drops, which found no significant difference between real and generated glucose responses to exercise ( $P = 0.535$ ).

Table 3.

Glucose Variations During Aerobic Exercise Events (N = 308) With Kolmogorov-Smirnov Test Results.

Metric	Exercise events
Metric	Real	Generated
Median Glucose Drop (mg/dL)	−23.8	−20.1
Mean Glucose Drop (mg/dL)	−31.2	−34.4
Std Glucose Drop (mg/dL)	36.0	43.5
IQR (Q1, Q3) (mg/dL)	(−48.5, −3.5)	(−49.5, −2.8)
KS test p-value: 0.535

We then compared glycemic metrics between real and generated data, splitting active and sedentary days (Table 4). When comparing active against sedentary days, we observed that mean BG, CV, and TAR were decreased, whereas TBR, TIR, and TITR increased, for both real and generated cohorts. Again, CV was the only metric to show statistically significant difference, being overestimated in generated patients.

Table 4.

Comparative Glycemic Outcomes for Generated Versus Real Data Grouped by Patients (56 patients) in Active and Sedentary States.

Type	Metric	Real	Generated
Sedentary	Mean BG (mg/dL)	144.7(129.7−156.9)	153.1(137.7−160.7)
	CV (%)	34.14(30.89−37.31)	39.49*(33.44−47.66)
	TBR (%)	3.55(1.30−6.24)	2.50(0.69−4.02)
	TIR (%)	73.52(64.71−81.07)	75.21(69.36−81.88)
	TAR (%)	22.31(12.44−32.08)	22.31(16.04−26.26)
	TITR (%)	48.05(41.24−60.58)	50.04(39.54−60.35)
Active	Mean BG (mg/dL)	139.1(121.8−151.6)	144.1(133.7−159.4)
	CV (%)	32.67(27.97−35.96)	37.60*(29.34−45.40)
	TBR (%)	3.92(1.62−6.89)	3.04(1.59−7.11)
	TIR (%)	77.23(68.26−87.41)	77.08(69.07−83.67)
	TAR (%)	18.80(7.81−28.47)	18.63(11.55−26.59)
	TITR (%)	53.13(41.97−69.31)	52.86(40.19−59.64)

Abbreviations: mean BG, average blood glucose; CV, coefficient of variation; TBR, time below range; TIR, time in range; TAR, time above range; TITR, time in tight range.

Asterisk (*) indicates statistically significant differences ( $P < 0.05$ ).

Figure 3 shows real and generated blood glucose over three days for a sample patient, including input data. The range includes two days with exercise and one without.

Figure 3.

Comparison of real (blue) and generated (red) blood glucose values for Patient 187 over a three-day period. Includes corresponding inputs: PI (green), RA (yellow), and PA (purple). Explore the interactive version at https://timeseries-bg-viz.vercel.app/.

A three-dimensional (3D) t-SNE embedding was used for qualitative distribution analysis (Figure 4). The overlap of real and synthetic point clouds indicates the cWGAN’s ability to replicate the underlying data structure.

Figure 4.

A 3D t-SNE visualization. The axes represent the components learned by the t-SNE dimensionality reduction technique. These axes do not correspond to physical variables but are inferred representations of the data’s intrinsic patterns. Explore the interactive version at: https://tsne-bg-viz.vercel.app/.

Causality was assessed via CCM. The causality coefficients for each input (PI, RA, and PA) are shown in Figure 5. All three inputs exhibit a statistically significant causal effect on glucose, while also demonstrating comparable causal influence on both real and generated glucose dynamics.

Figure 5.

Diagram of causality in the model computed using Convergent Cross Mapping between inputs and outputs.

Discussion

In this study we expanded our data-based virtual twins with PA information. This resulted in a more generalizable tool than previous black-box solutions. Despite overestimated CV, and a slight increase in mean BG, the model reproduces exercise-induced glucose drops. These discrepancies may partly stem from the populational nature of the model, which assumes a uniform response from a heterogeneous cohort. The model captures the long-term effects during active days, mimicking the expected decrease in mean BG and TAR, with the corresponding time distributed into higher TBR, TIR, and TITR.

Blood glucose, PI, and RA were auto-scaled to $[- 1, 1]$ to match the generator’s $t a n h$ output and mitigate mode collapse.^30,33 Physical activity was globally (not individually) scaled so inter-subject intensity differences were preserved. Per-subject normalization would have hidden whether a participant ever engaged in truly vigorous exercise.

Causality tests using CCM overcome Granger’s limits for weakly coupled variables. Convergent cross-mapping confirmed measurable causal links from all three inputs to BG ( $P I > R A > P A$ ), reproducing the ranking seen in the real dataset.³⁴ The model reproduces both acute glucose drops during the prescribed aerobic sessions and delayed changes from altered insulin sensitivity, treating PA as a quantified driver rather than a random disturbance.

Nonetheless, the model is valid under the conditions for which it was developed: 30-minute guided aerobic exercise sessions carried out by the study cohort of T1D patients with open-loop insulin pumps. Extrapolation beyond these bounds may yield inaccurate predictions. Interval and resistance exercises, multiple daily injections, and hybrid closed-loop therapies have yet to be modeled. Adding nonlinear exercise filters and individual metabolic parameters could sharpen realism, and an explicit two-way interaction between insulin and PA should replace the current fixed-decay model. Tailoring the model to each patient could also improve its performance.

One notable feature of our approach is that we use the entire dataset to train our generative model. Unlike deterministic predictors, which map inputs to a single target and are judged by test-set accuracy, generative GANs differ, they learn the entire joint distribution and create synthetic samples drawn from that distribution. Thus, classic train/validation/test splits are not well suited. To faithfully capture all the variability and rare events in the data, the generative model benefits from all available examples. Several works in related fields using deep neural networks³⁵ and GANs,^36,37 as well as in diabetes using probabilistic modeling for eating patterns,³⁸ demonstrate the use of the entire dataset.

This generative approach aligns with the observed intra-patient variability in T1D, where glucose responses vary even with the same inputs. Therefore, our GAN appends a Gaussian noise vector to each insulin-carbohydrate-exercise tuple. This latent encoding captures plausible physiological states, enabling the network to learn the full conditional distribution $P (glucose | insulin, carbs, exercise)$ and output a unique, realistic trajectory with each iteration. Since the model is stochastic, point-error metrics like RMSE are irrelevant. Instead, we check how well synthetic and real distributions and key glycemic metrics align to ensure each virtual twin faithfully reproduce their day-to-day variability.

We projected glucose profiles onto a 3D t-SNE map, where nearby points exhibit similar short-term evolution. While the absolute geometry of the clouds (loops and arcs) primarily reflects the natural diurnal rhythm, what matters is how the two point clouds co-occupy regions of that shape. The two point clouds overlap with comparable density across the manifold, confirming that the GAN preserves the underlying temporal structure.

Future work will train equivalent modules for the interval and resistance-training records of the T1DEXI dataset, enabling comprehensive simulation across multiple exercise modalities.

Conclusions

We present the first data-driven virtual twin that incorporates aerobic exercise, insulin, and carbohydrate intake for individuals with T1D. Trained on 1479 real-world days from the T1DEXI study, our cWGAN learns $P (glucose | insulin, carbs, exercise)$ directly from data, without relying on predefined mechanistic equations. The model reproduces clinical metrics (mean glucose, TIR/TAR/TBR) and captures both the rapid drop in glucose during workouts and the hours-long rise in insulin sensitivity that follows.

Our results confirm that GAN-based virtual twins can represent the intertwined dynamics of insulin, meals, and PA. This provides a scalable foundation for personalized decision support and in silico trials.

Supplemental Material

sj-docx-1-dst-10.1177_19322968251364291 – Supplemental material for Including Aerobic Exercise Into Data-Based Virtual Twins for Glycemic Simulation

Supplemental material, sj-docx-1-dst-10.1177_19322968251364291 for Including Aerobic Exercise Into Data-Based Virtual Twins for Glycemic Simulation by Oriol Bustos, Omer Mujahid, Iván Contreras, Aleix Beneyto and Josep Vehi in Journal of Diabetes Science and Technology

Footnotes

Acknowledgements

None.

Abbreviations

BG, blood glucose; CGM, continuous glucose monitoring; CV, coefficient of variation; cWGAN, conditional Wasserstein generative adversarial network; GANs, generative adversarial networks; HR, heart rate; PA, physical activity; PI, plasma insulin; RA, rate of appearance of carbohydrates; T1D, type 1 diabetes; TAR, time above range; TBR, time below range; TIR, time in range; TITR, time in tight range; t-SNE, t-distributed stochastic neighbor embedding.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by grant 2024-PROD-00051 funded by AGAUR, by grant PID2022-137723OB-C22 funded by MCIN/AEI/ and by 2021 SGR 01598 funded by the Autonomous Government of Catalonia.

ORCID iDs

Oriol Bustos

Omer Mujahid

Iván Contreras

Aleix Beneyto

Josep Vehi

Supplemental Material

Supplemental material for this article is available online.

References

Colberg

Sigal

Yardley

, et al. Physical activity/exercise and diabetes: a position statement of the American diabetes association. Diabetes Care. 2016;39(11):2065-2079.

Zaharieva

Riddell

MC.

Insulin management strategies for exercise in diabetes. Can J Diabetes. 2017;41(5):507-516.

Moser

Zaharieva

Adolfsson

, et al. The use of automated insulin delivery around physical activity and exercise in type 1 diabetes: a position statement of the European association for the study of diabetes (EASD) and the international society for pediatric and adolescent diabetes (ISPAD). Diabetologia. 2025;68(2):255-280.

Bergford

Riddell

Jacobs

, et al. The type 1 diabetes and EXercise initiative: predicting hypoglycemia risk during exercise for participants with type 1 diabetes using repeated measures random forest. Diabetes Technol Ther. 2023;25(9):602-611.

Roy

Parker

RS.

Dynamic modeling of exercise effects on plasma glucose and insulin levels. J Diabetes Sci Technol. 2007;1(3):338-347.

Man

Breton

Cobelli

Physical activity into the meal glucose—insulin model of type 1 diabetes: in silico studies. J Diabetes Sci Technol. 2009;3(1):56-67.

Jaloli

Cescon

, eds. Modeling physical activity impact on glucose dynamics in people with type 1 diabetes for a fully automated artificial pancreas. In: 2023 IEEE Conference on Control Technology and Applications (CCTA). Bridgetown: IEEE; 2023:546-551.

Pompa

Panunzi

Borri

De Gaetano

A comparison among three maximal mathematical models of the glucose-insulin system. PLoS ONE. 2021;16(9):e0257789.

Hose

Lawford

Halliday

Rafiroiu

Lungu

Challenges and progress in the application of physiological models for clinical decision support in cardiovascular medicine. IOP Conf Ser Mater Sci Eng. 2022;1254:012005.

10.

Man

Micheletto

Breton

Kovatchev

Cobelli

The UVA/PADOVA type 1 diabetes simulator: new features. J Diabetes Sci Technol. 2014;8(1):26-34.

11.

Barros

Paci

Tervonen

, et al. From multiscale biophysics to digital twins of tissues and organs: future opportunities for in-silico pharmacology. IEEE Trans Mol Biol Multiscale Commun. 2024;10:576-594.

12.

Riddell

Gallen

Smart

, et al. Exercise management in type 1 diabetes: a consensus statement. Lancet Diabetes Endocrinol. 2017;5(5):377-390.

13.

Muñoz-López

Naranjo-Orellana

Individual versus team heart rate variability responsiveness analyses in a national soccer team during training camps. Sci Rep. 2020;10(1):11726.

14.

Lauer

Okin

Larson

Evans

Levy

Impaired heart rate response to graded exercise: prognostic implications of chronotropic incompetence in the Framingham heart study. Circulation. 1996;93(8):1520-1526.

15.

Zhu

Herrero

Georgiou

Deep learning for diabetes: a systematic review. IEEE J Biomed Health Inform. 2020;25(7):2744-2757.

16.

Goodfellow

Pouget-Abadie

Mirza

, et al. Generative adversarial networks. Commun ACM. 2020;63(11):139-144.

17.

Mirza

. Conditional generative adversarial nets [published online ahead of print November 6, 2014]. arXiv. doi:10.48550/arXiv.1411.1784.

18.

Mujahid

Contreras

Beneyto

Conget

Gimenez

Vehi

Conditional synthesis of blood glucose profiles for T1D patients using deep generative models. Mathematics. 2022;10(20):3741.

19.

Mujahid

Contreras

Beneyto

Vehi

Generative deep learning for the development of a type 1 diabetes simulator. Commun Med. 2024;4(1):51.

20.

Noguer

Contreras

Beneyto

Vehi

Modelling rate of exogenous glucose appearance for biomedical applications using conditional generative models. IFAC-Pap. 2024;58(23):127-132.

21.

Ghebrehiwet

Zaki

Damseh

Mohamad

MS.

Revolutionizing personalized medicine with generative AI: a systematic review. Artif Intell Rev. 2024;57(5):128.

22.

Yoon

Jarrett

der Schaar

Time-series generative adversarial networks. Adv Neural Inf Process Syst. 2019;32:5508-5518.

23.

Schürch

Allam

, et al. Generating personalized insulin treatments strategies with deep conditional generative time series models [published online ahead of print November 13, 2023]. arXiv. doi:10.48550/arXiv.2309.16521.

24.

Kalita

Sharma

Mirza

KB.

Continuous glucose, insulin and lifestyle data augmentation in artificial pancreas using adaptive generative and discriminative models. IEEE J Biomed Health Inform. 2024;28:4963-4974.

25.

Noguer

Contreras

Mujahid

Beneyto

Vehi

Generation of individualized synthetic data for augmentation of the type 1 diabetes data sets using deep learning models. Sensors. 2022;22(13):4944.

26.

Nemat

Khadem

Elliott

Benaissa

Data-driven blood glucose level prediction in type 1 diabetes: a comprehensive comparative analysis. Sci Rep. 2024;14(1):21863.

27.

Riddell

Gal

, et al. Examining the acute glycemic effects of different types of structured exercise sessions in type 1 diabetes in a real-world setting: the type 1 diabetes and exercise initiative (T1DEXI). Diabetes Care. 2023;46(4):704-713.

28.

Kovatchev

Cox

Gonder-Frederick

Clarke

Symmetrization of the blood glucose measurement scale and its applications. Diabetes Care. 1997;20(11):1655-1658.

29.

Hovorka

Canonico

Chassin

, et al. Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes. Physiol Meas. 2004;25(4):905-920.

30.

Arjovsky

Chintala

Bottou

Wasserstein generative adversarial networks. Int Conf Mach Learn. 2017;70:214-223.

31.

Fushimi

Aiello

Cho

, et al. Online classification of unstructured free-living exercise sessions in people with type 1 diabetes. Diabetes Technol Ther. 2024;26(10):709-719. doi:10.1089/dia.2023.0528.

32.

Cho

Aiello

Ozaslan

, et al. Design of a real-time physical activity detection and classification framework for individuals with type 1 diabetes. J Diabetes Sci Technol. 2024;18(5):1146-1156.

33.

Lin

Jain

Wang

Fanti

Sekar

, eds. Using gans for sharing networked time series data: challenges, initial promise, and open questions. In: Proceedings of the ACM Internet Measurement Conference. New York, NY: Association for Computing Machinery; 2020:464-483.

34.

Sugihara

May

, et al. Detecting causality in complex ecosystems. Science(1979). 2012;338(6106):496-500.

35.

Lakshminarayanan

Pritzel

Blundell

Simple and scalable predictive uncertainty estimation using deep ensembles. Adv Neural Inf Process Syst. 2017;30:6405-6416.

36.

Clark

Donahue

Simonyan

. Adversarial video generation on complex datasets. arXiv:1907.06571, 2019. https://arxiv.org/abs/1907.06571

37.

Smith

Meger

. Improved adversarial systems for 3D object generation and reconstruction. arXiv:1707.09557, 2017. https://arxiv.org/abs/1707.09557

38.

Aiello

Toffanin

Magni

De Nicolao

Model-based identification of eating behavioral patterns in populations with type 1 diabetes. Control Eng Pract. 2022;123:105128. https://www.sciencedirect.com/science/article/pii/S0967066122000430. Accessed July 24, 2025.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.42 MB