Sage Journals: Discover world-class research

Abstract

Unmeasured baseline information in left-truncated data situations frequently occurs in observational time-to-event analyses. For instance, a typical timescale in trials of antidiabetic treatment is “time since treatment initiation”, but individuals may have initiated treatment before the start of longitudinal data collection. When the focus is on baseline effects, one widespread approach is to fit a Cox proportional hazards model incorporating the measurements at delayed study entry. This has been criticized because of the potential time dependency of covariates. We tackle this problem by using a Bayesian joint model that combines a mixed-effects model for the longitudinal trajectory with a proportional hazards model for the event of interest incorporating the baseline covariate, possibly unmeasured in the presence of left truncation. The novelty is that our procedure is not used to account for non-continuously monitored longitudinal covariates in right-censored time-to-event studies, but to utilize these trajectories to make inferences about missing baseline measurements in left-truncated data. Simulating times-to-event depending on baseline covariates we also compared our proposal to a simpler two-stage approach which performed favorably. Our approach is illustrated by investigating the impact of baseline blood glucose levels on antidiabetic treatment failure using data from a German diabetes register.

Keywords

Baseline covariates delayed entry joint model left truncation time origin

1. Introduction

The motivating data example for this work is a population-based study in patients diagnosed with type 2 diabetes and under first-line oral antidiabetic drug (OAD) medication.¹ One aim of the study was to explore prognostic baseline factors associated with treatment failure. The data source was the DIabetis Versorgungs-Evaluation (DIVE) register, a German prospective, observational, multicenter diabetes register established in 2011.² The natural timescale for analyzing OAD treatment failure is “time since the start of first OAD medication.” However, registers collect data in calendar time. Therefore, baseline covariate information such as relative glycated hemoglobin level (hemoglobin A1c (HbA1c)) is typically not available for those patients who had been assigned to OAD before the establishment of the DIVE register (see Figure 1 for a graphical visualization). These patients have a delayed study entry, a common phenomenon in observational studies^3,4: In the DIVE example, these patients are included in the register after their start of OAD treatment. Their starting time of OAD treatment will typically be known, but not necessarily information such as HbA1c values at the start of treatment. A direct consequence of this phenomenon is left truncation. The powerful time-to-event framework based on modern counting process theory naturally includes left truncation,⁵ but incorporating unmeasured baseline information as a consequence of delayed study entry is challenging.^4,6

Figure 1.

Data situation of the DIVE register: the timescale of interest is “time since start of first OAD medication”. Prospective observations enter the study at $t = 0$ and the baseline HbA1c level can be measured. Patients assigned to OAD medication before the establishment of the register have unknown baseline HbA1c levels and enter the study at some time $t > 0$ . This illustration is an adapted modification of the research of Bluhmki et al.⁷ DIVE: DIabetis Versorgungs-Evaluation; OAD: oral antidiabetic drug; HbA1c: hemoglobin A1c.

One common approach is to fit a Cox proportional hazards model incorporating the study entry measurements.⁸ Keiding and Knuimann⁶ argued that this is only reasonable for time-invariant (e.g. gender) or derived covariates (e.g. age) but not for time-dependent covariates such as biomarkers. On the one hand, the time of study entry because of the start of a register in calendar time may not have a meaningful interpretation on the patient level. On the other hand, the value at study entry may differ from the value at baseline, distorting the analysis. For example, the HbA1c level usually declines after OAD initiation, increases afterwards, and plateaus at the end, leading to a piecewise linear longitudinal trajectory with two breakpoints.^9,10 Hence, HbA1c values at study entry after OAD treatment initiation will already be medically regulated and are therefore not suited for baseline analysis. As a consequence, Keiding and Knuimann⁶ warned against such analyses but did not offer a solution for the aim of analyzing, for example, baseline HbA1c in the complete register data set.

To fill this gap, we propose a statistical framework based on joint models that combine a model for the time-dependent covariates with a model for the event of interest. Typically, joint models focus on the effect of a longitudinal marker on time-to-event in the presence of right-censoring. Here, the challenge is that markers are usually not monitored continuously, and a longitudinal submodel is used to account for marker values missing at the observed event times. In contrast, our focus is on left truncation and missing covariate values at baseline for patients with delayed study entry.

The joint model framework also simultaneously accounts for measurement errors in the longitudinal covariate. Therefore, our proposed approach does not only account for unmeasured, but also for mismeasured baseline information. However, the Cox proportional hazards model does not account for measurement errors in the covariates.

The two submodels of the joint model are linked with shared parameters, where random effects capture individual-specific dependencies.^11–13 Using joint models, there are several ways to make an inference. One is boosting,¹⁴ and the more often used ones are maximum likelihood methods^12,13,15 and a Bayesian approach.¹⁶ Joint models are implemented in several statistical software programs (see Yuen and Mackinnon¹⁷ and Furgal et al.¹⁸ for overviews and comparisons), and both maximum likelihood and the Bayesian approaches have already been extended to cover left-truncated data situations.^19–23 We also note that joint models for left-truncated data have been investigated for, for example, discrete longitudinal outcomes,²⁴ longitudinal counts and ordinal data.^25–27 We use already-known Bayesian joint model methods as a tool to reconstruct the missing baseline information.

To the best of our knowledge, there are very few other approaches to attack the present problem: Sperrin and Buchan²⁸ suggested a two-stage approach while using age as the timescale. First, potentially time-varying variables, measured at one timepoint only, are regressed against time. As a second step, the residuals obtained from the first step are inserted in a proportional hazard or accelerated failure time model. The approach of Sperrin and Buchan²⁸ assumes time-constant residuals or errors, which is more restrictive than in the common joint model. Lee and Betensky²⁹ derived conditions under which, in addition to a continuous and fully observed time-varying covariate, the estimate of the regression coefficient is still consistent even if study entry is incorrectly used as time origin. Then, they derive conditions for the same to hold for a not fully observed time-varying covariate. Furthermore, while assuming a functional form for the time-varying biomarker, which in fact may only be measured at study entry, they provide methods for estimating the regression parameter.

Both these approaches make rather strong assumptions, and we are interested in scenarios where the covariate value at study entry may not be used to analyze baseline effects. Hence, we will use the longitudinal trajectories to inform on baseline marker values and will compare our proposal against the common approach of using study entry values in a Cox analysis. We will also compare our suggestion against a two-stage and a complete case analysis.

The remainder of the article is structured as follows: Section 2 introduces the statistical framework for evaluating the baseline effects of longitudinal covariates on a time-to-event outcome in the presence of delayed study entry. In Section 3, a simulation study assesses the validity of our proposal and compares it to the standard Cox model incorporating the study entry measurements as well as to the complete case analysis and a two-stage approach. The procedure is applied to the DIVE data to quantify the effect of the HbA1c level at OAD initiation on time to treatment failure (Section 4). A discussion is given in Section 5.

2. A Joint model incorporating baseline effects

Throughout the article, “baseline” is defined as the time origin $(t = 0)$ with regard to a pre-specified timescale of interest. In our motivating study example, the timescale is chosen as “time since first treatment initiation”. Let $n$ be the total number of individuals under study and $T_{i}$ the event time of individual $i$ , $i \in {1, \dots, N}$ . We assume that the observation of $T_{i}$ is restricted by a pair $(L_{i}, C_{i})$ , assumed to be independent of $T_{i}$ , such that only patients with $T_{i} > L_{i}$ enter the study and only patients under study with $T_{i} \leq C_{i}$ have an observed event.⁵ We also assume that the left truncation time $L_{i}$ is less than the right-censoring time $C_{i}$ . Hence, the observed time-to-event information for each subject is then given by the tuple $(T_{i} \land C_{i}, L_{i}, δ_{i})$ . Here, $T_{i} \land C_{i} := min (C_{i}, T_{i})$ and $δ_{i} = 1 (T_{i} \leq C_{i})$ is the event indicator, which equals one if $T_{i} \leq C_{i}$ and zero otherwise. Note that individual $i$ has a delayed study entry if $L_{i} > 0$ and $T_{i} > L_{i}$ (cf. the last individual in Figure 1). Let $V_{i}$ be a set of time-independent covariates collected at $L_{i}$ . Following standard notations,¹² we further define $w_{i} = {w_{i} (t_{i j}), j = 1, \dots, n_{i}}$ as the observed trajectory of a time-dependent covariate (e.g. biomarker) measured at $n_{i}$ patient-specific timepoints $t_{i j}$ . In the following, we only consider one longitudinal response, but our concept can be generalized to multiple longitudinal responses $w_{i} (t)$ .

2.1. Longitudinal submodel

Following Rizopoulos,¹² we assume a linear mixed-effects model for the individual covariate values fulfilling

\begin{aligned} {\begin{cases} w_{i} (t) = m_{i} (t) + ϵ_{i} (t), ϵ_{i} (t) \sim N (0, σ^{2}) for all t \\ m_{i} (t) = X_{i}^{T} (t) β + Z_{i}^{T} (t) b_{i}, b_{i} \sim N (0, D) \end{cases} \end{aligned}

(1)

where

m_{i} (t)

is the true longitudinal value of individual

i

at time

t

X_{i} (t)

a pre-specified time-dependent row-vector of the design-matrix for the fixed effects

β

, and

Z_{i} (t)

the corresponding row-vector of the matrix of the random effects

b_{i}

. The random effects are assumed to be normally distributed with mean zero and covariance matrix

D

and to be independent of the error terms

ϵ_{i}

which are also assumed to be normally distributed with mean zero and variance

σ^{2}

. The error terms account for potential measurement error in

w_{i} (t)

. Note that time-independent baseline covariates contained in

V_{i}

(or a subset thereof) may also be part of

X_{i} (t)

. Furthermore,

X_{i} (t)

can easily be adapted to account for different shapes, for example, a piecewise linear structure, of the trajectory of the longitudinal covariate. The latter will be exploited for modeling HbA1c trajectories in the data example.

2.2. Survival submodel

We suggest a Cox proportional hazards model for the individual event hazard of the form

\begin{aligned} α_{i} (t | m_{i} (0), V_{i}) d t & = P (T \in [t, t + d t) | T_{i} \geq t, m_{i} (0), V_{i}) \\ = α_{0} (t) d t \cdot \exp (γ^{T} V_{i} + μ m_{i} (0)) \end{aligned}

(2)

with regression coefficients

γ

for the time-independent baseline covariates and

μ

quantifying the association between the true baseline value of the longitudinal covariate and the event of interest in terms of the log hazard ratio (HR). This is a special form of a survival submodel of a joint model with lagged effects with the lag chosen larger as the maximal observed time.¹² Note that model (2) only includes the baseline value of the longitudinal covariate but not any longitudinal covariates. This is because our aim is to study baseline effects and the longitudinal trajectories will only be used to inform on missing baseline information for patients with delayed entry.

Following Bayesian joint models, we assume a penalized B-spline to approximate the logarithm of the baseline hazard

\log (α_{0} (t)) = \sum_{k = 1}^{Q} α_{0, k} B_{d, k} (t, u)

(3)

where the sequence of knots

u = (u_{1}, \dots, u_{Q})

is defined as an equally spaced partition of the timescale,

α_{0, k}, k = 1, \dots, Q

, are the spline coefficients and

B_{d, k} (t, u)

denotes the

q

th basis function of a B-spline with knots

u

. We follow the suggestion of Eilers and Marx³⁰ and use

Q = 20

knots. One advantage of Bayesian penalized splines is that the choice of the number of knots is not critical because overfitting may be corrected by the penalty term.²⁵

As explained earlier and investigated in more depth in the remainder of the paper, the novelty of the application of model (1) and model (2) is that the combination of both models is not used to estimate the effect of the time-dependent value $m_{i} (t)$ , typically with right-censored data, but utilizes the entire available longitudinal covariate information to recover missing baseline information $m_{i} (0)$ in left-truncated data, such that the baseline effect $μ$ can properly be estimated. In other words, our aim is a standard Cox analysis with baseline covariates, which, however, is unfeasible as a consequence of left truncation. The longitudinal character of the covariate is solely exploited to recover that missing baseline information.

2.3. Estimation

Let $θ = (θ_{S}^{T}, θ_{L}^{T}, θ_{R}^{T})^{T}$ denote the full parameter space of the models (1) to (3). Thereby, $θ_{S} = (μ, γ, α_{0, k}, k = 1, \dots, Q)$ are the coefficients of the survival submodel, $θ_{L} = (β, σ)$ the parameter of the longitudinal submodel and $θ_{R}$ the parameter of the random effects, that is, a vector of the components of $D$ . The following likelihood arguments will condition on further (known) baseline covariates $V_{i}$ and the pre-specified time-dependent row-vector $X_{i} (t)$ . As usual, the likelihoods are partial in that they omit contributions from the censoring/truncation mechanisms.

Following the Bayesian framework of Piulachs et al.,²⁶ the overall joint likelihood conditioned on the random effects $b_{i}$ and the random parameter $θ$ is

\begin{aligned} p ({T_{i} \land C_{i}, L_{i}, δ_{i}, w_{i}, i = 1, \dots, N} | b_{i}, θ) & = \prod_{i = 1}^{N} \prod_{j = 1}^{n_{i}} p (w_{i} (t_{i j}) | b_{i}, θ) \\ \cdot \frac{p (T_{i} \land C_{i}, δ_{i} | b_{i}, θ)}{P (T_{i} > L_{i} | b_{i}, θ)} \end{aligned}

(4)

where the conditional density for the event times is

\begin{aligned} p (T_{i} \land C_{i}, δ_{i} | b_{i}, θ) & = {(α_{0} (T_{i} \land C_{i}) \cdot \exp (γ^{T} V_{i} + μ m_{i} (0)))}^{δ_{i}} \\ \cdot \exp (- \int_{0}^{T_{i} \land C_{i}} α_{0} (s) \cdot \exp (γ^{T} V_{i} + μ m_{i} (0)) d s) \end{aligned}

(5)

and the conditional density of the longitudinal process is

\begin{aligned} p (w_{i} (t) | b_{i}, θ) = \frac{1}{(2 π σ^{2})^{n_{i} / 2}} \exp (\frac{- | | w_{i} (t) - X_{i} (t) β - Z_{i} (t) b_{i} | |^{2}}{2 σ^{2}}) \end{aligned}

with the Euclidean vector norm

‖ x ‖ = \sqrt{\sum_{i = 1}^{n} x_{i}^{2}}

. Moreover, the denominator of equation (4) is given by

\begin{aligned} P (T_{i} > L_{i} | b_{i}, θ) = \exp (- \int_{0}^{L_{i}} α_{0} (s) \cdot \exp (γ^{T} V_{i} + μ m_{i} (0)) d s) \end{aligned}

This integral in the denominator, as well as the integral in the exponential term in equation (5), do not have a closed-form solution. Therefore, numerical integration methods, for example, Gauss–Kronrod quadrature rules, are necessary to approximate these expressions.²³ Markov Chain Monte Carlo methods enable the construction of an approximated random sample of the posterior distribution of

(θ, b_{i})

conditional on the observed data. This is used for all inferences.

\begin{aligned} π (θ, b_{i} | {T_{i} \land C_{i}, L_{i}, δ_{i}, w_{i}, i & = 1, \dots, N}) \propto p ({T_{i} \land C_{i}, L_{i}, δ_{i}, w_{i}, \\ i & = 1, \dots, N} | θ, b_{i}) \cdot p (b_{i} | θ) \cdot π (θ) \end{aligned}

where

π (θ)

is the prior distribution of

θ

and the conditional density function of the random effects is denoted by

\begin{aligned} p (b_{i} | θ_{R}) = \frac{1}{(2 π)^{q / 2}} \frac{1}{\sqrt{det D}} \exp (\frac{- b_{i}^{T} D^{- 1} b_{i}}{2}) \end{aligned}

where

q

equals the dimension of the random effects vector. In the end, our main interest is in the posterior mean estimate of the association parameter

μ

as the log HR estimate describing the impact of the baseline value on the event of interest.

In the following, we use standard priors to fit the models of the Bayesian approach.^23,26,31 In particular, normal priors are taken for the parameter of the fixed effects $β$ , the association parameter $μ$ , and the spline coefficients of the baseline hazard $α_{0, k}$ . An inverse Wishart prior with an identity scale matrix and two degrees of freedom is assumed for the covariance matrix of the random effects $D$ and for the variance parameter of the error terms of the longitudinal model $σ^{2}$ an inverse gamma prior is used.

Software to fit these joint models with left truncation is readily available in the JMBayes²³ package in the statistical software R. The baseline effects can be taken into account by the use of the lag option.

3. Simulation study

The following simulation study investigates the performance of our proposed approach under left-truncated scenarios. The scenarios are motivated by the study example and are chosen such that they differ in the percentage of left truncation, sample size, number of follow-up visits, and direction of the association between the longitudinal and the survival submodel. The R Code of the simulation study is available in the Supplemental material. Table 1 displays the settings of the six simulation scenarios.

Table 1.
Simulation scenarios: For each scenario, 1000 datasets are generated.

Scenario $N$ $μ$ $ρ$ Censoring Study entry Longitudinal process

(S1) 1000 ‒0.150 1.952 Uniform (0,10) after 0 before 4th visit; $β = (7.203, - 0.175)$ ; $σ = 0.766$ ;

about 16% about 19% up to ten visits; linear

(S2) 1000 ‒0.256 1.952 Uniform (0,10) after 0 before 10th visit; $β = (7.203, - 0.175)$ ; $σ = 0.766$ ;

about 23% about 20% up to ten visits; linear

(S3) 1000 ‒0.256 1.952 Uniform (0,10) after 0 before 10th visit; $β = (7.203, - 0.175)$ ; $σ = 0.766$ ;

about 23% about 59% up to ten visits; linear

(S4) 1000 ‒0.256 1.952 Uniform (0,10) after 0 before 10th visit; $β = (7.203, - 0.175)$ ; $σ = 0.0766$ ;

about 23% about 59% up to ten visits; linear

(S5) 800 ‒0.256 0.952 Uniform (0,10) after 0 before 10th visit; $β = (7.203, - 0.175, 0.333, - 0.154)$ ;

about 53% about 23% $σ = 0.766$ ; breakpoints 1.5 and 2.5;

up to ten visits

(S6) 800 ‒0.256 0.952 Uniform (0,10) after 0 before 5th visit; $β = (7.203, - 0.175, 0.333, - 0.154)$ ;

about 52% about 54% $σ = 0.766$ ; breakpoints 1.5 and 2.5;

up to five visits

(S7) 800 0.256 1.952 Uniform (0,0.5) after 0 before 10th visit; $β = (7.203, - 0.175, 0.333, - 0.154)$ ;

about 56% about 48% $σ = 0.766$ ; breakpoints 0.2 and 0.4;

up to ten visits

Scenario	$N$	$μ$	$ρ$	Censoring	Study entry	Longitudinal process
(S1)	1000	‒0.150	1.952	Uniform (0,10)	after 0 before 4th visit;	$β = (7.203, - 0.175)$ ; $σ = 0.766$ ;
				about 16%	about 19%	up to ten visits; linear
(S2)	1000	‒0.256	1.952	Uniform (0,10)	after 0 before 10th visit;	$β = (7.203, - 0.175)$ ; $σ = 0.766$ ;
				about 23%	about 20%	up to ten visits; linear
(S3)	1000	‒0.256	1.952	Uniform (0,10)	after 0 before 10th visit;	$β = (7.203, - 0.175)$ ; $σ = 0.766$ ;
				about 23%	about 59%	up to ten visits; linear
(S4)	1000	‒0.256	1.952	Uniform (0,10)	after 0 before 10th visit;	$β = (7.203, - 0.175)$ ; $σ = 0.0766$ ;
				about 23%	about 59%	up to ten visits; linear
(S5)	800	‒0.256	0.952	Uniform (0,10)	after 0 before 10th visit;	$β = (7.203, - 0.175, 0.333, - 0.154)$ ;
				about 53%	about 23%	$σ = 0.766$ ; breakpoints 1.5 and 2.5;
						up to ten visits
(S6)	800	‒0.256	0.952	Uniform (0,10)	after 0 before 5th visit;	$β = (7.203, - 0.175, 0.333, - 0.154)$ ;
				about 52%	about 54%	$σ = 0.766$ ; breakpoints 1.5 and 2.5;
						up to five visits
(S7)	800	0.256	1.952	Uniform (0,0.5)	after 0 before 10th visit;	$β = (7.203, - 0.175, 0.333, - 0.154)$ ;
				about 56%	about 48%	$σ = 0.766$ ; breakpoints 0.2 and 0.4;
						up to ten visits

The survival time $T_{i}$ is generated from equation (2) applying the inversion method.³² Throughout, in all scenarios, the baseline hazard is assumed to follow a Weibull distribution, that is, $α_{0} (t) = ρ \cdot t^{ρ - 1}$ . Also recall that model (2) depends on the baseline covariates value, but not on the longitudinal trajectories.

Scenarios (S1), (S2), (S3), and (S4) assume the linear longitudinal trajectory to be given by

\begin{aligned} w_{i} (t) = β_{0} + β_{1} t + b_{i 0} + b_{i 1} t + ϵ_{i} (t) \end{aligned}

Motivated by typical HbA1c trajectories,^9,10 the other scenarios assume a more complex shape of the longitudinal trajectory in terms of a piecewise linear function with two breakpoints, that is,

\begin{aligned} w_{i} (t) & = β_{0} + β_{1} t + β_{2} (t - bp1) 1 (t > bp1) + β_{3} (t - bp2) 1 (t > bp2) \\ + b_{i 0} + b_{i 1} t + ϵ_{i} (t) \end{aligned}

where bp1 denotes the first breakpoint and bp2 the second one. In all scenarios, the random effects follow a bivariate normal distribution with mean zero and covariance matrix

(\begin{matrix} 0.569 & 0.217 \\ 0.217 & 1.672 \end{matrix})

. Furthermore, the error terms

ϵ_{i}

are normally distributed with mean 0 and standard deviation 0.766 in all scenarios except (S4), which considers a smaller standard deviation of 0.0766. The follow-up visit times are uniformly distributed between the time origin 0 and the event time

T_{i}

. In scenario (S6), only up to five follow-up visits are considered, whereas in the other scenarios up to 10 follow-up visits are planned. To induce left truncation some of the observations do not enter at the time origin (first visit) but at one of their later follow-up visits. The entry time is chosen smaller than the time of the last visit to ensure at least two measurements per individual. The right censoring times are generated by a uniform distribution.

Scenario (S1) is rather simple. With an HR based on the association parameter of $\exp (- 0.150) = 0.861$ , it simulates the smallest impact of the baseline measurement of all scenarios. Furthermore, the individuals enter the study soon after the time of origin, and only a small amount of individuals are left-truncated. Scenarios (S2) and (S3) account for a greater association (HR $= \exp (- 0.256) = 0.774$ ) while scenario (S3) considers a greater amount of left truncation than (S2). The scenario (S4) is the same as (S3) except that it considers a smaller standard deviation of the error terms. In the scenarios with the linear trajectory, we simulate $N = 1000$ individuals, and in the more complex scenarios $N = 800$ to make the setup even more challenging. Scenarios (S5) and (S6) assume a piecewise linear longitudinal trajectory but the same association as (S2) and (S3). Compared to scenario (S5) fewer follow-up visits are planned but at the same time, the percentage of left-truncated observations is greater in scenario (S6). In all scenarios, the censoring times were generated using a uniform distribution. Scenario (S7) considers a positive association parameter resulting in an HR greater than one (HR $= \exp (0.256) = 1.292$ ). Due to the positive association individuals quickly leave the study, either censored or with the event of interest. To account for that a smaller maximum of the censoring distribution and smaller breakpoints than in the other scenarios were chosen. Scenario (S7) is the closest to the real data application in Section 4.

We simulate 1000 datasets of each scenario and set the confidence level to 95%. To each simulated dataset, we apply our proposed model (I) and a standard Cox proportional hazards model incorporating the observed covariate value of the time-dependent longitudinal covariate at study entry (II). The latter would model the individual event hazard as

α_{i} (t | m_{i} (L_{i}), V_{i}) = α_{0} (t) \cdot \exp (γ^{T} V_{i} + μ m_{i} (L_{i}))

(6)

For model (6), recall the critique that time

L_{i}

, that is, the entry time into the register on the patient’s timescale, will not have a particular meaning on the individual level if time

L_{i}

simply reflects the start of data collection.

Furthermore, we consider two alternative approaches, a complete case analysis using only the observations entering at the time origin 0 (III), and a two-stage approach, first fitting a linear mixed-effects model and predicting the baseline value of the covariate for all individuals, and then inserting the prediction into the Cox proportional hazards model (IV). The rationale behind approach III is that we have assumed left truncation to be independent of the times-to-event. The motivation behind approach IV is that it will be easier to fit.

The results of the association parameter $μ$ are visualized in Figure 2 and the bias, relative bias, and mean squared error are displayed in Table 2. Note that we concentrate on reporting the results of the association parameter which is our main target parameter. The longitudinal model is solely used to recover the missing baseline information.

Figure 2.

Simulation results regarding the association parameter $μ$ : The boxplots are derived from 1000 study replications. The dashed line displays the true HR. (A) Displays scenario (S1), (B) the scenarios (S2) to (S6), and (C) scenario (S7). I: HR is estimated by means of the exponential of the posterior mean estimate of the Bayesian joint model. II: HR estimated using the standard Cox model (6). III: HR is estimated by a complete case analysis. IV: HR is estimated by a two-stage approach. Corresponding CPs of the credible intervals of the Bayesian approach and the confidence intervals of the Cox model are included. HR: hazard ratio; CP: coverage probability.

Our proposed model provides negligible bias and coverage probabilities (CPs) close to 90%. Compared to scenarios (S1) and (S2), the joint model approach performed slightly worse in scenarios (S3) and (S4) (relative bias of $- 8.08 %$ and of $- 4.55 %$ ). The reason is the sparse information about the longitudinal trajectory at the beginning of follow-up as a consequence of the high percentage of left-truncated observation. Although bias and relative bias are larger in (S5) than in (S6), CP and mean squared error are larger in (S6), because a higher percentage of left truncation and fewer follow-up visits increase the variability. Furthermore, under a positive association, individuals enter the study soon after time of origin but leave again rather fast. In addition, the high percentage of left truncation means less information about the longitudinal trajectory, and a relative bias of $- 6.51 %$ follows. The standard Cox model approach incorporating the study entry measurement seriously underestimates the baseline effect in all scenarios (relative bias between $- 53.89 %$ and $- 80.56 %$ ).

The complete case analysis is also heavily biased in all scenarios except (S4), where the standard deviation of the error terms is very small (relative bias between $- 34.38 %$ and $- 51.60 %$ ). The reason is that the joint modeling approach not only recovers missing baseline information but also accounts for measurement error. In contrast, the joint model analysis and the two-stage approach are nearly unbiased. However, the two-stage analysis depicts less bias as well as a greater CP than the joint model analysis. There is a certain body of literature investigating possible bias of two-stage approaches with somewhat mixed findings concerning the consequences of the two-stage approach proceeding in two separate stages. Here, we stress that our simulation scenarios are in line with our aim of investigating the effect of baseline HbA1c. Hence, data were simulated, including the baseline covariate in the hazard function but not longitudinal trajectories, which, in turn, may support the two-stage analysis.

Table 2.

Simulation results regarding the association parameter $μ$ : Bias, relative bias and MSE.

	I: joint model approach			II: Cox model			III: CC analysis			IV: Two-stage approach
	bias	%bias	MSE	bias	%bias	MSE	bias	%bias	MSE	bias	%bias	MSE
S1	0.0049	‒3.27	0.0002	0.0844	‒56.28	0.0081	0.0774	‒51.60	0.0073	0.0014	‒0.92	0.0028
S2	0.0081	‒3.18	0.0003	0.1604	‒62.64	0.0266	0.1324	‒51.71	0.0190	0.0039	‒1.53	0.0033
S3	0.0207	‒8.08	0.0007	0.1918	‒74.93	0.0374	0.1315	‒51.35	0.0203	0.0131	‒5.10	0.0038
S4	0.0116	‒4.55	0.0002	0.1677	‒65.51	0.0289	0.0016	‒0.64	0.0061	0.0142	‒6.78	0.0054
S5	0.0173	‒6.75	0.0008	0.1880	‒73.42	0.0364	0.1259	‒49.17	0.0190	0.0121	‒4.72	0.0073
S6	0.0050	‒1.96	0.0009	0.2062	‒80.56	0.0432	0.0880	‒34.38	0.0163	‒0.0073	2.87	0.0091
S7	‒0.0167	‒6.51	0.0006	‒0.1380	‒53.89	0.0210	‒0.1277	‒49.88	0.0207	‒0.0174	‒6.78	0.0054

MSE: mean squared error; CC: complete case.

4. Real data application using the DIVE register

The data example is from a population-based study of patients diagnosed with type 2 diabetes and under first-line OAD medication.¹ One aim was to access the association of prognostic baseline factors on the time to treatment failure. Thereby, treatment failure was defined as either the initiation of basal-supported oral therapy (BOT) or stopping first-line OAD, whatever comes first. The data source was the DIVE register, which is a German prospective, observational, multicenter diabetes register established in 2011.² Since the natural timescale of the analysis is “time since start of first OAD medication”, but data collection occurs in calendar time and with start of the DIVE register, the data are subject to left truncation (see Figure 1). The original study fitted a Cox proportional hazards model accounting for left truncation and incorporating longitudinal covariate values measured at study entry. However, it has been discussed that statements regarding baseline effects are questionable due to unmeasured baseline information for left-truncated individuals.^1,7

The proposed methodology extends these previous results by utilizing the entire available longitudinal information to estimate baseline effects at OAD initiation. The present study cohort focuses on the 16,719 individuals, which have at least two observed longitudinal HbA1c measurements during follow-up and censors at year 20. The reasons are that neither only one observed HbA1c measurement per individual nor sparse measurements after year 20 are sufficient in order to adequately fit a linear mixed-effects model according to relation (1). A total of 69,363 longitudinal measurements were taken in this study cohort. In total, we observed 2023 patients initiating BOT and 5004 ones stopping OAD. About 62% of all patients entered the study cohort with delay.

The fitted univariate Cox model shows a 8.8% increased hazard risk for treatment failure for each unit increase in the HbA1c level at study entry (HR: 1.088, 95% confidence interval: [1.070, 1.106]). This is comparable to the result provided by the original study cohort (HR: 1.084, 95% confidence interval: [1.068, 1.102]) and in line with clinical expertise that higher HbA1c levels are conjoined with a higher risk of antidiabetic treatment failure.³⁵

Now, moving to our proposed joint modeling framework, we assume a piecewise linear mixed-effects model with breakpoints at 2.5 and 4.5 years for the HbA1c trajectory:

\begin{aligned} {HbA1c}_{i} (t) & = β_{0} + β_{1} t + β_{2} (t - 2.5) 1 (t > 2.5) + β_{3} (t - 4.5) 1 (t > 4.5) \\ + b_{i 0} + b_{i 1} t + ϵ_{i} (t) \end{aligned}

The choice of the breakpoints is based on the locally estimated scatterplot smoothing (LOESS) plot displayed in Figure 3. As the LOESS plot smoothes the scatterplot of the HbA1c level combining the closest 75% of the points, it is a curve and not a straight line. The use of a piecewise linear model is suggested in the literature.^9,10

Figure 3.

LOESS plot of the HbA1c trajectory (straight line) and trajectory obtained from estimated coefficients of the longitudinal (long.) submodel (dashed line). LOESS: locally estimated scatterplot smoothing; HbA1c: haemoglobin A1c; JM: Joint Model.

In contrast to the original Cox analysis, the joint model results in a more pronounced effect (HR: 1.292, 95% credible interval: [1.254, 1.337]). The effect estimated by the complete case analysis is similar to the original Cox analysis (HR: 1.074, 95% confidence interval: [1.047, 1.102]). The two-stage approach shows a more pronounced effect that is also more pronounced than that of the joint model (HR: 1.405, 95% confidence interval: [1.366, 1.445]). One possible explanation of the smaller effect seen in the standard Cox analysis is that patients with delayed study entry have an HbA1c value upon study entry that is already regulated by medication, attenuating association. However, the fact that the complete case analysis found an effect more in line with standard Cox rather than joint modeling or two-stage suggests a second possible explanation, namely the presence of measurement error.

Additionally, we also considered the estimated longitudinal trajectory (see Figure 3, dashed line), with estimated fixed effects $\hat{β} = ({\hat{β}}_{0}, {\hat{β}}_{1}, {\hat{β}}_{2}, {\hat{β}}_{3}) = (7.203, - 0.175, 0.333, - 0.154)$ . Although the estimated HbA1c trajectory is not as steep for later times as the LOESS plot, both curves in Figure 3 are in very good agreement for the baseline HbA1c levels, which is most important for our proposed framework.

As sensitivity analyses (not shown), we also considered earlier breakpoints based on the literature and added another breakpoint at 10.5, but the results of the association parameter and the estimated intercept ${\hat{β}}_{0}$ remained comparable and only the slope parameter of the longitudinal trajectory changed.

In this analysis, we have interpreted time-to-event $T_{i}$ as patient $i$ ’s time to OAD treatment failure. However, there will be patients for whom OAD treatment does not fail.³³ Those patients alive at the end of data collection will be right-censored, while patients who have died without prior OAD failure have experienced a competing risk. In their discussion of the DIVE register, Bluhmki et al.⁷ explain that mortality information is hardly available in the DIVE register and may also have been masked as censoring. Unfortunately, this information is hardly recoverable in the German environment. We discuss accounting for mortality information in Section 5, assuming that competing risk information is provided by the database. For the present analysis, we note that we have deliberately refrained from probability statements and have focused on reporting hazard results. The reason is that censoring a competing risk will typically provide for a meaningful analysis on the event-specific hazard scale. Probability statements, however, would not only depend on the event-specific hazard of OAD failure but also depend on the competing event-specific hazard.

5. Discussion

We have proposed an innovative use of the joint model framework to suggest a solution to a longstanding problem in observational studies with delayed entry: the estimation of baseline effects of longitudinal covariates associated with time-to-event outcomes in the presence of unmeasured baseline information due to delayed study entry. Our approach takes advantage of established joint modeling techniques, but considering baseline effects of the longitudinal trajectory in combination with delayed study entry is beyond the standard applications of joint models. In contrast, standard joint modeling targets missing longitudinal covariate information for right-censored time-to-event data. The simulation found a satisfactory performance of our method under different specifications of left truncation and the longitudinal trajectory. In contrast, the often used Cox model accounting for left truncation but incorporating the respective measurement at study entry rather than at baseline leads to biased results when the interest is in the baseline effect. This has been discussed in the literature⁶ and was supported by the present simulation study. Even though, at first, one would expect the complete case analysis to be unbiased provided that delayed study entry is entirely unrelated to the time-to-event process of interest, we found that the complete case analysis may also be biased in the presence of measurement error. These findings are in line with Crowther et al.,¹³ who, however, considered a non-left-truncated setting such that baseline covariates are known and measurement error was at the core of the investigation. Parts of the bias of the Cox model incorporating the study entry measurements might, therefore, also be attributed to ignoring measurement error. Furthermore, we found a two-stage approach, first predicting the baseline values of the covariate and then including the predictions into the Cox proportional hazards model, to be another alternative to deal with the problem. We found the two-stage approach to be unbiased in the simulations, one reason arguably being that a Cox model with baseline covariates only does not include the longitudinal trajectory modeled in the first stage. The theoretical disadvantage of this “naive” two-stage approach is that, unlike joint modeling, the variability in the prediction in the first stage is not included in the second stage.³⁴ Furthermore, in scenarios with a stronger correlation between the longitudinal and the survival process, the joint modeling approach might be preferable to the two-stage approach.¹⁸ We illustrated the procedures with register-based diabetes data and found that higher HbA1c levels at baseline are associated with an increased risk of treatment failure. This result supports the findings of the original study, but both joint model and two-stage analyses found more pronounced effects than a Cox regression of study entry values and a complete case analysis. Possible reasons are both the fact that HbA1c values at later study entry are already medically regulated and also possible measurement error. However, using a Cox regression would also lead to a biased estimated association, even if all baseline HbA1c measurements were known, as the measurement error is neglected during estimation.

We used Bayesian joint models as the joint models accounting for left truncation and lagged effects were readily implemented in the R-package JMBayes²³ and had no problem dealing with the large DIVE dataset. One can also use the frequentist approach implemented in stjm¹⁹ in stata. Simulations (not shown) showed similar performance of both the frequentist and the Bayesian approaches.

The proposed methodology is not restricted to register-based diabetes trials but has potential in other fields of epidemiological research. Typical examples include aging studies,³⁵ fracture studies in health services research,³⁶ or infection control trials in healthcare epidemiology.³⁷ However, in studies in which all individuals are left-truncated, for example, in pregnancy outcome studies,³⁸ the proposed method has restrictions. Having a subset of patients with measurements at the time origin will be most useful for estimation of the longitudinal trajectory. Furthermore, under heavier left truncation and sparse information about the longitudinal covariate, the joint model approach was found to have increased relative bias, although it still performed better than the standard Cox approach (simulation scenario (S5)).

In the analysis of the DIVE data, we have already discussed the need to account for competing risks. If mortality information had been available, one option would have been a composite endpoint of OAD failure or death, whatever came first. Alternatively, a joint model for competing risks may be investigated.^12,15,31,39 For the DIVE data, a finer competing risks investigation would consider endpoints “BOT initiation” and “OAD refusal prior to BOT”.^1,7 Finally, the question of a “baseline” effect is not necessarily restricted to the original time zero, but may also be investigated at a later time point $s > 0$ w.r.t. events after $s$ . One option here would be a landmark analysis, taking time $s$ as the new time origin.

Supplemental Material

sj-zip-1-smm-10.1177_09622802231163334 - Supplemental material for Modeling unmeasured baseline information in observational time-to-event data subject to delayed study entry

Supplemental material, sj-zip-1-smm-10.1177_09622802231163334 for Modeling unmeasured baseline information in observational time-to-event data subject to delayed study entry by Regina Stegherr, Jan Beyersmann, Peter Bramlage and Tobias Bluhmki in Statistical Methods in Medical Research

Footnotes

Acknowledgements

The authors thank an anonymous reviewer of this article, whose valuable suggestions greatly contributed to the improvement of the manuscript.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: RS and TB were partially supported by Grant [BE 4500/3-1] of the German Research Foundation (DFG).

ORCID iD

Regina Stegherr

Supplemental material

Supplemental material is available for this article online.

References

Danne

Bluhmki

Seufert

et al. Treatment intensification using long-acting insulin–predictors of future basal insulin supported oral therapy in the DIVE registry. BMC Endocr Disord 2015; 15: 54.

Danne

Kaltheuner

Koch

, et al. ‘DIabetes Versorgungs-Evaluation’ (DIVE)–a national quality assurance initiative at physicians providing care for patients with diabetes. Dtsch Med Wochenschr 2013; 138: 934–939.

Wang

Brookmeyer

Jewell

. Statistical models for prevalent cohort data. Biometrics 1993; 49: 1–11.

Keiding

Moeschberger

. Independent delayed entry. In: Klein JP and Goel PK (eds) Survival analysis: State of the art. Dordrecht: Springer, 1992, pp. 309–326.

Aalen

Borgan

Gjessing

. Survival and event history analysis: A process point of view. New York: Springer Science & Business Media, 2008.

Keiding

Knuiman

. Letter to the editor: Survival analysis in natural history studies of disease. Stat Med 1990; 9: 1221–1222.

Bluhmki

Bramlage

Volk

, et al. Time-to-event methodology improved statistical evaluation in register-based health services research. J Clin Epidemiol 2017; 82: 103–111.

Cnaan

Ryan

. Survival analysis in natural history studies of disease. Stat Med 1989; 8: 1255–1268.

Lind

Pivodic

Cea-Soriano

, et al. Changes in HbA 1c and frequency of measuring HbA 1c and adjusting glucose-lowering medications in the 10 years following diagnosis of type 2 diabetes: A population-based study in the UK. Diabetologia 2014; 57: 1586–1594.

10.

Zinman

Wanner

Lachin

, et al. Empagliflozin, cardiovascular outcomes, and mortality in type 2 diabetes. N Engl J Med 2015; 373: 2117–2128.

11.

Ibrahim

Chu

Chen

. Basic concepts and methods for joint models of longitudinal and survival data. J Clin Oncol 2010; 28: 2796.

12.

Rizopoulos

. Joint models for longitudinal and time-to-event data: With applications in R. Boca Raton: Chapman and Hall/CRC, 2012.

13.

Crowther

Lambert

Abrams

. Adjusting for measurement error in baseline prognostic biomarkers included in a time-to-event analysis: A joint modelling approach. BMC Med Res Methodol 2013; 13: 146.

14.

Waldmann

Taylor-Robinson

Klein

, et al. Boosting joint models for longitudinal and time-to-event data. Biom J 2017; 59: 1104–1121.

15.

Elashoff

. Joint modeling of longitudinal and time-to-event data. New York: CRC Press, 2017.

16.

Rizopoulos

Ghosh

. A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat Med 2011; 30: 1366–1380.

17.

Yuen

Mackinnon

. Performance of joint modelling of time-to-event data with time-dependent predictors: An assessment based on transition to psychosis data. PeerJ 2016; 4: e2582.

18.

Furgal

AKC

Sen

Taylor

JMG

. Review and comparison of computational approaches for joint longitudinal and time-to-event models. Int Stat Rev 2019; 87: 393–418.

19.

Crowther

Andersson

TML

Lambert

, et al. Joint modelling of longitudinal and survival data: incorporating delayed entry and an assessment of model misspecification. Stat Med 2016; 35: 1193–1209.

20.

Piccorelli

Schluchter

. Jointly modeling the relationship between longitudinal and survival data subject to left truncation with applications to cystic fibrosis. Stat Med 2012; 31: 3931–3945.

21.

Wang

. Modeling left-truncated and right-censored survival data with longitudinal covariates. Ann Stat 2012; 40: 1465.

22.

Armero

Forte

Perpiñán

, et al. Bayesian joint modeling for assessing the progression of chronic kidney disease in children. Stat Meth Med Res 2018; 27: 298–311.

23.

Rizopoulos

. The R package JMbayes for fitting joint models for longitudinal and time-to-event data using MCMC. J Stat Softw 2016; 72: 1–45.

24.

Van den Hout

Muniz-Terrera

. Joint models for discrete longitudinal outcomes in ageing research. J R Stat Soc C 2016; 65: 167–186.

25.

Piulachs

Andrinopoulou

Guillèn

, et al. A bayesian joint model for zero-inflated integers and left-truncated event times with a time-varying association: Applications to senior health care. Stat Med 2021; 40: 147–166.

26.

Piulachs

Alemany Leira

Guillén

, et al. Joint models for longitudinal counts and left-truncated time-to-event data with applications to health insurance. Sort 2017; 41: 347–372.

27.

Armero

Forné

Rué

, et al. Bayesian joint ordinal and survival modeling for breast cancer risk assessment. Stat Med 2016; 35: 5267–5282.

28.

Sperrin

Buchan

. Modelling time to event with observations made at arbitrary times. Stat Med 2013; 32: 99–109.

29.

Lee

Betensky

Initiative

ADN

. Time-to-event data with time-varying biomarkers measured only at study entry, with applications to Alzheimer’s disease. Stat Med 2018; 37: 914–932.

30.

Eilers

PHC

Marx

. Flexible smoothing with b -splines and penalties. Stat Sci 1996; 11: 89–121.

31.

Andrinopoulou

Rizopoulos

Takkenberg

, et al. Combined dynamic predictions using joint models of two longitudinal outcomes and competing risk data. Stat Meth Med Res 2017; 26: 1787–1801.

32.

Bender

Augustin

Blettner

. Generating survival times to simulate Cox proportional hazards models. Stat Med 2005; 24: 1713–1723.

33.

van Mark

Tittel

Sziegoleit

, et al. Type 2 diabetes in older patients: An analysis of the DPV and DIVE databases. Ther Adv Endocrinol Metab 2020; 11: 2042018820958296.

34.

Sweeting

Thompson

. Joint modelling of longitudinal and time-to-event data with application to predicting abdominal aortic aneurysm growth and rupture. Biom J 2011; 53: 750–763.

35.

Holman

Thorne

Farmer

, et al. Addition of biphasic, prandial, or basal insulin to oral therapy in type 2 diabetes. N Engl J Med 2007; 357: 1716–1730.

36.

Bluhmki

Peter

Rapp

, et al. Understanding mortality of femoral fractures following low-impact trauma in persons with and without care need. J Am Med Dir Assoc 2017; 18: 221–226.

37.

Munoz-Price

Frencken

Tarima

, et al. Handling time-dependent variables: Antibiotics and antibiotic resistance. Clin Infect Dis 2016; 62: 1558–1563.

38.

Friedrich

Beyersmann

Winterfeld

, et al. Nonparametric estimation of pregnancy outcome probabilities. Ann Appl Stat 2017; 11: 840–867.

39.

Williamson

Kolamunnage-Dona

Philipson

, et al. Joint modelling of longitudinal and competing risks data. Stat Med 2008; 27: 6426–6438.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

3.60 MB

0.00 MB