Sage Journals: Discover world-class research

Abstract

The identification of biomarkers for disease onset in longitudinal studies necessitates precise estimation of the association between longitudinal markers and survival outcomes. Currently, methods for estimating these associations in the context of left-truncated and clustered survival outcomes are lacking. In this study, we propose a novel model tailored to this scenario and develop several estimation methods: last observation carried forward, regression calibration, and a two-stage likelihood approach for joint modeling of longitudinal and survival processes. Simulation results indicate that the last observation carried forward method performs well only with a dense grid and no marker measurement error. For less dense grids and low measurement error, regression calibration approaches are preferred. Joint modeling approaches outperform calibration methods in the presence of measurement error, although they may suffer from numerical instability. In cases of numerical instability, calibration methods might be a good alternative. We applied these methodologies to the TwinsUK data to estimate the effect of bone mineral density (BMD) as a longitudinal marker on fracture incidence in 766 elderly females, 138 of whom experienced a fracture. The survival model included a shared gamma-distributed frailty to account for correlation between the times to fracture of twin pairs. Estimates obtained using calibration and joint modeling approaches indicated a larger BMD effect compared to the last observation carried forward method, likely due to the irregular BMD measurement process and minimal measurement error. Overall, our methods offer valuable tools for modeling the effect of a longitudinal marker on survival outcomes in complex designs.

Keywords

Joint model delayed entry regression–calibration shared frailty model longitudinal data twins

1. Introduction

Modeling the relationship between a longitudinal marker and a time-to-event outcome is a popular research area. To date, most of this research has focused on modeling data from independent subjects (singletons), using time-in-study (follow-up time) as the primary time scale. For many diseases, however, age is the natural time scale, which results in left truncation of the time-to-event outcome. Additionally, longitudinal data collection often follows complex designs. Twin studies, as one of the most prominent sources of high-quality longitudinal data,^1,2 present unique challenges due to their clustered nature. This paper focuses on estimating the association between a longitudinal marker and a left-truncated survival outcome in twin studies. Currently, no method exists to handle this type of data.

A common approach to model dependence of clustered event times is by introducing a cluster-specific random effect—the shared frailty model.^3–6 The survival times are assumed to be conditionally independent given the shared (common) frailty. The gamma distribution has commonly been considered for the frailties because of mathematical convenience, since it typically produces a tractable marginal likelihood function for the parameters after integration. The literature on joint models for a longitudinal marker and a survival outcome in paired data is limited, and currently used models lack flexibility. Specifically, in the models of Ratcliffe et al.,⁷ and Brilleman et al.,⁸ the correlation between the survival times of cluster members is solely modeled by the normally distributed shared effects of the survival and longitudinal outcomes. This model can be fitted using the available statistical software (e.g., VAJointSurv,⁹ INLAjoint¹⁰ packages in R and gsem,¹¹ and merlin¹² in Stata). In this paper, we propose a more general model that allows for additional correlation by including a gamma-distributed frailty term in the survival model. The R package Frailtypack¹³ allows for additional correlation in the survival model. However, it can fit either a joint longitudinal and clustered survival model without delayed entry or a left-truncated clustered survival data with time-invariant markers.

Estimation of the parameters of joint models by maximum likelihood is time-consuming due to the necessary numerical integration of the normally distributed shared random effects in the survival model.¹⁴ Using a novel two-stage joint likelihood approach, we propose to jointly estimate the shared random effect of the longitudinal marker and the frailty term of the survival process, preceded by the plug-in of the best linear unbiased predictions (BLUP) estimates for the individual-specific random effects of the longitudinal marker.

Furthermore, for estimation of the relationship between the longitudinal marker and the survival outcome, we compare the performance of this new joint modeling (JM) approach with two classical and simpler approaches adapted to clustered and left-truncated data, namely the last observation carried forward (LOCF) approach and regression calibration techniques. In the context of independent data, and under the strong assumption that the longitudinal process only jumps at observed time points and remains constant between two consecutive observation points, the LOCF approach has been proposed for modeling the relationship between the marker and the time-to-event outcome in singletons. This method can be easily extended to clustered data by using shared frailty models, which can be fitted using available statistical packages (e.g., the coxph function in the survival package¹⁵ and the frailtypack¹³ package in R). However, the adjustment to left-truncation is notably more elaborated for clustered data than for the singleton case.

Methods have been developed for shared-frailty models, but are restricted to time-fixed covariates. Hence, there is no software available to estimate the effect of a longitudinal marker on a time-to-event outcome by the LOCF approach on clustered data with delayed entry. Here, we extend the method for time-fixed covariates to the context of longitudinal markers. Often, the assumption of constant values within observation intervals cannot be made. For such cases, regression calibration methods have been proposed for singletons.^16–22 These are two-step approaches consisting of first fitting a mixed model to obtain estimates of the longitudinal marker. These estimated values can then be used in a second step as if they had been observed in the LOCF-based approach. Here, we extend regression calibration methods to the context of frailty models with delayed entry, which requires a more complex first step involving a linear mixed model with extra random effects to model the correlation between the longitudinal outcomes of cluster members. Using an intensive simulation study, we provide specific recommendations for estimation methods, taking into account the underlying model, the time gaps between longitudinal observations, and the level of measurement error.

As a data example, we consider bone mineral density (BMD), repeatedly measured over time, and the risk of fracture for twins from the TwinsUK registry. In a previous paper, we estimated the probability of a fracture in the next time period given current age and BMD using shared frailty models.²³ For such a question, follow-up time is the natural underlying time variable. Here, we are interested in modeling the effect of BMD on fracture incidence. As with many other aging-related diseases, for such a model, age is the natural time scale, which, in combination with twins entering the registry at different ages (delayed entry), results in left truncation of the time-to-event outcome.

The contributions of this paper are threefold. Firstly, it presents a novel statistical model for the analysis of longitudinal and survival data in twins. Secondly, it introduces a new set of estimation methods, each offering increased flexibility and computational complexity, all adapted to deal with left-truncated survival data. Thirdly, it provides users with guidance based on simulation results, advising on the choice of the estimation method according to factors such as observation grid density and measurement error.

The rest of this paper is laid out as follows: in the second section, we formalize the problem and propose novel approaches, in the third section, we study the performance of the methods via simulations, in the fourth section we present the results of the analysis of the TwinsUK dataset to estimate the effect of BMD on age-specific fracture incidence using our novel approaches. Lastly, the fifth section provides the discussion and conclusions.

2. Methods

2.1. Notation and model formulation

We are interested in modeling the relationship between a longitudinal marker and survival times in twin pairs. Let $N$ be the number of twin pairs. For twin $j$ of pair $i$ , let $T_{i j}$ be the random survival time of interest and $M_{i j} (t)$ be the longitudinal marker. Let $x_{i j}$ be a column vector of individual-specific time-invariant covariates for the survival time $T_{i j}$ and let $γ = (γ_{1}, γ_{2}, \dots, γ_{p})$ be a parameter vector for these time-invariant covariates. Let the frailty $v_{i}$ represent the unmeasured shared effects for twin pair $i$ that have an effect on their survival times $T_{i 1}$ and $T_{i 2}$ . We propose a gamma distribution with mean one and variance $θ$ for its distribution. Furthermore, not all $T_{i j}$ will be observed. So, define a second random variable $C_{i j}$ independent of $T_{i j}$ , which represents the censoring process.

Depending on the available amount of information on $M (t)$ , we will model it either nonparametrically or by using the following random effects model:

M_{i j} (t) = G_{i j} (t) + u_{i}

(1)

with

u_{i}

zero mean and independent normally distributed random variables with variance

σ_{u}^{2}

, which represents the deviation of the twin profile from the population profile

G_{i j} (t)

. In case there are many observations

G_{i j} (t)

might be modeled with a flexible fixed and random effects structure. In this paper, the following linear mixed model for

G_{i j} (t)

is considered:

G_{i j} (t) = β_{0} + β_{1} t + b_{i j 0} + b_{i j 1} t

(2)

where the vector

(b_{i j 0}, b_{i j 1})

follows a normal distribution with zero mean and covariance structure

Σ_{b}

. Here,

b_{i j 0}

represents the deviation of twin

j

from pair

i

of the population mean

β_{0}

, and

b_{i j 1}

represents the deviation of the slope of time for twin

j

from pair

i

of the population slope

β_{1}

Typically, $M_{i j} (t)$ is observed only at specific time points with a (small) error. For twin $j$ of pair $i$ , let the index $k$ run over the subject-specific grid of time points $k \in {0, \dots, K_{i j} - 1}$ , where $K_{i j}$ is the number of time points for this subject. Now, define the random variable $Y_{i j k}$ as a random perturbation of $M_{i j} (t_{k})$ at time point $t_{k}$ . Thus, the following relationship between $Y_{i j k}$ and $M (t_{k})$ is proposed:

Y_{i j k} = M_{i j} (t_{k}) + ε_{i j k}, k = 0, \dots, K_{i j} - 1, j = 1, 2, i = 1, \dots, N

(3)

where

ε_{i j k}

are mutually independent normally distributed random variables with zero mean and variance

σ_{e}^{2}

. Note that for

σ_{e}^{2} \to 0

Y_{i j k} \approx M_{i j} (t_{k})

Now, to model the relationship between $M (t)$ and $T$ , we need to consider two situations. Firstly, $M (t)$ is considered exogenous if its value at any given time is not influenced by past events. We can operationalize the concept of exogeneity in our context, assuming that $M (t)$ has a direct effect on $T$ and that there are no unobserved confounders.²⁴ For example, if $M (t)$ represents air pollution levels over time and $T$ denotes the onset of disease, then $M (t_{k})$ is as an exogenous variable. Secondly, $M (t)$ is considered endogenous when its current value is affected by past events. This dependence naturally arises through unobserved confounding. Consequently, we define a longitudinal marker $M (t)$ as endogenous if it affects $T$ while also being influenced by unobserved variables that affect both $M (t)$ and $T$ . Many biomarkers follow this model. For example, unobserved genetic factors influence the biomarker and the survival times of the two twins.

Firstly, we consider the case that $M (t)$ an exogenous variable. The model is depicted in Figure 1(a). Since there are no unobserved confounders for the relationship between $M (t)$ and $T$ , the following model for the conditional hazard function given the frailty $v_{i}$ , time invariant covariates $x_{i j}$ and the covariate trajectory $M_{i j} (t)$ , $h_{i j} (t | v_{i}, x_{i j}, M_{i j} (t))$ , can be proposed:

h_{i j} (t | v_{i}, x_{i j}, M_{i j} (t)) = h_{0} (t) v_{i} \exp {γ x_{i j} + α M_{i j} (t)}

(4)

where the parameter

α

represents the effect of the variable

M_{i j} (t)

T_{i j}

. The function

h_{0} (t)

is the baseline hazard, that is, the hazard when

M_{i j} (t) = 0

and

x_{i j} = 0

. In this paper, we consider parametric hazard functions with parameters

ξ

. For time points

t_{k}

and small

σ_{e}^{2}

, model (4) reduces to

h_{i j} (t_{k} | v_{i}, x_{i j}, Y_{i j k}) = h_{0} (t) v_{i} \exp {γ x_{i j} + α Y_{i j k}}

Secondly, for

M (t)

an endogenous variable, we assume that the random twin effect

u_{i}

representing the shared effects for the two twins for the marker patterns

M_{i j} (t)

is also shared with the random time variables

T_{i j}

. Examples of twin pair shared effects are genetic and environmental factors. In this model, the frailty

v_{i}

models the additional covariance between

T_{i 1}

and

T_{i 2}

, and

b_{i j}

models the correlation of the

M_{i j} (t)

and

M_{i j} (t^{'})

for person

j

and

t \neq t^{'}

. Figure 1(b) shows this model. The formula for this model is

\begin{aligned} M_{i j} (t) & = G_{i j} (t) + u_{i} = β_{0} + β_{1} t + b_{i j 0} + b_{i j 1} t + u_{i} \\ h_{i j} (t ∣ u_{i}, v_{i}, x_{i j}) & = h_{0} (t) v_{i} \exp {γ x_{i j} + α (G_{i j} (t) + u_{i})} \end{aligned}

(5)

Note that by including the frailty

v_{i}

this model is more general than the models of Ratcliffe et al.⁷ and Brilleman et al.⁸

Figure 1.

Model for the relationship between the marker profile $M (t)$ and the survival time $T$ for twins. (a) $M (t)$ exogenous and (b) $M (t)$ endogenous. When the variance of $e_{i j k}$ is small, $M_{i j} (t_{k})$ can be replaced with $Y_{i j k}$ .

2.2. Estimation of parameters

Let $t_{i j}$ be the observed survival time given in age scale and $t_{i j 0}$ be the age of enrollment to the study for person $j$ of twin pair $i, j = 1, 2$ . An individual is only included in the study if $t_{i j} > t_{i j 0}$ . Note that if the enrollment times $t_{i j 0}$ differ across subjects, we have delayed entry, and this needs to be accounted for when estimating the parameters. Furthermore, the observed time $t_{i j}$ of person $j$ of twin pair $i$ can be either censored or a time at which the event of interest has occurred. Let $δ_{i j}$ be the event indicator, that is, $δ_{i j} = 1$ if $t_{i j}$ is an event time, that is, $T_{i j} \leq C_{i j}$ and $δ_{i j} = 0$ , that is, $T_{i j} > C_{i j}$ otherwise. Furthermore, let $y_{i j k}$ be the observed value of the random variable $Y_{i j k}$ , which is a perturbation of $M_{i j} (t_{k})$ with $k = 0, \dots, K_{i j} - 1$ , the grid of observed time points for person $j$ of twin pair $i$ . We assume that $t_{K_{i j} - 1} \leq t_{i j}$ . In the next paragraphs, we will describe the estimation procedures for $M_{i j} (t)$ exogenous and a small error variance $σ_{e}^{2}$ , and for $M_{i j} (t)$ endogenous.

2.2.1. Estimation of $α$ by LOCF

When it can be assumed that the observations $y_{i j k}$ do not vary between consecutive time points $t_{k}$ and $t_{k + 1}$ and the measurement error $σ_{e}$ is relatively small, we have $Y_{i j} (t_{k}) \approx M (t_{k}) \approx M (t)$ for $t_{k} \leq t < t_{k + 1}$ and use $Y_{i j k}$ instead of $M (t)$ in the survival model. The variation in a biomarker between consecutive time points depends on both the biomarker’s trajectory and the density of the observation grid, where denser grids make this assumption more plausible. A small measurement error is realistic for precise biomarkers such as body mass index or fasting glucose levels (which are more precise than non-fasting glucose). These biomarkers are measured using standardized protocols and instruments, ensuring high accuracy in data collection and minimizing residual error.

Assume that the longitudinal marker $M_{i j} (t)$ takes the value $y_{i j k}$ within a given time interval $[t_{i j k}, t_{i j (k + 1)})$ at time points $k = 0, \dots, K_{i j} - 1$ , with $t_{i j k}$ the time point for person $j$ of twin pair $i$ for which the marker is observed and define $t_{i j (k + 1)} = t_{i j}$ , the observed survival time. Note that we typically do not have the marker value $y_{i j k}$ at $t_{i j}$ . Then the model for the conditional hazard $h_{i j} (t | v, x_{i j}, y_{i j})$ for person $j$ of twin pair $i$ is given by²⁰

h_{i j} (t | v_{i}) = h_{0} (t) v_{i} \exp (γ x_{i j} + α y_{i j k}) for t \in [t_{i j k}, t_{i j (k + 1)})

which is a standard Cox regression model with a time-dependent covariate.

For this model, the contribution of the $i$ th twin to the conditional likelihood function $L_{i} (α, γ, ξ | v_{i}, x_{i j})$ is therefore²⁵

\begin{aligned} L_{i} (α, γ, ξ | v_{i}) = & \prod_{j = 1}^{2} \prod_{k = 0}^{K_{i j} - 1} [h_{0} (t_{i j (k + 1)}) v_{i} \exp (γ x_{i j} + α y_{i j k})]^{δ_{i j k}} \\ \times \exp {- [H_{0} (t_{i j (k + 1)}) - H_{0} (t_{i j k})] v_{i} e^{γ x_{i j} + α y_{i j k}}} \end{aligned}

(6)

with

H_{0} (t)

the cumulative baseline hazard function.

To obtain the marginal likelihood, we have to integrate over the distribution of the frailty $v_{i}$ .⁴ It is well known that due to the delayed entry, the unobserved frailties do not represent a random sample. Large frailties are underrepresented because these correspond to subjects who are more likely to experience the event early. To obtain the correct distribution, it is needed to condition the frailty distribution on cluster $i$ being observed, that is we need to use the updated gamma frailty distribution. Let $g_{θ} (v)$ be the distribution of the frailties $v$ in the population. Now, using the updated frailty distribution, the log of the marginal likelihood can be formulated as follows:

l (α, γ, ξ, θ) = \sum_{i = 1}^{N} \log \int_{v_{i}} L_{i} (γ, ξ | v_{i}, x_{i j}) d G_{θ} (v_{i} | t_{i 1} > t_{i 10}, \dots, t_{i n_{i}} > t_{i n_{i} 0})

(7)

Note that

g_{θ} (v_{i} | t_{i 1} > t_{i 10}, \dots, t_{i n_{i}} > t_{i n_{i} 0})

equals the distribution

Γ (\frac{1}{θ}, \frac{1}{θ} + \sum_{j = 1}^{n_{i}} H_{i j} (t_{i j 0}) e^{γ x_{i j} + α y_{i j 0}})

.²⁶ Further note that we assume that we observe only complete twin pairs. Maximum likelihood estimates of the parameters

α

ξ

, and

θ

can be obtained by maximizing this log-likelihood function.

This approach is called LOCF since at event time $t$ for each individual still at risk the marker value $y_{i j k}$ with $k$ such that $t_{i j k} < t$ and $t_{i j (k + 1)} \geq t$ is used. When the assumption of constant marker values between the observed time points cannot be made, the estimate of $α$ might be biased, and a regression calibration method should be used.

2.2.2. Estimation of

α

by regression calibration methods

When it cannot be assumed that $y_{i j k}$ is constant between observed consecutive time points, we propose a two-step procedure where first a model for $M (t)$ is obtained and then estimates for $M (t)$ at a dense grid are plugged into the survival model. Such a method is available for singletons but not yet for clustered data. To use this approach for twin data, we propose a generalized linear mixed model for $M (t)$ instead of simple linear regression. Using the observations on the covariate $y_{i j k}$ , we can first fit model (1) with $G_{i j}$ given in (2) to the data to obtain estimates ${\hat{β}}_{1}$ , $Σ_{b}$ , $σ_{u}^{2}$ , and $σ_{e}^{2}$ . Now, consider a dense grid $t_{0}, \dots, t_{L - 1}$ over the observed age range of the $N$ twin pairs. Then for subject $j$ of twin pair $i$ , define $L_{i j} = {l : t_{i j 0} \leq t_{l} < t_{i j}}$ , that is, all grid time points for which person $j$ of twin pair $i$ is observed. Thus, for all $l \in L_{i j}$ , we can obtain the following estimates:

{\hat{M}}_{i j} (t_{l}) = {\hat{β}}_{0} + {\hat{β}}_{1} t_{l} + {\hat{b}}_{i j 0} + {\hat{b}}_{i j 1} t + {\hat{u}}_{i}

(8)

with

{\hat{b}}_{i j 0}

{\hat{b}}_{i j 1}

, and

{\hat{u}}_{i}

are the BLUP. Now, likelihood function (6) can be used to estimate

α

with index

l

instead of

k

and

l

running over

l \in L_{i j}

\begin{aligned} L_{i} (α, γ, ξ | v_{i}, x_{i j}) = & \prod_{j = 1}^{2} \prod_{l \in L_{i j}} [h_{0} (t_{(l + 1)}) v_{i} \exp (γ x_{i j} + α {\hat{M}}_{i j} (t_{l}))]^{δ_{i j l}} \\ \times \exp {- [H_{0} (t_{(l + 1)}) - H_{0} (t_{l})] v_{i} e^{γ x_{i j} + α {\hat{M}}_{i j} (t_{l})}} \end{aligned}

(9)

To obtain the necessary estimates for equation (8), a model can be fitted using all data under the assumption that the marker profiles are not influenced by the dropout due to the occurrence of the event. This method is called ordinary regression calibration (ORC;^16–22). If this assumption cannot be made, a way to relax it is to fit a model for each time point $t_{l}$ using only the subjects who have entered the study at time point $t_{l}$ , have not experienced the event yet, and are not yet censored (risk set regression calibration [RRC]). A drawback of this last method is that at the end of the study, the number of subjects at risk for the event might be small, and the error in estimation of ${\hat{M}}_{i j} (t_{l})$ might be large. Thus, a small number of individuals may violate the assumption of small $σ_{e}^{2}$ .

2.2.3. Estimation of

α

by a joint likelihood approach

For $M (t)$ endogenous, that is, $u_{i}$ represents unmeasured confounding linking $M (t)$ and $T$ , the estimator $\hat{α}$ from the LOCF, ORC, or RCC approach might be biased. Maximizing the full likelihood function might be preferred. Unfortunately, this requires integration over the joint distribution of $u_{i}$ , $b_{i 10}$ , $b_{i 11}$ , $b_{i 20}$ , and $b_{i 21}$ , hence a computationally intensive numerical approximation over five-dimensional integrals in the survival model. However, if we assume that the model depicted in Figure 1(b) holds, that is, $Y_{i 1}$ and $Y_{i 2}$ contain all information on $b_{i 10}$ , $b_{i 11}$ , $b_{i 20}$ , and $b_{i 21}$ , we might plug in BLUP of $b_{i 10}$ , $b_{i 11}$ , $b_{i 20}$ , and $b_{i 21}$ and only perform numerical integration over $u_{i}$ . We denote this novel approach: two-stage JM. Specifically, just as in the calibration regression method, we first obtain estimates for $M_{i j} (t)$ for a dense grid of $t$ , but then we compute

{\hat{G}}_{i j} (t_{k}) = {\hat{β}}_{0} + {\hat{β}}_{1} t_{k} + {\hat{b}}_{i j 0} + {\hat{b}}_{i j 1} t

(10)

that is, we do not plug in the estimate for

u_{i}

. Next, we fit the following joint model:

\begin{aligned} Y_{i j k} = {\hat{G}}_{i j} (t_{k}) + u_{i} + ε_{i j k} = {\hat{β}}_{0} + {\hat{β}}_{1} t_{k} + {\hat{b}}_{i j 0} + {\hat{b}}_{i j 1} t + u_{i} + ε_{i j k} \\ h_{i j} (t ∣ u_{i}, v_{i}, x_{i j}) = h_{0} (t) v_{i} \exp {γ x_{i j} + α ({\hat{G}}_{i j} (t_{k}) + u_{i}))} \end{aligned}

(11)

Note that since this is a joint model, we do not need to assume that

σ_{e}^{2}

is small. Thus, estimates of

β_{0}

β_{1}

, and

σ_{b}^{2}

are obtained in the first step, while estimates of

α

σ_{u}^{2}

ξ

, and

σ_{e}^{2}

are obtained by maximizing the following joint model likelihood in the second step:

\begin{aligned} ℓ_{p} (α, γ, σ_{u}^{2}, ξ, σ_{e}^{2}) = & \sum_{i = 1}^{N} \log \int_{u_{i}} [\int_{v_{i}} \prod_{j = 1}^{2} {\tilde{L}}_{i, T} (α, γ, ξ | u_{i}, v_{i}, x_{i j}) \\ \times g (v_{i} | t_{i 1} > t_{i 10} \dots t_{i n_{i}} > t_{i n_{i} 0}) d v_{i} \prod_{j = 1}^{2} {\tilde{L}}_{i, M} (σ_{e}^{2} | u_{i})] f_{u} (u_{i}) d u_{i} \end{aligned}

(12)

which involves just one integral over

u_{i}

to be solved numerically. Here,

\begin{aligned} {\tilde{L}}_{i, T} (α, γ, ξ | u_{i}, v_{i}, x_{i j}) = & \prod_{j = 1}^{2} [h_{0} (t_{i j}) v_{i} \exp (γ x_{i j} + α (β_{1} t + {\hat{b}}_{i j 0} + {\hat{b}}_{i j 1} t + u_{i}))]^{δ_{i j}} \\ \times \exp {- [H (t_{i j} | x_{i j}, {\hat{M}}_{i j} (t | u_{i})) - H (t_{i j 0} | x_{i j}, {\hat{M}}_{i j} (t_{i j 0} | u_{i}))]} \end{aligned}

and

{\tilde{L}}_{i, M} (σ_{b}^{2}, σ_{e}^{2} | u_{i}) = \prod_{j = 1}^{2} \frac{1}{\sqrt{2 π σ_{e}^{2}}} \exp (- \frac{(1 - {\hat{β}}_{0} - {\hat{β}}_{1} t - {\hat{b}}_{i j 0} - {\hat{b}}_{i j 1} t - u_{i})^{2}}{2 σ_{e}^{2}})

Note that this approach assumes that all confounding in the relation between

T

and

Y

, as well as the errors arising from using BLUPs, can be fully captured by

u_{i}

, hence at the twin pair level and not at the individual level.

To summarize, for estimation of the effect of a longitudinal marker on a survival time in clusters subject to delayed entry, we have proposed four new methods, namely LOCF, ORC, RCC, and JM. The calibration methods ORC and RCC estimate the marker value for each individual for a grid of time points, either using one linear mixed model for all data (ORC) or by using separate models for each time point (RCC). The last two-stage method (JM) estimates only a part of the parameters of the model for $M_{i j} (t)$ in the first stage and then applies a joint model to estimate the variance of the random effect modeling the relationship between the longitudinal process and the survival outcome as well as the parameters of the survival model hence takes into account (partly) the randomness of $M_{i j} (t)$ .

2.3. Simulation design

We evaluate and compare the bias of the presented estimators across various approaches, considering both sparse and dense scenarios, and the presence or absence of measurements. To do so, we simulate from a joint model for longitudinal and time-to-event data. Suppose that there are G groups with $n_{i}$ individuals in the $i th$ group, $i = 1, 2, \dots, G$ and $j = 1, 2$ . Let $y_{i j k} = y_{i j} (t_{k})$ denote the response of subject $j$ in cluster $i$ at time $t_{i j l}, l = 1, \dots, n_{i j}$ . For simplicity, we do not consider a random slope for the longitudinal model. We simulate data to follow the following theoretical model:

\begin{aligned} y_{i j k} & = M_{i j} (t_{k}) + ε_{i j l} = 1 + b_{i j} + u_{i} + 0.01 t_{k} + ε_{i j k} \\ h_{i j} (t ∣ b_{i j}, u_{i}, v_{i}) & = h_{0} (t) v_{i} \exp {α (1 + b_{i j} + u_{i} + 0.01 t)} \end{aligned}

(13)

where the random intercept

b_{i j} \sim N (0, σ_{b}^{2})

, the random cluster effect

u_{i} \sim N (0, σ_{u}^{2}),

and

ε_{i j k} \sim N (0, σ_{e}^{2})

. Assuming a Weibull baseline hazard

h_{0} (t) = λ ρ t^{ρ - 1}

, the cumulative hazard function is given by

H (t | M_{i j} (t), v_{i}) = \int_{0}^{t} λ ρ s^{ρ - 1} v_{i} \exp (α M_{i j} (s)) d s

Now, define the random variable

W

as a function of the random variable

T

as follows:

W = S (T | M_{i j} (T), v_{i}) = \exp [- H (T | M_{i j} (T), v_{i})]

The random variable

w

is uniformly distributed

U [0, 1]

. To generate survival times

t_{i 1}

and

t_{i 2}

, we sample a frailty

v_{i}

from the gamma distributions and the random effects

b_{i 1}

b_{i 2}

, and

u_{i}

from the normal distribution. Then we generate for each subject a

w

from the uniform distribution

U (0, 1)

. Next, we can obtain a random value for

T = t

for each subject as follows:

- \log (w) = H (t | M_{i j} (t), v_{i}) = \int_{0}^{t} λ ρ s^{ρ - 1} v_{i} \exp (α (β_{0} + b_{i j} + u_{i} + β_{1} s)) d s

(14)

where numerical integration is used to find

t

We assume that right censoring follows a uniform distribution $U [0, 15]$ . Next, we simulate individual-specific entry time points $t_{i j 0}$ as follows: a person has a probability of 0.5 to enter the study with a delay, that is, $t_{i j 0} > 0$ . If a person is delayed, we sample from the uniform distribution $U (0, 5)$ to obtain the truncation time $t_{i j 0}$ . We then generate follow-up times $t_{i j k}$ , at points $k = 0, \dots, K_{i j} - 1$ for an individual such that $t_{i j 0} \leq t_{i j k} .$ Then, for each $j$ , we can compute $y_{i j k}$ at time points $t_{k}$ .

The following parameter values are used $β_{0} = 1, β_{1} = 0.01, σ_{b}^{2} = {0.1}^{2}, σ_{u}^{2} = {0.1}^{2}$ , and frailty variance $θ = 0.5$ . Finally, for the baseline hazard, we use a shape parameter $ρ$ of 2 and a scale parameter $λ$ of 0.001. The value of $σ_{ε}$ varies across the considered scenarios.

We report the relative bias (reBias) and standard deviation (SD) for estimation of parameters in 1000 Monte Carlo trials for LOCF, regression calibration, and joint model in the main text. Results for mean square error (MSE) and coverage probabilities (CPs) are given in the Appendix. All methods are fit on the same simulated datasets. Inc.G represents the number of clusters included in the analysis after truncation and after removal of singletons (i.e., we only consider fully observed clusters). For all models, we fit a Weibull proportional hazards model.

2.4. Simulation results

2.4.1. Scenario 1: Dense and no measurement error

For this scenario, the longitudinal measurements are taken with a regular gap of 2 for all individuals, that is, at regular time points $0, 2, 4, 6, \dots$ and the variance $σ_{ε}$ is equal to zero (no measurement error). Table 1 and Table 6 in the Appendix show the performance of LOCF and the “naive” method in estimating the parameters of the gamma shared frailty model in the presence of delayed entry and longitudinal markers. The naive method ignores delayed entry and appearsto give biased results, especially for larger $α$ . The LOCF performed well in this setting.

2.4.2. Scenario 2: Sparse and no measurement error

For this scenario, all individuals have three or fewer measurements (i.e., $n_{i j} \leq 3$ ) and there is no measurement error ( $σ_{ε} = 0$ ). Here, we compare the performance of LOCF, RRC, and ORC.

Table 1.
Simulation study to investigate the effect of the magnitude of the longitudinal marker effect $α$ .

Naive LOCF

$α$ Par. Inc.G Events reBias (MCSE) SD (MCSE) reBias (MCSE) SD (MCSE)

1 $α$ 1637 578 $-$ 0.029 (0.010) 0.300 (0.007) $-$ 0.022 (0.010) 0.301 (0.007)

$θ$ 1637 578 0.027 (0.011) 0.181 (0.004) 0.034 (0.011) 0.181 (0.004)

2 $α$ 1584 1135 $-$ 0.031 (0.004) 0.242 (0.005) $-$ 0.012 (0.004) 0.246 (0.006)

$θ$ 1584 1135 $-$ 0.004 (0.006) 0.093 (0.002) 0.012 (0.006) 0.095 (0.002)

3 $α$ 1465 1633 $-$ 0.054 (0.003) 0.228 (0.005) $-$ 0.016 (0.003) 0.242 (0.005)

$θ$ 1465 1633 $-$ 0.032 (0.004) 0.062 (0.001) $-$ 0.002 (0.004) 0.063 (0.001)

				Naive	LOCF
1	$α$	1637	578	$-$ 0.029 (0.010)	0.300 (0.007)	$-$ 0.022 (0.010)	0.301 (0.007)
	$θ$	1637	578	0.027 (0.011)	0.181 (0.004)	0.034 (0.011)	0.181 (0.004)
2	$α$	1584	1135	$-$ 0.031 (0.004)	0.242 (0.005)	$-$ 0.012 (0.004)	0.246 (0.006)
	$θ$	1584	1135	$-$ 0.004 (0.006)	0.093 (0.002)	0.012 (0.006)	0.095 (0.002)
3	$α$	1465	1633	$-$ 0.054 (0.003)	0.228 (0.005)	$-$ 0.016 (0.003)	0.242 (0.005)
	$θ$	1465	1633	$-$ 0.032 (0.004)	0.062 (0.001)	$-$ 0.002 (0.004)	0.063 (0.001)

LOCF: last observation carried forward; reBias: relative bias; MCSE: Monte Carlo standard error; SD: standard deviation.

Naive is set up for modeling longitudinal markers without adjusting for delayed entry.

The results are depicted in Table 2 and in Table 7 in the Appendix. LOCF appeared to perform better for larger values of $α$ . This can be explained by the data generation mechanism. Note that all individuals have $\leq$ three measurements during the study (not taken at regular time-points) and that we hold all other parameters fixed except for the value of $α$ during the data generation. Generally, for larger $α$ while holding other settings constant results in lower values of generated survival time, while smaller $α$ results in larger values of generated survival time. Thus, we have less sparse measurements for larger $α$ . Both $α$ and the sparseness of measurements are changing in this case. Overall, the regression calibration approaches outperform the LOCF.

Table 2.

Simulation study to investigate the effect of the magnitude of the longitudinal marker effect $α$ .

				LOCF		RRC		ORC
$α$	Par.	Inc.G	Events	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)
1	$α$	1637	577	$-$ 0.306 (0.011)	0.342 (0.008)	$-$ 0.031 (0.012)	0.365 (0.008)	0.040 (0.015)	0.466 (0.010)
	$θ$	1637	577	$-$ 0.062 (0.010)	0.159 (0.004)	$-$ 0.018 (0.011)	0.178 (0.004)	$-$ 0.043 (0.011)	0.175 (0.004)
2	$α$	1584	1136	$-$ 0.187 (0.004)	0.252 (0.006)	$-$ 0.021 (0.005)	0.296 (0.007)	$-$ 0.054 (0.004)	0.228 (0.005)
	$θ$	1584	1136	$-$ 0.079 (0.006)	0.087 (0.002)	$-$ 0.019 (0.006)	0.099 (0.002)	$-$ 0.032 (0.004)	0.062 (0.001)
3	$α$	1464	1633	$-$ 0.140 (0.002)	0.231 (0.005)	$-$ 0.020 (0.003)	0.272 (0.006)	$-$ 0.037 (0.004)	0.361 (0.008)
	$θ$	1464	1633	$-$ 0.093 (0.004)	0.060 (0.001)	0.005 (0.004)	0.068 (0.002)	$-$ 0.066 (0.004)	0.066 (0.002)

LOCF: last observation carried forward; RRC: risk set regression calibration; ORC: ordinary regression calibration; reBias: relative bias; MCSE: Monte Carlo standard error; SD: standard deviation.

Table 3.

Simulation study to investigate the effect of the magnitude of the longitudinal marker effect $α$ .

				LOCF		RRC		ORC		JM
$α$	Par.	Inc.G	Events	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)
1	$α$	1637	577	$-$ 0.542 (0.008)	0.255 (0.006)	$-$ 0.010 (0.016)	0.511 (0.011)	0.002 (0.015)	0.488 (0.011)	0.002 (0.011)	0.337 (0.008)
	$θ$	1637	577	$-$ 0.034 (0.011)	0.172 (0.004)	0.085 (0.014)	0.221 (0.005)	$-$ 0.076 (0.012)	0.189 (0.004)	$-$ 0.013 (0.012)	0.175 (0.004)
2	$α$	1584	1136	$-$ 0.493 (0.003)	0.200 (0.004)	$-$ 0.009 (0.005)	0.345 (0.008)	$-$ 0.033 (0.006)	0.371 (0.008)	0.005 (0.005)	0.287 (0.007)
	$θ$	1584	1136	$-$ 0.077 (0.006)	0.092 (0.002)	0.014 (0.007)	0.103 (0.002)	$-$ 0.053 (0.007)	0.107 (0.002)	$-$ 0.003 (0.007)	0.093 (0.002)
3	$α$	1464	1633	$-$ 0.494 (0.002)	0.164 (0.004)	$-$ 0.035 (0.003)	0.306 (0.007)	$-$ 0.078 (0.004)	0.381 (0.009)	$-$ 0.010 (0.003)	0.287 (0.007)
	$θ$	1464	1633	$-$ 0.109 (0.004)	0.063 (0.001)	$-$ 0.006 (0.004)	0.067 (0.001)	$-$ 0.048 (0.005)	0.073 (0.002)	$-$ 0.022 (0.005)	0.062 (0.001)

LOCF: last observation carried forward; RRC: risk set regression calibration; ORC: ordinary regression calibration; reBias: relative bias; MCSE: Monte Carlo standard error; SD: standard deviation.

We let the $σ_{ε} = 0.1$ .

Table 4.

Simulation study to investigate the effect of the magnitude of measurement error $σ_{ε}^{2}$ .

				LOCF		RRC		ORC		JM
$σ_{ε}$	Par.	Inc.G	Events	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)	reBias (MCSE)	SD (MCSE)
0	$α$	1584	1136	$-$ 0.187 (0.004)	0.252 (0.006)	$-$ 0.021 (0.005)	0.296 (0.007)	$-$ 0.054 (0.004)	0.228 (0.005)
	$θ$	1584	1136	$-$ 0.079 (0.006)	0.087 (0.002)	$-$ 0.019 (0.006)	0.099 (0.002)	$-$ 0.032 (0.004)	0.062 (0.001)
0.1	$α$	1584	1136	$-$ 0.493 (0.003)	0.200 (0.004)	$-$ 0.009 (0.005)	0.345 (0.008)	$-$ 0.033 (0.006)	0.371 (0.008)	0.005 (0.005)	0.287 (0.007)
	$θ$	1584	1136	$-$ 0.077 (0.006)	0.092 (0.002)	0.014 (0.007)	0.103 (0.002)	$-$ 0.053 (0.007)	0.107 (0.002)	$-$ 0.003 (0.006)	0.093 (0.002)
0.2	$α$	1584	1136	$-$ 0.765 (0.002)	0.132 (0.003)	$-$ 0.035 (0.007)	0.421 (0.009)	$-$ 0.044 (0.007)	0.466 (0.010)	$-$ 0.013 (0.005)	0.324 (0.008)
	$θ$	1584	1136	$-$ 0.046 (0.006)	0.094 (0.002)	0.007 (0.007)	0.105 (0.002)	$-$ 0.034 (0.006)	0.098 (0.002)	0.005 (0.006)	0.097 (0.002)
0.3	$α$	1584	1136	$-$ 0.876 (0.001)	0.094 (0.002)	$-$ 0.094 (0.008)	0.476 (0.011)	$-$ 0.082 (0.009)	0.588 (0.013)	$-$ 0.005 (0.005)	0.307 (0.007)
	$θ$	1584	1136	$-$ 0.025 (0.006)	0.096 (0.002)	0.008 (0.007)	0.106 (0.002)	$-$ 0.012 (0.006)	0.100 (0.002)	$-$ 0.005 (0.006)	0.098 (0.002)
0.6	$α$	1584	1136	$-$ 0.965 (0.001)	0.048 (0.001)	$-$ 0.499 (0.009)	0.585 (0.013)	$-$ 0.097 (0.020)	1.270 (0.028)	0.002 (0.006)	0.331 (0.008)
	$θ$	1584	1136	$-$ 0.005 (0.006)	0.096 (0.002)	0.012 (0.007)	0.107 (0.002)	0.003 (0.006)	0.100 (0.002)	$-$ 0.005 (0.006)	0.098 (0.002)

LOCF: last observation carried forward; RRC: risk set regression calibration; ORC: ordinary regression calibration reBias: relative bias; MCSE: Monte Carlo standard error; SD: standard deviation.

We let the $α = 2$ .

Table 5.

Parameter estimates assuming gamma frailty distribution and BMD as a longitudinal marker in the presence of delayed entry for only twin pairs (383 dizygotic twin pairs).

	LOCF	RRC	ORC	Joint model
Variable	Estimate (s.e.)	Estimate (s.e.)	Estimate (s.e.)	Estimate (s.e.)
$\exp (α)$	0.081 (0.072)	0.054 (0.053)	0.033 (0.039)	0.018 (0.022)
$θ$	0.280 (0.291)	0.222 (0.294)	0.294 (0.291)	0.805 (0.372)
$λ$	<0.001 (0.001)	<0.001 (0.002)	0.001 (0.008)	0.003 (0.018)
$ρ$	2.579 (1.099)	2.394 (1.178)	2.244 (1.095)	2.292 (1.072)
$σ_{u}$				0.040 (0.002)
$σ_{e}$				0.071 (0.001)

BMD: bone mineral density; LOCF: last observation carried forward; RRC: risk set regression calibration; ORC: ordinary regression calibration; s.e.: standard error.

The number of observed events is 138. ID-level random slope included in the model.

2.4.3. Scenario 3: Sparse with measurement error

The simulations performed using Scenarios 1 and 2 assume that $M (t)$ is measured without an error. We now consider the scenario of measurement error. Similar to Scenario 2, we consider all individuals to have $\leq 3$ measurements during the study.

We perform simulations to investigate the effect of the magnitude of measurement error $σ_{ε}^{2}$ and of the longitudinal marker effect $α$ on the performance of the various estimators of $α$ and $θ$ .

Tables 3 and 4, along with Tables 8 and 9 in the Appendix, present the performance of the proposed methods (LOCF, RRC, ORC, and JM) in estimating the parameters of the gamma shared frailty model under delayed entry, for various values of $α$ and $σ_{e}^{2}$ , respectively.

When using the LOCF approach, the effect $α$ of the longitudinal marker is estimated with a negative relative bias in all scenarios (i.e., for various magnitudes of $α$ or of the measurement error $σ_{ε}^{2}$ ). The LOCF estimator for $α$ has a largest negative bias of about 54% when $α$ = 1 (Table 3 and of about 96% when $σ_{ε}$ = 0.6, see Table 4).

The two regression calibration approaches yield similar results, except for cases of a large magnitude of $α$ ( $α = 3$ ), and of a large magnitude of the measurement error ( $σ_{ε} = 0.6$ ). For a large magnitude of the effect of the longitudinal marker ( $α$ =3), RRC performs better than ORC. From Table 3, the largest bias in the RRC estimator of $α$ is about 3%, while the largest bias in the ORC estimator of $α$ is about 8%. For a large magnitude of measurement error ( $σ_{ε} = 0.6$ ), ORC performs better than RRC. From Table 4, the RRC estimator of $α$ has the largest bias of about 50% while the ORC estimator of $α$ has a bias <10% for the considered scenarios.

The JM approach performs well in estimating the effect $α$ of the longitudinal marker for all magnitudes considered of the longitudinal marker effect $α$ and of the measurement error $σ_{ε}^{2}$ . From Table 3, the joint model estimator of $α$ has the largest bias of about 5% when $α = 1$ , while from Table 4, this estimator of $α$ has the largest bias of about 6% for large $σ_{ε}$ .

All the methods appear to estimate $θ$ with low bias for all considered magnitudes of the effect $α$ and of the measurement error $σ_{ε}^{2}$ (largest bias for LOCF estimator of $θ$ is about 10% when $α = 3$ , the largest bias for the calibration estimators of $θ$ is about 8% when $α = 1$ and the largest bias for the JM estimator of $θ$ is about 3% when $α = 3$ as shown in Table 3). Results on performance in estimation $σ_{u}$ when fitting the joint model are given in the Appendix (see Tables 10 and 11) for various values of the longitudinal marker effect $α$ and of the measurement error $σ_{ε}^{2}$ .

Overall, the calibration and the JM approaches yield less bias in the estimation of parameters as compared to the LOCF approach in all considered scenarios. The bias in the estimation of $α$ increases when the measurement error increases, with the exception of the JM estimator. Both LOCF and the two regression calibration methods result in consistent underestimation of $α$ , the effect of the longitudinal marker. This underestimation is more severe for the LOCF approach. It appears that in the considered scenarios, the ORC performs well in estimation of the effect of the longitudinal marker, even for large measurement error, as it yields notably low bias.

The joint model had convergence issues (about 10% of the simulated datasets do not yield standard error estimates of the parameter $α$ ). In terms of computational time, the Joint Model (JM) was the most time-consuming. For example, in scenario 3, when $α$ =2, JM required on average 190 minutes per simulation run. In contrast, the RRC, ORC, and LOCF approaches were substantially faster, taking approximately 1 min 40 s, 1 min 10 s, and 18 s, respectively.

3. Application: The effect of BMD on fracture incidence

We have access to longitudinal BMD observations and age at fractures of female dizygotic twins of 50 years of age and older from the TwinsUK (https://twinsuk.ac.uk). In Muli et al.,²³ this dataset was analyzed using BMD as a time-fixed covariate to estimate the probability of a fracture in the next time period, that is, only the BMD at entry was used. BMD at entry appeared to be a statistically significant risk factor for fracture incidence. Here, we aim to estimate the relationship between BMD and fracture incidence over age. A joint model might be appropriate when there are genetic factors influencing both the BMD outcome over time and fracture incidence. On the other hand, the heritability of fracture incidence is lower, and identified genetic loci also have an effect on BMD, hence only a direct effect of BMD on fracture incidence might be biologically plausible as well.²⁷ Since the twins enter the study at different ages, we need to use the approaches developed in this paper. For all approaches, we consider a Weibull baseline hazard and gamma frailty distribution.

In this analysis, we consider BMD measurements after age 50 for the 766 individuals from 383 twin pairs (for results of the analysis of 383 twin pairs and 262 single twins see Table 12). For a sample of twins, their BMD profiles over age are given in Figure 2. We observe that indeed subjects enter the study at different ages and that for most subjects, BMD decreases with age. Furthermore, we notice that the time gaps between observed time points are quite large. The Kaplan–Meier curve taking into account delayed entry is depicted in Figure 3. It appeared that the probability of having a fracture before the age of 80 years is 0.3 to 0.4 in this cohort. Analysis of the longitudinal observed BMD showed that a model with a random intercept and slope at the individual level and a random intercept at the twin-level fitted the data well. All three random factors appeared to be statistically significant.

Figure 2.

Spaghetti plot for 50 participants (a subset of the data).

Figure 3.

Kaplan–Meier curve for fractures using age as time scale.

When modeling the BMD over time, we used mixed models including a random intercept and slope to model the correlation over time for a subject and a shared twin effect to model the correlation between twins. For ORC, we fit such a single mixed effects model in the first stage using all available data to compute predicted covariate values for each subject at each age in the sample. For RRC, we split the time into intervals: there are 30 unique ages at entry/event times (minimum age at entry is 50 years, minimum event time is 52 years, while the maximum event time is 85 years). For ages 51 and 52, we could only model a random twin effect, because individuals do not have repeated measurements at those time points. For the other unique age at entry/event times, the same mixed model as for ORC was used. From these models, we compute the predicted value of the covariate for individuals still at risk at a specific event time. For ORC and RRC, the data are arranged in start–stop format and the shared frailty model by maximizing the likelihood function (7). For the JM approach, we used the mixed model of the first step of ORC to obtain estimates of the fixed parameters and the empirical Bayes estimates of the random intercepts and slopes. Then, model (11) is fitted using the likelihood function (12).

All approaches gave a highly significant effect of BMD on fracture incidence (Table 5) ( $p$ -values of $< 0.001$ ). RRC, ORC, and JM gave stronger (smaller) point estimates as compared to the LOCF. JM provided the strongest (smallest) effect estimate (0.018, s.e.: 0.022). Concerning the variance of the frailty ( $θ$ ), the estimates vary from 0.222 (RRC) to 0.805 (JM). The residual variance of the mixed model was quite small ( $σ_{e} = 0.071$ ).

In a separate analysis, we model the effect of BMD on fracture incidence in the monozygotic dataset, using BMD measurements taken after age 50. This analysis includes 288 monozygotic twin pairs and 188 monozygotic single twins (see Tables 13 and 14 in the Appendix).

4. Discussion

In this paper, we propose a novel joint model for a longitudinal marker and a survival outcome in twin studies. We developed four approaches (LOCF, ORC, RRC, and JM) to estimate the relationship between a longitudinal marker and a survival outcome, and the frailty variance using data from twins who enter the study at different ages. Through simulations, we showed that LOCF performs well for a dense grid of observed marker values. ORC, RCC, and JM also perform well for a sparse grid. When the measurement error is large ( $σ_{e} \geq 0.6$ ), ORC and RCC provided bias results while JM still performed well.

Our novel JM approach can be interpreted as a hybrid approach in between a full joint modeling approach and the regression calibration approach. The advantage of a full JM approach is that it captures all variation that might be present in the data. Unfortunately, it is typically computationally infeasible to fit this model to the data. In contrast, our JM approach is computationally feasible, while regression calibration methods such as ORC and RRC are even more computationally efficient. Therefore, depending on the type of longitudinal marker, the density of the observations over time, and the magnitude of the measurement error, the calibration methods might be preferred. Specifically, in our simulation study, LOCF appears to perform well with small measurement error and dense grids, making it the standard method, as it is for singletons, for this scenario. ORC and RRC appear to be the preferred options for sparse grids with low measurement error. Low measurement error is a reasonable assumption in many practical settings. In studies in which rigorous data collection procedures are followed with standardized protocols, measurement error can be minimized. For example, in longitudinal studies measuring biomarkers such as blood glucose levels, where devices are regularly calibrated, the measurement error tends to be small. However, in studies with less controlled data collection processes, one should expect a larger measurement error, and in these cases, the joint model approach might be preferable. On the other hand, with a large measurement error, JM might show numerical instability. In such cases, ORC is an acceptable alternative.

The methods were applied to real data from the TwinUK twin registry. Here, LOCF yielded a weaker estimate than the other methods. Indeed, the gaps between the observed time points are quite large; hence, the two-stage approach should perform better. For the JM approach, the estimate of the variance of the frailty was larger than for the calibration methods. The differences between the estimates of the effect of BMD and of the frailty variance across the approaches are not statistically significant. The standard errors for the effect of BMD on survival are similar, although they slightly increase for the more advanced methods (ORC and JM). This can be expected, since the more advanced methods better capture the randomness in the longitudinal marker. For a full joint modeling approach, these standard errors might increase even further. Unfortunately, it appears computationally infeasible to apply this method to this dataset. However, given the small differences in standard errors between the methods used, a full JM approach would likely yield the same conclusion.

Our method is relevant for other survival outcomes and other twin studies beyond TwinsUK. Using age as an underlying time variable is often better interpretable than arbitrary follow-up time and results in more parsimonious models. For example, Cirulli et al.¹ modeled the effect of metabolome health on cardiovascular events in TwinsUK. They used follow-up as the underlying time variable and age-at-entry as covariate in a Cox proportional hazard model. For Danish twins, Tan et al.² modeled the effect of DNA methylation on mortality, adjusting for age at blood sampling. Also, when combining different studies in a meta-analysis, typically, follow-up times across studies are not comparable, and using age would be more appropriate. Our methods allow for the utilization of age as an underlying time scale, providing a more appropriate approach for analyzing longitudinal markers in twin studies. Furthermore, our proposed methods could be applied to other types of clustered survival data, such as paired data involving organs such as glycoma onset in eyes and hip replacement in hips due to osteoarthritis.

We proposed a novel two-stage approach for fitting a joint model as an alternative for a full joint estimation of all the parameters. We hypothesize that the computational complexity of a full-likelihood approach is often unfeasible in applications. Our two-stage joint model seems to represent a good tradeoff between complexity and applicability in the presence of measurement error. We have chosen to jointly model the cluster-specific parameters and the random error term of the longitudinal marker process and take the subject-specific random effects of the longitudinal marker process as independent of the time-to-event process. This assumption aligns with the idea that in twin studies, the shared unmeasured confounding between the biomarker and survival outcome is more likely to operate through the twin-level effect, rather than individual-specific effects, which justifies the focus on modeling $u_{i}$ while using BLUPs for the individual-specific deviations.

Several extensions of the methods can be considered, namely, modeling the recurrent fracture incidence and a more complex within-cluster structure to also include monozygotic twins. Additionally, we considered only complete twin pairs in our analysis, with results for including incomplete twin pairs provided in the Appendix. However, the likelihood function used is conditioned on both twins entering the study, which is violated when including single-twin members, as previously discussed in the context of time-fixed covariates.²⁸ Instead, one should condition on the entry of one twin, which complicates the likelihood function. Furthermore, to account for dropouts due to severe illness and death, competing risk models might be considered. Another extension is to use both the calendar and age scale simultaneously as proposed by Bower et al.²⁹ Lastly, we considered a Weibull baseline hazard in the simulation study and data application because it is commonly used as it is computationally attractive, and can take different shapes depending on the value of the shape parameter. It is straightforward to use more flexible baseline hazards, for example, by using splines.³⁰

In conclusion, the study of longitudinal markers in the presence of clustered survival data and delayed entry necessitates careful consideration of model choice. We have introduced a set of new models and estimation methods for this research area, along with specific recommendations based on the presumed underlying relationships, observation density, and error structures.

Footnotes

ORCID iDs

Annah Muli

Mar Rodriguez-Girondo

Jeanine Houwing-Duistermaat

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project has received funding from the European Union’s Horizon 2020 research and innovation programme, under H2020-MSCA-ITN grant agreement number 721815. TwinsUK is funded by the Wellcome Trust, Medical Research Council, European Union, Chronic Disease Research Foundation (CDRF), Zoe Global Ltd, and the National Institute for Health Research-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data and code availability statement

Data on information about fractures and BMD for the twins can be accessed through the TwinsUK data access committee. For information on access and how to apply, visit https://twinsuk.ac.uk. Code can be found on .

Appendix

Table 14.

Parameter estimates assuming gamma frailty distribution and BMD as a longitudinal marker in the presence of delayed entry for 288 monozygotic twin pairs and 178 monozygotic singletons.

	LOCF	RRC	ORC	Joint model
Variable	Estimate (s.e.)	Estimate (s.e.)	Estimate (s.e.)	Estimate (s.e.)
$\exp (α)$	0.040 (0.061)	0.037 (0.062)	0.032 (0.057)	0.024 (0.000)
$θ$	1.452 (0.741)	1.516 (0.775)	1.502 (0.784)	0.064 (0.000)
$λ$	<0.001 (0.000)	<0.001 (0.000)	<0.001 (0.000)	0.010 (0.000)
$ρ$	4.671 (1.499)	4.664 (1.577)	4.539 (1.615)	2.358 (0.004)
$σ_{u}$				0.077 (0.000)
$σ_{e}$				0.053 (0.000)

BMD: bone mineral density; LOCF: last observation carried forward; RRC: risk set regression calibration; ORC: ordinary regression calibration; s.e.: standard error.

The number of observed events is 91.

References

Cirulli

Guo

Swisher

, et al. Profound perturbation of the metabolome in obesity is associated with health risk. Cell Metab 2019; 29: 488–500.

Tan

Sørensen

, et al. Age patterns of intra-pair DNA methylation discordance in twins: sex difference in epigenomic instability and implication on survival. Aging Cell 2021; 20: e13460.

Clayton

. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978; 65: 141–151.

Duchateau

Janssen

. The frailty model. New York: Springer Verlag, 2008.

Hougaard

Harvald

Holm

. Measuring the similarities between the lifetimes of adult Danish twins born between 1881–1930. J Am Stat Assoc 1992; 87: 17–24.

Hougaard

. Analysis of multivariate survival data. New York: Springer, 2000.

Ratcliffe

Guo

Ten Have

. Joint modeling of longitudinal and survival data via a common frailty. Biometrics 2004; 60: 892–899.

Brilleman

Crowther

Moreno-Betancur

, et al. Joint longitudinal and time-to-event models for multilevel hierarchical data. Stat Methods Med Res 2019; 28: 3502–3515.

Christoffersen

. VAJointSurv: joint models for longitudinal and survival data using variational approximations. R package version 1.0. Available from: https://cran.r-project.org/package=VAJointSurv, 2023.

10.

Rustand

van Niekerk

Krainski

, et al. Joint modeling of multivariate longitudinal and survival outcomes with the R package INLAjoint. arXiv preprint arXiv:2402.08335, 2024.

11.

Yildirim

Karasoy

. gsem: A Stata command for parametric joint modelling of longitudinal and accelerated failure time models. Comput Methods Programs Biomed 2020; 196: 105612.

12.

Crowther

. merlin—A unified modeling framework for data analysis and methods development in Stata. Stata J 2020; 20: 763–784.

13.

Rondeau

Mazroui

Gonzalez

. frailtypack: An R package for the analysis of correlated survival data with frailty models using penalized likelihood estimation or parametrical estimation. J Stat Softw 2012; 47: 1–28.

14.

Rizopoulos

. Joint models for longitudinal and time-to-event data: With applications in R. Boca Raton, FL: CRC Press, 2012.

15.

Therneau

. A package for survival analysis in R. R package version 3.5-7. Available from: https://CRAN.R-project.org/package=survival (2023).

16.

Bycott

Taylor

. A comparison of smoothing techniques for CD4 data measured with error in a time-dependent Cox proportional hazards model. Stat Med 1998; 17: 2061–2077.

17.

Dafni

Tsiatis

. Evaluating surrogate markers of clinical outcome when measured with error. Biometrics 1998; 54: 1445–1462.

18.

Ngwa

Cabral

Cheng

, et al. Revisiting methods for modeling longitudinal and survival data: Framingham Heart Study. BMC Med Res Methodol 2021; 21: 1–2.

19.

Self

Pawitan

. Modeling a marker of disease progression and onset of disease. In: AIDS Epidemiology: Methodological Issues. Boston, MA: Birkhäuser, 1992, pp.231–255.

20.

Sweeting

Thompson

. Joint modelling of longitudinal and time-to-event data with application to predicting abdominal aortic aneurysm growth and rupture. Biom J 2011; 53: 750–763.

21.

Tsiatis

Degruttola

Wulfsohn

. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and CD4 counts in patients with AIDS. J Am Stat Assoc 1995; 90: 27–37.

22.

Lin

Taylor

. Semiparametric modeling of longitudinal measurements and time-to-event data—a two-stage regression calibration approach. Biometrics 2008; 64: 1238–1246.

23.

Muli

Gusnanto

Houwing-Duistermaat

. Use of shared gamma frailty model in analysis of survival data in twins. Theor Biol Forum 2021; 114: 45–58.

24.

Zohoori

Savitz

. Econometric approaches to epidemiologic data: relating endogeneity and unobserved heterogeneity to confounding. Ann Epidemiol 1997; 7: 251–257.

25.

Klein

Moeschberger

. Survival analysis: Techniques for censored and truncated data. New York: Springer, 2003.

26.

Jensen

Brookmeyer

Aaby

, et al. Shared frailty model for left-truncated multivariate survival data. Copenhagen: Department of Biostatistics, University of Copenhagen, 2004.

27.

Trajanoska

Morris

Oei

, et al. Assessment of the genetic and clinical determinants of fracture risk: genome wide association and mendelian randomisation study. BMJ 2018; 362: k3225.

28.

Rodríguez-Girondo

Deelen

Slagboom

, et al. Survival analysis with delayed entry in selected families with application to human longevity. Stat Methods Med Res 2018; 27: 933–954.

29.

Bower

Andersson

Crowther

, et al. Flexible parametric survival analysis with multiple timescales: estimation and implementation using stmt. Stata J 2022; 22: 679–701.

30.

Muli

(2023)Advances in shared frailty models with application to twin data. Doctoral dissertation, University of Leeds.

Modeling the effect of longitudinal markers on left-truncated time-to-event outcomes in twin studies

Abstract

Keywords

1. Introduction

2. Methods

2.1. Notation and model formulation

2.2.1. Estimation of α by LOCF

2.4.1. Scenario 1: Dense and no measurement error

2.4.2. Scenario 2: Sparse and no measurement error

3. Application: The effect of BMD on fracture incidence

Footnotes

ORCID iDs

Funding

Declaration of conflicting interests

Data and code availability statement

Appendix

References

2.2.1. Estimation of $α$ by LOCF