Using joint models for longitudinal and time-to-event data to investigate the causal effect of salvage therapy after prostatectomy

Abstract

Prostate cancer patients who undergo prostatectomy are closely monitored for recurrence and metastasis using routine prostate-specific antigen measurements. When prostate-specific antigen levels rise, salvage therapies are recommended in order to decrease the risk of metastasis. However, due to the side effects of these therapies and to avoid over-treatment, it is important to understand which patients and when to initiate these salvage therapies. In this work, we use the University of Michigan Prostatectomy Registry Data to tackle this question. Due to the observational nature of this data, we face the challenge that prostate-specific antigen is simultaneously a time-varying confounder and an intermediate variable for salvage therapy. We define different causal salvage therapy effects defined conditionally on different specifications of the longitudinal prostate-specific antigen history. We then illustrate how these effects can be estimated using the framework of joint models for longitudinal and time-to-event data. All proposed methodology is implemented in the freely-available R package JMbayes2.

Keywords

Decision making observational data time-varying confounding time-varying treatment shared parameter model survival analysis

1. Introduction

Many prostate cancer (PCa) patients undergo surgical removal of the prostate gland (radical prostatectomy) as their initial treatment, after being diagnosed with prostate cancer. Even though this surgical procedure is generally successful, the risk of recurrence and metastasis remains. For this reason, urologists closely monitor the prostate-specific antigen (PSA) levels of these patients. PSA can be easily measured from a blood sample, and a persistent rise in the PSA values suggests the cancer may be regrowing, although it is generally not yet detectable on imaging. After the initial surgery, PSA levels drop to near zero; however, PSA may rise again for some patients, leading the treating physicians to recommend salvage therapy to reduce their risk of metastasis. Salvage therapy typically consists of radiotherapy with or without androgen deprivation therapy (ADT), although in some cases ADT alone may be utilized. After salvage therapy, PSA levels nearly always drop, sometimes substantially, but typically rise again if metastasis is going to occur. Because of the serious side effects of the aforementioned salvage therapies and to avoid over-treatment, it is critical to understand which patients are most likely to benefit from these treatments and when they should be initiated. To our knowledge, there are no randomized trials that are designed explicitly to address this question. However, there are clinical trials that compare different types of salvage therapy.¹ In this article, we will not be considering that different salvage therapies have different effects.

We will use the University of Michigan Prostatectomy Registry (UMP) Data to tackle this question. This database includes 3634 PCa patients who underwent a radical prostatectomy during the period 1996–2013. Of those patients, 271 (15.6%) received salvage therapy, 102 (2.8%) developed metastasis, and 209 (5.8%) died without metastases. Of these 209 patients, 190 died before salvage therapy, and 19 died after salvage therapy without developing metastasis. For the patients who received salvage, the median PSA just prior to the initiation of the therapy was 0.7 ng/mL (min: 0.0009, Q1: 0.4, Q3: 2.06, max: 266 ng/mL), and 55 had a metastasis. For the patients who did not receive salvage therapy or experience metastases, the median PSA at their last measurement was 0.001 ng/mL (min: 0.0009, Q1: 0.001, Q3: 0.09, max: 11.7 ng/mL). Since death due to prostate cancer is extremely rare without prior metastases, these deaths are considered as due to other causes. Supplemental Figures 1 and 2 show the cumulative incidence functions for metastasis and death and initiation of salvage therapy and death. The median follow-up time was 6.4 years (min: 0.1, Q1: 3.3, Q3: 10.9, max: 17 years). This dataset has been previously described by Beesley et al.² The detection of metastasis usually requires imaging. Although imaging is noninvasive, it is not routinely performed without an indication, and the decision to undergo imaging would usually be based on whether the patients exhibited symptoms or if the PSA levels were consistently rising and had attained higher values. There is variation in the schedule of when PSA measurements are taken after the initial surgery, but an average schedule might be every three months for the first year, every 6 months for the next 2 years, and annually thereafter. PSA measurements are also collected after salvage therapy is initiated. We aim to utilize the longitudinal PSA measurements and baseline information (Gleason score, T-stage of the tumors, age, race, and comorbidities) to quantify the effect of salvage therapy on reducing or delaying metastases and aid the decision-making process of urologists. Given the observational nature of the UMP data, we face several challenges in achieving our goal. First, the longitudinal PSA process affects both metastasis and the assignment of salvage therapy, and, therefore, it is a time-varying confounder. At the same time, the salvage therapy affects future PSA measurements, and hence the longitudinal PSA process will also be an intermediate variable. In addition, patients who received salvage therapy are more closely monitored, and they will tend to have more PSA measurements and more checks done to see if metastasis has occurred than patients who did not. Finally, death from other causes is a competing risk for metastasis and requires accounting for it appropriately.

Nowadays, there is a considerable body of literature on methods developed to estimate causal treatment effects in settings with time-varying treatments when there exists confounding by time-dependent covariates affected by earlier treatments. While matching methods^3,4 and joint modeling approaches^5,6 have been developed, most of this literature has focused on marginal structural models, dynamic treatment regimes, and related methods. Excellent overviews of these methods are given by Hernán and Robins⁷ and Tsiatis et al.⁸ The advantage of these methods is that they are primarily non-parametric, making very few assumptions for the time-varying confounders. However, most of them require specifying a single model for all patients that relates the decision to give treatment on past confounder values. This is especially challenging in our setting because doctors decide when to initiate therapy on different grounds. For example, one physician may use just the last observed value of PSA. In contrast, another may treat patients who showed a sudden increase/decrease in the biomarker in the previous two/three measurements. In situations like this, specifying a single model for the decision to start therapy conditional on past PSA values is challenging. An alternative approach to estimating a causal effect is through direct modeling of the outcome process.^5,9 We will adopt this approach and avoid having to model the treatment initiation process, and employ the framework of joint models for longitudinal and time-to-event data.¹⁰ In particular, we postulate a linear mixed-effects model for the longitudinal PSA levels that explicitly accounts for the change in the subject-specific trajectories after initiating salvage therapy. For the instantaneous risk of metastasis, we specify a relative risk model that includes the time-varying salvage therapy and PSA effects, and for the hazard of death, we only include the time-varying salvage therapy effect. Based on the postulated joint model, and for each patient at risk at time $t$ , we predict the cumulative risk of metastasis under the two treatment scenarios, initiating salvage therapy at $t$ or continuing without it. Then, we obtain the causal salvage therapy effect by suitably averaging these predicted cumulative risks over appropriate groups of subjects. The benefit of our approach is that it does not require defining a model for initiating treatment. In this context, we have performed a simulation study to investigate the performance of joint models in settings with time-varying confounding.

The rest of the article is organized as follows. Section 2 presents the definition of causal salvage therapy effects we wish to estimate and the assumptions we make to identify these effects from observational data. Section 3 presents the modeling framework we use to analyze the data, and Section 4 shows the procedure to estimate the salvage therapy effects and derive their variance. In Section 5, we apply our proposed methods in the UMP dataset. Finally, Section 6 includes the results of a simulation study, and Section 7 concludes the article.

2. Salvage therapy effects

2.1. Definitions

We let $T^{m}$ denote the time to metastasis and $T^{d}$ the time to death. PSA measurements during follow-up are denoted as $Y (t_{j})$ taken at times points $t_{j}$ ( $j = 1, \dots, J$ ), and $Y (t) = {Y (t_{j}) = y (t_{j}); 0 \leq t_{j} \leq t, j = 1, \dots, J}$ denotes the available PSA measurements history up to time $t$ . We denote the time salvage therapy was initiated by $S$ , and $N (t) = I (t \geq S)$ , $N (t) = {N (t_{j}); 0 \leq t_{j} < t, j = 1, \dots, J}$ denotes the history of the salvage therapy process. Without the loss of generality, we presume that the urologists’ consideration to initiate salvage therapy took place at the same $J$ time points the patients provided PSA measurements. Baseline covariate information is denoted as $X = {X_{q} = x_{q}, q = 1, \dots, Q}$ .

To quantify the causal effect of salvage therapy, we will use the framework of counterfactual outcomes. Namely, we would like to decide if salvage therapy should be initiated at the follow-up time $t$ by comparing the counterfactual cumulative risks of metastasis under the two regimes in the medically relevant time interval $(t, t + Δ t]$ conditional on survival, no prior salvage therapy and no metastasis up to $t$ . In particular, we let $[T^{m (1)}, F^{(1)} (v, t) ∣ T^{m} > t, T^{d} > t, Y (t), N (t) = 0, X]$ denote the joint conditional distribution of $T^{m (1)}$ and $F^{(1)} (v, t)$ for all $v > t$ , where $F^{(1)} (v, t) = {Y (t_{j}); t < t_{j} \leq v, j = 1, \dots, J^{″}}$ denotes the $J^{″}$ future PSA measurements after $t$ and up to the horizon time $v$ , if salvage therapy was initiated at $t$ given the available PSA information up to $t$ and the fact that neither metastasis nor death has occurred up to this time. Analogously, we let $[T^{m (0)}, F^{(0)} (v, t) ∣ T^{m} > t, T^{d} > t, Y (t), N (t) = 0, X]$ denote the joint conditional distribution of $T^{m (0)}$ and $F^{(0)} (v, t)$ for all $v > t$ , if salvage therapy will not be initiated in the interval $(t, v]$ .

The marginal effect over all possible longitudinal histories up to time $t$ can be defined as

\begin{aligned} {ST}^{M} (t + Δ t, t) \\ = \int [Pr {T^{m (1)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t, Y (t), X} \\ - Pr {T^{m (0)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t, Y (t), X}] \\ \prod_{j = 1}^{J^{'}} d G {Y (t_{j}), X ∣ Y (t_{j - 1})} \\ = Pr {T^{m (1)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t} \\ - Pr {T^{m (0)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t} \end{aligned}

(1)

where

G (\cdot)

denotes the cumulative joint distribution function for

Y

and

X

, and

J^{'}

denotes the number of longitudinal measurements up to

t

. In this and the following expressions, and for simplicity of exposition, we have omitted that we condition on salvage not being initiated up to

t

, that is,

N (t) = 0

. Also, we have not included the term

\prod_{j = 1}^{J^{'}} p {N (t_{j}) ∣ Y_{i} (t_{j}), N (t_{j - 1}), X}

because, as we will explain in Section 3.2, this can be ignored under our modeling approach. Even though this type of marginal effect is often used in the context of causal inference, it may be of less interest to urologists because they would typically decide to initiate salvage therapy for patients with elevated PSA. A more clinically interesting setting is to restrict to specific PSA values, that is, to condition on the longitudinal history

Y_{i} (t)

of a specific subject

i

\begin{aligned} {ST}^{C} (t + Δ t, t) \\ = Pr {T_{i}^{m (1)} \leq t + Δ t ∣ T_{i}^{m} > t, T_{i}^{d} > t, Y_{i} (t), X_{i}} \\ - Pr {T_{i}^{m (0)} \leq t + Δ t ∣ T_{i}^{m} > t, T_{i}^{d} > t, Y_{i} (t), X_{i}} \end{aligned}

(2)

This effect is conditional/individualized in the sense that it is for the group of patients with baseline variables

X_{i}

and longitudinal history

Y_{i} (t)

, that is, the same PSA values as for the patient the doctor wants to decide on initiating salvage therapy. Following the discussion by Taylor et al.,⁶

{ST}^{C} (t + Δ t, t)

is a conditional causal effect relevant for subject-specific treatment decisions. The difference between the two salvage therapy effects presented above is a bias versus variance tradeoff. In particular, the marginal effect (1) marginalizes over a bigger group of patients and will have a smaller variance than the other effects; however, as explained above, it will also be less relevant for the practicing urologist. The conditional effect (2) is the one most relevant to the doctor, but it will have a larger variance because it is based on very little data. A compromise between (1) and (2) is to quantify the salvage therapy effect for patients who had PSA levels above a threshold value

c

at their last visit, that is,

Y^{*} (t) = {Y (t) : Y (t) > c}

\begin{aligned} {ST}^{M C} (t + Δ t, t) \\ = \int [Pr {T^{m (1)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t, Y^{*} (t)} \\ - Pr {T^{m (0)} \leq t + Δ t ∣ T^{m} > t, T^{d} > t, Y^{*} (t)}] \\ \prod_{j = 1}^{J^{'}} d \tilde{G} {Y (t_{j}), X ∣ Y (t_{j - 1})} \end{aligned}

(3)

where

\tilde{G} (\cdot)

denotes the cumulative distribution function for the truncated distribution of

Y

and

X

satisfying the condition

Y^{*} (t)

. This effect marginalizes over all possible longitudinal histories of PSA that end up having PSA greater than

c

at time

t

. The effect (3) offers a compromise by considering a more relevant group of patients, but smaller than the group considered in (1) (i.e. it will have a bigger variance than (1)). We should note that by changing the specification of

Y^{*} (t)

, alternative ST effects may be defined. For example, we could define the ST effect for patients with elevated PSA levels in the last

K

months (e.g.

K = 6

2.2. Assumptions

Because of the observational nature of the University of Michigan Prostatectomy Data, we will need a set of assumptions to identify and unbiasedly estimate the salvage therapy effects defined in the previous section. In particular, we will make the standard assumptions for causal inference with observational data, time-varying confounding, and intermediate variables.

-
Consistency: The observed outcomes equal the counterfactual outcomes for the actually assigned treatment.
-
Sequential exchangeability: The counterfactual outcomes are independent of the assigned treatment conditionally on the history of PSA measurements, the history of salvage therapy treatments, and baseline covariates, that is,
$T^{m (a)} ⫫ N (t) ∣ Y (t), N (t), X$
and
$F^{(a)} (v) ⫫ N (t) ∣ Y (t), N (t), X, v > t$
where $a = {0, 1}$ . Note that we assume here that the decision to start salvage therapy at time $t$ can only causally affect future PSA measurements $Y (v)$ , with $v > t$ but not $Y (t)$ .
The different salvage therapies are sufficiently well-defined in our motivating dataset not to cause any issues with the consistency assumption. The sequential exchangeability assumption also seems to be satisfied in our setting because, as mentioned earlier, urologists typically decide to initiate salvage therapy based on the history of PSA values and possibly other factors such as age and comorbidities (captured in $X$ ). Because of the parametric nature of our modeling approach (presented in the following section), the positivity assumption is not required to identify and estimate the three causal effects we introduced above. However, for the marginal effect (1), we extrapolate beyond the support of the data because the urologists will not prescribe salvage medication for patients with a zero PSA value. Nonetheless, because this effect is of little practical interest, we are not concerned about this extrapolation.
3. Modeling

3.1. Sub-models specification

We will use the framework of joint models to associate the risk of metastasis with the longitudinal PSA measurements and account for the risk of death. Joint models will account for the endogenous nature of the PSA process, and under the set of assumptions presented in Section 2.2 will provide valid predictions of the risk of metastasis. We will use the subscript $i$ ( $i = 1, \dots, n$ ) to denote the subject in all the random variables introduced above. Also, we will use $T_{i} = min (T_{i}^{m}, T_{i}^{d}, C_{i})$ to denote the observed event times with $C_{i}$ denoting the censoring time, and $δ_{i} \in {0, 1, 2}$ , where “0” is for censoring, “1” for metastasis, and “2” for death. The deaths we consider as competing events are the ones that occurred before metastasis; deaths and PSA measurements after metastasis are ignored. As explained in the introduction, we expect a drop in PSA levels after the initiation of salvage therapy. Following previous literature in modeling PSA profiles for prostate cancer patients, we specify a linear mixed-effects model for the logarithm of PSA with a change point in the subject-specific trajectories after the initiation of salvage therapy:

\begin{aligned} y_{i} (t) = {\begin{cases} η_{i} (t) + ε_{i} (t) = x_{i} {(t)}^{⊤} β + z_{i} {(t)}^{⊤} b_{i} + ε_{i} (t), & t < S_{i} \\ {\tilde{η}}_{i} (t) + ε_{i} (t) = η_{i} (t) + {{\tilde{x}}_{i} (\tilde{t})^{⊤} \tilde{β} + {\tilde{z}}_{i} (\tilde{t})^{⊤} {\tilde{b}}_{i}} + ε_{i} (t), & t \geq S_{i} \end{cases} \end{aligned}

(4)

where

y_{i} (t) = \log {{PSA}_{i} (t) + 1}

the design vectors

x_{i}

and

z_{i}

for the fixed

β

and random effects

b_{i}

, respectively, describe the subject-specific PSA evolutions before salvage therapy. Analogously, the design vectors

{\tilde{x}}_{i}

and

{\tilde{z}}_{i}

for the fixed

\tilde{β}

and random effects

{\tilde{b}}_{i}

, respectively, describe the change in the subject-specific PSA evolutions after salvage therapy. The latter design vectors use the relative time variable

\tilde{t} = t - S_{i}

. The covariates present in the four design vectors defined above are assumed to be a subset of

X_{i}

. The random effects is the model component that allows for different PSA profiles before and after salvage per patient. The distributional assumptions for the random effects and the error terms are

u_{i} = (b_{i}^{⊤}, {\tilde{b}}_{i}^{⊤})^{⊤} \sim N (0, Ω)

ε_{i} (t) \sim N (0, σ^{2})

, and

cov {u_{i}, ε_{i} (t)} = 0

. The variance–covariance matrix

Ω

is assumed completely unstructured.

Metastasis and death are considered competing risks, and we specify a different hazard model for each one. For metastasis, we postulate the model:

\begin{aligned} h_{i}^{m} (t) & = lim_{ϵ \to 0} ϵ^{- 1} Pr {t \leq T_{i}^{m} < t + ϵ ∣ T_{i}^{m} > t, T_{i}^{d} > t, H_{i} (t), N_{i} (t), X_{i}} \\ = h_{0}^{m} (t) \exp (ψ_{m}^{⊤} w_{i} + γ_{m} N_{i} (t) + α_{m}^{⊤} [{1 - N_{i} (t)} \times f {H_{i} (t)}] \\ + ξ_{m}^{⊤} [N_{i} (t) \times g {H_{i} (t)}]) \end{aligned}

(5)

where

h_{i}^{m} (t)

is the metastasis-specific hazard function for patient

i

, and

h_{0}^{m} (t)

is the baseline hazard. In our model, the logarithm of

h_{0}^{m} (t)

is modeled using penalized B-splines, that is,

\log h_{0}^{m} (t) = ψ_{h_{m}, 0} + \sum_{q = 1}^{Q} ψ_{h_{m}, q} B_{q} (t)

where

B_{q} (t)

denotes the

q

-th basis function of a B-spline with knots

v_{1}, \dots, v_{Q + 1}

and

ψ_{h_{m}}

the vector of spline coefficients. The design vector

w_{i}

with the corresponding coefficients vector

ψ

is for the baseline covariates (subset of

X_{i}

) relevant to the risk of metastasis. The term

H_{i} (t) = {η_{i} (s); 0 \leq s < min (S_{i}, t)} ⋃ {{\tilde{η}}_{i} (s); S_{i} \leq s < t}

denotes the history of the subject-specific linear predictor and

N_{i} (t)

is defined as in Section 2.1. The vector functions

f (\cdot)

and

g (\cdot)

determine the functional form for the dependence of the hazard on the PSA evolutions before and after ST, for example,

f {η_{i} (t)} = [η_{i} (t), d η_{i} (t) / d t]^{⊤}

f {η_{i} (t)} = [η_{i} (t), η_{i} (t) - η_{i} (t - 1)]^{⊤}

. From this model, we can compute the hazard ratio for salvage therapy for subject

i

, that is,

HR (t) = \exp (γ_{m} + ξ_{m}^{⊤} g {{\tilde{η}}_{i} (t)} - α_{m}^{⊤} f {η_{i} (t)}), t > S_{i}

(6)

where in this expression

η_{i} (t)

represents the expected value of

y_{i} (t)

as if subject

i

had not received salvage therapy at time

S_{i}

. We note that the metastasis model (5) could be extended in several ways, including to include interaction terms between

w_{i}

and

N_{i} (t)

or by allowing

ψ_{m}

to depend on

t

We presume that the PSA is not associated with the hazard of death:

\begin{aligned} h_{i}^{d} (t) & = lim_{ϵ \to 0} ϵ^{- 1} Pr {t \leq T_{i}^{d} < t + ϵ ∣ T_{i}^{m} > t, T_{i}^{d} > t, N_{i} (t), X_{i}} \\ = h_{0}^{d} (t) \exp {ψ_{d}^{⊤} w_{i} + γ_{d} N_{i} (t)} \end{aligned}

where

h_{i}^{d} (t)

is the death-specific hazard function for patient

i

, and

h_{0}^{d} (t)

is the baseline hazard that is again modeled using penalized B-splines with an associated spline coefficients vector

ψ_{h_{d}}

. Likewise, interaction terms between

w_{i}

and

N_{i} (t)

could also be included.

3.2. Estimation

The longitudinal and event time processes are linked via the random effects to define their joint distribution. We will use a Bayesian approach to fit the postulated joint model. Inference proceeds via the posterior distribution of the parameters ${u_{i}, θ; i = 1, \dots, n}$ given the data $D = {T_{i}, δ_{i}, Y_{i}; i = 1, \dots, n}$ , where $θ$ denotes the vector of model parameters. We start by formulating the joint posterior of ${u_{i}, θ; i = 1, \dots, n}$ and $θ_{N}$ , with $θ_{N}$ denoting the parameter vector for distribution of the salvage therapy assignment process. Using telescoping, we get

\begin{aligned} p (θ, u, θ_{N} ∣ T, δ, Y, N) \\ \propto \prod_{i = 1}^{n} \prod_{j = 1}^{n_{i}} p {Y_{i} (t_{i j}), T_{i}, δ_{i} ∣ Y_{i} (t_{i, j - 1}), N_{i} (t_{i, j - 1}), X_{i}, u_{i}, θ} \\ \times \prod_{j = 1}^{n_{i}} p {N_{i} (t_{i j}) ∣ Y_{i} (t_{i, j - 1}), N_{i} (t_{i, j - 1}), Y_{i} (t_{i j}), T_{i}, δ_{i}, X_{i}, u_{i}, θ_{N}} \\ \times p (u_{i} ∣ θ) \times p (θ) \times p (θ_{N}) \end{aligned}

where

T

δ

Y

, and

N

denote the vectors of the event times, event indicators, longitudinal measurements, and treatments decisions, respectively,

Y_{i} (t_{i, 0}) = N_{i} (t_{i, 0}) = \emptyset

, and

p (\cdot)

denotes an appropriate probability density or probability mass function. Under sequential exchangeability, we have that

\begin{aligned} p {N_{i} (t_{i j}) ∣ Y_{i} (t_{i, j - 1}), N_{i} (t_{i, j - 1}), F_{i}^{(a)} (v_{i j}, t_{i j}), T_{i}^{(a)}, δ_{i}^{(a)}, X_{i}, u_{i}, θ_{N}} \\ = p {N_{i} (t_{i j}) ∣ Y_{i} (t_{i, j - 1}), N_{i} (t_{i, j - 1}), X_{i}, θ_{N}} \end{aligned}

where

F_{i}^{(a)} (v_{i j}, t_{i j})

denotes the future counterfactual PSA measurements for

v_{i j} > t_{i j}

and

{T_{i}^{(a)}, δ_{i}^{(a)}}

is the counterfactual event times. Hence, assuming that the parameters

{u_{i}, θ; i = 1, \dots, n}

and

θ_{N}

are functionally independent, inference for the posterior distribution of the counterfactual outcomes

{θ, u ∣ T^{(a)}, δ^{(a)}, Y^{(a)}, N^{(a)}}

can be obtained using the first term (i.e. the observed data model) and ignore the second term. We should note that this decomposition allows the salvage therapy distribution to depend on a set of instrumental variables

{I V}_{i}

that determine who gets therapy but are unrelated to the outcomes and the random effects (i.e. the

I V

s are not a part of

X

). An example could be the urologist’s preferences on when to start treatment. As noted in Section 1, different physicians may use in other manners the PSA history when deciding to whom they will give the therapy. Under this setting, we obtain:

\begin{aligned} p (θ, u ∣ T, δ, Y, N) \\ \propto \prod_{i = 1}^{n} {\prod_{j} \frac{1}{\sqrt{2 π σ^{2}}} \exp (- \frac{1}{2 σ^{2}} {y_{i j} - μ_{i j} (θ, u_{i}, N_{i})}^{2})} \\ \times det (2 π Ω)^{- 1 / 2} \exp (- \frac{1}{2} u_{i}^{⊤} Ω^{- 1} u_{i}) \\ \times {h_{i}^{m} (T_{i}; θ, u_{i}, N_{i})}^{I (δ_{i} = 1)} {h_{i}^{d} (T_{i}; θ, N_{i})}^{I (δ_{i} = 2)} \\ \times \exp (- \int_{0}^{T_{i}} {h_{i}^{m} (s; θ, u_{i}, N_{i}) + h_{i}^{d} (s; θ, N_{i})} d s) \\ \times p (θ) \end{aligned}

where

μ_{i j} (θ, u_{i}, N_{i})

denotes the mean of the linear mixed model (4), and

det (A)

is the determinant of matrix

A

. We use standard priors for

θ

, that is, normal priors for all regression coefficients

(β, \tilde{β}, ψ_{h_{m}}, ψ_{m}, γ_{m}, α_{m}, ξ_{m}, ψ_{h_{d}}, ψ_{d}, γ_{d})

, inverse-Gamma priors for

σ^{2}

and the diagonal elements of

Ω

, and the LKJ prior for the correlation matrix of the random effects.¹¹ To ensure smoothness of the baseline hazard functions

h_{0}^{m} (t)

and

h_{0}^{d} (t)

, we postulate a “penalized” prior distribution for the regression coefficients

ψ_{h_{m}}

and

ψ_{h_{d}}

(we only show the formulation for the former):

p (ψ_{h_{m}} ∣ τ_{m}) \propto τ_{m}^{ρ (K) / 2} \exp (- \frac{τ_{m}}{2} ψ_{h_{m}}^{⊤} K ψ_{h_{m}})

where

τ_{m}

is the smoothing parameter that takes a

Gamma (5, 0.05)

hyper-prior in order to ensure a proper posterior for

ψ_{h_{m}}

K = Δ_{r}^{⊤} Δ_{r}

, where

Δ_{r}

denotes

r

-th difference penalty matrix, and

ρ (K)

denotes the rank of

K

We use a Markov chain Monte Carlo (MCMC) approach to obtain samples from the posterior distribution for all model parameters and the random effects. This algorithm is implemented in the R package JMbayes2¹² that we used to fit the model and calculate the salvage therapy effects.

4. Salvage therapy effects estimation

4.1. Estimates

We start with the estimation of the conditional effect (2) that we calculate for a specific patient with longitudinal history $Y_{i} (t)$ and covariates $X_{i}$ . Both terms in the definition of this effect are posterior predictive distributions, which under the postulated joint model are written as

\begin{aligned} Pr {T_{i}^{(a)} \leq t + Δ t, δ_{i}^{(a)} = 1 ∣ T_{i} > t, Y_{i} (t), X_{i}} \\ = \int \int Pr {T_{i}^{(a)} \leq t + Δ t, δ_{i}^{(a)} = 1 ∣ T_{i} > t, u_{i}, X_{i}, θ} \\ \times p {u_{i} ∣ T_{i} > t, Y_{i} (t), X_{i}, θ} p (θ ∣ D) d u_{i} d θ \end{aligned}

(7)

where

a = {0, 1}

, and the term

p (θ ∣ D)

denotes the posterior distribution for the model parameters

θ

. Again for simplicity of exposition, we have omitted the conditioning on

N (t) = 0

. We should note that these risk predictions are marginalized over both the parameters and the random effects. The first term in the integrand is written as:

Pr {T_{i}^{(a)} \leq t + Δ t, δ_{i}^{(a)} = 1 ∣ T_{i} > t, u_{i}, X_{i}, θ} = A / B

where

\begin{aligned} A = \int_{t}^{t + Δ t} h_{i}^{m (a)} (v) \exp (- \int_{t}^{v} {h_{i}^{m (a)} (s) + h_{i}^{d (a)} (s)} d s \\ - \int_{0}^{t} {h_{i}^{m (0)} (s) + h_{i}^{d (0)} (s)} d s) d v \end{aligned}

and

B = \exp (- \int_{0}^{t} {h_{i}^{m (0)} (s) + h_{i}^{d (0)} (s)} d s)

with

h_{i}^{m (1)} (t) = h_{0}^{m} (t) \exp (ψ_{m}^{⊤} w_{i} + γ_{m} + ξ_{m}^{⊤} g {{\tilde{η}}_{i} (t)})

and

h_{i}^{m (0)} (t) = h_{0}^{m} (t) \exp (ψ_{m}^{⊤} w_{i} + α_{m}^{⊤} f {η_{i} (t)})

. The counterfactual hazard functions for death

h_{i}^{d (1)} (t)

and

h_{i}^{d (0)} (t)

are defined analogously. In particular,

h_{i}^{d (1)} (t) = h_{0}^{d} (t) \exp (ψ_{d}^{⊤} w_{i} + γ_{d})

and

h_{i}^{d (0)} (t) = h_{0}^{d} (t) \exp (ψ_{d}^{⊤} w_{i})

. In the specification of the conditional distribution of the random effects given the observed information

{T_{i} > t, Y_{i} (t), X_{i}}

, it is assumed that salvage therapy has not been initiated by time

t

. Combining the integral equations presented above, we can estimate (7) using the following Monte-Carlo scheme:

We sample ${\overset{˘}{θ}}^{(l)}$ from the MCMC sample of the posterior distribution $[θ ∣ D]$ .

We sample ${\overset{˘}{u}}_{i}^{(l)}$ from the posterior distribution $[u_{i} ∣ T_{i} > t, Y_{i} (t), X_{i}, {\overset{˘}{θ}}^{(l)}]$ . Because this distribution cannot be written in closed-form, we sample from it using the Metropolis-Hastings algorithm.

We calculate the term $π_{i}^{(l)} (t + Δ t ∣ t, a) = Pr {T_{i}^{(a)} \leq t + Δ t, δ_{i}^{(a)} = 1 ∣ T_{i} > t, {\overset{˘}{u}}_{i}^{(l)}, X_{i}, {\overset{˘}{θ}}^{(l)}}$ . The integrals in the definition of the overall survival functions are approximated numerically using the 15-point Gauss-Kronrod quadrature rule.

We repeat Steps S1–S3,

L

times, and we take as an estimate of

{ST}_{i}^{C} (t + Δ t, t)

, the mean over the Monte-Carlo samples, that is,

{\hat{ST}}_{i}^{C} (t + Δ t, t) = \frac{1}{L} \sum_{l = 1}^{L} π_{i}^{(l)} (t + Δ t ∣ t, a = 1) - π_{i}^{(l)} (t + Δ t ∣ t, a = 0)

(8)

The estimation of the marginal effects (1) and (3) proceeds by averaging the conditional effects over the respective groups of patients in the sample. In particular, for (1), we define

R (t)

to denote the subset of patients at risk at time

t

. For each patient in

R (t)

, we calculate

{\hat{ST}}_{i}^{C} (t + Δ t, t)

. Then, we obtain the estimator

{\hat{ST}}^{M} (t + Δ t, t) = n_{r}^{- 1} \sum_{i : i \in R (t)} {\hat{ST}}_{i}^{C} (t + Δ t, t)

where

n_{r}

denotes the number of subjects in

R (t)

. For (3), the summation is over

R^{*} (t)

that denotes the subset of patients at risk at time

t

and had longitudinal histories that satisfy the definition of

Y^{*} (t)

(

n_{r^{*}}

is defined analogously).

4.2. Variance

To derive the variance of ${\hat{ST}}^{M} (t + Δ t, t)$ and ${\hat{ST}}^{M C} (t + Δ t, t)$ , we need to take into account that they are a function of both the parameters $θ$ and the data $D$ (in the definition of $D$ here we also include ${X_{i}, i = 1, \dots, n_{r}}$ ). The variance of both estimators can be derived using similar arguments, and here we will only show how to calculate the variance of the former. To make the dependence on the data $D$ more explicit, we write the estimator of the marginal salvage therapy effect as

{\hat{ST}}^{M} (t + Δ t, t; D) = E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D)}

To account for both the variability in the parameters and the sampling variability, our target variance is

\begin{aligned} {var}_{D} {{\hat{ST}}^{M} (t + Δ t, t; θ, D)} \\ = {var}_{D} [E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D)}] \end{aligned}

(9)

We will adapt the approach of Antonelli et al.¹³ to obtain an estimate of (9). More specifically, we let

{D^{(1)}, \dots, D^{(M)}}

to denote

M

datasets sampled with replacement from

D

. For each of these datasets, we calculate

\begin{aligned} {\hat{ST}}^{M} (t + Δ t, t; θ, D^{(m)}) \\ = E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D^{(m)})}, m = 1, \dots, M \end{aligned}

Note that the expectation is taken with respect to the posterior distribution of the parameters using the original data

D

(i.e. we do not refit the model for each dataset

D^{(m)}

). Calculating the sample variance of these

M

estimates we obtain

{var}_{D^{(m)}} [E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D^{(m)})}]

(10)

Even though (10) resembles our target variance, it ignores the variability in the posterior due to the different samples

{D^{(1)}, \dots, D^{(M)}}

. Hence, to get correct inferences, we use the correction term:

\begin{aligned} {\hat{var}}_{D} [E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D)}] \\ = {var}_{D^{(m)}} [E_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D^{(m)})}] \\ + {var}_{θ ∣ D} {{ST}^{M} (t + Δ t, t; θ, D)} \end{aligned}

(11)

To derive the variance of

{\hat{ST}}_{i}^{C} (t + Δ t, t)

we need to account for the sampling variability in

Y_{i} (t)

, that is, the PSA values patient

i

would have shown if we “cloned” him. To account for this variability, we cannot use the same idea as in

{\hat{ST}}^{M} (t + Δ t, t)

above because we cannot obtain samples with replacement from

Y_{i} (t)

. As an alternative, we employ a parametric Bootstrap approach,that is, we create different versions of the history

{Y_{i}^{(1)} (t), \dots, Y_{i}^{(M)} (t)}

using the Monte-Carlo scheme

We sample ${\ddot{θ}}^{(l)}$ from the MCMC sample of the posterior distribution $[θ ∣ D]$ .

We sample ${\ddot{u}}_{i}^{(l)}$ from the posterior distribution $[u_{i} ∣ T_{i} > t, Y_{i} (t), X_{i}, {\ddot{θ}}^{(l)}]$ .

We simulate $Y_{i}^{(m)} (t)$ using independent draws from $[y_{i} ∣ {\ddot{u}}_{i}, X_{i}, {\ddot{θ}}^{(l)}]$ , that is from the linear mixed model (4) with $t < S_{i}$ .

Subsequently, we use the same procedure as for the variance of

{\hat{ST}}^{M} (t + Δ t, t; D)

. Namely, along the lines of (11) we use

\begin{aligned} {\hat{var}}_{Y_{i}} [E_{θ ∣ Y_{i}} {{ST}^{C} (t + Δ t, t; θ, Y_{i} (t))}] \\ = {var}_{Y_{i}^{(m)}} [E_{θ ∣ Y_{i}} {{ST}^{C} (t + Δ t, t; θ, Y_{i}^{(m)} (t))}] \\ + {var}_{θ ∣ Y_{i}} {{ST}^{C} (t + Δ t, t; θ, Y_{i} (t))} \end{aligned}

Again, the first term accounts for the variability in

Y_{i} (t)

and the second for the parameters’ variability. The calculation of the first term entails obtaining the Monte Carlo estimate (8) for each realization

Y_{i}^{(m)} (t)

(i.e. we have nested Monte-Carlo schemes). Because of its parametric nature, this estimator relies more heavily on the model’s formulation.

An example on how package JMbayes2 can be used to estimate these causal effects and their variances is given in the following URL: https://drizopoulos.github.io/JMbayes2/articles/Causal_Effects.html. The running time for this example in a laptop with an Intel(R) Core(TM) i9-10885H CPU @ 2.40 GHz with eight physical cores and 32.0 GB RAM with a Microsoft Windows 11 Pro operating system is 40 s.

5. University of Michigan Prostatectomy Data – Analysis

We return to the UMP dataset, which we will use to estimate the salvage therapy effects introduced in Section 2. More information on the dataset and some pre-processing we applied is given in Supplemental Section 1.1. In the original version of the dataset, some patients received salvage therapy multiple times. For those patients, we have only considered the first time they received salvage therapy and used in the analysis only the PSA measurements taken up to 1.5 years after salvage therapy.

The joint model we fitted to the final dataset had the following specification. For the PSA trajectories, we use a modified version of the generic model (4), that is,

\log {{PSA}_{i} (t_{i j}) + 1} = {\begin{cases} η_{i} (t_{i j}) + ε_{i} (t_{i j}) = β_{0 i} + \sum_{k = 1}^{8} β_{k i} B_{k} (t_{i j}, v) + x_{b a s e, i}^{⊤} λ + ε_{i} (t_{i j}), t_{i j} \leq S_{i} \\ \tilde{η_{i} (t_{i j})} + ε_{i} (t_{i j}) = η_{i} (t_{i j}) + {\tilde{β}}_{0 i} + {\tilde{β}}_{1 i} (t_{i j} - S_{i}) + x_{b a s e, i}^{⊤} \tilde{λ} + ε_{i} (t_{i j}), t_{i j} > S_{i} \end{cases}

The subject-specific coefficients,

β_{0 i}, \dots, β_{8 i}

{\tilde{β}}_{0 i}

, and

{\tilde{β}}_{1 i}

are decomposed into a fixed and random part. The combined random-effects vector

b_{i} = (b_{0 i}, \dots, b_{8 i}, {\tilde{b}}_{0 i}, {\tilde{b}}_{1 i})^{⊤}

is assumed to follow a multivariate normal distribution with mean zero and covariance matrix

Ω

, and is independent of the error terms

ε_{i j}

that follow a normal distribution with mean zero and variance

σ^{2}

. In both branches of the model, and via the terms

x_{b a s e, i}^{⊤} λ

and

x_{b a s e, i}^{⊤} \tilde{λ}

we control for the baseline covariates age, baseline PSA, Gleason score, and the Charlson comorbidity index. After prostatectomy, most patients will exhibit close to zero PSA levels; however, for some patients, at some point, PSA will start to rapidly increase, triggering the urologists to initiate salvage therapy. The time PSA will start increasing differs per patient, requiring a flexible model to capture these evolutions. Hence, we postulate a nonlinear model with a cubic B-spline for the time variable, with internal knots placed at

v = {0.5, 1, 3, 5, 7, 9}

years after prostatectomy and boundary knots placed at 0 and 17.1 years after prostatectomy (i.e. the terms

B_{k} (t_{i j}, v)

). After salvage, we assume that the PSA will drop by

{\tilde{β}}_{0 i}

and then have a modified linear increase before metastasis. We allow for a nonlinear PSA profile during follow-up with a change in the linear slope after salvage because of the limited available information in the data for

t > S_{i}

. Before salvage, patients had a median of six PSA measurements (IQR = 6), whereas, after salvage, the median of available PSA measurements was three (IQR = 0). We should note that our model does not account for undetectable PSA levels; in the dataset these were set to zero. A better approach would be to account for the limit of detection in the formulation of the likelihood of the longitudinal submodel by using the cumulative distribution function instead of the probability density function for these values.

For the hazard of metastasis, we postulate the relative risk models

h_{i}^{m} (t) = h_{0}^{m} (t) \exp {ψ_{m}^{⊤} w_{m i} + A (t)}

where for the time-varying component

A (t)

we consider the following versions

\begin{aligned} M_{1} : A (t) & = {\begin{cases} α_{m 1} η_{i} (t), & t \leq S_{i} \\ γ_{m 1} (t - S_{i}) + γ_{m 2} {(t - S_{i}) \times {basePSA}_{i}} + ξ_{m 1} {\tilde{η}}_{i} (t), & t > S_{i} \end{cases} \\ M_{2} : A (t) & = {\begin{cases} α_{m 1} η_{i} (t) + α_{m 2} \frac{d η_{i} (t)}{d t}, & t \leq S_{i} \\ γ_{m 1} (t - S_{i}) + γ_{m 2} {(t - S_{i}) \times {basePSA}_{i}} + ξ_{m 1} {\tilde{η}}_{i} (t), & t > S_{i} \end{cases} \\ M_{3} : A (t) & = {\begin{cases} α_{m 1} η_{i} (t) + α_{m 2} \frac{d η_{i} (t)}{d t}, & t \leq S_{i} \\ γ_{m 1} (t - S_{i}) + γ_{m 2} {(t - S_{i}) \times {basePSA}_{i}} + ξ_{m 2} {\tilde{β}}_{0 i} + ξ_{m 3} {\tilde{β}}_{1 i}, & t > S_{i} \end{cases} \\ M_{4} : A (t) & = {\begin{cases} α_{m 1} η_{i} (t) + α_{m 3} {\int_{0}^{t} η_{i} (v) d v / t}, & t \leq S_{i} \\ γ_{m 1} (t - S_{i}) + γ_{m 2} {(t - S_{i}) \times {basePSA}_{i}} + ξ_{m 2} {\tilde{β}}_{0 i} + ξ_{m 3} {\tilde{β}}_{1 i}, & t > S_{i} \end{cases} \end{aligned}

Coefficient

α_{m 1}

α_{m 2}

, and

α_{m 3}

quantify the association between the current value, current slope, and average

\log (PSA + 1)

in the period before salvage and the hazard of metastasis, respectively. Coefficient

ξ_{m 1}

quantifies the association of the current value of

\log (PSA + 1)

after salvage and the instantaneous risk of metastasis; coefficients

ξ_{m 2}

and

ξ_{m 3}

quantify the association between the drop of PSA just after salvage, and the change in the linear slope of

\log (PSA + 1)

after salvage and the instantaneous risk of metastasis, respectively. Also, the coefficient

γ_{m 1}

quantifies the association between the length of the period the patient was on salvage therapy and the hazard of metastasis, and coefficient

γ_{m 2}

the interaction between the time a patient has been on salvage and baseline PSA. The baseline covariates part

ψ_{m}^{⊤} w_{m i}

includes the effects of age, baseline PSA, the Gleason score, and the Charlson comorbidity index. The baseline hazard is approximated with B-splines, as explained earlier. For the hazard of death, we assume the model

h_{i}^{d} (t) = h_{0}^{d} (t) \exp {ψ_{d 1} (Age - 50) + γ_{d 1} N_{i} (t)}

The prior distributions we assumed can be found in the Supplemental Material. Samples from the posterior distribution of the model parameters from these four joint models have been obtained from the MCMC algorithm provided in the R package JMbayes2 using three parallel chains started from different initial values. Each chain run for 55,000 iterations, discarding 5000 iterations as burn-in, and applying a thinning of 10 iterations. Hence, each chain contained 5000 realization of all model parameters. The priors used for the various parameters were the same as in Section 3.2. Trace-plots and the potential scale reduction factor (

\hat{R}

) showed satisfactory convergence of the MCMC algorithm. In particular, for all model parameters the

\hat{R}

-values were smaller than 1.09. We have also performed a sensitivity analysis for the choice of the prior distributions. We found that for almost all parameters the choice of the prior had little influence in the results. For the only parameters for which the impact was greater was for

τ_{m}

and

τ_{d}

that control the shape of the baseline hazard functions for metastasis and death, respectively.

We first evaluate the model fit before focusing on the results. All fitted joint models showed a similar fit to the longitudinal data; therefore, we only show the results from model $M_{1}$ . Supplemental Figures 3 to 5 show the observed data and fitted longitudinal trajectories for 48 selected patients from the dataset. These patients have been selected to showcase different profiles seen in the data, including patients who did and did not initiate salvage therapy. We observe from these figures that the postulated longitudinal submodel provides a good fit to the observed data. As previously observed,¹⁴ this is an important aspect in joint models. In addition, using residuals plots we assessed the normality and homoscedastiscity assumptions for the error terms; these plots did not indicate any violations of these assumptions. Table 1 shows the Deviance information criterion (DIC) and the Watanabe-Akaike information criterion (WAIC) for the four fitted joint models.

Table 1.
Deviance information criterion (DIC), the Watanabe-Akaike information criterion (WAIC), and the log pseudo marginal likelihood (LPML) for the fitted joint models $M_{1} - M_{4}$ .

DIC WAIC LPML

$M_{4}$ 23505.49 12902244.27 −130702.53

$M_{3}$ 25920.73 17954385.81 −177836.90

$M_{1}$ 43554.78 33007334.39 −263741.71

$M_{2}$ 91596.92 44668135.08 −270099.58

	DIC	WAIC	LPML
$M_{4}$	23505.49	12902244.27	−130702.53
$M_{3}$	25920.73	17954385.81	−177836.90
$M_{1}$	43554.78	33007334.39	−263741.71
$M_{2}$	91596.92	44668135.08	−270099.58

According to both WAIC and DIC, Model $M_{4}$ provides better predictions compared to the other three models.

For the event process, we show in Supplemental Table 2 the posterior means and the corresponding 95% credible intervals for the coefficients of the relative risk models for metastasis and death. All models suggest that before initiating salvage, high PSA levels at time $t$ , high PSA velocity at $t$ , and high average PSA from baseline to $t$ translate to a greater risk of metastasis at $t$ . Models $M_{1}$ and $M_{2}$ indicate that after salvage has been initiated, high PSA levels and high PSA velocity at time $t$ translate to a higher risk of metastasis. Models $M_{3}$ and $M_{4}$ suggest that the greater the drop in PSA levels after salvage initiation, the lower the risk of metastasis afterward, and the greater the PSA velocity after salvage, the higher the risk of metastasis. For the longitudinal process, the corresponding coefficients are shown in Supplemental Table 3. We observe that the baseline PSA, the Gleason score, and the Charlson comorbidity index seem to be associated with the level of post-surgery PSA. There is no indication that the effect of these confounders change after the initiation of salvage therapy.

We now turn our focus on estimating causal salvage therapy effects from the fitted joint models. To contrast the different effect types, we show the conditional effect (2) for Patients 490 and 327, the marginal-conditional effect (3) that averages over the group of patients who at their last visit has a PSA value > 0.5 ng/mL, and the marginal effect (1) that averages over all patients irrespective of their PSA values. These effects have been calculated at $t = 5, 7, 9$ , and $13$ years after the initial surgery and for a $Δ t = 2$ -year medically relevant time window. The 95% confidence intervals are calculated using the variance estimate presented in Section 4.2. The number of patients based on which the marginal and marginal-conditional effects have been calculated is shown in Supplemental Table 1. Figures 1 to 4 present these results.

Figure 1.

Salvage therapy effects for follow-up times $t = 5, 7, 9$ , and $13$ years and $Δ t = 2$ under model $M_{1}$ . For Patients 490 and 327, conditional causal effects are shown. The marginal-conditional causal effect is for patients who had a prostate-specific antigen (PSA) value greater or equal to 0.5 ng/mL at their last visit. The marginal effect is for all patients at risk at the corresponding $t$ .

Figure 2.

Salvage therapy effects for follow-up times $t = 5, 7, 9$ , and $13$ years and $Δ t = 2$ under model $M_{2}$ . For Patients 490 and 327, conditional causal effects are shown. The marginal-conditional causal effect is for patients who, at their last visit, had a prostate-specific antigen (PSA) value greater or equal to 0.5 ng/mL. The marginal effect is for all patients at risk at the corresponding $t$ .

Figure 3.

Salvage therapy effects for follow-up times $t = 5, 7, 9$ , and $13$ years and $Δ t = 2$ under model $M_{3}$ . For Patients 490 and 327, conditional causal effects are shown. The marginal-conditional causal effect is for patients who had a prostate-specific antigen (PSA) value greater or equal to 0.5 ng/mL at their last visit. The marginal effect is for all patients at risk at the corresponding $t$ .

Figure 4.

Salvage therapy effects for follow-up times $t = 5, 7, 9$ , and $13$ years and $Δ t = 2$ under model $M_{4}$ . For Patients 490 and 327, conditional causal effects are shown. The marginal-conditional causal effect is for patients who had a prostate-specific antigen (PSA) value greater or equal to 0.5 ng/mL at their last visit. The marginal effect is for all patients at risk at the corresponding $t$ .

The estimated values of the conditional and marginal-conditional salvage therapy effects tend to be negative, as expected, since salvage therapy is considered to be beneficial but are also small in magnitude. They are small because the probability of metastasis in the next two years is itself small, even in the absence of salvage therapy, so the room for improvement is also small. The results for the salvage therapy effects and across models align with the points raised in Section 2.1. In particular, we observe that the marginal effects are the smallest in magnitude, followed by the marginal-conditional and conditional effects. The reverse is observed regarding the variance of these effects, with the marginal effects having the smallest variance and the conditional effects the largest. Also, we observe that the conditional effects are more adaptable to the shape of the PSA profile. For example, while the marginal and marginal-conditional effects become smaller in magnitude at later follow-up times, the conditional effect for Patient 490 increases in size because this subject shows a steeply increasing PSA trajectory. This also happens for Patient 327 but to a lesser degree because his PSA profile is less steep. The variance of the conditional effects also increases with increasing PSA values. This reflects the fact that because in the sample, the majority of patients showed stable PSA profiles close to zero, the model is less “certain” for the shape of increasing PSA trajectories.

In this section, we estimated the marginal-conditional effect at time $t$ , by averaging the causal effects of individuals who had their last PSA > 0.5. A different marginal-conditional effect would have been obtained if we had used different criteria for who to average over. For example, we could have used a range of PSA, say 0.5–4.0. The criteria could have also included requirements such as Gleason score was at least 8 and age was < 75. We might also exclude from the set any patient whose last PSA was not current, for example, was more than 2 years ago. Restrictions such as these could make the causal estimates more targeted to an individual patient, but at the expense of greater variance.

6. Simulation

We performed a simulation study to assess the performance of the approaches presented in the previous sections. Note that because the causal effects (1)–(3) entail most of the model parameters in their definition, it is very challenging to simulate data with specific values for the causal effects. This prevents us from setting up a simulation to assess whether specific values for the causal effects can be unbiasedly estimated. However, the key assumption behind our approach is that joint models can be unbiasedly estimated under time-varying confounding and do not require to include a model for the treatment initiation process. This property of joint models has not been established before to our knowledge. Therefore, we focus here on validating the finite sample performance of joint models when the decision to initiate salvage therapy heavily depends on past PSA values. By setting the decision to perform salvage therapy on previous PSA values, we thus, mimic a causal relationship between the decision to perform salvage therapy and the history of PSA. Furthermore, to reduce the computational burden, we opted for a simplified setting with no additional covariates and linear time evolutions. Also, when we simulate PSA values we have not imposed the constraint that they should be positive.

More specifically, we assumed 1000 subjects and then randomly selected follow-up visits, $t_{i j}$ from a uniform distribution between 0 and 20. To mimic the causal relationship between salvage therapy and PSA history, the timing of salvage therapy was assumed to depend on the value of PSA. Specifically, if PSA was smaller than 2 ng/mL, the probability of receiving salvage at that visit time is 0.01; when PSA was in the interval (2 and 4 ng/mL), the probability of receiving treatment was 0.5, and for PSA > 4 ng/mL, the probability was set to 0.9. If salvage therapy was not given according to the model described previously, the subject was assumed not to undergo salvage therapy but still be at risk for metastasis or death. More details for the settings of the simulation study are given in Supplemental Section 2. We then simulated 300 datasets under three different scenarios for the association structure between features of the longitudinal PSA values and the hazard of metastasis. In Scenario 1, we considered an association with the current value of PSA; in Scenario 2, the current value and the current slope before salvage therapy and only the current value after salvage therapy; and in Scenario 3, the value and cumulative effect before salvage therapy and the current value after salvage therapy. We assumed no association between the PSA history and the hazard of death. The detailed model specification and parameter values are provided in the Supplemental Material, along with visualizations of the simulated data in Supplemental Figures 6 to 8.

The simulation study results are presented in Table 2.

Table 2.
Mean, bias, and root mean square error (RMSE) for the regression coefficients from the three different simulation scenarios.

Scenario 1 Scenario 2 Scenario 3

Parameter Mean Bias RMSE Mean Bias RMSE Mean Bias RMSE

$β_{0}$ −1.745 −0.005 −0.036 −1.745 −0.005 −0.036 −1.745 −0.005 −0.036

$β_{1}$ −0.038 −0.004 −0.012 −0.039 −0.005 −0.012 −0.038 −0.005 −0.012

${\tilde{β}}_{0}$ −5.840 −0.000 −0.065 −5.838 −0.002 −0.066 −5.838 −0.002 −0.065

${\tilde{β}}_{1}$ −0.194 −0.012 −0.016 −0.194 −0.013 −0.017 −0.194 −0.013 −0.017

$γ_{d 1}$ −0.250 −0.000 −0.223 −0.246 −0.004 −0.233 −0.238 −0.012 −0.265

$γ_{m 1}$ −0.266 −0.016 −0.150 −0.264 −0.014 −0.150 −0.262 −0.012 −0.150

$α_{m 1}$ −0.810 −0.035 −0.096 −0.808 −0.037 −0.127 −0.780 −0.065 −0.165

$α_{m 2}$ – – – −1.048 −0.152 −1.816 – – –

$α_{m 3}$ – – – – – – −0.558 −0.058 −0.267

$ξ_{m 1}$ −0.603 −0.008 −0.027 −0.605 −0.010 −0.027 −0.604 −0.009 −0.026

	Scenario 1	Scenario 2	Scenario 3
$β_{0}$	−1.745	−0.005	−0.036	−1.745	−0.005	−0.036	−1.745	−0.005	−0.036
$β_{1}$	−0.038	−0.004	−0.012	−0.039	−0.005	−0.012	−0.038	−0.005	−0.012
${\tilde{β}}_{0}$	−5.840	−0.000	−0.065	−5.838	−0.002	−0.066	−5.838	−0.002	−0.065
${\tilde{β}}_{1}$	−0.194	−0.012	−0.016	−0.194	−0.013	−0.017	−0.194	−0.013	−0.017
$γ_{d 1}$	−0.250	−0.000	−0.223	−0.246	−0.004	−0.233	−0.238	−0.012	−0.265
$γ_{m 1}$	−0.266	−0.016	−0.150	−0.264	−0.014	−0.150	−0.262	−0.012	−0.150
$α_{m 1}$	−0.810	−0.035	−0.096	−0.808	−0.037	−0.127	−0.780	−0.065	−0.165
$α_{m 2}$	–	–	–	−1.048	−0.152	−1.816	–	–	–
$α_{m 3}$	–	–	–	–	–	–	−0.558	−0.058	−0.267
$ξ_{m 1}$	−0.603	−0.008	−0.027	−0.605	−0.010	−0.027	−0.604	−0.009	−0.026

A detailed discussion of these results is given in Supplemental Section 2.2. In general, the results suggest that joint models can unbiasedly estimate the parameters of the joint distribution of the longitudinal and event time outcomes in the presence of time-varying confounding. Hence, we expect that well-specified joint models can be used to estimate causal effects for time-varying treatments.

7. Discussion

In this article, we have showcased how causal effects for time-varying treatments (or exposures) can be estimated using the framework of joint models for longitudinal and time-to-event data. These models will account for time-varying confounding without requiring an explicit specification of a model for the probability of receiving treatment conditional on the history of longitudinal confounders and past treatments. The causal effects (1)–(3) are in the flavor of the parametric G-formula and, by conditioning on different specifications of the longitudinal PSA history correspond to different targets of inference. In our model, we specified a linear mixed-effects model to link the longitudinal PSA measurements and the risk of metastasis. An alternative approach is to postulate a mixed model with nonlinear ordinary differential equations and link the PSA kinetics to the risk of metastasis.¹⁵

Our approach relies on parametric assumptions and is expected to use the available data efficiently. However, the estimated causal effects will be biased if these assumptions are seriously violated. To minimize the chance of biased estimated effects, performing a thorough check of the model’s assumption using residual plots and evaluating the model’s fit with figures such as Supplemental Figures 3 to 5 is advisable. As we have seen in Section 3.2, an advantage of our full likelihood approach is that we do not need to model the salvage therapy initiation process. This property of our approach also holds for all other processes that may depend on the observed PSA history. For example, if the treating urologists decide to change the visiting process (i.e. when patients should come back for their next PSA test), and if this decision is solely based on the past observed PSA history of the patient (and possibly covariates), then our modeling approach will still provide valid results without requiring modeling this process. The same consideration also holds for the censoring process, that is, our approach allows censoring to depend in a complex manner on past longitudinal measurements and does not require a model to derive censoring weights. In particular, we feel that there are two general routes to follow for deriving causal effects from observational data; either to make no assumptions for the measurements process but require to derive weighting models for the other competing processes or to make stronger assumptions for the measurement model, and no assumptions for the other competing mechanisms. We have selected here the latter approach.

In this article, we have focused on the effect of salvage therapy at fixed time $t$ for the individual patient or for a group of patients with similar PSA histories. In the former case, we envisage using our approach in the context of shared decision-making for an individual patient. Namely, the doctor and the patient can assess the current risk of metastasis and how much this can be lowered by starting salvage therapy. This should weigh against the potential side effects of salvage therapy. In the context of a group of patients with similar PSA histories, our approach can be used to estimate the causal effect of different policies. For example, one policy might start salvage therapy when PSA first goes above 1.0 ng/mL. In contrast, another policy could delay the start of salvage therapy until PSA first goes above 4.0 ng/mL. A micro-simulation approach could then be used in which the parameter estimates from the joint model allow us to simulate realistic data under both these scenarios. Then, the causal effect would be defined as the difference in the incidence rate of metastasis.

As mentioned in Section 5, we have elected only to consider the first time patients received salvage therapy. However, we should stress that this is not a limitation of our proposed modeling framework. Namely, the model could be adjusted to specify the longitudinal subject-specific PSA profiles under multiple salvage therapy interventions. However, we have not done this in our analysis because few patients had received salvage therapy more than once. Hence, there was insufficient information in the data to estimate the changes in the PSA profiles after these interventions. In addition, our model could also be extended to account for deaths after metastasis by specifying a multi-state process model with a third hazard equation for the transition from metastasis to death.

Supplemental Material

sj-pdf-1-smm-10.1177_09622802241239003 - Supplemental material for Using joint models for longitudinal and time-to-event data to investigate the causal effect of salvage therapy after prostatectomy

Supplemental material, sj-pdf-1-smm-10.1177_09622802241239003 for Using joint models for longitudinal and time-to-event data to investigate the causal effect of salvage therapy after prostatectomy by Dimitris Rizopoulos, Jeremy MG Taylor, Grigorios Papageorgiou and Todd M Morgan in Statistical Methods in Medical Research

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors thank the NIH CISNET Prostate Award CA253910 for financial support.

ORCID iDs

Dimitris Rizopoulos

Grigorios Papageorgiou

Supplemental material

Supplemental material for this article is available online. Supplemental Figures 1 to 8 and Supplemental Tables 1 to 3 are available with this article as Supplemental Material.

References

Spratt

Dess

Zumsteg

, et al. A systematic review and framework for the use of hormone therapy with salvage radiation therapy for recurrent prostate cancer. Eur Urol 2018; 73: 156–165.

Beesley

Morgan

Spratt

, et al. Individual and population comparisons of surgery and radiotherapy outcomes in prostate cancer using Bayesian multistate models. JAMA Network Open 2019; 2: e187765.

Schaubel

Wolfe

Port

. A sequential stratification method for estimating the effect of a time-dependent experimental treatment in observational studies. Biometrics 2006; 82: 910–917.

Schaubel

Wolfe

Sima

, et al. Estimating the effect of a time-dependent treatment by levels of an internal time-dependent covariate: application to the contrast between liver wait-list and posttransplant mortality. J Am Stat Assoc 2009; 104: 49–59.

Kennedy

Taylor

JMG

Schaubel

, et al. The effect of salvage therapy on survival in a longitudinal study with treatment by indication. Stat Med 2010; 28: 2569–2580.

Taylor

JMG

Shen

Kennedy

, et al. Comparison of methods for estimating the effect of salvage therapy in prostate cancer when treatment is given by indication. Stat Med 2014; 33: 257–274.

Hernán

Robins

. Causal Inference: What If. Boca Raton: Chapman & Hall/CRC, 2020.

Tsiatis

Davidian

Holloway

, et al. Dynamic Treatment Regimes Statistical Methods for Precision Medicine. Boca Raton: Chapman & Hall/CRC, 2020.

Müller

Wahed

, et al. Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times. J Am Stat Assoc 2016; 111: 921–950.

10.

Rizopoulos

. Joint Models for Longitudinal and Time-to-Event Data, With Applications in R. Boca Raton: Chapman & Hall/CRC, 2012.

11.

Lewandowski

Kurowicka

Joe

. Generating random correlation matrices based on vines and extended onion method. J Multivar Anal 2009; 100: 1989–2001.

12.

Rizopoulos

Papageorgiou

Afonso

. JMbayes2: Extended Joint Models for Longitudinal and Time-to-Event Data., 2023. http://CRAN.R-project.org/package=JMbayes2, R package Version 0.4-5.

13.

Antonelli

Papadogeorgou

Dominici

. Causal inference in high dimensions: a marriage between Bayesian modeling and good frequentist properties. Biometrics 2022; 78: 100–114.

14.

Loïc

Putter

Proust-Lima

. Individual dynamic predictions using landmarking and joint modelling: validation of estimators and robustness assessment. Stat Methods Med Res 2019; 28: 3649–3666.

15.

Desmée

Mentré

Veyrat-Follet

, et al. Using the SAEM algorithm for mechanistic joint models characterizing the relationship between nonlinear PSA kinetics and survival in prostate cancer patients. Biometrics 2017; 73: 305–312.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.39 MB

	Scenario 1			Scenario 2			Scenario 3
Parameter	Mean	Bias	RMSE	Mean	Bias	RMSE	Mean	Bias	RMSE
$β_{0}$	−1.745	−0.005	−0.036	−1.745	−0.005	−0.036	−1.745	−0.005	−0.036
$β_{1}$	−0.038	−0.004	−0.012	−0.039	−0.005	−0.012	−0.038	−0.005	−0.012
${\tilde{β}}_{0}$	−5.840	−0.000	−0.065	−5.838	−0.002	−0.066	−5.838	−0.002	−0.065
${\tilde{β}}_{1}$	−0.194	−0.012	−0.016	−0.194	−0.013	−0.017	−0.194	−0.013	−0.017
$γ_{d 1}$	−0.250	−0.000	−0.223	−0.246	−0.004	−0.233	−0.238	−0.012	−0.265
$γ_{m 1}$	−0.266	−0.016	−0.150	−0.264	−0.014	−0.150	−0.262	−0.012	−0.150
$α_{m 1}$	−0.810	−0.035	−0.096	−0.808	−0.037	−0.127	−0.780	−0.065	−0.165
$α_{m 2}$	–	–	–	−1.048	−0.152	−1.816	–	–	–
$α_{m 3}$	–	–	–	–	–	–	−0.558	−0.058	−0.267
$ξ_{m 1}$	−0.603	−0.008	−0.027	−0.605	−0.010	−0.027	−0.604	−0.009	−0.026