Sage Journals: Discover world-class research

Abstract

Competing risks models can involve more than one time scale. A relevant example is the study of mortality after a cancer diagnosis, where time since diagnosis but also age may jointly determine the hazards of death due to different causes. Multiple time scales have rarely been explored in the context of competing events. Here, we propose a model in which the cause-specific hazards vary smoothly over two times scales. It is estimated by two-dimensional $P$ -splines, exploiting the equivalence between hazard smoothing and Poisson regression. The data are arranged on a grid so that we can make use of generalised linear array models for efficient computations. The R-package TwoTimeScales implements the model. As a motivating example we analyse mortality after diagnosis of breast cancer and we distinguish between death due to breast cancer and all other causes of death. The time scales are age and time since diagnosis. We use data from the Surveillance, Epidemiology and End Results (SEER) program. In the SEER data, age at diagnosis is provided with a last open-ended category, leading to coarsely grouped data. We use the two-dimensional penalised composite link model to ungroup the data before applying the competing risks model with two time scales.

Keywords

Cause-specific hazards two-dimensional smoothing penalised composite link model cancer mortality

1. Introduction

Competing risks describe the situation where individuals are at risk of experiencing one of several types of events.¹ The prototype of a competing risks model is the study of cause-specific mortality. Such models are used in clinical studies of cancer to analyse mortality from cancer as well as mortality due to other causes as competing events.

The building blocks in a competing risks analysis are the cause-specific hazards (CSHs). They are defined as the instantaneous risk of experiencing an event of a specific type at time $t$ , given no event (of any type) has happened before $t$ . From them the overall survival function, that is the probability of no event up to time $t$ , and the cumulative incidence functions, the probability of an event of given type before $t$ , can be derived.

Time is a key quantity in any survival analysis, and it can be recorded on several time scales. For example, after a cancer diagnosis, the risk of death may be studied over time since diagnosis, over age, which is time since birth, or over time since treatment. All time scales progress at the same speed and differ only in their origin. However, each time scale is a proxy for a specific mechanism linked to the event of interest.² In the study of mortality after a cancer diagnosis, time since diagnosis measures the cumulative adverse effect of the cancer. Additionally, as individuals age, they become more frail and their capacity of resisting comorbidity deteriorates.

Usually, CSHs are defined for the same single time scale, and little research has been done on how to handle multiple time scales in competing risks models. Existing studies focus on the choice of the time scale to use when modeling competing causes of deaths after a cancer diagnosis. Cancer mortality is clearly a function of time since diagnosis, while other causes of death are more naturally modeled over age. Whenever the CSHs of death for other causes are solely modeled over the time since diagnosis, bias may be introduced in the estimates of the cumulative incidences for cancer mortality.³ Lee et al.⁴ propose a model where two competing causes are modeled on two different time scales and then combined under one of the two scales to estimate the cumulative incidence functions.

Carollo et al.⁵ introduced a model for a single event in which the hazard varies smoothly over two time scales simultaneously and is estimated by tensor products of $P$ -splines.⁶ The model allows great flexibility for the hazard shape, while being simple to estimate. Here, we develop this model further for a competing risks setting. Each CSH varies over the two time scales, and estimation again is achieved by two-dimensional $P$ -splines smoothing. Therefrom, we calculate the cumulative incidence functions for each cause.

This study is motivated by the analysis of mortality after a breast cancer diagnosis. Breast cancer is the most common cancer diagnosed in women worldwide,⁷ and the risk of being diagnosed with it increases over age, but varies by race and ethnic group,⁸ with peaks of diagnosis after age 60 for white women and in the late 40s for non-white women. Survival probabilities after a breast cancer diagnosis are generally very high, with more than 80% of women with breast cancer being alive 10 years after diagnosis,⁹ leading to many women living with a history of breast cancer for many years. Several studies have shown that breast cancer survivors are at higher risk of dying from cardiovascular disease (CVD),^10–12 and there is evidence that this increased risk appears among survivors of breast cancer several years after diagnosis.¹¹ Increased CVD mortality rates among breast cancer survivors appear to be likely linked to adverse effects of therapy and to common risk factors.¹²

Mortality rates of women with breast cancer depend on age, race and cancer subtypes,^13,14 however, the relationship between time since diagnosis, age, age at diagnosis and mortality is not yet clear. Most studies of women with breast cancer find that breast cancer mortality increases after age 70 so that older women are at higher risk of dying because of the cancer.^13,15,16 However, older women with breast cancer often die of causes other than cancer, with CVDs being the most common cause of non-cancer death.¹² Other studies have found that younger ages at diagnosis of breast cancer are associated with higher overall mortality^14,17 and that women diagnosed before age 35 had higher risk of death because of cancer than older women.¹⁸ Although most studies acknowledge the presence of several time scales, they do not account for them explicitly. For example, age at diagnosis is often grouped in broad and arbitrarily-cut categories, and interaction with time since diagnosis is often neglected. Shedding light on the way mortality rates of breast cancer vary with these two time scales, while properly accounting for competing causes, is both of empirical and of methodological relevance.

We analyse mortality data of women with breast cancer from the Surveillance, Epidemiology and End Results (SEER) registers¹⁹ by time since cancer diagnosis and age. We focus on postmenopausal women (age at diagnosis 50 and older) and distinguish between deaths due to breast cancer and all other causes of death. In the SEER data, age at diagnosis is recorded in single years of age up to age 89 and one last open-ended interval for all individuals older than 89 (age at diagnosis 90+). Such grouping is common in data on disease incidence and/or mortality, usually to prevent identification of the patients or to provide a condensed depiction of the data. In this paper we employ the two-dimensional penalised composite link model (PCLM)²⁰ to ungroup the final age interval.

The remainder of this article is structured as follows: in Section 2 we first set the notation for the CSHs with two time scales and review the estimation procedure. Then we introduce calculation of cumulative CSHs, overall survival probabilities and finally cumulative incidence functions with two time scales, together with their standard errors. Section 3 is dedicated to the motivating example. A discussion concludes the paper.

2. A competing risks model with two time scales

Consider individuals that are diagnosed with cancer. Once a patient is diagnosed with the disease, he or she is at risk of dying from cancer (event of interest) or because of causes other than cancer (competing event). The risk of dying from cancer actually sets in at disease onset, however, since this onset is rarely known, time since diagnosis mainly is used as duration of the disease and we will also follow this practice.

For both competing events two time scales are relevant: the first is the age of the individual, which we indicate with $t$ . The second time scale is time since diagnosis of the cancer, which we denote with $s$ . Because age measures the time since birth, we have that $t > s$ for each individual. Both $t$ and $s$ move at the same speed, that is, an increment of one unit (e.g. 1 day or 1 month) on the $t$ -scale corresponds to the same increment on the $s$ -scale.

We consider two competing events: death from breast cancer ( $ℓ = 1$ ) and death from other causes ( $ℓ = 2$ ). The CSHs for event type $ℓ \in {1, 2}$ over the two time scales $t$ and $s$ are defined as

λ_{ℓ} (t, s) = lim_{ν ↓ 0} \frac{P (event of type ℓ \in {t + ν, s + ν} | no event of any type before (t, s))}{ν}

(1)

Here

t > s

, so the two-dimensional hazards

λ_{ℓ} (t, s)

are only defined in the lower half-open triangle of

R_{+}^{2}

. They give the instantaneous risk of dying because of cause

ℓ

for an individual who is alive at age

t

and

s

years after receiving the diagnosis of cancer.

The two time scales differ in their origin, so in the case of cancer patients $t_{0} > 0$ is the age when the individual is diagnosed with cancer, at $s = 0$ . This age is fixed for each individual but differs among patients. In a Lexis diagram with axes $t$ (age) and $s$ (time since diagnosis) individuals move along diagonal lines from $(t_{0}, 0)$ to $(t_{0} + v, v)$ until they leave the risk set (due to event of any kind or censoring). The individual trajectories start at individually different points $(t_{0}, 0)$ .

The same information can also be portrayed in the $(u, s)$ -plane, where $u$ denotes the age at diagnosis and $s$ , as before, is the time since diagnosis. Here individual trajectories run vertically from $(u, s = 0)$ to $(u, s = v)$ . We can view the CSHs equivalently as two-dimensional functions of $u$ and $s$ , ${\overset{˘}{λ}}_{ℓ} (u, s)$ , where

{\overset{˘}{λ}}_{ℓ} (u = t - s, s) \equiv λ_{ℓ} (t, s)

(2)

(see Carollo et al.⁵). The

{\overset{˘}{λ}}_{ℓ} (u, s)

are defined over the full positive quadrant

R_{+}^{2}

From the CSHs $λ_{ℓ} (t, s)$ or ${\overset{˘}{λ}}_{ℓ} (u, s)$ , respectively, we obtain the cumulated CSHs, the overall survival probability and the cumulative incidence functions. In the next section we describe estimation of the CSHs. Thereafter, we provide expressions for the quantities that are obtained from them.

2.1. Cause-specific hazards: Model and estimation

We will model (and estimate) the CSHs ${\overset{˘}{λ}}_{ℓ} (u, s)$ over the $(u, s)$ -plane. Back-transformation to the $(t, s)$ -coordinates is straightforward using equation (2). The only assumption that we are going to make about the CSHs is that they vary smoothly over $u$ and $s$ (and hence over $t$ and $s$ ). This will be achieved by two-dimensional $P$ -spline smoothing (tensor products of $B$ -splines with difference penalties on the rows and columns of coefficients, see Eilers and Marx²¹). The approach was introduced by Carollo et al.⁵ for a single event type and we will extend it here to the competing risks setting.

The $(u, s)$ -plane is divided into $n_{u} \times n_{s}$ small bins of size $h_{u}$ and $h_{s}$ , respectively (rectangles if $h_{u} \neq h_{s}$ and squares otherwise), covering the range of observed values of $u$ and $s$ . The bin widths $h_{u}$ and $h_{s}$ will be chosen relatively narrow and hence the number of bins $n_{u}$ and $n_{s}$ can be relatively large. Within each bin the number of events of type $ℓ$ , denoted by $y_{j k}^{ℓ}$ , and the total time at risk $r_{j k}$ are determined ( $j = 1, \dots, n_{u}$ ; $k = 1, \dots, n_{s}$ ), summed over all individuals $i, i = 1, \dots, n$ . (In the following the superscript $ℓ$ , like in $y_{j k}^{ℓ}$ , indicates the cause and not a power.)

Each individual $i$ contributes to $y_{j k}^{ℓ}$ only in bins with a single index $j$ , namely, where its age at diagnosis $u_{i}$ is located. If the individual experiences an event of type $ℓ$ , then it contributes 1 to the event count $y_{j k}^{ℓ}$ in the bin $(j, k)$ , in which its exit time (since diagnosis) $s_{i}$ falls. In all other bins, and in all bins of the other cause it contributes 0. For an individual that is right-censored (leaves the risk set without having experienced any of the events) all contributions to the $y_{j k}^{ℓ}$ equal zero.

The contribution of an individual’s at-risk time per bin is, again, only in those cells where the first index $j$ is determined by the individual age-at-diagnosis $u_{i}$ , and the contributions are as long as the individual is under observation in the respective bin. Exit from the risk set is caused by experiencing an event of any type or by right-censoring. Late entry can be accommodated in a straightforward way in that an individual contributes only to those bins, where it is in the risk set (and not automatically from $s = 0$ in $r_{j 1}$ ). In our application all individuals are followed-up from diagnosis onward, so there is no late entry. The $r_{j k}$ are the sum over all individual exposure times in the respective bin.

The $y_{j k}^{ℓ}$ can be thought of as realisations of Poisson variates^22,23 with means

{\overset{˘}{μ}}_{j k}^{ℓ} = r_{j k} {\overset{˘}{λ}}_{j k}^{ℓ} = r_{j k} \exp ({\overset{˘}{η}}_{j k}^{ℓ})

(3)

where the

{\overset{˘}{λ}}_{j k}^{ℓ}

represent the CSHs

{\overset{˘}{λ}}_{ℓ} (u, s)

evaluated at the center of bin

(j, k)

. The

{\overset{˘}{η}}_{j k}^{ℓ}

are the corresponding values of the log-hazard,

{\overset{˘}{η}}_{j k}^{ℓ} = \ln {\overset{˘}{λ}}_{j k}^{ℓ}

Because of the same-sized bins, event counts and at-risk times are on a regular grid and are naturally arranged as $n_{u} \times n_{s}$ matrices $Y_{ℓ} = [y_{j k}^{ℓ}]$ and $R = [r_{j k}]$ . Correspondingly, we denote $M_{ℓ} = [{\overset{˘}{μ}}_{j k}^{ℓ}]$ and $E_{ℓ} = [{\overset{˘}{η}}_{j k}^{ℓ}]$ , so that equation (3) can be written more concisely in matrix form as

Y_{ℓ} \sim Poisson (M_{ℓ}) with M_{ℓ} = R ⊙ \exp (E_{ℓ})

(4)

Here

⊙

denotes element-wise multiplication.

To obtain smooth CSH surfaces we model the log-hazards ${\overset{˘}{η}}_{ℓ} (u, s) = \ln {\overset{˘}{λ}}_{ℓ} (u, s)$ via tensor products of $B$ -splines. Modeling the log-hazard will automatically produce positive estimates for the CSHs.

The tensor products are formed from two marginal $B$ -splines bases along the $u$ - and $s$ -axis, with $c_{u}$ and $c_{s}$ elements, respectively. We choose cubic $B$ -splines along both axes. The two basis matrices are denoted by $B_{u}$ , which is $n_{u} \times c_{u}$ , and by $B_{s}$ , which is $n_{s} \times c_{s}$ . The $c_{u} c_{s}$ regression coefficients are denoted by $α_{l m}^{ℓ}$ ( $l = 1, \dots, c_{u}$ ; $m = 1, \dots, c_{s}$ ), and the log-hazards, evaluated at the bin midpoints, are

{\overset{˘}{η}}_{j k}^{ℓ} = \sum_{l = 1}^{c_{u}} \sum_{m = 1}^{c_{s}} b_{j l}^{u} b_{k m}^{s} α_{l m}^{ℓ}

(5)

If we arrange the coefficients in the

c_{u} \times c_{s}

matrix

A_{ℓ} = [α_{l m}^{ℓ}]

, then the linear predictor in (5) can be expressed in matrix form as

E_{ℓ} = B_{u} A_{ℓ} B_{s}^{^{T}}

The numbers of basis functions $c_{u}$ and $c_{s}$ will be large enough to allow sufficient flexibility, but difference penalties on the coefficients, both along the rows and the columns of $A_{ℓ}$ , will prevent over-fitting. The penalty on the coefficients is constructed from two matrices $D_{u}$ and $D_{s}$ that form differences of order $d$ (usually $d = 1$ or $d = 2$ ) of neighboring elements in the columns of a matrix and it is controlled by two smoothing parameters $ϱ_{u}$ and $ϱ_{s}$ to allow anisotropic smoothing (that is, different amount of smoothing along the $u$ - and the $s$ -axis):

pen (ϱ_{u}, ϱ_{s}) = ϱ_{u} | | D_{u} A_{ℓ} | |_{F}^{2} + ϱ_{s} | | A_{ℓ} D_{s}^{^{T}} | |_{F}^{2}

(6)

(The squared Frobenius norm

| | . | |_{F}^{2}

is the sum of all squared elements of a matrix.)

The objective function to be minimised is the sum of the Poisson deviance resulting from (4) and the above penalty (6)

dev (M_{ℓ}; Y_{ℓ}) + pen (ϱ_{u}, ϱ_{s}) = 2 \sum_{j = 1}^{n_{u}} \sum_{k = 1}^{n_{s}} (y_{j k}^{ℓ} \ln (y_{j k}^{ℓ} / {\overset{˘}{μ}}_{j k}^{ℓ}) - (y_{j k}^{ℓ} - {\overset{˘}{μ}}_{j k}^{ℓ})) + pen (ϱ_{u}, ϱ_{s})

(7)

which leads to normal equations that can be solved, for given

ϱ_{u}

and

ϱ_{s}

, in a penalised Poisson iterative weighted least squares (IWLS) scheme (in compact notation;

\otimes

denotes the Kronecker product):

[(B_{s} \otimes B_{u})^{^{T}} {\tilde{W}}_{ℓ} (B_{s} \otimes B_{u}) + P] α_{ℓ} = (B_{s} \otimes B_{u})^{^{T}} {\tilde{W}}_{ℓ} {\tilde{z}}_{ℓ}

(8)

In the penalty matrix

P = ϱ_{u} (I_{s} \otimes D_{u}^{^{T}} D_{u}) + ϱ_{s} (D_{s}^{^{T}} D_{s} \otimes I_{u})

the

I_{u}

and

I_{s}

are identity matrices of appropriate dimension. The

c_{u} c_{s} \times 1

vector

α_{ℓ}

holds the elements of

A_{ℓ}

, and the

n_{u} n_{s} \times n_{u} n_{s}

matrix

{\tilde{W}}_{ℓ}

is a diagonal matrix of weights (for the Poisson case

{\tilde{W}}_{ℓ} = diag ({\tilde{μ}}_{ℓ})

). The

{\tilde{z}}_{ℓ}

is the usual working variable

{\tilde{z}}_{ℓ} = {\tilde{η}}_{ℓ} + (y_{ℓ} - {\tilde{μ}}_{ℓ}) / {\tilde{μ}}_{ℓ}

. The tilde indicates the current value in the iteration.

The parameters $ϱ_{u}$ and $ϱ_{s}$ control the smoothness of the estimated CSHs. Their optimal values are selected by minimising AIC (Akaike Information Criterion²⁴) or BIC (Bayesian Information Criterion²⁵). Both criteria balance fidelity of the estimated model (indicated by ${\hat{M}}_{ℓ}$ and ${\hat{W}}_{ℓ}$ below) to the data, as measured by the deviance, and model complexity:

\begin{aligned} AIC (ϱ_{u}, ϱ_{s}) & = dev ({\hat{M}}_{ℓ}; Y_{ℓ}) + 2 ED \end{aligned}

(9)

\begin{aligned} BIC (ϱ_{u}, ϱ_{s}) & = dev ({\hat{M}}_{ℓ}; Y_{ℓ}) + \ln (n_{bin}) ED \end{aligned}

(10)

The effective dimension ED of the estimated model is obtained as trace of the hat matrix

ED = trace {B (B^{^{T}} {\hat{W}}_{ℓ} B + P)^{- 1} B^{^{T}} {\hat{W}}_{ℓ}} = trace {(B^{^{T}} {\hat{W}}_{ℓ} B + P)^{- 1} B^{^{T}} {\hat{W}}_{ℓ} B}

where

B = B_{s} \otimes B_{u}

and

n_{bin}

is the number of Poisson variates entering in model (4). AIC penalises model complexity less strongly and has been found to undersmooth, particularly for large sample sizes.^6,26 We determine the optimal values for

ϱ_{u}

and

ϱ_{s}

by numerically minimising

BIC (ϱ_{u}, ϱ_{s})

The calculations required in (8) can be done very efficiently by employing generalised linear array methods,²⁷ which is possible due to the regular grid of bins underlying $Y_{ℓ}$ , $R$ and $E_{ℓ}$ .

Once the estimated coefficients ${\hat{A}}_{ℓ} = [{\hat{α}}_{l m}^{ℓ}]$ are obtained, we can evaluate the estimated ${\hat{\overset{˘}{η}}}_{ℓ} (u, s)$ at arbitrary points $(\dot{u}, \dot{s})$ in the range of the two bases:

{\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s}) = B_{u} (\dot{u}) {\hat{A}}_{ℓ} B_{s} (\dot{s})^{^{T}} = \sum_{l = 1}^{c_{u}} \sum_{m = 1}^{c_{s}} b_{l}^{u} (\dot{u}) b_{m}^{s} (\dot{s}) {\hat{α}}_{l m}^{ℓ}

(11)

The CSHs are obtained by exponentiation of the linear predictors (11)

{\hat{\overset{˘}{λ}}}_{ℓ} (\dot{u}, \dot{s}) = \exp {{\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s})}

(12)

Despite the popularity of penalised splines in practical applications, the derivation of their asymptotic properties had been lagging somewhat behind. Building on earlier work,²⁸ Xiao²⁹ presents $L_{2}$ and $L_{\infty}$ convergence rates of different types of penalised splines, including $P$ -splines as they are used in this paper ( $B$ -spline basis, difference penalty on the coefficients). Local asymptotic bias and variance are given as well. Xiao and Nan³⁰ present further results on minimax rate optimality of penalised splines.

2.2. From cause-specific hazards to cumulative incidence functions

To obtain the overall survival function and the two cause-specific cumulative incidence functions, we first have to obtain the cumulative CSHs. In the model with two time scales this can be done in either of the two specifications, see equation (2):

Λ_{ℓ} (t, s) = \int_{0}^{s} λ_{ℓ} (t = u + v, v) d v or {\overset{˘}{Λ}}_{ℓ} (u, s) = \int_{0}^{s} {\overset{˘}{λ}}_{ℓ} (u, v) d v

(13)

Note that the integrand on left hand side of equation (13) represents the simultaneous time increment in

t

and

s

, and recall that

u = t - s

and hence

t = u + s

, see equation (2).

The overall survival function correspondingly is

S (t, s) = \exp {- \sum_{ℓ = 1}^{2} Λ_{ℓ} (t, s)} or \overset{˘}{S} (u, s) = \exp {- \sum_{ℓ = 1}^{2} {\overset{˘}{Λ}}_{ℓ} (u, s)}

(14)

The survival functions above give, in the specific example, the probability of not having died by age

t

and time since diagnosis

s

, that is

S (t, s)

, which is equivalent to surviving more than

s

time units after diagnosis, when the disease was diagnosed at age

u = t - s

, that is

\overset{˘}{S} (u, s)

Finally, the two cumulative incidence functions (CIFs) are

\begin{aligned} I_{ℓ} (t, s) & = \int_{0}^{s} λ_{ℓ} (t = u + v, v) S (t = u + v, v) d v \\ or \\ {\overset{˘}{I}}_{ℓ} (u, s) & = \int_{0}^{s} {\overset{˘}{λ}}_{ℓ} (u, v) \overset{˘}{S} (u, v) d v \end{aligned}

(15)

and give the probability of dying from cancer (

ℓ = 1

) or because of other causes other (

ℓ = 2

) before age

t

and time since diagnosis

s

, or respectively, by

s

time units after diagnosis that was made at age

u = t - s

Integration in (13) and (15) is over one dimension only and we determine all integrals by numerical quadrature. Since we can evaluate the estimated CSHs on an arbitrarily fine grid, see (11), approximation errors will be minimal, even with a simple rectangle rule, as long as we choose a narrow enough grid width $Δ$ . Hence we have, for the $(u, s)$ -specification,

{\hat{\overset{˘}{Λ}}}_{ℓ} (u, s) \approx \sum_{k = 0}^{K (s)} {\hat{\overset{˘}{λ}}}_{ℓ} (u, k Δ) Δ

(16)

where

K (s)

is the largest integer such that

K (s) Δ < s

. Similarly, for the CIFs we obtain

{\hat{\overset{˘}{I}}}_{ℓ} (u, s) \approx \sum_{k = 0}^{K (s)} {\hat{\overset{˘}{λ}}}_{ℓ} (u, k Δ) \hat{\overset{˘}{S}} (u, k Δ) Δ

(17)

2.3. Standard errors

The variance-covariance matrix of the coefficients ${\hat{α}}_{ℓ}$ , $ℓ = 1, 2$ , which can be derived from Bayesian arguments, is given for the Poisson case by

Σ_{ℓ} = (B {\hat{W}}_{ℓ} B^{^{T}} + P_{ℓ})^{- 1}

(18)

The derivation as well as a comparison to the alternative sandwich estimator can be found in Appendix F of Eilers and Marx.²¹

The standard error of the linear predictor ${\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s})$ , evaluated at an arbitrary point $(\dot{u}, \dot{s})$ within the range of the two marginal $B$ -spline bases, is

s.e. ({\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s})) = {(B (\dot{u}, \dot{s}) Σ_{ℓ} B (\dot{u}, \dot{s})^{^{T}})}^{1 / 2}

(19)

Here

B (\dot{u}, \dot{s}) = B_{s} (\dot{s}) \otimes B_{u} (\dot{u})

is the model matrix evaluated at this point. Therefrom we obtain the standard errors for the CSHs via the delta-method as

s.e. ({\hat{\overset{˘}{λ}}}_{ℓ} (\dot{u}, \dot{s})) = s.e. (\exp {{\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s})}) = {\hat{\overset{˘}{λ}}}_{ℓ} (\dot{u}, \dot{s}) s.e. ({\hat{\overset{˘}{η}}}_{ℓ} (\dot{u}, \dot{s}))

(20)

The standard errors for the cumulative incidence functions (CIFs) are obtained via nonparametric bootstrap. To corroborate the sampling distribution of the CIFs, we performed a simulation study that is presented in the Supplementary Material. Based on the results, confidence intervals for the CIFs were determined as Normal-based intervals with variances obtained from the empirical variances of the bootstrap estimates. Details about the bootstrap implementation for the SEER data will be discussed in Section 3.3.

3. Mortality of women with breast cancer

3.1. The SEER data

The Surveillance, Epidemiology and End Results (SEER) program from the National Cancer Institute collects and publishes data on cancer incidence and survival covering about 48% of the population of the USA.³¹ In this study we use data from the incidence SEER-17 database (November 2022 submission³²), that includes all cancer diagnoses up to 2020. We accessed the registers on 09/08/2023 and used the case listing functionality of SEER*Stat³³ to extract all cases of malignant breast cancer (primary site codes C50.0-C50.9) diagnosed to women between 2010 and 2015, with maximum follow-up to the end of 2020.

We focus on postmenopausal women and hence selected only women who were at least 50 years old when the cancer was diagnosed. Year of diagnosis was restricted to 2010–2015 to limit potential period effects, such as improvements in diagnosis and treatment efficacy that may affect survival probabilities. The potential maximum length of follow-up was several years for all women in the sample (although of different length, depending on the specific year of diagnosis).

Only individuals for whom the cancer was the first malignant tumor ever recorded in the SEER registry were included in the analysis. Furthermore, only cases for which the cancer was confirmed microscopically or by a positive test/marker study were included, while cases where the cancer was detected only by autopsy or only reported on the death certificate were not considered.

Mortality of breast cancer patients is known to depend on race/ethnicity,³⁴ independent of socioeconomic factors. SEER provides a variable that combines information on race and ethnicity and we distinguish between white women, black women and women of race/ethnicity other than white or black in our analysis. Also the cancer sub-type influences the survival chances,^13,14 and the sub-types are categorised according to their hormone receptor status (HR+ or HR-) and to the status of the human epidermal growth factor 2 receptor (HER2+ or HER2 $-$ ).³⁵Different combinations of the receptor statuses can be grouped further, and we follow the distinction between the Luminal A sub-type (HR+ and HER2-) and all other sub-types of breast cancers. Luminal A is the most common and least aggressive sub-type of breast cancer and it is linked to higher survival probabilities.³⁵

Cases for which no information on the cancer sub-type was available (7.8%) or where race/ethnicity information was unknown (0.4%) were not included in the analysis. SEER also provides information on whether chemotherapy was performed, but does not distinguish between cases where chemotherapy was not given and those where this information is missing. The final size of the dataset is 202,242 women. Table 1 gives a summary breakdown by race/ethnicity, cancer sub-type and chemotherapy status.

Table 1.
Sample size in each group identified by race/ethnicity, breast cancer subtype and chemotherapy status.

Race/ethnicity White Black Other race/ethnicity

Cancer subtype Luminal A Other Luminal A Other Luminal A Other

Chemotherapy 30,245 25,878 4,259 5,372 3,580 3,489

Unknown/No Chemo 95,424 11,701 8,704 2,016 10,076 1,498

Race/ethnicity	White	Black	Other race/ethnicity
Chemotherapy	30,245	25,878	4,259	5,372	3,580	3,489
Unknown/No Chemo	95,424	11,701	8,704	2,016	10,076	1,498

The key variables for our analysis are age at diagnosis, time since diagnosis, vital status at the end of the follow-up and, in case of death, whether death was attributable to the cancer or to any other cause of death. Time since diagnosis is available in the registry as survival months, that is the number of months from diagnosis to either death or right-censoring.

In the raw data, end of follow-up is fixed at the 31st of December 2020. However, since the COVID-19 pandemic in 2020 likely had a noticeable impact on cancer patients, we decided to limit the follow-up in our analyses to the end of 2019. For individuals who died before the end of 2019 no changes were required. For those individuals who were alive at the end of 2020 we subtracted 12 months of survival time and they were still right-censored at the end of 2019. Finally, individuals that died sometime in 2020 were right-censored at the end of 2019 and their survival times adjusted by 6 months, assuming that deaths occurred approximately 6 months into 2020, as is standard practice when the exact month of last follow-up is not available (see Supplementary Figure 1 for a graphical representation of this correction).

The different combinations of race/ethnicity, cancer sub-type and chemotherapy indicator define 12 groups of interest for our analysis. The sizes of the subgroups allow separate analyses so we estimate all models separately and compare the results among different groups. In the following we will present a subset of the analyses and refer to the supplementary material for more results.

3.2. Ungrouping ages at diagnosis with the PCLM

Age at diagnosis is provided in the SEER registry as single years of age, up to age 89, and a last open-ended interval 90+. The distribution of age at diagnosis is shown in Figure 1, for the whole sample. Two features are clearly visible: first, the grouping of ages at diagnosis greater or equal to 90 (depicted as $[90, 100)$ , assuming that only very few, if any, diagnoses will be made beyond), and second the small peak of the distribution at age 65. The latter feature can be explained by the enrollment in the Medicare program, which starts at age 65 and has been found to be connected to an increase in diagnoses of the most common cancers in the US.³⁶

The hazard model with two time scales that was introduced in Section 2.1 employs data that are binned in relatively narrow intervals of same size, resulting in data on a regular grid. The gridded structure allows efficient computations by using GLAM algorithms. To preserve this data structure, we will disaggregate the final age-group using the penalised composite link model (PCLM), the details can be found in Appendix A.1. Of the twelve subgroups (see Table 1) only eleven had observations with diagnoses made at age 90 or older, so the PCLM had only been applied in those subgroups. (Time since diagnosis is given in months for all individuals, also for those in this last age-interval, so only one time dimension is affected by coarse grouping.)

Before we ungroup the ages at diagnosis in the interval $[90, 100)$ , we first need to define the bins for the finer grid from which the CSHs will be estimated by the methods described in Section 2.1. For all models we chose bins of length $h_{u} = 1$ year for the age at diagnosis $u$ and $h_{s} = 0.5$ year for the time since diagnosis $s$ , which is given in months. This leads to 50 bins along the $u$ -axis and $21$ along the $s$ -axis.

The cause-specific event counts, and consequently the number of individuals at risk, for ages at diagnosis in the final wider interval are ungrouped using the PCLM as described in Appendix A.1 and the model specification is the same for all groups. The marginal $B$ -spline matrices entering in $B = B_{s} \otimes B_{u}$ in equation (23) are of dimension $50 \times 16$ for $B_{u}$ and of dimension $21 \times 10$ for $B_{s}$ . The composition matrices are built as explained in (22), $C_{u}$ is of dimension $41 \times 50$ and $C_{s}$ is an identity matrix of dimension $21 \times 21$ . The number of parameters in $θ^{ℓ}$ to be estimated in the PCLM are $16 \cdot 10 = 160$ . Finally, the optimal smoothing parameters ${\hat{ϕ}}_{u}$ and ${\hat{ϕ}}_{s}$ are selected, separately for each group, as those that produce the model with minimum AIC. This is done by grid search using values of $\log_{10} (ϕ_{u})$ and $\log_{10} (ϕ_{s})$ ranging from $- 1$ to $2$ .

Figure 2 illustrates this approach for white women with luminal A subtype and no chemotherapy. The raw counts of breast cancer deaths are plotted in the left panel, the last age-interval is 10 years wide and corresponds to ages $[90, 100)$ . In the right panel, the counts that were ungrouped to single ages are plotted over ten bins of width 1.

Figure 1.

Distribution of age at diagnosis of breast cancer from the SEER data.

Figure 2.

(Left) Histogram of deaths due to breast cancer over age at diagnosis (with ages $\geq 90$ grouped) and time since diagnosis. (Right) Data ungrouped by the PCLM.

After having obtained estimates for the ungrouped event counts and exposure times, we replace the grouped data in $[90, 100)$ with the estimated quantities and we use the so obtained matrices $Y_{ℓ}$ and $R$ to estimate the CSHs and derived quantities as described in the next section.

3.3. Estimating the cause-specific hazards, cumulative incidence functions and overall survival probabilities

To illustrate the approach we present here the results for white women diagnosed with breast cancer of type other than Luminal A, both for patients who received chemotherapy and who did not. The distribution of the ages at diagnosis is given in Figure 3. Noticeable grouping of ages 90+ is discernible for women that received no chemotherapy.

Figure 3.

White women with cancer type other than Luminal A, age at diagnosis.

The matrices of event counts $Y_{ℓ}$ ( $ℓ = 1$ for death due to breast cancer, $ℓ = 2$ for other causes) and the matrix $R$ of exposure times are all of dimension $n_{u} \times n_{s} = 50 \times 21$ . To estimate the CSHs we chose 16 cubic $B$ -splines on the $u$ -axis and 10 cubic $B$ -splines on the $s$ -axis, see equation (5). Penalty order was $d = 2$ . The optimal smoothing parameters $ϱ_{u}$ and $ϱ_{s}$ were determined by numerical optimisation of BIC, see equation (9). All computations were performed using the R-package TwoTimeScales.³⁷

Table 2 summarises the optimal smoothing parameters and the resulting effective dimensions for the two groups and the two causes of death, respectively. The estimated CSHs, on log₁₀-scale, are shown in Figure 4.

Figure 4.

Cause-specific hazards for white women, other cancer types (non-Luminal A) who received chemotherapy (bottom row) and who did not (top row). Effective dimensions are given in Table 2.

Table 2.

White women with other subtypes: optimal smoothing parameters (minimal BIC) and resulting effective dimensions (ED) for cause-specific hazards.

	Death due to breast cancer			Other causes of death
	$\log_{10} ϱ_{u}$	$\log_{10} ϱ_{s}$	ED	$\log_{10} ϱ_{u}$	$\log_{10} ϱ_{s}$	ED
No chemotherapy	2.5	1.7	7.2	7.5	6.9	4.0
Chemotherapy	2.5	0.2	12.3	3.3	6.5	4.2

For women who did not receive chemotherapy the hazard of dying of breast cancer declines with time since diagnosis and it is generally higher if it was diagnosed at a higher age. If women did receive chemotherapy then, for age at diagnosis up to about age 75, the CSH for death due to breast cancer first increases and peaks about two years after diagnosis and declines thereafter. The different complexity of the hazard patterns is also reflected in the effective dimensions of the estimated hazard surfaces. The hazard of dying from other causes is, for both treatment groups, much smoother and reveals the well known exponential increase (Gompertz hazard) for ages 50+, however, the hazard level is lower for patients receiving chemotherapy but the increase in mortality over time is stronger for them.

The penalty allows to extend the estimated hazards into areas that are not supported by data. For example, in the specific application presented here, we know that for women who received chemotherapy and were diagnosed between age 90 and 100 the longest observed time since diagnosis was $s = 6.75$ years. Hence the top right area of the CSHs (above $s = 6.75$ and for $90 \leq u < 100$ ) is extrapolated beyond the observed data.

It is already evident from Figure 4 that application of chemotherapy leads to a more complex change in the hazards than a proportional shift, but to investigate the differences further Figure 5 shows the hazard ratios of the two CSHs (for women who received chemotherapy relative to those who did not).

Figure 5.

Ratio of the cause-specific hazards in Figure 4; chemotherapy versus no chemotherapy.

From the CSHs the overall survival function and the cumulative incidence functions can be derived, see Section 2.2. In Figure 6 we compare the cumulative incidence of death due to breast cancer and due to other causes for the treatment groups. The different lines correspond to different ages at diagnosis and result from transsecting the surfaces that are presented in Appendix A.2, Figure A.2 vertically. The overall survival probabilities for the two treatment groups can also be found there (Figure A.1).

Figure 6.

Cumulative incidence functions (CIFs) for selected ages at diagnosis over time since diagnosis. The complete surfaces of the CIFs can be found in Appendix A.2.

The gap of the cumulative incidence between the treatment groups widens for ages at diagnosis above 70, but while for breast cancer deaths the group that received chemotherapy shows higher values of the CIF, for the other causes of death the order reverses. For younger ages at diagnosis and time since diagnosis up to about 3 years the CIFs reflect the unimodal shape of the CSH in the chemotherapy group.

To obtain uncertainty estimates for the CIFs we perform a nonparametric bootstrap. We draw 500 bootstrap samples, and for each sample we first ungroup the last age interval. We fit the CSH models to the ungrouped data and finally we obtain estimates of the CIFs for each bootstrap sample.

We present the standard errors of the CIFs that were obtained as the empirical standard deviation over the 500 bootstrap samples in Figure A.3, comparing women with different therapy regimes. The two treatment groups are of quite different sizes (25,878 who received chemotherapy vs. 11,701 who did not) and this, together with the distribution of observations across the $(u, s)$ -plane, has an impact on the uncertainty of the estimates.

Additional to the graphical displays of the estimates, which present the results in a comprehensive way but may be tedious to read, tabular summaries will be a useful supplement. The companion R-package also provides a function that allows to present estimates of overall survival and cumulative incidences, including bootstrap confidence intervals, in tabular form. An example is provided in Tables A.1 and A.2 in Appendix A.2.

4. Discussion

In this article we present a method to estimate a competing risks model where the CSHs and hence also the derived quantities in the model vary over two time scales. The model is estimated by two-dimensional $P$ -splines and the only assumption made is that the hazards for the competing events vary smoothly over the two time scales. For estimation the event counts and the exposure times are binned in regular narrow two-dimensional intervals so that the well-known correspondence between hazard estimation and Poisson regression can be exploited. Computations can be performed very efficiently by applying GLAM algorithms. From the CSHs one can obtain overall survival probabilities and cause-specific incidence functions by simple numerical integration. As the estimated $P$ -splines can be evaluated on arbitrarily fine grids the approximation error will be minor, even if simple integration techniques are used. The model is implemented in the R-package TwoTimeScales, which is available on CRAN.³⁷

The results of the proposed model are presented in several figures throughout the manuscript, and all graphs were produced using plotting functions in the R-package. While those figures provide compact illustrations of the estimates, numerical summaries may also be desirable. Hence the R-package also includes a function that produces tabular summaries of the estimated overall survival and cumulative incidence functions at specified values of $u$ and $s$ . Examples were provided in the analysis of the SEER data.

The GLAM approach rests upon data that are binned on a regular grid so that the array structure can be utilised in the computations. In practice we may encounter event times that are grouped in a wide interval for one or both time axes at the upper end of the time scale. The SEER data that we used in this paper provide such an example. To preserve the gridded structure of the data for estimating the CSHs we used the penalised composite link model to ungroup the final wide intervals. Again only smoothness of the underlying distributions is assumed and bivariate $P$ -splines are employed. In this way the ungrouped data that enter the estimation procedure will be more smooth than the (unobserved) original data. Although the numbers in the final interval are rather low, concerns may arise that this could influence the smoothing parameters chosen by minimising BIC in the estimation step. As a sensitivity analysis, we restricted the data to known ages at diagnosis, that is ages 50 to 89, to avoid the ungrouping step and then estimated the model from these data. The results of the comparison can be found in the Supplementary Material and demonstrate that the ungrouping by the PCLM and subsequent estimation alters the results only marginally.

The penalty allows to extend the estimates of the CSHs, and hence also the derived quantities, into areas not supported by data. The SEER analysis provided an example for women who were diagnosed with cancer at very high ages and for whom the hazards could be extrapolated beyond the longest life spans in the data. However, extrapolation should be done with caution, like with any other method. Examples of both unproblematic and more questionable extrapolation results are given in Carollo et al.⁵

We analyse mortality after diagnosis of breast cancer and we distinguish between deaths due to breast cancer and all other causes of death. We consider two time scales, age and time since diagnosis. Extensions of the model to more than two competing events, analysed over the same time scales, is unproblematic. It is theoretically possible to consider more than two time scales, but it would imply greater computational complexity and is a topic for future research.

In the application we considered race/ethnicity, tumor type and chemotherapy status and analysed subgroups separately which, in this case, was feasible due to the large sample size of the SEER registers. This allowed, on the one hand, to use the PCLM for ungrouping as this model operates on aggregated count data. It also allowed to identify deviations from proportional hazards for the two treatment groups that were particularly pronounced for the hazard of death due to breast cancer. Nevertheless, for more covariates and less comprehensive data sets hazard regression models will be relevant tools for analysis, also for CSHs that vary along two time scales. For exactly observed event times proportional hazards regression has been developed already⁵ and is also implemented in the R-package TwoTimeScales.³⁷ The extension of this model to individual interval-censored data will be future research.

Supplemental Material

sj-pdf-1-smm-10.1177_09622802251367443 - Supplemental material for Competing risks models with two time scales

Supplemental material, sj-pdf-1-smm-10.1177_09622802251367443 for Competing risks models with two time scales by Angela Carollo, Hein Putter, Paul HC Eilers and Jutta Gampe in Statistical Methods in Medical Research

Footnotes

Data availability statement

The data that support the findings of this study are available from SEER () but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

ORCID iDs

Angela Carollo

Hein Putter

Paul HC Eilers

Jutta Gampe

Supplemental material

Supplemental material for this article is available online.

Appendix A

References

Putter

Fiocco

Geskus

. Tutorial in biostatistics: competing risks and multi-state models. Stat Med 2007; 26: 2389–2430.

Berzuini

Clayton

. Bayesian analysis of survival on multiple time scales. Stat Med 1994; 13: 823–838.

Skourlis

Crowther

Andersson

TML

, et al. On the choice of timescale for other cause mortality in a competing risk setting using flexible parametric survival models. Biomet J 2022; 64: 1161–1177.

Lee

Gouskova

Feuer

, et al. On the choice of time scales in competing risks predictions. Biostatistics 2016; 18: 15–31.

Carollo

Eilers

PHC

Putter

, et al. Smooth hazards with multiple time scales. Stat Med 2025; 44: e10297.

Currie

Durban

Eilers

PHC

. Smoothing and forecasting mortality rates. Stat Modell 2004; 4: 279–298.

Arnold

Morgan

Rumgay

, et al. Current and future burden of breast cancer: global statistics for 2020 and 2040. The Breast 2022; 66: 15–23.

Stapleton

Oseni

Bababekov

, et al. Race/Ethnicity and age distribution of breast cancer diagnosis in the united states. JAMA Surg 2018; 153: 594.

Surveillance Research Program NCI. SEER*Explorer: an interactive website for SEER cancer statistics [Internet]. Available from: https://seer.cancer.gov/statistics-network/explorer/., 2023.

10.

Colzani

Liljegren

Johansson

ALV

, et al. Prognosis of patients with breast cancer: causes of death and effects of time since diagnosis, age, and tumor characteristics. J Clin Oncol 2011; 29: 4014–4021.

11.

Bradshaw

Stevens

Khankari

, et al. Cardiovascular disease mortality among breast cancer survivors. Epidemiology 2016; 27: 6–13.

12.

Mehta

Watson

Barac

, et al. Cardiovascular disease and breast cancer: where these entities intersect: a scientific statement from the American Heart Association. Circulation 2018; 137: e30–e66.

13.

Brandt

Garne

Tengrup

, et al. Age at diagnosis in relation to survival following breast cancer: a cohort study. World J Surg Oncol 2015; 13: 33.

14.

Jayasekara

MacInnis

Chamberlain

, et al. Mortality after breast cancer as a function of time since diagnosis by estrogen receptor status and age at diagnosis. Int J Cancer 2019; 145: 3207–3217.

15.

San Miguel

Gomez

Murphy

, et al. Age-related differences in breast cancer mortality according to race/ethnicity, insurance, and socioeconomic status. BMC Cancer 2020; 20: 228–236.

16.

Freedman

Keating

Lin

, et al. Breast cancer-specific survival by age: worse outcomes for the oldest patients. Cancer 2018; 124: 2184–2191.

17.

Partridge

Hughes

Warner

, et al. Subtype-dependent relationship between young age at diagnosis and breast cancer survival. J Clin Oncol 2016; 34: 3308–3314.

18.

Narod

Iqbal

Giannakeas

, et al. Breast cancer mortality after a diagnosis of ductal carcinoma in situ. JAMA Oncol 2015; 1: 888.

19.

National Cancer Institute N. Surveillance, Epidemiology, and End Results Program. Overview of the SEER Program. Online, 2023. https://seer.cancer.gov/about/overview.html.

20.

Rizzi

Halekoh

Thinggaard

, et al. How to estimate mortality trends from grouped vital statistics. Int J Epidemiol 2018; 48: 571–582.

21.

Eilers

PHC

Marx

. Practical Smoothing: The Joys of P-Splines. Cambridge: Cambridge University Press, 2021. ISBN 1108482953.

22.

Holford

. The analysis of rates and of survivorship using log-linear models. Biometrics 1980; 36: 299–305.

23.

Laird

Olivier

. Covariance analysis of censored survival data using log-linear analysis techniques. J Am Stat Assoc 1981; 76: 231–240.

24.

Akaike

. A new look at the statistical model identification. IEEE Trans Automat Contr 1974; 19: 716–723.

25.

Schwarz

. Estimating the dimension of a model. Ann Stat 1978; 6: 461–464.

26.

Camarda

. MortalitySmooth: an R package for smoothing Poisson counts with P-splines. J Stat Softw 2012; 50: 1–24.

27.

Currie

Durban

Eilers

PHC

. Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc: Ser B (Stat Methodol) 2006; 68: 259–280.

28.

Wang

Shen

Ruppert

. On the asymptotics of penalized spline smoothing. Electron J Stat 2011; 5: 1–17.

29.

Xiao

. Asymptotic theory of penalized splines. Electron J Stat 2019; 13: 747–794.

30.

Xiao

Nan

. Uniform convergence of penalized splines. STAT 2020; 9: 1–11.

31.

National Cancer Institute. Surveillance, Epidemiology, and End Results Program, 2023. https://seer.cancer.gov/.

32.

Surveillance, Epidemiology, and End Results (SEER) Program. SEER*Stat Database: Incidence—SEER Research Data, 17 Registries, Nov 2022 Sub (2000-2020) , 2023.

33.

Surveillance Research Program NCI. SEER*Stat software (www.seer.cancer.gov/seerstat) version 8.4.2.

34.

Campbell

. Breast cancer-race, ethnicity, and survival: a literature review. Breast Cancer Res Treat 2002; 74: 187–192.

35.

Howlader

Cronin

Kurian

, et al. Differences in breast cancer survival by molecular subtypes in the United States. Cancer Epidemiol Biomarkers Prev 2018; 27: 619–626.

36.

Patel

Berry

, et al. Cancer diagnoses and survival rise as 65-year-olds become Medicare-eligible. Cancer 2021; 127: 2302–2310.

37.

Carollo

Eilers

PHC

Gampe

. TwoTimeScales: analysis of event data with two time scales. R-package version 1.0.0 2024; https://CRAN.R-project.org/package=TwoTimeScales.

38.

Eilers

PHC

. Ill-posed problems with counts, the composite link model and penalized likelihood. Stat Modell 2007; 7: 239–254.

39.

Lambert

Eilers

PHC

. Bayesian density estimation from grouped continuous data. Comput Stat Data Anal 2009; 53: 1388–1399.

40.

Rizzi

Gampe

Eilers

PHC

. Efficient estimation of smooth distributions from coarsely grouped data. Am J Epidemiol 2015; 182: 138–147.

41.

Bishop

YMM

Fienberg

Holland

. Discrete Multivariate Analysis: Theory and Practice. New York: The MIT Press, 1975.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

2.62 MB

Race/ethnicity	White		Black		Other race/ethnicity
Cancer subtype	Luminal A	Other	Luminal A	Other	Luminal A	Other
Chemotherapy	30,245	25,878	4,259	5,372	3,580	3,489
Unknown/No Chemo	95,424	11,701	8,704	2,016	10,076	1,498