Abstract
In clinical trials, evaluating the accuracy of risk scores (markers) derived from prognostic models for prediction of survival outcomes is of major concern. The time-dependent receiver operating characteristic curve and the corresponding area under the receiver operating characteristic curve are appealing measures to evaluate the predictive accuracy. Several estimation methods have been proposed in the context of classical right-censored data which assumes the event time of individuals are independent. In many applications, however, this may not hold true if, for example, individuals belong to clusters or experience recurrent events. Estimates may be biased if this correlated nature is not taken into account. This paper is then aimed to fill this knowledge gap to introduce a time-dependent receiver operating characteristic curve and the corresponding area under the receiver operating characteristic curve estimation method for right-censored data that take the correlated nature into account. In the proposed method, the unknown status of censored subjects is imputed using conditional survival functions given the marker and frailty of the subjects. An extensive simulation study is conducted to evaluate and demonstrate the finite sample performance of the proposed method. Finally, the proposed method is illustrated using two real-world examples of lung cancer and kidney disease.
Keywords
Introduction
The time-to-event data arises when an interest lies in the time from a certain origin to the occurrence of a specific event of interest. As an example, time-to-event is often used as an endpoint in clinical trials, refers to the time between randomization and the occurrence of an event of interest. The event, for example, can be death due to certain disease, progression, treatment failure or the recurrences of a disease. In such follow-up studies, the event of interest is not necessarily experienced by all study participants at the end of the study, so the actual event times for some subjects are unknown. This loss of information on time-to-event is known as censoring, which may occur when a subject withdraws from the study, lost to follow-up, or the study ends before the event has occurred. Survival analysis, or more generally, time-to-event analysis, is a standard tool to analyze the event time data taking the unique censoring feature into account.
In time-to-event analysis, prognostic model, which is a class of clinical prediction models, is an important tool to evaluate the association between one or more risk factors (biomarkers) and the outcomes of interest (time-to-event) and to predict the risk of an individual developing a particular state of health or experiencing a future outcome. This plays an increasingly important role in assisting health professionals in making clinical decisions and improving health outcomes of patients. To handle censored time-to-event data, regression models have been developed in the field of survival analysis. The standard or most widely used regression model to analyze time-to-event data is the proportional hazards model, which was introduced by Cox in 1972 and is commonly known as the Cox regression model. 1 The risk score (marker) derived from these models should, however, be evaluated for its accuracy before being used to predict the future outcome that is important for better clinical decision. In this regard, receiver operating characteristic (ROC) curve and the associated area under the ROC curve (AUC) are commonly used to assess predictive accuracy. Unlike the traditional ROC curve analyses which typically assume that event status of a subject is fixed and known, in prognostic studies such as time-to-event analysis, the disease status of subject can change over time. In such situations, the event of interest is time-dependent and is defined by considering a specific fixed time point of interest. This gives rise to concepts like time-dependent sensitivity, time-dependent specificity, and consequently, the time-dependent ROC and AUC. Several researchers, such as Heagerty et al., 2 Heagerty and Zheng, 3 Etzioni et al., 4 and others, have proposed various definitions and extensions to adapt classical methodologies for handling time-to-event data in prognostic studies. In literature, taking various censoring mechanisms into account, several time-dependent ROC curve and AUC estimation methods have been proposed; see, for example, Heagerty et al., 2 Heagerty and Zheng, 3 Li et al., 5 Blanche et al., 6 Martìnez-Camblor et al., 7 Martìnez-Camblor and Pardo- Fernández, 8 Beyene et al., 9 Díaz-Coto, 10 Wu and Cook, 11 Beyene and El Ghouch12,13 and the references given in these papers.
One of the basic assumptions common to the aforementioned methods is that the survival times of different subjects are independent of each other given observed values of covariates. In many practical applications, this assumption, however, may not hold true since all relevant covariates cannot be observed. In many clinical trials, for example, event times of different subjects maybe clustered or correlated because of certain common features such as genetic traits or shared environmental factors, or repeated events. In this situation, there are many unobserved characteristics that seem to be shared between observations from a cluster, so event times of different subjects are presumed to be correlated if they came from the same cluster. When survival analysis is performed without taking the correlated nature of the data into account, parameter estimates and their standard errors will be incorrect.14,15 For correlated time-to-event data analysis, a number of models widely referred to as frailty models have been proposed, see, for example,14,16–21 and frailty models were extensively studied, for example, in.22–26 In spite of the fact that these methods have been active area of research for the past several decades, and many applications have been published, almost no literature has been published about evaluating the predictive ability of these models, specifically the time-dependent ROC curve and its corresponding AUC estimation method that considers the correlated nature of the data.
In this paper, we propose a new time-dependent ROC curve and time-dependent AUC estimation methods for right-censored time-to-event data taking the correlated nature into account. In this regard, the proposed method is a generalization of the time-dependent ROC curve introduced by Beyene and El Ghouch 12 which assumed that individual event times are independent. As in Beyene and El Ghouch, 12 the unknown event status of censored individuals is imputed with conditional survival function, the conditionality in our approach, however, is both on the marker and frailty of the subjects. This conditional survival function can be estimated using a frailty model, such as, a parametric model.
The rest of this article is organized as follows. The following section introduces some important notations and definitions and describes the proposed estimators for time-dependent ROC curve and its associated AUC in the presence of correlation. In the “Simulation” section, the finite sample performances of the proposed method are evaluated through a simulation study. The practical use of the proposed method is illustrated with a real-data application in the “Real data analysis” section. Finally, some remarks and discussions are presented in the “Discussion” section.
Methods
In this section, we first introduce some important notations and definitions, followed by the estimator for the time-dependent ROC curve and its associated summary measure.
Notations and definitions
Let us consider that there are
A frailty model generalizes the classical survival model by allowing within-cluster correlations. According to this model, observations share a common frailty, resulting in correlated outcomes within the group. Assuming a proportional hazards frailty model, the hazard can be written as
Suppose we have a quantitative (bio)marker, denoted by
The time-dependent ROC curve (
In this section, we first derive theoretical formulas for the time-dependent TPR, FPR and the time-dependent ROC curve defined above which is the basis for our proposal. Assume that the event time
Using the same approach, the theoretical formula for the time-dependent ROC curve given in equation (4) can be written as
The empirical estimator for the TPR (6) and FPR (7), respectively, can be obtained as follows
The empirical estimator for
The estimate of the time-dependent AUC (5) can be obtained by approximating the integral using a numerical integration method, i.e.
In this section, we conduct an extensive simulation study with various scenarios to investigate the finite sample performance of the proposed time-dependent ROC curve and the time-dependent AUC. In addition, the performance of the proposed methods will be compared with the ROC curve and AUC estimators proposed by Beyene and El Ghouch,
12
hereafter referred by Beran, that was developed for classical uncorrelated survival data. The Beran method is implemented in
Data generating process
In order to perform the simulations, we first need to generate the data. To this end, the following procedure is used in order to generate survival times from the frailty model. The failure times are assumed to be independent and follow a proportional hazards model, given the frailty. The survival times were generated from a frailty model given by
In our simulation, the covariate

True time-dependent ROC curves and the corresponding AUC values of Scenario I (top row) and Scenario II (bottom row) computed at prediction time Q1 (solid line), Q2 (dashed line) and Q3 (dotted line).ROC: receiver operating characteristic; AUC: area under the ROC curve.
The simulations are conducted by generating
In this section, we compare the performance of the proposed empirical (non-smoothed) method under various data generating process, namely under the first scenario we considered a Weibull survival frailty model with
The estimation of the parameters for this model was done using the
The MIB and MISE of the time-dependent ROC curve were computed to evaluate the finite sample performance, and for the time-dependent AUC the percent bias (%Bias) and MSE were used as performance measure. Since the behavior of the estimator depends on various data aspects, its performance was examined considering different sample sizes (
Results of time-dependent ROC curve estimator
The MIB and the MISE of both the proposed empirical time-dependent ROC estimator and the empirical Beran method for the first and second scenarios obtained with sample sizes
Scenario I: MIB(
) and MISE(
) of the proposed empirical time-dependent ROC estimator with Weibull and Log-normal frailty, and the Beran method computed for different sample sizes (
), cluster sizes (
), right censoring rates (% cen), correlation values (
), and
values.
Scenario I: MIB(
MIB: mean integrated bias; MISE: mean integrated squared error; ROC: receiver operating characteristic.
Scenario II: MIB(
MIB: mean integrated bias; MISE: mean integrated squared error; ROC: receiver operating characteristic.
The empirical percent bias (%Bias) and MSE of both the proposed empirical time-dependent AUC estimator and the empirical Beran method for the first and second scenario are presented in Tables 3 and 4, respectively. The general conclusion of these results is consistent with the ROC curve findings of the previous simulations. The AUC estimator shows good performance with small bias and MSE. In order to study how well the proposed empirical estimator performs under misspecification of the frailty distribution, the data are generated under a gamma distribution for both scenarios, and the estimation was done under a log-normal frailty distribution. The results presented in Tables 3 and 4 show that the AUC estimated with true frailty distribution and misspecified frailty distribution are similar with slight better performance from the latter. Results presented in Table 4 revealed that the proposed method also performs well with small bias and MSE when the estimation is done with a misspecified baseline survival function. Furthermore, as expected, the MSE decrease as the sample size increases. In contrast, the MSE, in general, increases with both the censoring percentage and cluster size
Scenario I: %Bias(MSE
) for the proposed empirical time-dependent AUC estimator with Weibull and Log-normal frailty, and the Beran method computed for different sample sizes (
), cluster sizes (
), right censoring rates (% cen), correlation values (
), and
values.
Scenario I: %Bias(MSE
MSE: mean squared error; AUC: area under the ROC curve.
Scenario II: %Bias(MSE
MSE: mean squared error; AUC: area under the ROC curve.
In this section, two different real-world examples are provided to illustrate the proposed time-dependent ROC curve and associated time-dependent AUC estimation methods. The first data is the lung cancer data from the North Central Cancer Treatment Group (NCCTG). The second data set is infections in kidney patients data. In the upcoming subsections, we will provide details about the data sets and results of our analyses.
NCCTG lung cancer data
Our analysis in this section uses the popular NCCTG lung lancer data which contains 228 patients, of whom 63 are right-censored (i.e. patients left the study before experiencing the event of interest). The NCCTG lung cancer data was collected from 18 different health institutions (clusters). The number of subjects per institution ranges from 2 to 36. In this data, patients from the same institution (cluster) are likely to have correlated event times because they share the same facility. This correlation among event times within the same cluster should be considered in the analysis to ensure that the results are valid and accurate. The NCCTG data set records survival times together some important predictor variables such as sex (Male = 1 and Female = 2), age (in years), ph.ecog (Eastern Cooperative Oncology Group (ECOG) performance status assessed by the physician, on a scale ranges from 0 (asymptomatic) to 5 (dead)), and pat.karno (Karnofsky performance status, assessed by the patient). Originally, Loprinzi et al.
32
analyzed this data in an attempt to determine whether descriptive information gathered from a patient-completed questionnaire could provide prognostic information independently of that previously gathered from the patient’s physician. This data set is available in the
In this analysis, we estimated the time-dependent ROC curves

Estimated time-dependent ROC curves obtained using the proposed empirical estimator with gamma frailty (solid lines), log-normal frailty (dashed lines), and Beran estimator (dotted lines) for the NCCTG lung cancer data with
Estimated time-dependent AUC values, based on the proposed empirical estimator with gamma and log-normal frailty, and Beran estimator obtained from the North Central Cancer Treatment Group (NCCTG) lung cancer data, with
The kidney data is the other data that often used to illustrate frailty models. This data is about recurrence of infection in kidney patients who use portable dialysis equipment. In kidney patients using portable dialysis equipment, recurrent infection is the main complication, which occurs at a point where the catheter is inserted. When an infection occurs, the catheter is removed, then the catheter is reinserted again after the infection has been treated successfully. In some cases, the catheter may be removed for reasons other than infection, in which case the observation is censored. A total of 38 patients with kidney disease were followed for time to recurrence of infection. Each patient has exactly 2 observations. The data consists of three covariates: sex (1 = Male, 2 = Female), age (in years), and disease (disease type with 4 levels: “GN”, “AN”, “PKD”, and “Other”). This data set was originally analyzed and presented in McGilchrist and Aisbett
18
and it is available in
To evaluate the predictive accuracy of

Estimated time-dependent receiver operating characteristic (ROC) curves obtained using the proposed empirical estimator with gamma frailty (solid lines), log-normal frailty (dashed lines), and Beran estimator (dotted lines) for the kidney data with
In this article, we proposed and investigated a time-dependent ROC curve and the corresponding AUC estimation method for correlated right-censored survival data. This method is a generalization of the time-dependent ROC curve introduced by Beyene and El Ghouch 12 which assumed that individuals event times are independent. Therefore, as in Beyene and El Ghouch, 12 the unknown event status of censored individuals is imputed with conditional survival function, the conditionality in this estimator, however, is both on the marker and frailty of the subjects. In order to estimate this conditional survival function needed for determining the unknown event status of censored individuals, a parametric frailty model was considered.
An extensive simulation study with two different scenarios was conducted to evaluate the finite sample performance of the proposed empirical (non-smoothed) time-dependent ROC curve and the corresponding AUC estimation method. A comparison was also made between the proposed method and the existing naïve estimator that proposed for right-censored data with independent event times. This existing method uses the popular Beran approach to estimate the unknown conditional survival function. Based on the results, the proposed ROC curve and AUC estimators have a better finite sample performance with smaller MIB and MISE than the existing Beran method. In addition, the Beran approach tends to underestimate the ROC curve and the corresponding AUC as the biases are consistently negative for all considered scenarios. Moreover, when we examine the effect of misspecifing the frailty distribution in the proposed estimator, it generally has minimal effects since as results of gamma frailty and log-normal frailty are very similar, with the former showing slightly better performance. The method also performed well with small MIB and MISE when the data is generated with misspecified baseline.
Estimated time-dependent area under the ROC curve (AUC) values, based on the proposed empirical estimator with gamma and log-normal frailty, and Beran estimator obtained from the kidney data with
The proposal was applied to the NCCTG lung cancer and kidney data sets. For both data set, the continuous risk score (marker) is obtained as
To conclude, we showed that the proposed time-dependent ROC curve and the corresponding AUC estimation method which take the correlated nature of event times into account has better finite sample performance than the existing Beran approach that does not acknowledge the presence correlation. Therefore, we recommend to use the proposed estimator for correlated event times data.
R functions that implement the proposed method are available from the corresponding author, and will be published as an open-source package R package
Footnotes
Acknowledgments
The authors would like to thank the two anonymous reviewers and the editor for their comments and suggestions, which substantially improved the quality of this manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is partially based upon research supported by the South Africa National Research Foundation (NRF) and South Africa Medical Research Council (SAMRC) (South Africa DST-NRF-SAMRC SARChI Research Chair in Biostatistics, Grant number 114613). Opinions expressed and conclusions arrived at are those of the author and are not necessarily to be attributed to the NRF and SAMRC.
