Abstract
Analyzing the large-scale survival data from the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program may help guide the management of cancer. Detecting and characterizing the time-varying effects of factors collected at the time of diagnosis could reveal important and useful patterns. However, fitting a time-varying effect model by maximizing the partial likelihood with such large-scale survival data is not feasible with most existing software. Moreover, estimating time-varying coefficients using spline based approaches requires a moderate number of knots, which may lead to unstable estimation and over-fitting issues. To resolve these issues, adding a penalty term greatly aids estimation. The selection of penalty smoothing parameters is difficult in this time-varying setting, as traditional ways like using Akaike information criterion do not work, while cross-validation methods have a heavy computational burden, leading to unstable selections. We propose modified information criteria to determine the smoothing parameter and a parallelized Newton-based algorithm for estimation. We conduct simulations to evaluate the performance of the proposed method. We find that penalization with the smoothing parameter chosen by a modified information criteria is effective at reducing the mean squared error of the estimated time-varying coefficients. Compared to a number of alternatives, we find that the estimates of the variance derived from Bayesian considerations have the best coverage rates of confidence intervals. We apply the method to SEER head-and-neck, colon, prostate, and pancreatic cancer data and detect the time-varying nature of various risk factors.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
