Abstract
The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cut-off points as well as the number of categories considered differ considerably in the literature. Motivated by the need to obtain the best categorization of the lymph node ratio as a predictor of mortality in colorectal cancer patients, we propose a method to select the best number of categories for a continuous variable in a logistic regression framework. Thus, to this end, we propose a bootstrap-based hypothesis test, together with a new estimation algorithm for the optimal location of the cut-off points called BackAddFor, which is an updated version of the previously proposed AddFor algorithm. The performance of the hypothesis test was evaluated by means of a simulation study, under different scenarios, yielding type I errors close to the nominal errors and good power values whenever a meaningful difference in terms of prediction ability existed. Finally, the methodology proposed was applied to the CCR-CARESS study where the lymph node ratio was included as a predictor of five-year mortality, resulting in the selection of three categories.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
