Abstract
Purpose
This study aimed to construct a prediction model regarding risk factors and prognostic factors for distant metastasis of T1-T3 stage rectal cancer. For this purpose, a population-based retrospective cohort study was conducted.
Methods
Data on 7872 patients diagnosed with rectal cancer between 2004 and 2020 were obtained from the Surveillance, Epidemiology, and End Results database, of whom 746 had distant metastases at diagnosis. Independent risk factors for distant metastasis of rectal cancer were determined using univariate and multivariate logistic regression analyses. Cox proportional hazards regression analyses clarified the independent prognostic factors for distant metastases of rectal cancer. A 7:3 randomization process was used to place all patients into the training and internal validation groups. Furthermore, we retrospectively collected clinical data from 226 patients who had both rectal cancer and distant metastases between 2012 and 2024 at the Weifang Hospital of Traditional Chinese Medicine. We used the calibration curve, DCA curve, C-index, and area under the curve (AUC) to assess the discriminatory and pre-precision qualities of the models.
Results
The multivariate logistic regression analysis identified race, tumor grade, T stage, N stage, radiotherapy, chemotherapy, surgery, tumor size, and histological subtype as risk factors for distant metastases in rectal cancer, with AUC values for both training and validation sets exceeding 0.8. Using Cox regression analysis, we determined that the age, sex, tumor size, surgery, chemotherapy, and radiotherapy were independent predictors of distant metastasis of rectal cancer. In the prognostic model, the C-index of the training cohort was 0.687 (95% CI: 0.6615-0.7125), that of the internal validation cohort was 0.692 (95% CI: 0.6508-0.7332), and that of the external validation cohort was 0.704 (0.6785-0.7295).
Conclusion
Our nomogram can predict risk factors and analyze the 1-, 2-, and 3 year prognosis of distant metastases in patients with rectal cancer, providing valuable guidance for future clinical work.
Plain language summary
Purpose: This study aimed to construct a prediction model regarding risk factors and prognostic factors for distant metastasis of rectal cancer with T1-T3 stage. Methods: Data of patients diagnosed with rectal cancer between 2004 and 2020 were obtained from the Surveillance, Epidemiology, and End Results database. Independent risk factors for distant metastasis of rectal cancer were determined using univariate and multivariate logistic regression analyses. Cox proportional risk regression analyses clarified the independent prognostic factors for distant metastases of rectal cancer. A 7:3 randomization process was used to place all patients into the training and internal validation groups. Furthermore, as part of the validation cohort, we retrospectively collected clinical data from 226 patients who had both rectal cancer and distant metastases between 2012 and 2024 at the Weifang Hospital of Traditional Chinese Medicine. We used the calibration curve, DCA curve, C-index, and area under the curve (AUC) to assess the discriminatory and pre-precision qualities of the models. Results: The multivariate logistic regression analysis revealed that the race, tumor grade, T, N, radiotherapy, chemotherapy, surgery, tumor size, and histological subtype were among the risk factors for distant metastases in rectal cancer, and the AUC values for both the training and validation sets in the risk model were greater than 0.8. Using Cox regression analysis, we determined that the age, sex, tumor size, surgery, chemotherapy, and radiotherapy were independent predictors of distant metastasis of rectal cancer. In the prognostic model, the C-index of the training cohort was 0.687 (95% CI: 0.6615-0.7125), that of the internal validation cohort was 0.692 (95% CI: 0.6508-0.7332), and that of the external validation cohort was 0.704 (0.6785-0.7295). Conclusion: Our nomogram can predict risk factors and analyze the 1-, 2-, and 3 year prognosis of patients with metastatic rectal cancer, providing valuable guidance for future clinical work.
Introduction
The metastatic tumor cell (“seed”) vs its organ microenvironment (“soil”) hypothesis has been widely used to explain the metastatic spread of tumors studying the tumor-stroma interactions is possible at the molecular level.1,2 Specific tumor cells may have a propensity to metastasize to particular target organs; the most common metastatic target organs are the bone, brain, liver, and lungs. 3 The incidence of rectal cancer has been increasing year after year due to the insidious nature of its onset and the lack of clinically specific indicators, among others, making it the second most common cause of cancer deaths.4,5 In recent years, with the advent of novel targeted agents and immunotherapies, new insights into the treatment of this population have been provided, reducing the incidence of rectal cancer and leading to an improved prognosis. Nonetheless, patients with stage I–II rectal cancer have a 5-year survival probability of 88%-95%, whereas those with metastatic rectal cancer have a survival span of 3-5 years, and approximately 60% of patients with metastatic colorectal cancer die in 1-2 years. 6 Tumor metastasis is the primary cause of cancer-related mortality, and most current treatments have not yet been able to provide a lasting therapeutic effect. 7
We focused on cancer metastasis because 90% of cancer-related deaths are caused by metastasis rather than the primary tumor. 8 Metastatic disease poses significant therapeutic challenges, especially as peritoneal spread reduces the quality of life and discourages outcomes. Tumor metastasis is a complex process. Through gene regulation, tumor cells spread from their original location and progressively develop into new tumors, producing intractable consequences and diminishing therapeutic efficacy. 9 The primary colorectal cancer metastatic pathway is lymphatic metastasis, and Fan et al found that FAP-α in tumor cells induces extracellular matrix remodeling and establishes an immunosuppressive environment through the recruitment of regulatory T cells, which promotes lymph node metastasis of rectal cancer. 10 Furthermore, the liver is the most frequently metastasized organ in patients who develop metastases following primary resection of rectal cancer.11,12 If occult metastases are present at the time of resection, intrahepatic recurrence of future liver remnant (FLR) cannot be detected using standard imaging. All liver tumors must be removed with enough FLR for liver regeneration.13,14 Early diagnosis is necessary for patients at high risk of distant metastasis from rectal cancer, which helps improve the prognosis of rectal cancer patients. 15 Therefore, practical tools are urgently required to guide in effective clinical diagnosis and treatment of such patients.
A nomogram can provide predictive models based on numerical estimates for cancer prognosis and recurrence.16-18 A concise and clear presentation can help clinicians make decisions and provide guidance. 19 Several cancers are caused by multiple clinical factors, and TNM staging systems cannot provide personalized guidance. Therefore, nomograms have greater advantages over TNM staging systems and are gradually becoming the new standard.20,21 To the best of our knowledge, no model has predicted the overall survival (OS) of patients with rectal cancer with distant metastases. Therefore, we collected and screened a large amount of clinical data on rectal cancer combined with distant metastases from the Surveillance, Epidemiology, and End Results (SEER) database and used the clinical data of patients from Weifang Hospital of Traditional Chinese Medicine as external validation. By retrospectively analyzing these data, we constructed and validated the risk factors and 1-, 2-, and 3 year prognostic prediction models for such patients.
Materials and Methods
Patients
The study was a population-based retrospective cohort study. Data on 7872 patients diagnosed with rectal cancer between 2004 and 2020 were obtained from the Surveillance, Epidemiology, and End Results database, of whom 746 had distant metastases at diagnosis. Additionally, we retrospectively collected data on 226 patients with rectal cancer combined with distant metastases, diagnosed at the Weifang Hospital of Traditional Chinese Medicine from 2012 to 2024, as an external validation cohort for our study. The inclusion criteria were as follows: (1) patients diagnosed with a primary tumor of rectal cancer with a definite site of metastasis between 2004 and 2020; (2) Rectal cancer patients with T-stage of T1-3 were selected; (3) a reporting source that was neither an autopsy nor a death certificate; (4) patients with full survival and follow-up data; and (5) patients with complete demographic and clinicopathological information such as TNM stage and tumor size. The screening flowchart is shown in Figure 1, which depicts 7872 patients with rectal cancer, including 746 patients who experienced distant metastases. A flowchart showing patient screening details.
Our study methods were strictly based on the research criteria published in the SEER database. This study obeyed the Declaration of Helsinki and was approved by the Medical Research Ethics Committee of Weifang Traditional Chinese Medicine Hospital (approval number: 2024YX152). In accordance with national legislation and institutional requirements, we have obtained verbal informed consent from the patients.‘Informed consent’ is a core principle that ensures that the rights and interests of the subject (patient) are respected and protected. With the help of a combination of an electronic medical records system and telephone follow-up visits, we meticulously documented the patients’ OS to apply it in further data analysis and evaluation. Our exclusion criteria were: patients based only on autopsy registration or death certificates, zero or unknown number of months of survival, and unclear or lack of relevant clinical data (e.g., TNM stage, pathological staging, tumour size, ethnicity and degree of differentiation). The reporting of this study conforms to TRIPOD guidelines. 22
Data Collection
The following variables were collected to clarify the risk factors for developing distant metastases in rectal cancer: age at diagnosis, race, sex, histological subtype, tumor grade, primary site, TN stage, surgery, radiotherapy, chemotherapy, tumor size, and occurrence of metastases. Age was analyzed as a continuous variable with original data. We selected 65 years as the age cut-off point because it is an important criterion for delineating the elderly population and is widely used in policy making and healthcare planning. In this study, the number of patients <65 and ≥65 years of age was approximately equal, which ensured a comparable and balanced study and also helped us to effectively identify those who require more frequent medical monitoring and intervention. Race was divided into (American Indian/Alaska Native、Asian or Pacific Islander、Black、White)、Measurements regarding tumour size were based on imaging findings (e.g. CT, MRI, etc.) and pathological assessment. Patients (55684 cases) who were based only on autopsy registrations or death certificates (205 cases), with zero or unknown number of months of survival (1980 cases), and with ambiguous or lack of relevant clinical data (e.g., TNM staging, pathological staging, tumour size, ethnicity, and degree of differentiation) were also excluded according to the exclusion criteria. Patients in the diagnostic group with distant metastases comprised the training and validation groups of patients in the prognostic cohorts. A nomogram was created for each cohort using the training set patients and validated with the matching validation set patients to identify the risk factors for patients with rectal cancer with distant metastases. Fundamental clinical characteristics were chosen for this investigation. In addition, survival analysis was performed to explore prognostic factors in patients with rectal cancer combined with distant metastases.
Statistical Analyses
All variables were analyzed using univariate logistic analysis, and variables significantly associated with rectal cancer were identified using multivariate logistic analysis to construct a prediction model for the occurrence of distant metastasis in patients with rectal cancer. We analyzed the survival of 746 patients who developed metastases to determine the prognostic factors. A training group (n = 522) and an internal validation group (n = 224) were randomly assigned to all patients in a 7:3 ratio. The grouping method was simple random grouping. Randomization was computer-generated by using the random () function in R language. To identify independent predictive factors for rectal cancer distant metastases, we ran one-way Cox proportional hazards regression analysis for all included variables and included variables with P < 0.05 in “glm” in R for multifactorial Cox proportional hazards regression analyses. Furthermore, two nomograms were developed to estimate the risk and OS of rectal cancer distant metastases, based on risk factors and independent prognostic indicators. For comparing the continuous data, we used independent t test or Mann-Whitney U-test to ensure the accuracy and reliability of the data, while for comparing the categorical data, we chose the chi-square test or the Fisher’s exact test in order to obtain more accurate and scientific analysis results. In addition, time-dependent ROC curves of nomogram and all independent prognostic variables at 12, 24, and 36 months were generated, and the corresponding time-dependent AUCs were applied to show the discrimination. Calibration curves and DCA of 12, 24, and 36 months were plotted to evaluate the nomogram. According to the median risk score, all rectal cancer patients with distant metastasis were divided into high- and low-risk groups. Kaplan–Meier (K–M) survival curves with the log-rank test were performed to show the difference OS status between the two groups. The graphs’ correctness was assessed using the C-index and ROC, and their discriminatory power was confirmed by calibration curves. Decision Curve Analysis (DCA) is a method that facilitates decision making regarding test selection and use, 23 which allows comparison of nomogram with other models. Compared to strictly quantitative performance metrics like the area under the receiver operating characteristic curve, decision curves are superior. 24 The R software (version 4.3.2; https://www.r-project.org/) was used for all statistical analyses.
Results
Baseline Clinical Characteristics of the Patients
Baseline Clinical Characteristics of Patients Diagnosed With Rectal Cancer.
Independent Risk Factors of Rectal Cancer With Distant Metastases
Univariate and Multivariate Logistic Analyses of Rectal Cancer With Distant Metastasis.
Diagnostic Nomogram Modeling and Validation
Using independent predictors from the multifactorial logistic regression, we created a nomogram model for predicting rectal cancer risk in combination with distant metastases (Figure 2A). The nomogram AUC values for both training and validation sets exceeded 0.8, indicating good discriminative properties (Figures 2B, E). Strong agreement was observed in the calibration curves between the predictions and observations (Figures 2C, F). In addition, the DCA suggested that our nomogram could be used as a tool for diagnosing distant metastases in patients with rectal cancer (Figures 2D, G). Meanwhile, we also generated ROC curves of nomogram vs TN staging (Supplemental Figure 1A–(B)): the results show that our prediction model has a good diagnostic effect. To further validate the applicability of the model, we retrospectively collected data on 226 patients with rectal cancer combined with distant metastases in the Weifang Hospital of Traditional Chinese Medicine from 2012-2024 to form an expanded test set for our study. The nomogram 1-, 2-, and 3 year AUCs of the expanded test set were 0.612, 0.590, and 0.610, respectively (Supplemental Figure 5), and the calibration and DCA curves for each independent factor showed good predictability. Construction and validation of a diagnostic nomogram (A) Development and validation of a diagnostic risk model for assessing the distant metastasis of rectal cancer. Rectal cancer distant metastasis risk model. (B) Area under the ROC curve for the training set. (C) The training set nomogram correction plot. (D) Decision curve analysis of training set. (E) Area under the ROC curve for the validation set. (F) Validation of nomogram correction plot. (G) Decision curve analysis of the validation set.
Analysis of Prognostic Factors in Patients with Rectal Cancer and Distant Metastases
Unifactorial and Multifactorial Cox Regression Analyses of Rectal Cancer With Distant Metastasis.
Development and Validation of Prognostic Nomogram
We established a nomogram for predicting survival in patients with rectal cancer and distant metastases (Figure 3) based on six factors: age, sex, tumor size, surgery, chemotherapy, and radiotherapy. A significant association was noted between the nomogram calibration curves and 1-, 2-, and 3 year survival probability. A consistency was noted between the actual outcomes in the training (Supplemental Figure 2(A)–(C)) and validation sets (Supplemental Figure 2(D)–(F)), and the nomogram-predicted OS. The DCA curves showed that the nomogram could be applied in clinical practice (Supplemental Figure 3(A)–(F)). Furthermore, the results of the ROC analysis indicated that the nomogram AUC was also discriminative for predicting the OS in patients with rectal cancer combined with DM, reaching 0.743, 0.750, and 0.736 at 12, 24, and 36 months, respectively, in the training set (Supplemental Figure 4(A)) and 0.828, 0.774, and 0.744 in the validation set (Supplemental Figure 4(B)). The C index (Supplemental Table 2) was 0.687 (95% CI: 0.6615-0.7125) in the training cohort and the C-index was 0.692 (95% CI: 0.6508-0.7332) in the internal validation cohort. The external validation cohort was categorized as 0.704 (0.6785-0.7295). The K-M curves showed that patients in the high-risk group had significantly worse OS than those in the low-risk group (Supplemental Figure 4(C) and (D)). Nomogram for predicting 1, 2 and 3 year overall survival in patients with metastatic rectal cancer.
Discussion
Studies on the development of risk factors and prognostic models for rectal cancer with distant metastases are lacking. Therefore, using a large number of clinical samples from the SEER database, we developed risk factors and prognostic prediction models. Although previous studies have achieved some results in the construction of prognostic models, there are still some limitations, such as the lack of external validation, insufficient sample size, and incomplete selection of risk factors. In response to these limitations, we have improved and added accordingly. We combined traditional clinicopathological factors with treatment options for comprehensive analysis. Among these, patients with adenocarcinoma, no surgery and no radiotherapy, completed chemotherapy, tumour size of 50-100 mm, T3, stage N1, tumour stage IV, and rectal cancer of black ethnicity were at greater risk of distant metastasis. Patients of older age, male, larger tumours, no surgery, no chemotherapy and no radiotherapy had poorer OS. However, the association between metastasis and age is still unclear. There are two potential explanations: one pertaining to the immune system and the other to the mechanical characteristics of the tissue, have been proposed by certain scientists. 25 As we age, immune cells become less functional, making them unable to effectively recognize and remove tumor cells and providing an opportunity for tumor cells to spread. The tumor microenvironment (TME) and pre-metastatic ecological niche environment play key roles in driving the reprogramming of cancer cells towards a metastatic progression state. The extracellular matrix (ECM) is a highly dynamic microenvironmental component of the metastatic cascade that changes and evolves during aging. 7 Interestingly, although race and T and N stages are independent risk factors for distant metastases in patients with rectal cancer, their impact on prognosis is not significant. Surgery, radiotherapy, and chemotherapy are included as important factors that may affect patient survival and disease progression as predictor variables. These treatments do not directly cause distant metastases per se, but they can influence the ability of a tumour to grow, invade and metastasise, thus indirectly affecting a patient’s prognosis. For example, genetic mutations and adaptive changes in tumour cells after radiotherapy may also lead to treatment failure and tumour recurrence. Population studies and evidence suggest that chemotherapy may provide a survival advantage and improve quality of life for distant metastases in patients with rectal cancer. 26 Based on our analysis, chemotherapy is essential to improve the prognosis of distant metastases in patients with rectal cancer. Of the samples in the SEER database, 3214 (40.8%) did not receive chemotherapy and 4658 (59.2%) did. However, chemotherapeutic agents may produce additional toxicities (e.g., neurotoxicity) from which not all patients will benefit. Zhang et al found that radiotherapy leads to an increased risk of metastasis. 27 Therefore, it is important to consider each patient’s specific clinical situation when deciding whether to use adjuvant therapy for rectal cancer. 28 In addition, Cox regression analysis showed that radiotherapy as an adjuvant treatment had the same benefit as chemotherapy. Notably, shorter courses of radiotherapy have been found to reduce treatment duration and toxicity, thereby increasing treatment completion rates in all patients. 29 Although some previous retrospective studies have suggested that patients with distant metastases do not benefit from palliative surgery and may even facilitate the progression of systemic metastases due to surgical resection, our study yielded the opposite result.30,31 This difference may be explained by the fact that our study included a broader group of patients, including those who were traditionally considered unsuitable for surgery, but who may benefit from surgery thanks to new treatment strategies. Enhanced postoperative management (e.g., pain control, nutritional support, and rehabilitation exercises) similarly improved patients’ postoperative survival. This difference in AUC values may be due to the different characteristics of survival data at different time periods. For example, short-term prediction models (e.g., 1-year survival prediction models) may be more focused on reflecting rapid disease progression, and factors such as surgical recovery period, psychological changes, and complications may have a significant impact on survival time in the short term. Long-term prediction models (e.g., 3-year survival prediction models) may be more focused on assessing factors such as the overall health of the patient, the effectiveness of the treatment regimen, and the potential risk of recurrence.
Currently, most studies have assessed only one aspect of risk factors and prognosis prediction. As far as we know, we are the first to establish a comprehensive retrospective study on predicting the risk of distant metastases and prognosis in patients with rectal cancer. This visualized predictive model may help physicians determine targeted treatment strategies.
A study by Han et al found that adenocarcinoma, gender, tumor size, poor grading, T1 stage, and N1/N2 stage were significantly associated with the development of metastasis,32,33which is similar to our study. Therefore, clinicians should be aware of these risk factors in patients with rectal cancer, while physicians should recommend timely treatment to these potentially at-risk patients. There are persistent racial and ethnic disparities in the causes of death from rectal cancer. In addition, nomogram modeling suggests that the tumor size is an independent risk factor for distant metastases in patients with rectal cancer. De et al showed that in the rodent tumor model KHT-C, tumor oxygenation decreased with increasing tumor size, and hypoxic tumors exhibited a higher risk of metastasis. The hypoxic environment triggered by changes in tumor size may play a pivotal role in the metastatic ability of human tumors. 34 In a study on breast cancer, Yang et al similarly found that the larger the tumour the worse the prognosis. 35 Further, our research clinically validated that surgery, chemotherapy, and radiotherapy could improve the survival prognosis of patients with metastasis.
Our study has the following strengths: First, we comprehensively analyzed the risk factors and prognostic predictions of distant metastases in patients with rectal cancer, and included external validation data from Asians, which expanded the diversity and generalizability of the validation and helped to assess the performance and reliability of the model more comprehensively. Finally, the independence of the validation dataset ensures the objectivity and accuracy of the assessment results, making the model more convincing.
This study had certain limitations. First, specific data on chemotherapy drugs and doses were not available in the SEER database, so we were unable to explore the impact of chemotherapy further. Meanwhile, we excluded a large amount of data with missing or ambiguous information, which exacerbated the risk of bias in the results to some extent. Second, we had to exclude biomarkers such as CEA and CA199 due to the lack of information on their specific parameters in the pre-2010 SEER database. Purely clinical and therapeutic means may not be able to satisfy the evaluation of tumor prognosis, and comprehensive consideration of combining a variety of biomarkers, laboratory indices, and clinical characteristics is expected to provide a more accurate and valuable basis for the determination of tumor prognosis. 32 With the progress of medicine, the application of immune and targeted therapies has gradually become widespread and has changed the prognosis of patients to a certain extent. Looking ahead, clinical practice urgently needs more outstanding prediction models to guide and support medical decision-making, in order to drive continuous progress in the field of healthcare.
Conclusion
In summary, using univariate and multivariate logistic regression analyses, We identified independent risk factors for the development of distant metastases in patients with rectal cancer. Subsequently, through univariate and multivariate Cox regression analyses, we determined the factors related to disease prognosis. This can help doctors accurately determine the prognosis of patients and provide strong support for the development of personalised treatment plans. Our column line graphs can also provide an important reference for clinical trial screening and patient stratification.
Supplemental Material
Supplemental Material - Evaluation of Risk Factors, and Development and Validation of Prognostic Prediction Models for Distant Metastasis in Patients With Rectal Cancer: A Study Based on the SEER Database and a Chinese Population
Supplemental Material for Evaluation of Risk Factors, and Development and Validation of Prognostic Prediction Models for Distant Metastasis in Patients With Rectal Cancer: A Study Based on the SEER Database and a Chinese Population by Huiru Zhang, Haojun Wang, Yan Yao, Lijuan Liu, Fubin Feng, Huayao Li, and Changgang Sun in Cancer Control.
Footnotes
Acknowledgments
Author Contributions
HRZ, HJW conceived of and designed the study. HJW, YY and LJL performed literature search. HRZ, YY and HYL generated the figures and tables. HRZ,FBF and HYL analyzed the data. HRZ, HJW wrote the manuscript and HRZ critically reviewed the manuscript. CGS supervised the research. All authors have read and approved the manuscript.
Declaration of Conflicting Interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the National Natural Science Foundation of China (82174222) and Shandong Province Natural Science Foundation (ZR2021LZY015).
Ethical Statement
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
