Abstract
Objectives
The objective is to determine the optimal minimum lymph node examination number for right colon cancer (RCC) patients.
Methods
We comprehensively analysed the Surveillance, Epidemiology and End Results database data from 2004 to 2016 to determine the 13-year trend in the number of lymph nodes examined among 108,703 left colon cancer and 165,937 RCC patients. 133,137 RCC patients eligible for inclusion were used to determine the optimal minimum for lymph node examination. We used restricted cubic splines to analyse the dose-response relationship between the number of lymph nodes examined and prognosis. X-tiles and decision trees were used to determine the optimal cutoff for the number of lymph nodes based on the survival outcomes of patients with RCC. The Kaplan–Meier method and COX model were used to estimate the overall survival and independent prognostic factors, and a prediction model was constructed. The C-index, calibration curve, net reclassification improvement and integrated discrimination improvement were used to determine the predictive performance of the model, and decision curve analysis was used to evaluate the benefits.
Results
Lymph node examinations were common among colon cancer patients over the 13-year study period. It is generally agreed that at least 12 lymph nodes must be examined to ensure proper dissection and accurate staging of RCC; however, the optimal number of lymph nodes to be examined is controversial. The dose-response relationship indicated that 12 was not the optimal minimum number of lymph nodes for RCC patients. X-tile and survival decision-tree analysis indicated that 20 nodes was the optimal number. Survival analysis indicated that <20 nodes examined was a risk factor for poor prognosis, and the classification performance was superior for 20 nodes compared to 12 nodes.
Conclusion
Lymph node examination in RCC patients should be altered. Our research suggests that a 20-node measure may be more suitable for RCC patients.
Keywords
Introduction
Colorectal cancer (CRC) is the third most common cancer worldwide and the second most common cause of cancer-specific mortality. 1 According to estimates from the National Cancer Institute, CRC accounted for approximately 8% of all cancers in 2017, and the age-standardised CRC incidence increased by 9.5% between 1990 and 2017. 2 CRC imposes a huge burden on patients and healthcare systems worldwide.
Colon cancer includes a range of disease types. Considering the distal transverse colon as the boundary, tumours originating in the distal third of the transverse, descending and sigmoid colon are considered left colon cancers (LCCs), while right colon cancer (RCC) tumours originate in the caecum, ascending and proximal two-thirds of the transverse colon. 3 Differences in histological features, clinical manifestations and patient prognoses between LCC and RCC have been widely reported.4,5 Regarding morphology, RCC tumours are flatter and less likely to be detected using colonoscopy in the early stages; therefore, diagnosed patients often have more advanced disease and larger tumours. 6 Tumours in patients with RCC also tend to be poorly differentiated. 7 These factors may contribute to worse prognoses for RCC patients, as supported by the findings of recent epidemiological studies.8,9 Evaluating RCC separately will have positive implications for the application of individualised treatments to cancer patients.
Examining a sufficient number of lymph nodes will benefit the survival of patients with CRC. Increasingly extensive lymph node examinations can help to accurately determine the stage of a tumour by detecting the presence of lymph node metastasis.10,11 Moreover, previous studies have indicated that examining more lymph nodes is directly associated with improved patient survival rates and that more accurate staging and adjuvant therapy are not the only methods to improve outcomes.12,13
The optimal number of lymph nodes to be examined is controversial. It is widely agreed that at least 12 lymph nodes must be examined to ensure proper dissection and accurate staging, as determined at the 1990 World Congress of Gastroenterology. 14 The American Society of Clinical Oncology and the National Comprehensive Cancer Network has issued guidelines for the assessment of at least 12 lymph nodes in a clinical work. 15 With medical professionals increasing their understanding of colon tumours, LCC and RCC have been recognised as 2 different types of tumours, especially for RCC as these patients have worse prognoses. However, a large-scale population study to determine whether examining 12 lymph nodes is sufficient has not been performed.
We used data from the Surveillance, Epidemiology and End Results (SEER) database to better define this problem by analysing changes in the number of lymph nodes examined in RCC patients from 2004 to 2016 and further determining the optimal minimum number of nodes examined. We also analysed whether our optimal number of examinations could be an independent risk factor for the prognosis of patients with RCC. We then developed and validated our RCC prediction model based on the new number of lymph nodes examined to improve individualised tumour treatments.
Methods
Data Collection and Patient’s Selection
The SEER database is the definitive source of cancer statistics in the United States that provides this information in an attempt to reduce the burden of cancer in the United States population. The SEER database is supported and regularly maintained by the Surveillance Study Programme of the Division of Cancer Control and Population Science’s Surveillance Study Programme. 16 The SEER database covers 34.6% of the population of the United States and includes cancer-related data from a population-based cancer registry that includes demographics, site of primary tumours, tumour morphology, diagnosis stage and treatments and tracks the life status of patients. All data analysed in the present study were obtained from the SEER database; because these data were de-identified, we did not require approval from an ethics review board or informed consent from the patients. We received permission to access the SEER research data (reference number 13944, November 2019).
From the SEER data for 2004 - 2016, we determined the codes from the International Classification of Diseases for Oncology for eligible patients: C180, C182, C183 and C184 for RCC patients, and C185, C186, C187, C199 and C209 for LCC patients. All included patients had been diagnosed with primary colorectal adenocarcinoma and underwent radical surgery including partial colectomy, subtotal colectomy/hemicolectomy, total colectomy, total proctocolectomy and colectomy or coloproctectomy with resection of contiguous organs. Patients with missing surgical and lymph-node-number information were excluded. Notably, LCC patient data were used only to display the trend in the number of lymph nodes examined in CRC patients from 2004 to 2016 but not for more detailed statistical analyses. Patients with RCC were excluded from the dose-response relationship and survival analyses if they had missing data on survival time, demographics, histology, American Joint Committee on Cancer (AJCC) stage, differentiation grade or tumour size. Applying these criteria resulted in the inclusion of 274 640 patients in the study. The dose-response relationship and survival were analysed using 133 137 patients with RCC (Figure 1). Flow chart for patient selection.
Statistical Analysis
Using count and percentage values to describe basic patient information, graphs were used to present trends in the average number of lymph nodes examined for different types of colon cancer. RCC patient death was defined as the endpoint of this study. The Kaplan–Meier method was used to calculate overall survival (OS), and logarithmic rank tests were used to compare survival differences between the groups. Multivariate regression was performed using Cox proportional hazards models and adjusted for confounding variables to analyse the relationship between the number of lymph nodes examined and patient prognosis. The restricted cubic spline (RCS) method was used to determine the dose-response relationship between the number of lymph nodes examined and OS. The fixed observation points were the 25th, 50th and 75th percentile values of the number of lymph nodes examined in all patients with RCC. Nonparametric testing by the RCS model was used to assess the dose-response relationship between the number of lymph nodes examined and OS. X-tile software and survival decision trees were used to determine the optimal cutoff for the number of lymph nodes to be examined in RCC patients. The principle and algorithm of X-tile software have been reported previously.17-19 ANOVA was used to compare the mean number of positive lymph nodes between different groups.
The survival decision-tree algorithm was implemented using the Rpart package in R software. Based on the new cutoff for the number of lymph nodes, all RCC patients were divided into training and validation cohorts at a 7:3 ratio. The Cox proportional hazards model was used to determine patient OS factors and construct a nomogram to predict 3-, 5- and 8-year survival rates in patients with RCC. The prediction model was calibrated using 500 bootstrapping iterations for both the training (internal) and validation (external) datasets. The C-index was used to quantify the predictive power of our model and determine the difference between the predicted and actual values of the Cox model, and a calibration curve was used to evaluate how well the nomogram was calibrated, and the observed and predicted survival rates were compared to calibrate the 3-, 5- and 8-year OS nomograms. Decision curve analysis (DCA) was used to evaluate the clinical value of our new prediction model. Finally, the new prediction model was compared with the AJCC staging system, and the integrated discrimination improvement (IDI) and net reclassification improvement (NRI) were calculated to determine the accuracy improvements of the new prediction model.
All statistical tests were two-sided, and statistical significance was set at P < .05. Descriptive statistics, Kaplan–Meier curves, Cox regression, nomograms, C-index, calibration plotting, DCA curves, NRI and IDI were calculated using R software (version 3.5.1). Dose-response relationships were plotted using the Stata software (version 15.1).
Results
We identified 274 640 CRC patients in the SEER database: 108 703 and 165 937 with LCC and RCC, respectively. Among all CRC patients from 2004 to 2016, the proportion of patients with ≥12 and ≥20 lymph nodes examined gradually increased, while the proportion of patients with ≥1 lymph node examined remained almost unchanged. The proportions of RCC patients with ≥1, ≥12 and ≥20 lymph nodes examined were all higher than those of patients with LCC over the same period. Among RCC patients, the proportion of RCC patients with ≥12 lymph nodes examined increased the most, from 57.3% to 88.9% in 2004 and 2016, respectively, followed by patients with ≥20 lymph nodes examined, from 22.5% to 44.1% (Figure 2). Figure 3 provides more details of the trend of the mean number of lymph nodes examined for different types of CRC patients over time. The figure shows that RCC patient examinations increase at a faster rate, and the mean is higher than that for LCC patients each year. Similarly, the number of lymph nodes examined in LCC and RCC patients increased at the fastest rate between 2004 and 2009 and then became slower. Percentage of patients with lymph node excision (≥1, ≥12, ≥20 nodes) by year. (A) Percentage of colorectal cancer (CRC) patients with lymph node excision (≥1, ≥12 and ≥20 nodes) by year. (B) Percentage of left colon cancer (LCC) patients with lymph node excision (≥1, ≥12 and ≥20 nodes) by year. (C) Percentage of right colon cancer (RCC) patients with lymph node excision (≥1, ≥12 and ≥20 nodes) by year. Mean number of lymph nodes excised by year of diagnosis in all patients and those who underwent a lymph node excision. (A) Mean number of lymph nodes excised by year of diagnosis in CRC patients and those who underwent a lymph node excision. (B) Mean number of lymph nodes excised by year of diagnosis in LCC patients and those who underwent a lymph node excision. (C) Mean number of lymph nodes excised by year of diagnosis in RCC patients and those who underwent a lymph node excision.

Baseline Characteristics of RCC Patients
Characteristics of RCC patients in the SEER database, 2004-2016.
Dose-Response Relationship Between Lymph Node Examinations and OS
After adjusting for age, sex, race, AJCC stage, differentiation grade, histology and tumour size, the RCS model indicated that there was a negative correlation between the number of lymph nodes examined and mortality risk, indicating that RCC patients who had more number of lymph nodes examined had a better prognosis. Using the criteria previously applied to CRC patients (≥12 lymph nodes examined) as the reference value, we observed that 12 was not the optimal minimum in RCC patients since the mortality risk continued to decline rapidly as the number of lymph nodes examined increased. When the number of lymph nodes examined was >20 (OR = .75, 95% CI = .74-.76), there was a slower decrease in the mortality rate, meaning that 20 lymph nodes examined was an inflection point for the dose-response relationship (Figure 4). Dose-response relationship between number of lymph node examined and risk of death.
Analysis of the Optimal Minimum Node Count
The analysis of the dose-response relationship suggested that the optimal minimum number of lymph nodes to be examined in RCC patients is >12. We used X-tile analysis and survival decision trees to determine values more accurately by exploring the cutoff value for OS predictions based on every possible number of lymph nodes examined.
Both the X-tile analysis and the survival decision trees indicated that 20 was the optimal minimum number of nodes to be examined (Figure 5). Combined with the results for the dose-response relationship, we used this optimal lymph node count as a prognostic factor for RCC patients in the subsequent analysis. Identification of the optimal cut-off point of lymph node count for RCC patients. (A) Result based on the x-tile software. (B) Result of the decision-tree algorithm.
Effect of the 20-Node Measure on OS of RCC Patients in Different AJCC Stages
The patients were divided into 2 groups based on the 20 nodes. For all AJCC stages, 3-, 5- and 8-year survival rates for patients with <20 lymph nodes examined were 62.7%, 51.8% and 39.8%, respectively, and 71.6%, 61.5% and 50.4%, respectively, for patients with >20 lymph nodes examined. The survival curve for patients at different AJCC stages indicated that the 3-, 5- and 8-year survival rates differed the most in patients with AJCC stage III (10.1%), II (11.8%) and II (13.2%), respectively. The survival curves also suggested that patients in AJCC stages II and III had a greater chance of survival based on our suggested number of examined lymph nodes (Figure 6). Prognostic impact of the 20-node measure on overall survival (OS) for RCC patients with different AJCC stage. (A) Survival curve of patients with all AJCC stage. (B) Survival curve of patients with AJCC stage I. (C) Survival curve of patients with AJCC stage II. (D) Survival curve of patients with AJCC stage III. (E) Survival curve of patients with AJCC stage IV.
The 20-Node Measurement Was Associated With Tumour Stage and Number of Positive Lymph Nodes
Change of stage in different groups of lymph nodes examined.

Mean number of positive nodes in different groups.
Multivariate Analyses for OS
Multivariate survival analysis based on 20 nodes of lymph node examination.
Multivariate survival analysis based on 12 nodes of lymph node examination.
Constructing a Nomogram From the Training Cohort
The training cohort data were used to construct the prediction model. Independent prognostic factors associated with OS in RCC patients identified using multivariate Cox regression were used to construct the nomogram. The nomogram was used by drawing a vertical line to obtain the value of each variable, and the values of all variables are added to obtain a total score, with a vertical line drawn down from the total value to obtain the OS rates at 3, 5 and 8 years (Figure 8). Nomogram predicting 3-, 5- and 8-year survival.
Evaluating the Nomogram Using the Validation Cohort
The C-index values were .719 and .707 in the training and validation cohorts, respectively, indicating that the model had good recognition ability. The calibration curves verified the consistency between the actual value of the model and the predicted value. As displayed in Figure 7, the probabilities of OS at 3, 5 and 8 years were almost the same as the standard line, indicating that the model was well calibrated in both cohorts (Figure 9). The NRI and IDI were more sensitive indicators for comparing the prediction accuracy of our model with that of the AJCC staging model. The 3-,5-and 8-year NRIs were .34 (95% CI = .33-.35), .33 (95% CI = .32-.34) and .32 (95% CI = .31-.33), respectively, in the training cohort, and .34 (95% CI = .32-.36), .33 (95% CI = .31-.35) and .33 (95% CI = .31-.35) in the validation cohort. The IDI values for 3-, 5- and 8-year OS were .046, .055 and .063 (P < .001), respectively, in the training cohort and .048, .057 and .065 (P < .001) in the validation cohort. These findings indicate that our model has a significant advantage in predicting the 3-, 5-and 8-year OS rates in patients with RCC. Finally, a DCA curve was constructed to assess the clinical effectiveness of our model. Figure 10 displays the net benefit rates for patients in both cohorts and shows that our prediction model significantly outperforms the AJCC staging model in predicting patient survival at 3, 5 and 8 years. Calibration curves for the nomogram. (A) Calibration curves for 3-year survival of the training cohort. (B) Calibration curves for 3-year survival of the validation cohort. (C) Calibration curves for 5-year survival of the training cohort. (D) Calibration curves for 5-year survival of the validation cohort. (E) Calibration curves for 8-year survival of the training cohort. (F) Calibration curves for 8-year survival of the validation cohort. DCA curves for the nomogram. (A) DCA curve for 3-year survival of the training cohort. (B) DCA curve for 3-year survival of the validation cohort. (C) DCA curve for 5-year survival of the training cohort. (D) DCA curve for 5-year survival of the validation cohort. (E) DCA curve for 8-year survival of the training cohort. (F) DCA curve for 8-year survival of the validation cohort.

Discussion
The number of examined lymph nodes has long been a concern as a prognostic risk factor for colon cancer, and many patients with this condition have benefited from the guidelines for 12 lymph nodes being examined.20-22 Recent deeper research on colon cancer reported lateral differences, with LCC and RCC being considered 2 different types of solid tumours.4,5 RCC tends to have a worse prognosis,23,24 suggesting that the needs of RCC patients may not be met by examining only 12 nodes. Large-scale population data must be analysed to verify this.
The mean number of lymph nodes examined in RCC patients continuously increased between 2004 and 2016, and the proportion of patients who underwent 12-node examinations also increased. This encouraging situation is probably related to the development of clinical practice guidelines that significantly assist in improving patient outcomes. However, the difference between LCC and RCC suggests that 12 nodes are not the optimal minimum number of nodes to be examined for RCC patients. This optimal number can be determined using various methods. The RCS model determines inflection points by analysing the dose-response relationship between the number of lymph nodes examined and the prognosis and then calculates the optimal minimum value. 25 X-tile software and survival decision trees were used to group patients according to the number of nodes.26,27 When the survival curves of the 2 groups differed the most, the corresponding node was the optimal minimum value.28,29 These methods have been widely used in previous studies to calculate the optimal minimums in epidemiological data.15,30,31 Previous studies have suggested that RCC patients should have more lymph nodes examined to improve their prognosis. A cohort study of the Polish population reported that the total number of lymph nodes examined was significantly higher in RCC than in LCC patients (11.7 ± 6.0 vs 8.3 ± 5.0, mean ± SD). 32 Another SEER-based study suggested that more lymph nodes were examined in RCC than in LCC patients and used the mean values to group patients based on independent prognostic factors. 33 To confirm the conclusions of previous studies, our study proposed a more accurate method for determining the optimal minimum number of nodes to examine that would be as beneficial as possible to patient survival.
While examining more lymph nodes can improve the prognosis of patients, the reasons for its association with survival have not been specifically explained. 34 Our study found that the number of lymph nodes examined was associated with the AJCC stage, N stage and the positive number of lymph nodes. We hypothesised that adequate lymph node examination can more accurately confirm the number of positive lymph nodes, so as to correctly classify patients and adopt appropriate treatment strategies, while inadequate lymph node examination will affect the survival interests of patients. In addition, correctly classify patients as lymph node negative or positive would improve the accuracy of staging, thereby improving targeted treatments and adjuvant therapies. 35 Patients with positive lymph nodes detected and operated in time have a lower risk of lymph node metastasis and may have a better prognosis.34,36
The present findings did not prove that there is a causal relationship between lymph node examination and OS, but they do represent strong circumstantial evidence that 12-node examinations are insufficient for RCC patients. Some studies have also found that the immune status of patients may affect the number of lymph nodes examined, because large lymph node excisions may negatively affect the immune status, while increasing the number of lymph nodes examined may be ineffective in improving survival in patients with metastases. 14 This is a limitation of our conclusion, but it can still reasonably be concluded that examining an adequate number of lymph nodes contributes to improved survival rates in RCC patients over a large population.
We analysed survival differences among patients at different AJCC stages and constructed a simple assessment tool to enable efficient clinical prediction and guide overall clinical outcomes. According to the survival analysis results, patients with AJCC stages II and III exhibited more positive effects on survival from the 20-node examination. Our prediction model, involving 20 nodes as independent predictors, effectively predicted patient survival and had higher net benefits than the AJCC staging system. Another advantage is that the prediction model we developed is based on large population studies, and the results are stable and reliable. However, for patients with AJCC stage IV, more lymph node examinations have less effect on prognosis, meaning that patients with recurrence and distant metastasis had less benefit. Although the number of lymph nodes examined in RCC patients is increasing each year, only 38.0% of patients in the study population underwent 20-node examinations, and 61.3% of patients did not receive sufficient lymph node examinations in 2016. These patients are the primary audience for our conclusions and prediction model and should be primary targets for the next phases of RCC treatment and care.
This study utilised the SEER database, the definitive source of cancer statistics in North America, but it also has some limitations. First, this was a retrospective study, and recall bias could not be avoided. Second, since the SEER database does not include detailed surgical information, it was impossible to differentiate between laparotomy and laparoscopic surgery that may affect the number of lymph nodes examined performed. Third, since our study subjects were all North Americans, more studies are needed to verify the generalisability of our conclusions.
Conclusions
Our study showed that 12 lymph node examinations were insufficient for patients with RCC. We determined 20 nodes as the optimal and minimum nodes for patients with RCC, and an adequate number of lymph node examinations are important factors in patient prognosis. At the same time, the predictive tools we developed will help clinicians make rational clinical decisions, thus benefiting patients.
Footnotes
Acknowledgments
The authors thank SEER program staff for providing open access to the database.
Author Contributions
WW and DL analysed the data and wrote the manuscript. WM, FX, SZ and DH acquired and analysed the data. HY and JL designed the study and participated in data analysis and interpretation. All authors approved the paper.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the National Social Science Foundation of China (No.16BGL183).
Ethics Statement
There are no human subjects in this article and informed consent is not applicable.
