Abstract
Background
The aim of this retrospective study was to construct and clinically apply a nomogram for cancer-specific survival (CSS) in patients diagnosed with base-of-tongue squamous cell carcinoma (BOTSCC) to predict their survival prognosis.
Methods
We collected 8448 patients diagnosed with BOTSCC during 2004–2015 from the Surveillance, Epidemiology, and End Results (SEER) database and divided 30% and 70% of them into validation and training cohorts, respectively. We utilized backward stepwise regression in the Cox model to select variables. Predictive variables were subsequently identified from the variables selected above by using multivariate Cox regression. The new survival model was compared with the American Joint Committee on Cancer (AJCC) prognosis model using the following variables: calibration curve, time-dependent area under the receiver operating characteristic curve (AUC), concordance index (C-index), integrated discrimination improvement (IDI), decision-curve analysis (DCA), and net reclassification improvement (NRI).
Results
A nomogram was established for predicting the CSS probability in patients with BOTSCC. Factors including sex, race, age at diagnosis, marital status, radiotherapy status, chemotherapy status, TNM AJCC stage, surgery status, tumor size, and months from diagnosis to treatment were selected through multivariate Cox regression as independent predictors of CSS. Calibration plots indicated that the model we established had satisfactory calibration ability. The AUC, C-index, IDI, DCA, and NRI results illustrated that the nomogram performed explicit prognoses more accurately than did the AJCC system alone.
Conclusion
We identified the relevant factors affecting the survival of BOTSCC patients and analyzed the data on patients suffering from BOTSCC in the SEER database. These factors were used to construct a new nomogram to give clinical staff a more-visual prediction model for the 3-, 5-, and 8-year probabilities of CSS for patients newly diagnosed with BOTSCC, thereby aiding clinical decision making.
Keywords
Background
The base of the tongue (BOT) refers to the back one-third of the tongue. The area of the BOT houses the lingual tonsil and extends inferiorly to its termination at the level of the vallecula. Squamous cell carcinoma (SCC) in the BOT is a deeply infiltrative, aggressive tumor with a 75% incidence of lymph node metastases at the time of presentation. 1 One-third of SCCs of the tongue are BOTSCC. When the disease course progresses to T4 or spreads to nearby tissues, they can often be distinguished by their origin at the BOT, and the survival probability differs from those in squamous carcinomas with different origins. 2 On the other hand, tonsil squamous cell carcinoma (TSCC) has a much lower risk of contralateral nodal spread than BOTSCC, 3 which is why BOTSCC is separated from SCC of adjacent tissues including the lingual tonsil, oropharynx, and retromolar trigone. The present study focused on analyzing BOTSCC alone.
Male sex is a known risk factor for BOTSCC. Most patients with BOTSCC first present with it at an age of at least 60 years. 4 BOTSCC is primarily managed surgically 5 and includes the use of electrocautery, transoral laser microsurgery using a microscope and CO2 laser, and transoral robotic surgery. 6 Radiotherapy also plays an important role mostly in advanced disease, unresectable diseases, and poor surgical candidates. It also plays a role as an adjuvant to surgery at the early stages (T1 or T2) of tongue carcinoma.2,7–13
The TNM cancer staging classification system published by the American Joint Committee on Cancer (AJCC) is important for assessing overall survival, treatment planning, and estimating the recurrence rate. 14 Prognostic nomogram models can provide great benefits in clinical practice to individual patients.
We selected patients with BOTSCC from the Surveillance, Epidemiology, and End Results (SEER) database and used a series of methods to construct a nomogram for predicting the probability of cancer-specific survival (CSS) at 3, 5, and 8 years for patients with BOTSCC. A nomogram visually combines the scores of all relevant factors, with the sum of the scores corresponding to the predicted survival rate, thereby helping clinical staff to make treatment decisions and determine prognoses. Since no nomogram has been established to target patients with BOTSCC, the present study was performed to construct an overall nomogram for patients with BOTSCC on the basis of pathological and demographic factors.
Materials and Methods
Data Source
We downloaded the data of individuals with BOTSCC using SEER*Stat software (version 8.4.0.1).15,16 All of the patients with BOTSCC in the SEER database, which was developed by the National Cancer Institute to carry out comprehensive national clinical research, were extracted using the following methods17,18: The principal site of BOTSCC was chosen by applying the ICD-O-3 code “C01.9-base of tongue, NOS.” Age at diagnosis, race, sex, and marital status were chosen as the demographic features. The following variables were measured at diagnosis and included in model development: TNM AJCC stage, radiotherapy status, surgery status, chemotherapy status, tumor size, and months from diagnosis to treatment. We chose the TNM AJCC stage on the basis of the sixth edition of the Derived AJCC Stage Group. We classified radiotherapy status into “Yes” (meaning one of five situations occurred: radiation, beam radiation, NOS approach or source unknown, combination of beam and isotope or implant, and radioactive implant) and “No” (referring to one of four scenarios: none/unknown, recommended, refused, and unknown). We divided the chemotherapy and surgery statuses into “Yes” and “No/Unknown.” Tumor size was categorized into the following 3 groups: <2, 2–4, and >4 cm. The endpoint of the present study was death from BOTSCC.
Data Selection Criteria
This study tentatively extracted 10 934 patients diagnosed with BOTSCC during 2004–2015 from the SEER database. We used the third revision of International Classification of Disease for Oncology (ICD-O-3) histology subtype code 8070/3(SCC, NOS) and the primary site C02.0-Base of tongue for the main scope of inclusion. After applying the exclusion criteria of (1) unknown or not-applicable tumor size, (2) AJCC stage MX, TX, T0, or NX, (3) older than 100 years, or (4) no record of months from diagnosis to treatment, 8448 patients with BOTSCC were finally selected.
Statistical Analysis
All of the qualifying BOTSCC cases were randomly divided into the validation (n = 2535, 30% of all cases) and training (n = 5913, 70%) cohorts for nomogram development and validation. Figure 1 details the data filtering and sorting procedures. Data selection flowchart.
At the time of diagnosis, age at diagnosis and months from diagnosis to treatment were represented as medians and interquartile ranges and were normally distributed. The nomogram was constructed as follows: First, we conducted univariable analyses with principal endpoints of the CSS at 3, 5, and 8 years. Second, factors clearly linked to survival in each univariate analysis (P < .05) were integrated into the multivariable Cox regression analysis. A final multivariable model was derived through inverse model selection. A nomogram was constructed based on the results of the multivariable analyses.
We validated the nomogram by applying calibration curves, concordance index (C-index), time-dependent area under the receiver operating characteristic (ROC) curve (AUC), and decision-curve analysis (DCA). The AUC and C-index values varied from .5 to 1.0, with .5 denoting random chance and 1.0 denoting perfect discrimination. A calibration plot of 500 bootstrap weights was employed to determine the calibration. DCA was developed as a means to assess the clinical value of a predictive model in informing decision making within the healthcare setting. By comparing the potential benefits and harms, DCA helps determine whether the use of a prediction model would yield a net positive outcome. In contrast, papers lacking DCA are unable to adequately address queries regarding the clinical value of the predictive model. 19 We employed 2 relatively new metrics to complement this, namely, net reclassification improvement (NRI) and integrated discrimination improvement (IDI), and to improve the comprehensiveness and accuracy of the comparisons.
R software (version 8.4.0.1; http://www.rproject.org) was used for all of the statistical analyses, and a 2-sided test with a probability value of P < .05 was considered significant.
Ethical Review
Data on cancer research is free for public access if the data use agreement has been signed. Since the SEER database does not contain any data that can identify specific patients, informed consent did not need to be obtained from patients before using their data.
Results
Characterization of Included Cases
Baseline of Clinicopathological and Therapeutic Factors in Training and Validation Cohorts.
Screening of Prognostic Factors for CSS and Nomogram Establishment
Multivariable Cox Regression Analysis.
Figure 2 displays the nomogram we developed based on the results of the multivariable Cox regression analysis for predicting CSS probability in patients with BOTSCC at 3, 5, and 8 years. The nomogram indicates that the M stage made the greatest prognostic contribution to CSS probability in BOTSCC, followed by age at diagnosis, sex, race, TNM AJCC stage, marital status, radiotherapy status, surgery status, chemotherapy status, tumor size, and months from diagnosis to treatment. Clinicians can predict CSS probability at 3, 5, and 8 years through scoring the prognostic factors accordingly for each individual patient. Creation of nomograms to predict CSS at 3, 5, and 8 years for patients with BOTSCC. Mari—marital status; Surg—surgery status; Rad—radiotherapy status; Chemo—chemotherapy status; Monthtt—Months from diagnosis to treatment.
Nomogram Comparison and Evaluation
After constructing the nomogram, we validated it using a series of metrics in both the validation and training cohorts. C-index is a dependable metric for assessing the properties of a nomogram. The C-indexes for the AJCC staging system were .535 and .526 in the validation and training cohorts, respectively, while those for the nomogram were considerably higher at .712 and .695. We calculated ROC curves to test the exact ability of the nomogram for CSS predictions at 3, 5, and 8 years. In the training cohort, the AUCs for the nomogram were .708, .699, and .702, respectively, and those for the AJCC staging system were .533, .534, and .528. In the validation cohort, these values were .739, .740, .733, .545, .541, and .545, respectively (Figure 3). ROC curves. ROC curves of AJCC stage and nomogram to verify accurate predictability for 3-(A and D), 5-(B and E), and 8-year (C and F) CSS probability in the training cohort and validation cohort. (A–C) stand for the outcome of the training cohort and (D–F) for the outcome of the validation cohort.
In the training cohort, the NRI values for the probabilities of 3-, 5-, and 8-year CSS were .620 (95% confidence interval [CI] = .561–.673), .587 (95% CI = .536–.643), and .585 (95% CI = .549–.651). In the validation cohort, these probabilities were .669 (95% CI = .603–.742), .721 (95% CI = .625–.796), and .679 (95% CI = .587–.765), respectively. The corresponding IDI values in both cohorts were .110, .121, .129, .112, .127, and .136, respectively (P < .001). All of the outcomes with IDI and NRI values above 0 suggested that the predictive properties of our new model were better than those of the AJCC model.
Calibration charts for CSS probabilities at 3, 5, and 8 years revealed that the predicted line was very close to the 45-degree line, suggesting good consistency between nomogram predictions and real observations in both cohorts (Figure 4). Calibration plot. Calibration plot for predicting patient CSS survival at 3-, 5-, and 8 years. (A–C) denote the outcome of the training cohort and (D–F) denote the outcome of the validation cohort.
Using DCA to assess clinical usefulness revealed that the novel nomogram model produced more net gains in CSS probabilities at 3, 5, and 8 years than did the conventional AJCC staging system, demonstrating the high clinical utility of the CSS nomogram (Figure 5). Decision curve analysis (DCA). (A–C) DCA of 3-, 5-, and 8-year CSS nomogram using the training cohort; (D–F) DCA of 3, 5 and 8 year CSS with the validation cohort.
Discussion
BOT and the tonsil have always been combined in research because they are located close to each other posterior to the circumvallate papillae. 20 Due to the survival differences between cancer of the BOT and other sites, such as tonsil, retromolar trigone, and oropharynx, it is meaningful to conduct a separate analysis specifically for BOTSCC2,9,21–23; a study based on the statistics derived from the National Cancer Database also supported this view. 20 A neatly supported and comprehensive study on BOTSCC prognoses has been lacking.
A specific nomogram for BOTSCC appears to be urgently needed due to the obvious inadequacy of staging only according to tumor pathology, that is, AJCC staging. This study established a nomogram by extracting sufficient samples from the large-sample SEER database to address that limitation. Our new model can provide independent data sets to ensure more-accurate model evaluations and make it more intuitive and easier for clinicians to apply in treatments. The influencing factors obtained from the multivariate Cox regression, such as age at diagnosis, sex, marital status, race, TNM AJCC stage, chemotherapy status, radiotherapy status, surgery status, tumor size, and time from diagnosis to treatment, also represent reference information for use in research into BOTSCC.
Our study indicated that sex, a previously unidentified risk factor for BOTSCC, was a prognostic element in the novel model13,24; this study identified female sex as a risk factor for survival (HR = 1.264, P < .001). This outcome is consistent with the increasing smoking trend among females. Our study also found for the first time that the months from diagnosis to treatment is a significant predictor affecting BOTSCC prognosis. This result can allow clinical staff to provide more information to their patients about the changes in the prognosis of the disease and its severity, improve the efficiency of doctor–patient communication, and indirectly improve the compliance of the patients. Figure 2 illustrates all of the associated factors and the degree to which they influence the probabilities of CSS at 3, 5, and 8 years in patients with BOTSCC, as derived from our Cox regression model.
We compared the base nomogram model with the AJCC model by applying the AUC, C-index, IDI, and NRI values of both cohorts. 10 Both the AUC and C-index of the nomogram were superior to those of AJCC staging system, illustrating that our new model provided a better fit than the AJCC staging system alone in both cohorts (Figure 4). The NRI results indicated that nomogram model was better at reclassifying the risk probability, with NRIs for 3-, 5-, and 8-year survival probabilities of 62.0%, 58.7%, and 58.5%, in the training cohort, respectively, and 66.9%, 72.1%, and 67.9% in the validation cohort (P < .001). IDI was used to analyze the superiority and to compare the predictive abilities of the nomogram to the AJCC model. The results indicated that our new model provided 11.0%, 12.1%, 12.9%, 11.3%, 12.7%, and 13.6% (P < .001) improvements in those survival probabilities, respectively.
The calibration curves were used to assess the consistency of the models (Figure 4). The 45-degree line is the criterion line, and the predicted values (solid lines) in the calibration plots closely matched the actual values (dotted lines) in both cohorts. This implies that the model we constructed can predict the probabilities of CSS at 3, 5, and 8 years with good consistency and also ensures the reliability of the nomogram and that it can be used in clinical applications.
DCA is a methodology for examining the clinical utility of a nomogram. Figure 5 displays the DCA curves associated with the nomogram in the training (Figure 5(A)–(C)) and validation (Figure 5(D)–(F)) cohorts, which indicated that the novel nomogram model has a positive net benefit.
However, our research had some inevitable limitations. First, the study had a retrospective design. Second, the factors in this study were very broadly classified, such that the results could not subdivide the effects conferred by various treatment options; for instance, it would be inappropriate to combine “no” and “unknown” into one class in the SEER database, and regarding surgery status, the different surgical sites and their corresponding surgical approaches can cause different outcomes25,26 that were all integrated into the “yes” category. Third, several underlying significant factors were not included in this study, such as the presence of HPV infection3,27–29 or habit history like smoking,30,31 making our new model insufficiently comprehensive.
Conclusion
The present multivariate Cox regression analysis found that sex, age at diagnosis, marital status, race, chemotherapy status, TNM AJCC stage, radiotherapy status, surgery status, tumor size, and months from diagnosis to treatment were independent predictive factors for CSS in individuals with BOTSCC. We constructed a nomogram for CSS probability in patients with BOTSCC at 3, 5, and 8 years. The validation of our nomogram determined that it exhibited excellent discriminatory power and accurate calibration.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Guangdong Provincial Key Laboratory of Traditional Chinese Medicine Informatization (2021B1212040007).
