Abstract
Purpose:
To develop a nomogram model for the predicted overall survival (OS) in patients aged 18 to 59 years with nasopharyngeal carcinoma (NPC) and assess the value of the clinical application.
Methods:
In total, 1334 registers of NPC patients from 2010 to 2015 were retrieved from the Surveillance, Epidemiology, and End Results database. Univariate and multivariate Cox analysis were used to screen out independent risk factors affecting patients. Cox analysis predicted OS for patients with NPC at 3, 5, and 8 years. Nomogram performance was validated using the concordance index (C-index), receiver operating characteristic, calibration curve, and decision curve analysis (DCA).
Results:
Age, sex, race, marital, histological type, tumor size, AJCC stage, and radiotherapy were independent risk factors. The C-index of the nomogram was 0.69 [95% confidence interval (CI): 0.68-0.71] for the training set, and the C-index of the AJCC stage was 0.63 (95% CI: 0.62-0.65), both statistically significant (P < .01). The area under the curve for the nomogram at these intervals (0.755, 0.729, and 0.729, respectively) was higher than that of the AJCC stage (0.667, 0.646, and 0.646, respectively), indicating better predictive accuracy. The calibration curves revealed a high degree of agreement between the observation and the prediction. Compared to the American Joint Committee on Cancer (AJCC) stage, DCA showed better clinical utility.
Conclusion:
The nomogram as novel predictor for nasopharyngeal carcinoma patients’ survival.
Introduction
Nasopharyngeal carcinoma (NPC) is a rare epithelial cancer originating from the inner membrane of the nasopharyngeal mucosa with a unique epidemiological and geographical distribution. 1 The age of onset of nasopharyngeal cancer varies across different regions. In areas with low incidence rates, the occurrence of NPC typically increases with age. The age range of 15 to 24 represents the first peak period of NPC incidence. 2 Between the ages of 25 and 39, there is a notable rise in incidence and mortality among NPC patients. 3 Conversely, in regions where nasopharyngeal cancer is endemic, the highest incidence is observed in the age group of 50 to 60 years. 4
Despite these variations, there is limited in-depth research on young patients with nasopharyngeal cancer, and clear clinical guidelines for this specific population are lacking. Similarly, individual prognostic factors for patients aged 18 to 59 with NPC have not been adequately studied.
Predictive modeling of risk factors for nasopharyngeal cancer plays a crucial role in facilitating early diagnosis and optimizing treatment options . However, existing predictive models primarily focus on the general population 5 or the elderly population, 6 with no specific models developed for individuals aged 18 to 59. The development of a predictive model tailored to this age group would not only identify key prognostic factors but also enable individualized prognosis prediction for each patient.
It is worth nothing that NPC accounts for a significant proportion (44.4%) of head and neck squamous cell carcinoma in patients under 30 years of age. 7 Although the prognosis for this disease is generally better in the young population, these patients often present at an advanced stage with a higher incidence of complications, significantly impacting their quality of life. A study conducted by Zhu et al 3 further highlighted that young patients tend to have advanced-stage disease compared to older individuals, emphasizing the importance of developing models and conducting research specifically focused on this specific population.
To develop a predictive model specifically for individuals aged 18 to 59, utilizing the widely recognized Surveillance, Epidemiology, and End Results (SEER) database, this model aims to enable early identification of factors associated with a poor prognosis and provide accurate predictions to guide clinical decision-making.
Materials and Methods
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Demographic, Clinical Characteristics
The retrospective study is based on the SEER database (http://seer.cancer.gov/), which is one of the most representative large oncology registry databases in North America. SEER data is a national cancer database in the United States that includes 97% of tumor categories and covers approximately 30% of the total US population. 8 The database has been strictly studied and validated, so it is considered the gold standard for cancer registration (evidence level: III, analytical review). The clinical information for all cancer patients is publicly available from the SEER database. The privacy of patients is fully protected, and their personal information cannot be identified. Full data access and usage have been authorized by the SEER registry (authorization number: 11227-Nov2021).
Using SEER*Stat software (version 8.4.0; National Cancer Institute, Bethesda, MD, USA), patients with NPC, registered between 2010 and 2015, were carefully reviewed [site recode International Classification of Disease for Oncology, 3rd Edition (ICD-O-3)/World Health Organization (WHO) 2008 of “nasopharynx” and behavior recode ICD-O-3 of “malignant”]. Demographic, clinicopathological, and treatment variables including age, sex, race, marital, histological type, tumor size, tumor stage (T stage), node stage (N stage), metastatic stage (M stage), AJCC stage (TNM stage), radiotherapy, and chemotherapy were originally retrieved from the SEER database. The primary endpoint was OS, which was defined as the time interval from diagnosis of NPC to death from any cause or the last follow-up.
Predictor Selection by the LASSO
According to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis guideline, 9 all patients included in the study were randomly assigned in a 7:3 ratio to the training set and the validation set. The goal was to identify potential risk factors for cancer OS. In the training set, a least absolute shrinkage and selection operator (LASSO) regression analysis was performed, including all predictors. This analysis resulted in the identification of 12 variables with non-zero coefficient values, along with their corresponding minimum lambda values. It is important to note that the AJCC staging system is derived from the TNM staging system, which includes tumor stage, node stage, and metastasis stage. To avoid issues related to multicollinearity, only the AJCC stage was included as a predictor in this study.
Construction and Validation of the Nomogram
Firstly, univariate Cox regression analyses were conducted on the LASSO candidate predictors using the training set. This analysis aimed to identify statistically significant candidate predictors (P < .05). Secondly, the candidate predictors that showed statistical significance in the univariate analysis were included in the multivariate Cox regression analysis. This multivariate analysis aimed to further identify the independent risk factors for overall survival (OS) in patients. Based on the results of the multivariate Cox regression analysis, a nomogram was constructed to predict OS at 3-, 5-, and 8-year intervals for patients with NPC.
The discrimination and calibration of the prediction model were evaluated using metrics such as the concordance index (C-index), the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, and the calibration curve. A bootstrapping method with 1000 resamples was employed to reduce the risk of overfitting. The clinical utility of the nomogram was assessed through decision curve analysis (DCA).
Statistical Analysis
Categorical variables were presented as numbers and percentages for each group. The clinical characteristics were evaluated using the Chi-square test. Cox proportional hazard regression was conducted for both univariate and multivariable analyses to assess the independent variables influencing OS, yielding hazard ratios (HRs) and 95% confidence intervals (CIs).
The data was randomly grouped using the caret package. LASSO analysis was performed using the glmnet package. Based on the results of the multivariate analysis, a nomogram was developed using the rms package. The survival ROC package was utilized for drawing the ROC curve. All statistical analyses were conducted using R software version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria). Two-tailed tests were used to calculate the P-values, and a P-value <.05 was considered statistically significant.
Results
Demographic, Clinical Characteristics
Initially, a total of 2659 patients diagnosed with primary NPC were identified from the SEER database. After undergoing strict screening procedures, 1473 patients were excluded, as illustrated in Figure 1. Ultimately, 1334 patients diagnosed with NPC between 2010 and 2015 from the SEER database were included in our study, with 936 patients allocated to the training set of the predictive model and 398 patients assigned to the validation set. The baseline characteristics of the patients are summarized in Table 1. Of all patients included, 479 (35.9%) died at the end of the follow-up period, while 855 (64.1%) survived. The age of the entire cohort ranged from 18 to 59 years, with a median age of 38 to 39 years.

Study design flow chat.
Demographic and clinical characteristics of the training set and the validation set.
Abbreviations: DNKC, differentiated non-keratinizing carcinoma; KSCC, keratinizing squamus cell carcinoma; UNKC, undifferentiated non-keratinizing carcinoma.
Candidate of Model Predictor
To mitigate the risk of overfitting and enhance model simplicity, LASSO regression analysis was employed, which applies a penalty to the absolute values of the coefficients (Figure 2).

After initial screening by univariate analysis, OS variables were selected using the LASSO method. (A) Trend lines of model coefficients for the 12 variables of OS. (B) Tuning parameter l in the LASSO model. OS, overall survival; LASSO, least absolute shrinkage and selection operator.
Construction and Validation of the Nomogram
The univariate Cox regression analysis demonstrated that age, sex, race, marital status, tumor size, radiotherapy, chemotherapy, histopathological type, and AJCC stage were found to be statistically significant factors associated with OS. Moreover, the multivariate Cox regression analysis confirmed that age, sex, race, marital status, tumor size, radiotherapy, histological type, and AJCC stage were independent risk factors for OS. The results of the univariable and multivariable Cox analyses for OS are presented in Table 2. Based on the findings of the multivariate Cox regression analysis, we constructed nomograms to predict OS at 3-, 5-, and 8-year intervals (Figure 3A).
Univariate and multifactorial Cox regression analysis of OS in patients with NPC.
Abbreviations: CI, confidence interval; HR, hazard ratio; NPC, nasopharyngeal carcinoma; OS, overall survival.

Nomogram for young patients with nasopharyngeal carcinoma and risk stratification for 3-, 5-, and 8-year overall survival.
To assess the accuracy of the risk prediction model, various testing methods were employed to validate the data from both the training and validation sets. The C-index of the nomogram was 0.69 (95% CI: 0.68-0.71) for the training set, while the C-index of the AJCC stage was 0.63 (95% CI: 0.62-0.65). Both values were statistically significant (P < .01), indicating a robust discriminatory ability of the nomogram. This suggests that the nomogram performs well in distinguishing between high and low-risk patients. Figure 4 depicts the ROC curves for 3-, 5-, and 8-year OS predictions, comparing the nomogram with the AJCC stage. The corresponding AUC values for the nomogram were 0.755, 0.729, and 0.729, while for the AJCC stage, they were 0.667, 0.646, and 0.646, respectively. These results demonstrate that the nomogram exhibits excellent discrimination in predicting survival outcomes. The calibration curves for 3-, 5-, and 8-year OS outcomes are displayed in Figure 5, illustrating a high level of agreement between the predicted values of the nomogram and the observed values. This indicates that the nomogram has good accuracy in estimating survival probabilities. Furthermore, Figure 6 demonstrates the DCA, which reveals favorable net benefits and threshold probabilities in both the training and validation sets. This suggests that the nomogram possesses good clinical utility and can assist in making informed decisions.

The AUC of ROC curve used to compare the predictive nomogram and AJCC stage at 3 (A), 5 (B), and 8 year (C). AUC, area under the curve; ROC, receiver operating characteristic.

The calibration curve for predicting overall survival. (A) The model training set at 3, 5, and 8 years. (B) the model validation set at 3, 5, and 8 years.

The decision curve analyses used to compare the predictive nomogram and AJCC stage at 3 (A), 5 (B), and 8 year (C).
Discussion
To the best of our knowledge, this study is the first to develop a nomogram specifically tailored to evaluate OS in patients aged 18 to 59 years with NPC. Our approach involved the identification of prognostic factors and a comprehensive evaluation of the nomogram’s predictive accuracy and validity. This rigorous evaluation was conducted to prevent overfitting of the prediction model and ensure its practicality in clinical settings. Additionally, we employed DCA to assess the clinical utility of the nomogram, further enhancing its applicability in guiding clinical decision-making.
Nomograms, as predictive tools, offer several advantages in the field of prognostic assessment. They provide a visual representation of various prognostic risk factors in a column chart format, enabling quick interpretation and ease of use. Due to their high accuracy and readability, nomograms have become increasingly popular in predictive models. While the AJCC stage is a widely used tool for evaluating tumor severity, it primarily relies on factors such as tumor size, lymph node involvement, and distant metastasis. However, our research aimed to develop a predictive model that incorporates a broader range of factors to provide more personalized and accurate predictions. By considering additional factors, our nomogram can offer a more comprehensive assessment of an individual’s risk of overall survival with NPC. This approach goes beyond relying solely on the AJCC stage, providing a more nuanced and individualized prediction. The intuitive and user-friendly nature of nomograms has contributed to their widespread adoption in medical research and clinical practice. They serve as valuable tools in facilitating prognostic assessments and guiding personalized treatment decisions. By incorporating multiple prognostic factors into a single visual representation, nomograms enhance the understanding and interpretation of complex predictive models.
Constructing a prognostic model specifically for NPC patients aged 18 to 59 is motivated by two main reasons. Firstly, NPC is known to be a prevalent tumor in young patients, often characterized by early metastasis and a higher likelihood of recurrence. However, there is a lack of effective tools to identify factors that influence the prognosis of young NPC patients. Therefore, developing a specific and comprehensive nomogram tailored for young NPC patients would hold significant clinical value, providing insights into prognosis and aiding in treatment decision-making for this specific population. Secondly, older NPC patients have been observed to have a higher mortality rate, which introduces potential confounding bias when assessing overall prognostic outcomes. These observations have been documented in earlier studies, such as the 1984 study by Morales et al, 10 which reported a 5-year actuarial survival of 14% in NPC patients under 30 years old in Puerto Rico. Additionally, a 2020 study by Yeh et al 11 found that pretreatment age was an independent predictor for NPC with synchronous second primary cancer, emphasizing the importance of early detection in this population. However, most studies in the field tend to focus on elderly NPC patients. 6 Therefore, it is crucial to explore the relationship between NPC and the young population to gain a better understanding of the disease in this specific age group. By specifically targeting young NPC patients and investigating their prognostic factors, this study aims to fill the gap in knowledge and provide valuable insights into the relationship between NPC and the young population. This information can contribute to improved risk assessment, tailored treatment strategies, and ultimately better outcomes for young patients with NPC.
Our research indicates that age is an independent risk factor for OS in young NPC patients. Gundog et al 12 also found that NPC patients under 45 years old had better OS compared to those aged 46 to 60 years old after treatment. However, it is important to acknowledge that despite the improved clinical prognosis, the quality of life of long-term survivors can be significantly affected by radiotherapy-related damage. 13
Chen et al 14 made an important discovery indicating that 5-year OS significantly decreases in NPC patients with tumor volumes ≥50 ml. Furthermore, other researchers have also identified total tumor volume as an independent prognostic factor in NPC OS. 15 However, despite the significance of these findings, they have not yet gained widespread attention. 16 Prior to this study, few researchers had considered the impact of tumor size on the prognosis of young patients with NPC. In contrast, studies in breast cancer 17 and lung cancer 18 have highlighted the importance of primary tumor size as an indicator for patient prognosis. Nevertheless, the prognostic models for primary tumor size in young patients with NPC have not been thoroughly explored. Therefore, our research incorporated tumor size as a variable of interest, and we found that tumor size is a major risk factor for survival in young NPC patients. Consistent with previous findings, we observed that larger tumors were associated with worse prognosis. 5 Notably, tumors larger than 70 mm emerged as a strong predictive indicator for NPC OS. It is plausible that larger tumors may be linked to more advanced AJCC stage and the presence of occult disseminated tumor cells. Additionally, tumor size may not only reflect the proliferative and angiogenic capacity of cancer cells but also the biological characteristics of the extracellular matrix and stroma. The intrinsic characteristics of tumor size warrant further investigation.
NPC is a tumor type that exhibits high sensitivity to radiotherapy and chemotherapy. 19 Our research findings align with previous studies, demonstrating a significant correlation between radiotherapy and the prognosis of NPC. However, the benefits of adjuvant chemotherapy for nasopharyngeal cancer remain a topic of debate. 20 Interestingly, we found that chemotherapy did not improve OS in young patients with NPC. Meta-analyses have also reported inconsistent results regarding the magnitude of chemotherapy’s benefit and the optimal chemotherapy sequence for individuals with advanced NPC.21-23 In clinical practice, achieving complete clinical remission remains a challenge even with standardized treatment protocols. This finding highlights the need for further consideration and research. In addition to age, tumor size, and chemotherapy, our study also identified other factors such as sex, race, marital status, histological type, and AJCC stage as consistent with previous studies.24-27
We have validated that our model exhibits excellent discrimination, calibration, and clinical utility, making it a valuable tool for predicting the prognosis of young patients with NPC. The model offers several advantages. Firstly, it is based on readily available clinical data that are commonly collected during routine patient assessments. This means that the model can be easily implemented in clinical practice without the need for additional specialized tests or procedures. Secondly, the model serves as an effective tool for predicting the prognosis of young NPC patients, providing valuable information to clinicians. By utilizing the model, healthcare professionals can better understand the potential outcomes for individual patients and make informed decisions regarding personalized adjuvant treatments and follow-up plans. This individualized approach is crucial for optimizing long-term prognosis. Moreover, the model facilitates effective communication between patients, their families, and oncologists. By providing clear prognostic information, the model enables open and informed discussions about treatment options, potential risks, and expected outcomes. This shared decision-making process enhances patient satisfaction and ensures that treatment plans align with patients’ preferences and goals.
This study developed a novel predictive model for assessing the prognosis of young patients with NPC using the SEER database. The model demonstrated superior personalization compared to the AJCC staging system. However, several limitations should be acknowledged. Firstly, the SEER database primarily consists of US data, which may limit the generalizability of the findings to endemic areas for NPC. Moreover, the database lacks detailed information regarding radiotherapy, such as specific regimens and the duration of treatment, which could be important factors in predicting prognosis. Secondly, it is important to note that this study was retrospective in nature, which inherently introduces biases and limitations. Potential confounding variables and selection biases may have influenced the results. Prospective studies with more rigorous methodologies would provide stronger evidence. Lastly, although the model was internally validated using the SEER database, external validation in independent cohorts or different populations is currently lacking. External validation is crucial to assess the generalizability and reliability of the model in different settings and patient populations.
Conclusion
The current study aimed to identify independent prognostic factors for OS in NPC patients aged 18 to 59 years. Several factors including age, sex, race, marital status, tumor size, histological type, AJCC stage, and radiotherapy were found to be significant predictors of OS. Building upon these findings, a novel nomogram was developed to predict OS at 3-, 5-, and 8-year intervals specifically for young NPC patients. The developed nomogram represents a more accurate and practical predictive tool for clinical decision-making in this specific patient population. In comparison to the AJCC staging system, the nomogram demonstrated superior discrimination, calibration, and clinical utility. Its use enables oncologists to more accurately assess the individual risk of young NPC patients and guide personalized treatment strategies. By utilizing the nomogram, clinicians can make more informed decisions regarding treatment options and prognostic assessment for young NPC patients. This personalized approach leads to improved patient care and tailored interventions, ultimately resulting in more optimal outcomes.
Footnotes
Authors’ Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organization, or those of the publisher, the editors, and the reviewers.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by Air Force Medical Central (No. 2021LC017 and No. 2021LC007).
