Abstract
Introduction
The purpose of this study was to construct and validate a nomogram for predicting cancer-specific survival (CSS) in undifferentiated pleomorphic sarcoma (UPS) patients at 3, 5, and 8 years after the diagnosis.
Methods
Data for UPS patients were extracted from the SEER (Surveillance, Epidemiology, and End Results) database. The patients were randomly divided into a training cohort (70%) and a validation cohort (30%). The backward stepwise Cox regression model was used to select independent prognostic factors. All of the factors were integrated into the nomogram to predict the CSS rates in UPS patients at 3, 5, and 8 years after the diagnosis. The nomogram’ s performance was then validated using multiple indicators, including the area under the time-dependent receiver operating characteristic curve (AUC), consistency index (C-index), calibration curve, decision-curve analysis (DCA), integrated discrimination improvement (IDI), and net reclassification improvement (NRI).
Results
This study included 2,009 UPS patients. Ten prognostic factors were identified after analysis of the Cox regression model in the training cohort, which were year of diagnosis, age, race, primary site, histological grade, T, N, M stage, surgery status, and insurance status. The nomogram was then constructed and validated internally and externally. The relatively high C-indexes and AUC values indicated that the nomogram has good discrimination ability. The calibration curves revealed that the nomogram was well calibrated. NRI and IDI values were both improved, indicating that our nomogram was superior to the AJCC (American Joint Committee on Cancer) system. DCA curves demonstrated that the nomogram was clinically useful.
Conclusions
The first nomogram for predicting the prognosis of UPS patients has been constructed and validated. Its usability and performance showed that the nomogram can be applied to clinical practice. However, further external validation is still needed.
Keywords
Introduction
Undifferentiated pleomorphic sarcoma, which was previously known as malignant fibrous histiocytoma, was first proposed by O’Brien et al. in 1964.1-3 Undifferentiated pleomorphic sarcoma is thought to be a high-grade aggressive sarcoma, and originates in histiocytes.1-4 It appears in a storiform pattern and consists of spindled (fibroblast-like) and rounded (histiocyte-like) cells, accompanied by pleomorphic giant and inflammatory cells.1-4 Undifferentiated pleomorphic sarcoma occurs most frequently in the soft tissues of extremities (50% in lower limbs, 20% in upper limbs), and occasionally in bone and viscera.5-8 It peaks around the age of 60 years and is more common in males. 6 As one of the most common subtypes, it accounts for about 5-10% of the adult soft-tissue sarcoma (STS).1-3 The present study focused on UPS occurring in soft tissues.
Undifferentiated pleomorphic sarcoma is normally treated by extensive surgical resection and adjuvant radiotherapy, with adjuvant chemotherapy considered for high-grade tumors. 9 Accurate prognosis is critical for treatment decisions. At present, the prognostic tool most commonly used for UPS is the American Joint Committee on Cancer (AJCC) staging system for STS. In addition to tumor (T), node (N), and metastasis (M) staging, which are used as predictors for all types of malignant tumors, the AJCC staging system for STS also considers the histological grade.10-12 However, this approach is applicable to all subtypes of STS, rather than being specifically designed for UPS. And it is widely considered that the prognostic features of different STS subtypes vary greatly. Moreover, many other important prognostic factors are neglected by this staging system. Therefore, it does not provide a very accurate prognosis for individual UPS patients. In other words, for UPS patients in the same AJCC stage, the prognosis may vary greatly. Nomograms have been widely used as an alternative to the AJCC staging system. By broadly integrating prognostic factors and quantifying individual risks, a nomogram can provide more accurate and intuitive individualized predictions. 13 In many oncology studies, the performance of a nomogram has been shown to be superior to the AJCC staging system.14-17 However, to the best of our knowledge, a nomogram for UPS has not yet been reported. 18
The aim of this study was to construct and validate a prognostic nomogram for predicting the cancer-specific survival (CSS) rates of UPS patients at 3, 5, and 8 years after the diagnosis, based on the Surveillance, Epidemiology, and End Results (SEER) database. In clinical practice, a nomogram can be used as a convenient and reliable reference tool to help assess the prognosis of patients and develop individualized treatment strategies.
Method
The SEER database was established by the National Cancer Institute in 1973 with the aim to reduce the burden of cancer. It is one of the most representative tumor databases in the United States.12,19 With 18 population-based registries already, the SEER database covers about 28% of the US population. 7 Each registry collects demographic, clinicopathological, and follow-up survival data, providing a broad approach to the study of tumors, especially rare tumors. 20
The SEER*Stat (version 8.3.6) software was used to extract data from the SEER database. The selected UPS patients had been diagnosed between 1990 and 2015. The subdatabase used had the following designation: “Incidence-SEER 18 Regs Custom Data (with additional treatment fields), Nov 2018 Sub (1975-2016 varying).” The end date of follow-ups for this version of the subdatabase was December 31, 2016.
ICD-O-3 (third revision of the International Classification of Diseases for Oncology) criteria were applied to identify patients with soft tissue as the primary site (codes C47.0-47.9 and C49.0-49.9) and with the histological type of UPS (code 8830/3). Patients who met any of the following criteria were excluded: multiple primary tumors, diagnosed at autopsy or by death certificate, or with incomplete information on the variables of interest.
The following variables were extracted: age at diagnosis, year of diagnosis, race and origin, sex, marital status, laterality, primary site, T, N, M stage, histological grade, surgery status, radiotherapy status, chemotherapy status, insurance status, Yost index, cause of death, and survival time.
The X-tile software was used to determine the optimal cutoffs to turn continuous variables into categorical variables. Age was divided into three groups: ≤65, 66-75, and >75 years. The Yost index is derived from principal-components analyses of census group variables including income, education, and occupation. A higher Yost index value indicates a higher socioeconomic level. In the present study, the index values were categorized into three levels: low, intermediate, and high. Year of diagnosis was divided into three groups: 1990-1999, 2000-2009, and 2010-2015.
Since there were only small numbers of non-Hispanic Asians or Pacific Islanders and non-Hispanic American Indians/Alaska Natives, we combined them into the group NHAA. Marital status was classified as unmarried, married, and separated (including divorced, separated, and widowed). The primary site location was reclassified as head and neck (HN), upper limb and shoulder (US), lower limb and hip (LH), thorax, abdomen, pelvis, trunk, and not otherwise specified. Laterality was reclassified as left, right, and other. Since the SEER database does not contain the information on the seventh revision of AJCC staging system before 2004, we recoded into stages T, N, and M for cases from 1990 to 2003 to use the three variables of EOD 10-size, nodes, and extent. Here are to be highlighted, the most common histological grading system for STS is the French Federation Natinale des centres de Lutte Cotre le Cancer grading system which on the basis of the following three items: degree of differentiation, mitotic activity, and necrosis. 9 Unfortunately, this information is missing for the vast majority of UPS patients in the SEER database, so we utilized the histological grade based on the degree of differentiation instead.
All of the variables were presented as frequencies and proportions except for survival time, which was presented as median and interquartile-range values. We randomly allocated 70% of the cases into a training cohort and the remaining 30% into a validation cohort, which were used to establish and validate the nomogram, respectively. The difference in composition ratio between the two cohorts was evaluated using the chi-square test.
All variables were applied to a Cox regression model and the backward stepwise selection method was used to identify prognostic factors. On the basis of these results, we then constructed a nomogram to predict the CSS rates of UPS patients at 3, 5, and 8 years after the diagnosis.
We performed internal and external validations of the nomogram. The area under the time-dependent receiver operating characteristic curve (AUC) and the consistency index (C-index) were the indicators used to assess the discrimination ability of the nomogram. 15 We plotted the calibration curve to evaluate the relationship between the observed frequency and predicted probability. 21 Bootstrapping with 500 resamples was used to calculate the C-index and calibration curve. When compared with the AJCC staging system, the integrated discrimination improvement (IDI) and the net reclassification improvement (NRI) reflected the performance improvement provided by the nomogram.22,23 Finally, we evaluated the value of the nomogram in clinical applications by using decision-curve analysis (DCA).14,24
R software (version 4.0.0) was used to perform all statistical analyses, with the survival, rms, foreign, survival ROC, and nricens packages. The criterion for statistical significance was P<.05 in two-sided tests.
Results
Demographic and clinicopathological characteristics of the included patients.
Abbreviations: NHW, non-Hispanic white; NHW, non-Hispanic white; NHAA, Non-Hispanic Asian or Pacific Islander and Non-Hispanic American Indian/Alaska Native; HN, head and neck; US, upper limb and shoulder; LH, lower limb and hip; TH, thorax; AB, abdomen; PE, pelvis; TR, trunk.
Multivariate Cox regression analysis of cancer-specific survival in the training cohort.
Abbreviations: NHW, non-Hispanic white; NHW, non-Hispanic white; NHAA, Non-Hispanic Asian or Pacific Islander and Non-Hispanic American Indian/Alaska Native; HN, head and neck; US, upper limb and shoulder; LH, lower limb and hip; TH, thorax; AB, abdomen; PE, pelvis; TR, trunk.
A nomogram was constructed based on the prognostic factors to predict CSS rates in UPS patients at 3, 5, and 8 years after the diagnosis (Figure 1). It can be seen that the M stage had the strongest prognostic value, followed by surgery status, histological grade, T stage, insurance status, age, primary site, N stage, year of diagnosis, and race. Each prognostic feature of a patient corresponds to a score, which helps to make the nomogram easy to understand. A total score is calculated by adding up the scores for all items, which can be translated into the probabilities of 3-, 5-, and 8-year CSS. Nomogram for predicting 3-, 5-, and 8-year cancer-specific survival in undifferentiated pleomorphic sarcoma.
The C-indexes for the nomogram in the training and validation cohorts (.759 and .766) were both higher than in the AJCC system (.708 and .706). As shown in Figure 2, the 3-, 5-, and 8-year AUC values of the nomogram were also higher than the corresponding values for the AJCC system. All of these findings indicated that the nomogram had good discrimination ability. The calibration curves in Figure 3 were all close to the diagonal line, indicating that the nomogram was well calibrated. ROC curves. ROC curves were generated to validate the discrimination of the newly established nomogram, by the areas under the ROC curves. (A) Came from the training set and (B) came from the validation set. Calibration curves for 3-, 5-, and 8-year cancer-specific survival. Calibration curves depict the calibration of the newly established nomogram in terms of the agreement between the predicted probabilities and observed frequencies of the training set (A, B, C) and validation set (D, E, F).

The nomogram was then compared with the AJCC staging system. The NRI values were .331 for 3 years (95% CI = .211-.526), .343 for 5 years (95% CI = .254-.507), and .354 for 8 years (95% CI = .240-.496) after diagnosis in the training cohort and .343 for 3 years (95% CI = .125-.578), .356 for 5 years (95% CI = .141-.558), and .487 for 8 years (95% CI = .256-.667) after diagnosis in the validation cohort. The corresponding IDI values were .046 (P < .001), .050 (P < .001), .051 (P < .001), .062 (P < .001), .061 (P < .001), and .063 (P < .001), respectively. All of these findings represented the statistical superiority of our nomogram.
The DCA curves for 3, 5, and 8 years after diagnosis in the training and validation cohorts are presented in Figure 4. Our nomogram had greater net benefits in predicting the prognosis UPS patients than the AJCC system, demonstrating the clinical usefulness of the nomogram. DCA of the training set (A, B, C) and validation set (D, E, F) for 3-, 5-, and 8-year cancer-specific survival. In the figure, the abscissa is the threshold probability and the ordinate is the net benefit rate. The horizontal one indicates that all samples are negative and all are not treated, with a net benefit of zero. The oblique one indicates that all samples are positive. The net benefit is a backslash with a negative slope. The blue dotted line represents the DCA of newly established nomogram; contrastively, the red dotted line represents the DCA of AJCC staging system. Abbreviations: DCA, decision-curve analysis.
Discussion
Undifferentiated pleomorphic sarcoma is a high-grade sarcoma originating in mesenchymal tissue, showing pleomorphic nuclei and no definitive cellular differentiation.4,25 Undifferentiated pleomorphic sarcoma is common in STS, but the low incidence of STS means that UPS is also a rare tumor. 3 As a result, almost all previous studies on UPS employed a small sample from a single institution.3-5,26-29 The lack of representative samples resulted in poor extrapolation of their results. The SEER database contains data from more than 20 regions and covers 28% of the US population. 6 It collects sociodemographic, clinicopathological, treatment, follow-up, and other information on cancer patients. 30 This can provide high-quality, multicenter, large-sample data for the study of rare cancers. The present study identified 2,009 eligible UPS patients, which is a fairly large sample for studies on UPS, from the SEER database by applying specific inclusion and exclusion criteria.
Undifferentiated pleomorphic sarcoma has a poor prognosis, and the most common treatment strategy involves a combination of surgical resection, radiotherapy, and chemotherapy. 31 It is well known that decisions around treatment strategies are based on an assessment of the prognosis. The AJCC staging system, which is currently used to assess the prognosis of UPS patients, cannot provide very accurate predictions for individuals. 10 This shows the importance of developing new, better-performing prognostic tools. In the present study, we used backward stepwise Cox regression analysis to identify independent prognostic factors for UPS patients and developed (to the best of our knowledge) the first nomogram for predicting the 3-, 5-, and 8-year CSS rates for UPS patients. The prognostic factors combined in the nomogram included year of diagnosis, age, race, primary site, histological grade, T, N, M stage, surgery status, and insurance status.
As can be seen from the nomogram (Figure 2), histological grade, T, N, and M stage, which are indicators used in the AJCC system for STS, still have great prognostic value. We found that the UPS in the HN had worse survival rates than in the US and LH. This was consistent with the results of Ibanez et al. for UPS of the skin. 32 The head and neck area has a rich vascular supply and a dense lymphatic network, and the margin for surgery tends to be more narrow when considering the important anatomical site. 32 This may be why UPS in the head and neck has a poor prognosis. However, as a retrospective study, our study was inevitably biased, so the prognostic value of the primary site still needs to be verified in prospective studies.
In the present study, the cancer-specific mortality (CSM) rates were 1.602 and 2.394 times higher in patients aged 66-75 and >75 years, respectively, than in patients aged ≤65 years (66-75 vs ≤65 years: HR = 1.602, 95% CI = 1.284-1.999; >75 vs ≤65 years: HR = 2.394, 95% CI = 1.954-2.933). Many other studies on UPS also supported the prognostic value of age.3,5,8,27 This is not surprising since elderly patients are generally less physically active and often have other chronic diseases. Many cancer-related studies have found race to be an important prognostic factor, since white patients had better prognoses than black patients.15,21,32 Our study obtained the same result, with the CSM rates being higher in NHB patients than in NHW patients (HR = 1.469, 95% CI = 1.115-1.934). This can be attributed to black people having a worse socioeconomic status and less access to health care.
The prognostic value of surgery for UPS is clear, but chemotherapy and radiotherapy for UPS are controversial. Some studies suggested that chemotherapy and radiotherapy did not significantly prolong UPS patient survival, and some researchers even believed that their side effects may be harmful to the prognosis of patients.32-34 Radiotherapy and chemotherapy had no prognostic value in the present study, which may be related to radiotherapy and chemotherapy commonly being used in high-risk UPS patients. The year of diagnosis may influence the results due to the new therapeutic developments for STS, such as immunotherapy and targeted therapy. 35
One of our novel findings was that the CSM rate was higher in uninsured patients than in insured patients (HR = 2.608, 95% CI = 1.313-5.181). The difference in CSM rates between patients with insurance and patients with Medicaid was not statistically significant. To some extent, this reflected the influence of socioeconomic factors on the survival of UPS patients. Socioeconomic factors have been shown to be prognostically significant for many types of cancer.23,36 For example, Zhang et al. found that poverty was a prognostic risk factor for chondrosarcoma. However, our analysis of the Yost index (a composite index reflecting income, education, and occupation) was not positive, which may be due to the sample not being large enough to see the weak prognostic effect.
Nomograms are widely used to combine important prognostic factors with specific endpoints, quantifying the risks for cancer prognosis. 21 The nomogram developed in the present study broadly incorporated prognostic factors for UPS patients, with the information on these factors being easily obtained from medical records. It can provide clinicians with a convenient auxiliary tool to develop individualized treatment strategies and design clinical trials. 37
Internal and external validations were also performed on the nomogram. The evaluation parameters included were the C-index, AUC, calibration curve, NRI, IDI, and DCA. The findings presented in the results section showed that our nomogram had a statistical advantage in performance and clinical usefulness over the AJCC system. This was mainly because our nomogram combined more important prognostic factors.
A major advantage of our study over previous studies was the application of high quality, multicenter, large-sample data, which increased the reliability of our conclusions. There were some limitations to this study, the first being that many cases were excluded due to a lack of information, which may lead to selection bias. Second, some potential prognostic factors were not analyzed because their documentation in the SEER database was incomplete, which reduces the accuracy of the nomogram. It is clearly impossible to include all prognostic factors in the nomogram, so the nomogram should only be used as a reference for clinicians to make decisions, rather than providing a completely accurate prognosis. Third, the clinical practice of AJCC staging system has reached the eighth edition, but the latest edition used in SEER is the seventh edition, and SEER lacks part of the variables that constitute the eighth edition, so our study was still based on the seventh edition, which is also a limitation. Finally, our nomogram was based on a retrospective cohort and was only validated using data from the SEER database. Therefore, further validation by using a prospective cohort is required before clinical application.
Conclusion
We have constructed and validated the first nomogram for predicting the CSS rates of UPS patients at 3, 5, and 8 years after the diagnosis, based on the SEER database. The clinical usefulness and good performance showed that it can be applied in clinical practice as an auxiliary tool for clinicians to develop individualized treatment strategies and design clinical trials. However, further external validation is still needed.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by The National Social Science Foundation of China (grant/award no. 16BGL183).
Ethics Approval
This study was conducted in accordance with the Declaration of Helsinki.
Informed Consent
Institutional review board approval and informed consent were not required in the current study because SEER research data are publicly available and all patient data are de-identified.
