Abstract
Objectives
Gastric cancer combined with multiple primary malignancies (GCM) is increasingly common. This study investigated GCM clinical features and survival time.
Methods
Patients with GCM and GC only (GCO) were selected from the Surveillance, Epidemiology and End Results (SEER) database. Survival was compared between GCM and GCO groups using propensity score matching. Then, the GCM group was divided into a training cohort and a validation cohort. These cohorts were used to establish a nomogram for survival prediction in patients with GCM.
Results
Survival time was significantly longer in the GCM group than in the GCO group. All-subsets regression was used to identify four variables for nomogram establishment: age, gastric cancer sequence, N stage, and surgery. The concordance index and time‐dependent receiver operating characteristic curve indicated that the nomogram had favorable discriminative ability. Calibration plots of predicted and actual probabilities showed good consistency in both the training and validation cohorts. Decision curve analysis and risk stratification showed that the nomogram was clinically useful; it had favorable discriminative ability to recognize patients with different levels of risk.
Conclusions
Compared with GCO, GCM is a relatively indolent malignancy. The nomogram developed in this study can help clinicians to assess GCM prognosis.
Keywords
Introduction
Multiple primary malignancies (MPMs)—two or more primary malignant tumors in one patient—were first described in 1889. 1 The first report using the name MPMs was published in 1932 2 ; current diagnostic criteria were derived from that report. MPMs are classified as synchronous when the interval between diagnosis of the first and second primary tumors is <6 months; they are considered metachronous when that interval is >6 months. 3
The overall reported frequency of MPMs varies from 2.4% to 17% 4 ; triple localization occurs in 5% to 8% of cases. 5 Possible underlying causes of MPMs include host and lifestyle-related factors, environmental and genetic factors, and treatment-related factors. Patients with MPMs typically have less aggressive malignancies, an earlier stage of cancer, a strong family history of similar cancers, and indolent cancers with a longer survival rate. 6 The prevalence of MPMs is increasing because better survival increases the likelihood of subsequent diagnosis with another cancer. 7 However, there is limited population-based literature regarding MPMs. The use of public databases to study MPMs may provide valuable insights.
Gastric cancer (GC) is the fifth most common malignancy and the fourth leading cause of cancer-related death. 8 Although the age-standardized global incidence of GC has substantially decreased since 1990, 9 there were 1,089,103 new cases and 719,523 deaths worldwide in 2020. 8 The treatment of GC includes endoscopic mucosal resection for localized lesions, gastrectomy with chemotherapy and radiotherapy for advanced lesions, and targeted therapy for HER2-positive lesions. 10 Patients with GC have increased risks of MPMs involving cancers of the head and neck, esophagus, colon and rectum, bones and soft tissues, ovaries, bladder, and kidneys, as well as non-Hodgkin lymphoma; radiotherapy and chemotherapy are independent risk factors for MPMs. 11 The association between GC and breast cancer may be related to CDH1 gene mutations; the clustering of colorectal cancer, small intestinal cancer, and bladder cancer may be related to Lynch syndrome. 12 Furthermore, among patients with GC, survival is worse when metachronous or synchronous MPMs are present. 13 However, prediction models for these patients are not robust.
In the past decade, nomograms have been widely used for prediction in oncology.14–17 They meet the requirements of integrated models and individualized medicine, 14 while providing a convenient approach for clinicians.14,18,19 This study compared survival between patients with GC combined with MPMs (GCM) and patients with GC only (GCO), and then established a nomogram to predict GCM prognosis using the Surveillance, Epidemiology and End Results (SEER) database.
Materials and methods
Patient selection
Data concerning patients with GC from 2010 to 2019 were extracted from the SEER database (reference number for access permission: 17902-Nov2021). The SEER database (https://seer.cancer.gov/seerstat/) contains both incidence and survival characteristics for malignancies from 17 cancer registries (covering the period 2000–2019) in the United States. Clinicopathological information in the SEER database is publicly available and anonymous; thus, ethical approval and patient consent were not required for this study. The following inclusion criteria were used: cases occurred between 2010 and 2019, GC was confirmed by histology, and patients were alive or had died of cancer. The exclusion criteria were: no follow-up records and missing information about TNM staging.
Patients with GC who met the above criteria were divided into a GCO group and a GCM group according to the number of primary tumors (Figure 1). Propensity score matching (PSM) was used to ensure that baseline data were comparable between the GCM and GCO groups. Patients with GCM who met the above criteria were randomly divided (at a 7:3 ratio) into a training cohort and an internal validation cohort for nomogram construction and verification, respectively.

Flowchart of patient inclusion and separation into cohorts. Database name: Incidence–SEER Research Plus Data, 17 Registries, Nov 2021 Sub (2000–2019).
Study variables
Variables in this study were patient characteristics (age, ethnicity, sex, and marital status), tumor characteristics (primary tumor location, number of primary tumors, T stage, N stage, and M stage), treatment information (surgery, radiation, and chemotherapy), and survival information (vital status and survival time). Some variables were regrouped during the analysis. For example, patients were regrouped according to age at diagnosis: <60 years, 60–74 years, and ≥75 years.
Survival analysis
Survival rates were compared between GCO and GCM to determine the effect of MPMs on survival in patients with GC. Then, univariate Cox regression was used to identify relevant prognostic factors for GCM. Multivariate Cox regression analysis was used to analyze independent risk factors for worse survival in patients with GCM. Hazard ratios and 95% confidence intervals (CIs) were recorded.
Nomogram construction and validation
Based on the result of all-subsets regression, a nomogram was constructed to predict 12-, 36-, and 60-month cancer-specific survival in patients with GCM. Nomogram accuracy was mainly assessed by calibration curves; the relationship between observed and actual values was determined by bootstrap sampling (1000 samples). Values close to the diagonal indicated good model accuracy. The concordance index (C-index) and area under the time‐dependent receiver operating characteristic curve (time‐dependent AUC) were used to evaluate the discriminative ability of the nomogram. Calibration plots were used to evaluate calibration ability.
Evaluation of clinical utility
Decision curve analysis (DCA) was used to determine the net clinical benefit at each risk threshold. 20 Subsequently, patients were divided into low-risk, middle-risk, and high-risk groups according to each patient’s nomogram score. Cut‐off points for risk stratification were selected using X‐tile software. 21 The Kaplan–Meier curve and log-rank test were used to compare survival among risk groups.
Statistical analysis
Statistical analyses were performed using R software version 4.2.1 (https://www.r-project.org). The MatchIt package was used to implement propensity score matching. The leaps package was used for all-subsets regression to achieve optimal subset selection [i.e., the model with maximum R2 adjustment value and minimum Bayesian Information Criterion and Cp (regression fit) values]. The regplot package was used to construct the nomogram. Time-dependent cumulative/dynamic receiver operating characteristic curves were constructed using the “KM” method in the timeROC package; C-index and AUC values >0.7 were considered indicative of a reasonable estimate. DCA in the ggDCA package was utilized to quantify net benefits at distinct threshold probabilities to determine the clinical usefulness of the model. All variables were analyzed using the chi‐squared test. P values <0.05 were considered statistically significant.
Results
Patient characteristics and survival
In total, 7857 patients with GCM and 25,758 patients with GCO were included in this study; after propensity score matching, 7601 patients with GCO were matched to patients with GCM at 1:1 ratio. Demographic and clinical characteristics are shown in Table 1. Furthermore, patients with GCM were randomly divided into a training cohort (n = 5499) and a validation cohort (n = 2358). The median follow‐up intervals were 46 [interquartile range (IQR): 19–76] months among all patients with GCM, 45 (IQR: 19–76) months among patients in the training cohort, and 48 (IQR: 20–77) months among patients in the validation cohort. The cohorts were comparable in terms of demographic and clinical characteristics. Other demographic and clinical characteristics of these patients are summarized in Table 2.
Clinicopathological characteristics of patients with GCM and patients with GCO.
GCM, gastric cancer combined with multiple primary malignancies; GCO, gastric cancer only; PSM, propensity score matching.
Clinicopathological characteristics of patients with GCM in training and validation cohorts.
GCM, gastric cancer combined with multiple primary malignancies.
Subsequent survival analyses of all enrolled patients with GCO and patients with GCM indicated that the median survival time was 50 months [95% confidence interval (CI): 45–55 months)] among patients with GCM, whereas it was 28 months (95% CI: 27–29 months) among patients with GCO (P < 0.05; Figure 2a). After PSM, survival results were similar: the median survival time was 49 months (95% CI: 44–54 months) among patients with GCM, whereas it was 39 months (95% CI: 36–43 months) among patients with GCO (P = 0.015; Figure 2b).

Comparison of survival between GCM and GCO groups. (a) Comparison of survival between GCM and GCO before PSM. (b) Comparison of survival between GCM and GCO after PSM.
Nomogram variable screening and construction
Univariate and multivariate Cox regression analysis (Table 3) revealed that 11 variables (age, marital status, sex, primary site, GC sequence, number of primary cancers, T stage, N stage, M stage, chemotherapy, and surgery) were independent prognostic factors for GCM. After all-subsets regression and collinearity analysis, age, N stage, GC sequence, and surgery were included in the final model. Figure 3 shows an example of using the nomogram to predict survival probability for a specific patient. The total score is the sum of individual scores shown in the nomogram. Total risk scores for patients in this study ranged from 74 to 249.
Univariate and multivariate Cox analyses of variables influencing survival in patients with GCM.
CI, confidence interval; GCM, gastric cancer combined with multiple primary malignancies; HR, hazard ratio.

Nomogram for prognostic prediction in a patient with GCM. The patient was aged <60 years, and gastric cancer was not the patient’s first tumor. The patient’s N stage was N1, and the patient did not undergo surgery. The density plots of total points and age show their distributions. The distributions of categorical variables are indicated by box size. To use the nomogram, an individual patient’s specific points (black dots) are placed on each variable axis. Red lines and dots are drawn upward to determine the points contributed by each variable; the sum (238) of these points is placed on the total points axis, and a line is drawn downward to the survival axes to determine the probabilities of 12‐month (47.7%), 36-month (17.0%) and 60-month (9.8%) survival.
Nomogram validation
C-index values were 0.779 (95% CI: 0.758–0.799) in the training cohort and 0.786 (95% CI: 0.753–0.815) in the validation cohort. In both the training and validation cohorts, the time‐dependent AUC was >0.7 for the prediction of survival within 60 months (Figure 4a, b), indicating that the nomogram has good discriminative power. In the training cohort, the 12-, 36-, and 60-month AUC values were 0.820, 0.848, and 0.849, respectively (Figure 4c); in the validation cohort, these values were 0.830, 0.850, and 0.853, respectively (Figure 4d). The calibration curves showed strong agreement between predicted and observed survival probabilities in the training and validation cohorts (Figure 4e, f). Overall, the GCM nomogram had good discriminative and calibrating abilities.

Evaluation of nomogram discriminative and calibration abilities. (a, b) Time-dependent AUC values in the training and validation cohorts. (c, d) Twelve-, 36-, and 60-month ROC curves and AUC values in the training and validation cohorts and (e, f) Calibration curves of 12-, 36-, and 60-month overall survival for patients with GCM in the training and validation cohorts. Dots in calibration curves were calculated by bootstrap sampling (1000 samples); they represent nomogram performance. A predicted value close to the ideal line (diagonal) indicates greater accuracy in survival prediction
Clinical utility
The clinical utility of the nomogram was evaluated by DCA (Figure 5a, b), which revealed good positive net benefits in terms of survival modeling among most threshold probabilities at 12, 36, and 60 months.

Clinical utility of the nomogram. (a, b) DCA of the nomogram for training and validation cohorts. Twelve-, 36-, and 60-month DCA lines represent the net clinical benefit over a range of threshold probabilities: horizontal red lines assume no patients will experience the event; all-12 months, all-36 months, and all 60 months lines assume all patients will experience the event and (c, d) Kaplan–Meier survival curves for training and validation cohorts at different risk levels, stratified using the nomogram.
A risk classification system was constructed based on the total nomogram score. Patients with GCM were divided into three risk groups: low (total score <144), middle (192≤ total score ≤144), and high (total score ≥192). Kaplan–Meier curves showed robust discriminative power among the three risk groups. In the training cohort, the 12-, 36-, and 60-month survival rates were 79.8%, 51.4%, and 31.6% in the low-risk group; 64.9%, 34.9%, and 19.3% in the middle risk group; and 49.9%, 14.8%, and 6.7% in the high-risk group, respectively (Figure 5c). In the validation cohort, the 12-, 36-, and 60-month survival rates were 80.5%, 53.0%, and 34.3% in the low-risk group; 65.8%, 35.5%, and 21.1% in the intermediate-risk group; and 48.1%, 14.7%, and 9.2% in the high-risk group, respectively (Figure 5d).
Discussion
The incidence of MPMs is gradually increasing, with considerable variation according to ethnicity, age, cancer type, and registry. 22 In Japan, MPMs cause approximately 25% of deaths among long-term survivors. 23 However, it remains difficult to conduct single-center studies on MPMs because of their sporadic nature and the diverse combinations of tumors involved. Thus far, most descriptions of MPMs in patients with GC (i.e., GCM) have been provided in case reports.24–26 Therefore, the present study used a public database (SEER) for analysis of GCM. The database contained 29,791 cases of GCM between 2010 and 2019, which was sufficient for a large-scale analysis.
In the present work, survival time was significantly better in patients with GCM than in patients with GCO. There were similar findings in a study of patients with lung cancer: survival was slightly better among patients with multiple cancers than among patients with lung cancer alone. 27 Taken together, the previous findings and our real-world results indicate that some MPMs are indolent tumors that do not lead to poor patient outcomes. However, another study showed that survival was worse among patients with GCM than among patients with GCO. 13 Notably, only 70 patients with GCM were included in that study; the small sample size may have influenced the findings.
TNM stage, chemotherapy, and surgery clearly have an effect on GC prognosis 28 ; the present study confirmed their effects on GCM prognosis. However, Cox regression showed that radiotherapy (hazard ratio=1.085, P=0.054) tended to be an independent risk factor for worse survival in patients with GCM; further analysis revealed that this was mainly because most patients receiving radiotherapy had an advanced TNM stage. Reanalysis after PSM showed no effect of radiotherapy on survival time in patients with GCM. A previous study also demonstrated that adjuvant radiation therapy after D2 gastrectomy for node-positive GC does not improve survival. 29
Importantly, the present study showed that age, sex, number of primary cancers, and GC sequence were also independent risk factors for worse survival in patients with GCM. Among them, older age and male sex have previously been identified as risk factors for MPMs in the digestive system.30,31 A study of 24,105 patients with MPMs revealed that the male-to-female ratio was 1.45 to 1 in patients with MPMs and 2.7 to 1 in patients with GCM. 32 Regarding GC sequence and number of primary tumors, to the best of our knowledge, the present study is the first to demonstrate that they are independent risk factors for worse survival in patients with GCM. According to our statistical analysis, GC as the first tumor is generally a favorable prognostic factor for patients with GCM. Surprisingly, the presence of >2 tumors was a positive prognostic factor in patients with GCM.
Nomograms, graphical scoring tools for prediction models, are widely used to calculate event probability33,34 and constitute an important component of modern medical practice. Validation of our model demonstrated good accuracy and consistency; it is effective for predicting survival in patients with GCM. The C‐index and time‐dependent AUC findings confirmed the accuracy of the nomogram.
DCA showed that the nomogram had good net clinical benefits; these benefits were replicated in the validation cohort. After patients with GCM had been divided into low‐, middle‐, and high‐risk groups according to nomogram total score, Kaplan–Meier curves demonstrated significant differences in survival time among the three risk groups. These findings indicate that the nomogram has good clinical utility.
This study had some limitations. First, the SEER database does not allow investigation of information about non-GC tumors in patients with GCM (e.g., cancer type, TNM stage, and treatment). Thus, we could not explore the specific effects of these tumors on patients with GCM. Nonetheless, statistical analysis showed that GCM was a more indolent tumor, suggesting that some cancer combinations contributed to its indolence. Second, with respect to metachronous GCM, patients with good GC survival may be more likely to experience another malignancy. However, the SEER database does not clarify whether a patient had synchronous or metachronous GCM. To address this issue, we aligned GC baseline data between the two groups (by PSM) to minimize the effects of GC clinical characteristics on prognosis. Importantly, we found that survival was better among patients with GCM than among patients with GCO (Figure 2b), confirming that GCM is a relatively indolent tumor compared with GCO. Third, multicenter clinical validation is required to determine the external utility of our nomogram.
Conclusions
Compared with GCO, GCM is a relatively indolent malignancy. Age, marital status, sex, primary site, GC sequence, number of primary cancers, T stage, N stage, M stage, chemotherapy, and surgery were independent risk factors for worse survival in patients with GCM. The nomogram developed in this study can help clinicians to assess GCM prognosis.
Footnotes
Authors' contributions
XL and LZ conceived the study; LM analyzed the data and drafted the manuscript; JF and XZ organized the data and revised the manuscript; XL and LZ were responsible for the supervision of project and approval of the final version of the manuscript. All authors contributed to the manuscript and approved its submission for publication.
Availability of data and materials
The datasets used and analyzed during the present study are available from the corresponding author on reasonable request.
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
