Abstract
Introduction
Thyroid carcinoma (TC) is the most common endocrine tumor but has a favorable prognosis among malignancies, exhibiting approximately 98.3% of 5-year relative survival. 1 In the past few decades, the incidence of TC has increased rapidly around the world, which was due to more careful detection of this neoplasm to some extent. TC rarely metastasizes to distant sites at a rate of 2%, and the outcomes worsen remarkably once distant metastasis (DM) develops, as thyroid cancer-specific death accounts for 73.2%. 2 The lung is the most common metastatic site in TC patients, followed by the bone, liver and brain.2–6 Most thyroid cancer metastases are asymptomatic and are difficult to distinguish until detection by full-body metastatic examination. Due to its asymptomatic nature, lung metastasis (LM) is usually neglected at the initial diagnosis of TC patients. Chest computed tomography (CT) and whole-body nuclear medicine bone scanning are advised to be performed after 2-3 years of operative treatment or when patients develop doubtful symptoms of DM. Patients with advanced or recurrent thyroid cancer lack effective treatment, resulting in a high mortality rate. It was reported that the 20-year survival of TC patients with LM decreased by approximately 50% compared with patients without DM (51.2% vs 98%) since the initial diagnosis. 7
Radioactive iodine 131 therapy is the main therapeutic tactic for TC patients with lymph node metastasis or distant metastases.6, 8–10 However, resistance to radioactive iodine 131 significantly reduced the survival outcomes of these patients. In recent years, the treatment of LM has developed and improved, and locoregional resection of pulmonary metastases has brought survival benefits to patients with LM.11, 12 Small molecule tyrosine kinase inhibitors (TKI) can usually control disease and improve progression-free survival, 13 in which lenvatinib and sorafenib have been approved for the treatment of radioiodine-refractory differentiated thyroid cancer in phase III clinical trials, showing significant response rates and increased progression-free survival over placebo.14–16 Nevertheless, the survival results of thyroid cancer lung metastasis (TCLM) are still poor owing to late detection. A retrospective study demonstrated that early identification of asymptomatic DM was beneficial for the long-term survival of TC patients. 17 Therefore, to identify the patients with a high risk of LM is of great importance for improving the prognosis of TC patients.
In the present study, based on the data from the surveillance, epidemiology, and end results (SEER) database, we aimed to identify risk indicators related to LM and then construct and validate a nomogram for predicting the risk of LM in newly diagnosed stage IV TC patients. Furthermore, we will assess the clinical utility of this model by examining the correlation between predicted risk and survival and conducting a preventive intervention trial. This study is committed to providing some basis for LM screening and preventive intervention in TC patients.
Methods
Data Selection and Exclusion Criteria
We collected the data of newly diagnosed stage IV thyroid cancer (TC) patients by SEER*Stat 8.3.9. The study was limited to between 2010 and 2015 because the cases included in this study were staged using the 7th edition of the American Joint Committee on Cancer (AJCC) TNM staging system, which only contained the information of patients from 2010 to 2015 in the SEER database.
Demographic characteristics (age at diagnosis, sex, race, and marital status), disease characteristics (histological grade, histology, and TNM stage), treatment characteristics (surgery, radiotherapy, and chemotherapy), and survival status (survival time and cause of death) were included in our research. We first excluded the patients who were diagnosed by autopsy or death certificate. The exclusion criteria were as follows: (1) TC was not the only primary tumor; (2) age at diagnosis under 18 years old; (3) patients with complete clinicopathological data were included, those with missing data were excluded; (4) survival time equaled 0. The selection process of detailed inclusion and exclusion criteria is displayed in Supplementary Figure 1. Ethical approval and informed consent were not required because the identifying information of individuals was not disclosed in the SEER database.
The metastatic events of all patients (including lung, bone, brain, liver) were evaluated at primary diagnosis, so the patients had not previously received any treatment for thyroid cancer. Finally, we identified 1407 patients, of whom 235 (16.7%) were found to metastasize to the lung, which was defined as an outcome event. We randomly divided the patients into a training cohort and a validation cohort at a ratio of 7:3. The training group was used to build the prediction model of LM, and the validation group was used to verify the model.
Statistical Analysis
Pearson's Chi-squared test or Fisher's exact test was utilized to assess the baseline characteristics and evaluate the relationship between LM and clinicopathologic characteristics. Based on the results of the Pearson Chi-squared test or Fisher's t-test analysis (P value < .05), multivariate logistic regression analysis was performed to build a nomogram with significant variables (P value < .05) in the training set.
We evaluated the nomogram performance in terms of discrimination, calibration, and clinical utility. The discriminative ability of the nomograms was assessed using the area under the curve (AUC). The predictive accuracy was evaluated by the calibration curve using the bootstrapping method (1000 repeats). For the purpose of quantifying the clinical utility of the model, we regarded the predictive risk of LM as a new variable (below median or above median) and performed Cox regression analysis together with other factors of thyroid cancer in the whole cohort. Kaplan–Meier curves and log-rank tests were applied to compare the overall survival (OS) between the two groups. A simulation trial for preventing LM was also conducted to verify the clinical utility of this model. We calculated the individual risk of LM according to the nomogram in the total cohort and set different thresholds as intervention conditions. The number of patients receiving the preventive intervention and successful prevention of LM can reflect the clinical value of this nomogram in the prevention of TCLM.
All of these analyses were executed using packages (including caret, rms, foreign, and survival) in R software (version 4.0.4; http://www.r-project.org). All statistical analyses were two-sided, and P < .05 was defined as statistically significant.
Results
Patient Characteristics
Through selection in the light of the exclusion criteria, we extracted a total of 1407 initially diagnosed stage IV thyroid cancer patients. A total of 235 (16.7%) patients had clinical evidence of LM. Supplementary Table 1 shows the baseline characteristics of the entire cohort, and there was no difference between the training set (N = 987) and the validation set (N = 420). As shown in Table 1, Pearson's Chi-squared test or Fisher's exact test was performed for the training set, and the results demonstrated that factors including age at diagnosis (P = .002), grade (P < .001), histology (P < .001), T stage (P < .001), N stage (P < .001), bone metastasis (P < .001), brain metastasis (P < .001), and liver metastasis (P < .001) were connected with lung metastasis.
Participant Characteristics for the Training set and Validation Set.
For marital status, unmarried consists of single, divorced, separated, and widowed.
For race, ‘other’ includes American Indian, AK Native, Asian, and Pacific Islander.
For grade, Grade I means well-differentiated, Grade II means moderately differentiated, Grade III means poorly differentiated, Grade IV means undifferentiated or anaplastic.
For histology, PTC means papillary TC, ATC means anaplastic TC, FTC means follicular TC, MTC means medullary TC, and other includes all histological types of thyroid cancer except for the above four types.
Multivariate Logistic Regression Analysis
According to the outcomes of the Pearson Chi-squared test or Fisher's exact test analysis, 8 statistically significant variables were included in multivariate logistic regression analysis. Grade, histology, N stage, bone involvement, and liver involvement were found to be independent risk predictors of LM (Table 2).
Multivariate Logistic Regression Analysis of Predictive Factors of Lung Metastasis in the Training Set.
Abbreviations: OR, odds ratio; CI, confidence interval.
Nomogram Establishment
We used the 5 independent risk predictors to establish a nomogram to predict the risk of LM (Figure 1). In the nomogram, every variable was assigned a special score (Supplementary Table 2), and we can estimate the probability of LM in a single patient by summing the scores of each variable. In addition, the nomogram showed that histology contributed the most to the risk of LM, followed by grade. Strikingly, the N stage was neither straightforward positively nor negatively correlated with the LM risk. The outcomes indicated that patients at N1a had the highest risk of LM, followed by patients at N1b, and those at N0 had the lowest risk.

A nomogram for predicting lung metastasis in initially diagnosed stage IV thyroid cancer patients.
Nomogram Performance
The nomogram performance at predicting LM was evaluated from the aspects of discrimination, calibration, and clinical utility. The discrimination performance and predictive accuracy of the nomogram were evaluated by the area under the ROC curve (AUC) and calibration curves, respectively. This nomogram had a moderate discrimination ability with an AUC of 0.794 in the training set (Figure 2A) and 0.819 in the validation set (Figure 2B). Then, 1000 bootstrap resamples were performed to build calibration plots of the model (Figure 2C and D). The calibration curve showed good agreement between the bias-corrected curve and the ideal curve in the training and validation sets (Figure 2C and D), indicating that the nomogram fitted well internally and externally.

Receiver operating characteristic (ROC) curves and calibration curves (bootstrap = 1000 repetitions) of the training set and the validation set for lung metastasis. The model had an area under the curve (AUC) of 0.794 in the training set (A) and 0.819 in the validation set (B).
We calculated the total points of every patient based on the nomogram model, the value of total points of all patients is from 176 to 448 and the median value is 219. A value of the total point corresponds to a value of the LM prediction risk. For evaluating the clinical utility of the nomogram, we defined a new variable—the LM prediction risk (below median and above median). Cox regression analysis was performed together with other clinical parameters of thyroid cancer (Table 3). The results of multivariate Cox regression analysis showed that the LM prediction risk obtained from the nomogram was one of the most significant parameters associated with OS [P = .009, hazard ratio (HR): 1.812, 95% CI: 1.163-2.824]. Moreover, significant differences were observed between the below median and above median groups in the Kaplan–Meier analysis (Figure 3, P < .0001). The above results demonstrated that the LM prediction risk could serve as a prognostic indicator, TC patients with a higher LM prediction risk had a worse prognosis.

Kaplan–Meier curves comparing the overall survival for patients with a predictive risk of lung metastasis above or below the median.
Univariate and Multivariate Cox Regression Analysis of Prognostic Factors for Overall Survival in the Total Cohort.
Abbreviations: HR, hazard ratio; CI, confidence interval.
We further designed a simulation trial of preventing LM in the whole cohort to corroborate the clinical utility of the nomogram. In the simulation trial, we set the predicted risks at total points 200, 250, 300, 350, and 400 as thresholds (corresponds to the LM prediction risk of 1.87%, 7.55%, 25.96%, 60.08%, and 86.61%, respectively), which were regarded as the conditions to receive the prophylactic intervention. As shown in Table 4, when all patients received preventive treatment without selection, LM could be prevented to the greatest extent, but many patients will be over-treated. When we included patients with a predicted risk of LM >7.55% in the nomogram, 38.3% (539/1407) of the population would receive treatment, but about 71.5% (168/235) of follow-up LM could be potentially prevented. With the assumption of prophylactic treatment reducing 75% of LM risk, the actual incidence of LM could decrease to 46.4%. Other preventive thresholds set from our nomogram and the corresponding effect of preventing LM are displayed in Table 4. In a few words, our model was expected to identify patients at high risk of LM to undertake prophylactic treatment and improve prognoses of TC patients.
Clinical Utility of the Nomogram Evaluated by the Virtual Trial With Several Thresholds for Prophylactic Treatment and the Corresponding Effects on the Prevention of Lung Metastasis.
Abbreviations: PT, prophylactic therapy; TCLM, thyroid cancer lung metastasis.
Discussion
The lung is a unique organ that has the densest capillary bed in the body and is the first reservoir of most lymphatic drainage entering the venous system, leading to the highest LM rate among the whole body in malignancies. 18 It is commonly known that over 50% of patients have pulmonary metastases in the outcomes of autopsies performed on patients who died of malignancy. 19 Early LM of thyroid cancer was absent of typical symptoms and was generally found by imaging examination, showing multiple small nodular shadows. 20 Cough, hemoptysis, and dyspnea are the typical advanced symptoms of LM, which rapidly deteriorate the quality of life and accelerate the progression of death. Therefore, the forepart identification of LM at the primary diagnosis of TC is of vital significance for choosing suitable therapies and ameliorating the prognosis. Recently, nomograms have been widely used as a predictive tool to predict the effect of treatment or the risk of clinical events. In this research, we identified independent risk factors for LM based on a population-based database and constructed a nomogram to estimate the risk of LM in newly diagnosed stage IV TC patients, so as to assist clinicians in identifying specific persons at high risk of LM and implementing early preventive interventions to extend survival.
We have generated some interesting conclusions in our study. We identified 5 independent factors associated with LM, including grade, histology, N stage, and bone and liver metastatic status. Prior studies have demonstrated that poor differentiation is also a high-risk factor for developing LM, and progressively increasing grade was connected with an increased risk of LM.4, 21 This conclusion was consistent with our results, as patients with Grade IV (undifferentiated or anaplastic) TC were more likely to have LM than patients with Grade III (poorly differentiated) TC, followed by those with Grade II (moderately differentiated) TC, and the risk was lowest for patients with Grade I (well-differentiated) TC. For the histological type of thyroid cancer, the existing study reported that follicular TC was associated with lung metastasis. 4 Similar conclusions were observed in our study, suggesting that FTC histology was an independent risk factor in predicting TCLM. Moreover, lymphatic metastasis was also a high-risk factor for developing LM,21, 22 which was confirmed in this study. It was universally known that lateral lymph node metastasis is more likely to invade distant organs than central lymph node metastasis. Differently, the relationship between the N stage and LM was neither directly positive nor negative in our study. The outcomes indicated that patients at N1a had the highest risk of LM, followed by patients at N1b, and the lowest risk was at N0. There were several reasons that might result in the outcome. First, this difference might be caused by the small number of patients included in the study. Second, lymph node ratio (LNR) was generally defined as the number of metastatic lymph nodes divided by the number of removed lymph nodes, which was regarded as a variable reflecting both extent of lymph node metastasis and completeness of resection, and lymph node size as the maximal diameter of tumor in the metastatic lymph node. A great deal of studies validated that high LNR and the presence of large lymph node metastasis were significant risk factors for thyroid cancer recurrence.23–27 In addition, for MTC patients undergoing thyroidectomy and lymphadenectomy, high lymph node number predicted poorer survival in the subpopulation of node-positive patients,28,29 and a similar result was presented in PTC patients. 30 Another research reported that a higher number of involved lymph nodes implied a higher risk of LM in MTC patients. 31 Accordingly, the LNR, lymph node size and the number of metastatic lymph nodes have an important impact on disease progression and prognosis for thyroid cancer. However, the corresponding information was not included in our study owing to the limitations of incomplete data records in the SEER database, which might lead to results that do not conform to conventional understanding. Therefore, studies with larger populations and prospectively rigorous designs are necessary for the future. Furthermore, the results of our study indicated that bone metastasis and liver metastasis were significant variables in predicting LM. However, patients with brain metastasis had no significant relationship with LM.
Our nomogram has shown several excellent clinical utilities for the identification of LM high-risk patients. First, clinicians can increase the frequency of performing lung CT for high-risk patients to achieve early diagnosis of pulmonary metastasis. Moreover, in terms of local therapy, identifying lung metastasis early and receiving local resection of pulmonary metastasis sites can prolong the prognosis. 11 Therefore, the early identification of LM is conducive to undertaking local surgery treatment for TC patients as soon as possible, which is expected to improve the prognoses. In addition, for systematic therapy, lenvatinib, sorafenib, and apatinib could significantly improve progression-free survival in patients with progressive locally advanced or radioactive iodine-refractory differentiated thyroid cancer.14–16 Apatinib plus camrelizumab (anti-PD1 therapy, SHR-1210) has shown clinical benefits in a variety of malignancies,32–39 suggesting its potential therapeutic value in advanced TC. These novel targeted drugs can be moved forward for high-risk patients to prevent LM. In our research, by forecasting the number of patients who successfully prevent LM after receiving prophylactic treatment at different thresholds of intervention conditions, the simulation trial also corroborated the health economic value of this nomogram in predicting TCLM.
We conducted this study using data from the SEER database. The SEER database covered approximately 30% of the US population, which could better reflect the actual clinical outcomes of TC patients with LM than single cancer center studies. Besides, we only included stage IV TC patients in our study, aiming to overcome the limitation of overlapping DM high-risk factors (such as T stage and tumor stage) in studies covering the whole TC population. Inevitably, some notable limitations exist in our study. First, recurrence and follow-up information are absent in the SEER database, so our model could only be applied for primary diagnosed TC patients but could not be used to predict the risk of developing LM in the subsequent process of disease. Second, owing to the lack of typical symptoms, the actual incidence rate of LM might well have been underestimated in newly diagnosed thyroid cancer. Third, due to the small number of included patients and the nature of retrospective studies, further prospective studies with a larger number of data are warranted to verify the conclusions generated in this study.
Conclusion
In conclusion, this was the first study to explore the risk indicator of LM in TC, and we built an effective model to predict LM. The model is expected to assist clinicians in identifying TCLM patients early and guiding early preventive interventions.
Supplemental Material
sj-jpg-1-tct-10.1177_15330338231167807 - Supplemental material for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study
Supplemental material, sj-jpg-1-tct-10.1177_15330338231167807 for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study by WenYi Wang, JiaJing Liu, XiaoFan Xu, LiQun Huo, XuLin Wang and Jun Gu in Technology in Cancer Research & Treatment
Supplemental Material
sj-docx-2-tct-10.1177_15330338231167807 - Supplemental material for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study
Supplemental material, sj-docx-2-tct-10.1177_15330338231167807 for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study by WenYi Wang, JiaJing Liu, XiaoFan Xu, LiQun Huo, XuLin Wang and Jun Gu in Technology in Cancer Research & Treatment
Supplemental Material
sj-docx-3-tct-10.1177_15330338231167807 - Supplemental material for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study
Supplemental material, sj-docx-3-tct-10.1177_15330338231167807 for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study by WenYi Wang, JiaJing Liu, XiaoFan Xu, LiQun Huo, XuLin Wang and Jun Gu in Technology in Cancer Research & Treatment
Supplemental Material
sj-pdf-4-tct-10.1177_15330338231167807 - Supplemental material for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study
Supplemental material, sj-pdf-4-tct-10.1177_15330338231167807 for A High-Quality Nomogram for Predicting Lung Metastasis in Newly Diagnosed Stage IV Thyroid Cancer: A Population-Based Study by WenYi Wang, JiaJing Liu, XiaoFan Xu, LiQun Huo, XuLin Wang and Jun Gu in Technology in Cancer Research & Treatment
Footnotes
Abbreviations
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethical Statement
The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This article does not contain any studies with human participants or animals performed by any of the authors. Ethical approval and informed consent were not required because the identifying information of individuals was not disclosed in the SEER database.
Informed Consent
Not applicable (The data generated during the current study are available in the SEER database).
Data Availability Statement
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
