Abstract
BACKGROUND:
Therapeutic possibilities for non-small cell lung cancer (NSCLC) have considerably increased during recent decades.
OBJECTIVE:
To summarize the prognostic relevance of serum tumor markers (STM) for early and late-stage NSCLC patients treated with classical chemotherapies, novel targeted and immune therapies.
METHODS:
A PubMed database search was conducted for prognostic studies on carcinoembryonic antigen (CEA), cytokeratin-19 fragment (CYFRA 21-1), neuron-specific enolase, squamous-cell carcinoma antigen, progastrin-releasing-peptide, CA125, CA 19-9 and CA 15-3 STMs in NSCLC patients published from 2008 until June 2022.
RESULTS:
Out of 1069 studies, 141 were identified as meeting the inclusion criteria. A considerable heterogeneity regarding design, patient number, analytical and statistical methods was observed. High pretherapeutic CYFRA 21-1 levels and insufficient decreases indicated unfavorable prognosis in many studies on NSCLC patients treated with chemo-, targeted and immunotherapies or their combinations in early and advanced stages. Similar results were seen for CEA in chemotherapy, however, high pretherapeutic levels were sometimes favorable in targeted therapies. CA125 is a promising prognostic marker in patients treated with immunotherapies. Combinations of STMs further increased the prognostic value over single markers.
CONCLUSION:
Protein STMs, especially CYFRA 21-1, have prognostic potential in early and advanced stage NSCLC. For future STM investigations, better adherence to comparable study designs, analytical methods, outcome measures and statistical evaluation standards is recommended.
Keywords
Introduction
Lung cancer is still the second most frequent cancer type, accounting for 11.4% of all cancers and serving as the leading cause of cancer mortality, with estimated 1.8 million deaths per year worldwide (18%) [1, 2]. Over the last decade, the incidence and mortality of lung cancer have steadily declined [3], mainly due to improvements in both diagnostic and therapeutic areas, such as the introduction of low-dose computed tomography for early lung cancer detection in high risk groups [4] and the approval of novel surgical and systemic treatment approaches including targeted tyrosine kinase inhibitor therapies (TKI) and immune checkpoint inhibitor (ICI) therapies [5, 6]. Consequently, the prognosis for early-stage non-small cell lung cancer (NSCLC) has improved in recent years, with a 5-year survival rate of 72% for adeno-cell (LUAD) and 48% for squamous-cell lung cancer (LUSC) [7, 8]. However, 55% of NSCLC patients continue to be diagnosed with unresectable advanced stages IIIB to IV, which are associated with a 5-year survival rate of only 9.5% [9] and a median survival of 8 to 18 months [10–12]. The advent of targeted and ICI therapies, as well as of new combination regimes [6], has also steadily improved survival in late-stage disease [13]. Notably, for patients ineligible for targeted or ICI therapies, combination chemotherapy regimens remain the recommended systemic therapy for LUSC and LUAD [14, 15].
In addition to molecular classification of lung tumors, for precise patient stratification using predictive “companion diagnostics” that indicate the likelihood of response to specific targeted or ICI therapies [16, 17], patient guidance involves estimating overall prognosis and individually monitoring therapy response as well as post-therapeutic surveillance using radiological and biochemical biomarkers [18, 19].
At present, considerable efforts are devoted to developing predictive molecular diagnostics, such as screening for tumor-specific genomic alterations in EGFR, ALK, ROS1, BRAF, NTRK1/2/3, RET, MET genes, for tumor mutational burden (TMB), mismatch repair and microsatellite instability amongst others, that are assessed in tumor tissue and on cell-free tumor DNA (ctDNA) circulating in the blood plasma [19–26].
To estimate prognosis, clinical markers, such as TNM stage, performance score, weight loss, lymph node involvement, metastases and the histologic subtypes [20, 27], as well as blood-based biochemical markers like routine lab parameters and tumor-associated proteins, provide valuable information in daily clinical practice. In the future, novel molecular markers like mRNA, miRNA, genetic and epigenetic changes in tumor and plasma DNA will further expand the array of prognostic markers [20, 29]. Regarding serum-based protein tumor markers (STM), numerous original studies and reviews have demonstrated prognostic relevance, particularly for cytokeratin-19 fragments (CYFRA 21-1), as well as carcino-embryonic antigen (CEA), neuron-specific enolase (NSE), squamous cell cancer antigen (SCCA), carbohydrate antigens 19-9 and 125 (CA 19-9 and CA 125) in NSCLC patients [30].
The present survey aims to update the findings of our 2010 review [27] which compiled all studies up to 2008 concerning the prognostic significance of serum tumor markers CEA, CYFRA 21-1, NSE, CA 125, CA 19-9, CA 15-3, SCCA, and ProGRP in both early and late-stage NSCLC. In this updated review, we incorporate all prognostic research conducted since 2008 until June 2022, presenting their results and grading the evidence based on criteria established by Hayes et al. [31]. We categorize the examined studies by stage due to the varying prognostic situations and therapeutic implications in early and advanced NSCLC stages. Similar to the previous review, the majority of studies focus on patients undergoing chemotherapy, and the most pertinent tumor markers are discussed individually, with comprehensive and detailed overviews provided in tables. Furthermore, we expanded the search to include the predictive value of STM in advanced stage NSCLC patients treated with targeted or ICI therapies. Finally, we critically address and discuss the limitations in study comparability due to heterogeneity and inconsistencies in the use of prediction and prognosis terminology [28, 32].
Methods
A search in the PubMed database was performed using the terms (and corresponding terms) “non-small cell lung cancer” (or “NSCLC”) AND “prognostic value” (or “prognosis” or “survival” or “prediction”) AND serum biomarkers: “CEA” (or “carcinoembryonic antigen”) or “CYFRA 21-1” (or “CYFRA21-1” or “cytokeratin-19 fragment”) or “NSE” (or “neuron-specific enolase” or “neuron specific enolase”) or “SCCA” (or “squamous cell carcinoma antigen” or “SCC-Ag”) or “CA19-9” (or “CA 19-9” or “carbohydrate antigen 19-9”) or “CA15-3” (or “CA 15-3” or “cancer antigen 15-3”) or “CA125” (or “CA 125” or “cancer antigen 125”) since the year 2008 (and three studies from 2007, not included in the last review) until June 2022. We supplemented the structured literature inquiry with a search of the reference lists from the included articles, to find additional eligible studies. Figure 1 displays a flow chart outlining the search process.

Flow-diagram of the literature search in PubMed. NSCLC (non-small cell lung cancer).
Inclusion criteria were: article in English (or German) language, no double publication, NSCLC patients identifiable, no mixed histology investigations with SCLC, minimum number of participants N > 40, appropriate “prognostic” study design and statistical survival analysis evaluation, relevant serum biomarkers, no case reports. The following items were listed in the Tables 1–3: study type, number of patients, tumor stage, histology, therapy, endpoint investigated, STMs investigated, analytics and analyzer used, evaluation of results, the level of evidence and statistically significant prognostic STMs and additional investigated markers.
Summary of prognostic biomarker studies in patients with early staged non-small cell lung cancer
Findings are presented as positive predictive for the corresponding endpoint in multivariate analysis (low tumor marker levels reflect longer endpoint), unless otherwise specifically described. If not otherwise stated, baseline serum tumor marker levels are given. LOE (level of evidence), OS (overall survival), DFS (disease free survival), RFS (recurrence free survival), LRFS (local relapse-free survival), DMFS (distant metastasis-free survival), PFS (progression-free survival), PRS (post-recurrence survival),ORR (overall response rate), PPS (post-progression survival), DCB (durable clinical benefit), DCR (disease control rate), STM (serum tumor marker), DCR (disease control rate), NSCLC (non-small cell lung cancer), LUAD (lung adenocarcinoma), LUSC (lung squamous cell carcinoma), CEA (carcinoembryonic antigen), CYFRA21-1 (cytokeratin-19 fragment), CA19-9 (carbohydrate antigen 19-9), CA 15-2 (cancer antigen 15-3), CA125 (cancer antigen 125), NSE (neuron-specific enolase), SCCA (squamous cell carcinoma antigen), ProGRP (pro-gastrin releasing peptide), TPSA (tissue polypeptide specific antigen), NLR (neutrophil lymphocyte ratio), SLex (Sialyl Lewisx), OPN (osteopontin), FEV 1s (forced expiratory volume in 1 second), RT (radiotherapy), ChT (chemotherapy), RChT (radiochemotherapy), PS (performance status), IPI (inflammatory-prognostic index), TMII (tumormarker and inflammation Index), PLT (platelet count), TKI (tyrosine kinase inhibitor), ICI (immune checkpoint inhibitor), PD-L1 (programmed death-ligand 1), PD-1 (programmed cell death protein 1), EGFR (epidermal growth factor receptor), ALK (anaplastic lymphoma kinase), TGF-alpha (transforming growth factor alpha), LDH (lactate dehydrogenase), HB-EGF (heparin binding epidermal growth factor like factor), TK (thymidine kinase), NA (no data), GPS (Glasgow Prognostic Score), TIMP1 (tissue inhibitor of metalloproteinase-1), TrxR (thioredoxin reductase), PLR (platelet-lymphocyte ratio), PAR (platelet-activated receptor), EGFR mut (epidermal growth factor receptor mutation status), VEGFR (vascular endothelial growth factor receptor), SCS (simplified comorbidity score), LVI (lymphatic vascular invasion), Ca (calcium), HGF (hepatocyte growth factor), CLIA (chemiluminescent Immunoassay), ECLIA (electro-chemiluminescence immunoassay), ELISA (enzyme-linked immunosorbent assay), IRMA (immunoradiometric assay), IEA (immunoenzymatic assay); RIA (radioimmunoassay), uni (univariate).
Summary of prognostic biomarker studies in patients with investigations of all stages of non-small cell lung cancer
Findings are presented as positive predictive for the corresponding endpoint in multivariate analysis (low tumor marker levels reflect longer endpoint), unless otherwise specifically described. If not otherwise stated, baseline serum tumor marker levels are given. LOE (level of evidence), OS (overall survival), DFS (disease free survival), RFS (recurrence free survival), LRFS (local relapse-free survival), DMFS (distant metastasis-free survival), PFS (progression-free survival), ORR (overall response rate), PPS (post-progression survival), DCB (durable clinical benefit), DCR (disease control rate), STM (serum tumor marker), DCR (disease control rate), NSCLC (non-small cell lung cancer), LUAD (lung adenocarcinoma), LUSC (lung squamous cell carcinoma), CEA (carcinoembryonic antigen), CYFRA21-1 (cytokeratin-19 fragment), CA19-9 (carbohydrate antigen 19-9, CA 15-2 (cancer antigen 15-3), CA125 (cancer antigen 125), NSE (neuron-specific enolase), SCCA (squamous cell carcinoma antigen), ProGRP (pro-gastrin releasing peptide), TPSA (tissue polypeptide specific antigen), NLR (neutrophil lymphocyte ration), SLex (Sialyl Lewisx), RT (radiotherapy), ChT (chemotherapy), RChT (radiochemotherapy), PS (performance status), IPI (inflammatory-prognostic index), PLT (platelet count), TKI (tyrosine kinase inhibitor), ICI (immune checkpoint inhibitor), ALK (anaplastic lymphoma kinase), TGF-alpha (transforming growth factor alpha), LDH (lactate dehydrogenase), HB-EGF (heparin binding epidermal growth factor like factor), TK (thymidine kinase), NA (no data), GPS (Glasgow Prognostic Score), TIMP1 (tissue inhibitor of metalloproteinase-1), TrxR (thioredoxin reductase), PLR (platelet-lymphocyte ratio), PAR (platelet-activated receptor), EGFR mut (epidermal growth factor receptor mutation status), VEGFR (Vascular endothelial growth factor receptor), SCS (simplified comorbidity score), LVI (lymphatic vascular invasion), Ca (calcium), HGF (hepatocyte growth factor), LAMC (Laminin Subunit Gamma 2), CLIA (chemiluminescent immunoassay), ECLIA(electro-chemiluminescence immunoassay), ELISA (enzyme-linked immunosorbent assay), IRMA (immunoradiometric assay), uni (univariate).
Summary of prognostic biomarker studies in patients with advanced non-small cell lung cancer
Findings are presented as positive predictive for the corresponding endpoint in multivariate analysis (low tumor marker levels reflect longer endpoint), unless otherwise specifically described). If not otherwise stated, baseline serum tumor marker levels are given. LOE (level of evidence), NA (no data), OS (overall survival), DFS (disease free survival), RFS (recurrence-free survival), LRFS (local relapse-free survival), DMFS (distant metastasis-free survival), PFS (progression-free survival), ORR (overall response rate), PPS (post-progression survival), FFS (failure-free survival), DCB (durable clinical benefit), DCR (disease control rate), STM (serum tumor marker), DCR (disease control rate), NSCLC (non-small cell lung cancer), LUAD (lung adenocarcinoma), LUSC (lung squamous cell carcinoma), CEA (carcinoembryonic antigen), CYFRA21-1 (cytokeratin-19 fragment), CA19-9 (carbohydrate antigen 19-9), CA 15-2 (cancer antigen 15-3), CA125 (cancer antigen 125), NSE (neuron-specific enolase), SCCA (squamous cell carcinoma antigen), ProGRP (pro-gastrin releasing peptide), TPSA (tissue polypeptide specific antigen), NLR (neutrophil lymphocyte ration), SLex (Sialyl Lewisx), RT (radiotherapy), ChT (chemotherapy), RChT (radiochemotherapy), PS (performance status), IPI (inflammatory-prognostic index), PLT (platelet count), TKI (tyrosine kinase inhibitor), ICI (immune checkpoint inhibitor), PD-L1 + 2 (programmed death-ligand 1 + 2), PD-1 (programmed cell death protein 1), sEGFR (soluble epidermal growth factor receptor), EGFR (epidermal growth factor receptor), ALK (anaplastic lymphoma kinase), TGF-alpha (transforming growth factor alpha), LDH (lactate dehydrogenase), HB-EGF (heparin binding epidermal growth factor like factor), TK (thymidine kinase), GPS (Glasgow Prognostic Score), TIMP1 (tissue inhibitor of metalloproteinase-1), TrxR (thioredoxin reductase), PLR (platelet-to-lymphocyte ratio), PAR (platelet-to-albumin ratio), EGFR mut (epidermal growth factor receptor mutation status), ALP (alkaline phosphatase), GPS (Glasgow Prognostic Score), CLIA (chemiluminescent immunoassay), ECLIA (electro-chemiluminescence immunoassay), ELISA (enzyme-linked immunosorbent assay), IRMA (immunoradiometric assay), RIA (radioimmunoassay), uni (univariate).
Grade of evidence was rated according to the criteria suggested and adapted by Hayes et al. [31]: Evidence from single, high-powered, prospective, controlled study that is specifically designed to test marker, or evidence from meta-analysis, pooled analysis or overview of level II or III studies Evidence from a study, in which marker data are determined in relationship to prospective therapeutic trial, that is performed to test therapeutic hypothesis but not specifically designed to test marker utility Evidence from large prospective or retrospective studies Evidence from small retrospective studies Evidence from small pilot studies.
Figure 2 presents the number of investigations, rather than the number of studies or patients, as in some studies multiple endpoints or baseline and additional kinetics of STMs were investigated. Consequently, in some studies, several investigations were conducted and considered separately.

Results of tumor marker investigations in non-small cell lung cancer for all stages. The size of circles reflects the number of investigations, since baseline values, values post therapy or kinetics are investigated separately in some studies. Hence, the size of circles does not represent the number of studies but the number of investigations of the tumor marker. Positive predictive (low tumor marker levels reflect longer endpoint), negative predictive (high tumor marker levels reflect longer endpoint), NS (not significant), CEA (carcinoembryonic antigen), CYFRA21-1 (cytokeratin-19 fragment), NSE (neuron-specific enolase), CA125 (cancer antigen 125), SCCA (squamous cell carcinoma antigen), CA19-9 (carbohydrate antigen 19-9), ProGRP (pro-gastrin releasing peptide), CA15-3 (cancer antigen 15-3), TT (targeted therapy), ICI (immune checkpoint inhibitor), ChT (chemotherapy), OS (overall survival), PFS (progression-free survival).
Since 2008, numerous prognostic protein biomarker studies have been published. One thousand sixty nine articles were identified in the Pubmed database searched for publications between 2008 and June 2022. Eight hundred twenty two articles were excluded in the abstract screening as they did not fulfil the inclusion criteria. In full text screening of the remaining 247 articles, further 133 were found not to be eligible. Finally, a total of 114 studies were included in the review. For the evaluation of all stage NSCLC, 16 papers were identified, 36 papers for early-stage NSCLC and 62 for advanced stage NSCLC (Fig. 1). Among patients with advanced stages who were treated with either tyrosine kinase inhibitors (TKI) or immunotherapy (ICI), further studies were identified that claimed predictive value and conducted survival analysis. These studies investigated the same endpoints, primarily OS and PFS, making it difficult to differentiate them from studies on prognostic value. These studies are discussed in a separate section.
The majority of prognostic studies were single-center (102 out of 114), retrospective (86 out of 114) observations of single or multiple marker combinations at baseline (98 out of 114), before the initiation of therapy. Tumor marker kinetics during the course of treatment were considered more frequently (25 out of 114), especially in advanced stage NSCLC (20 out of 62). The primary endpoint for predicting prognosis was overall survival (OS; 95 out of 114) followed by the surrogate endpoints, progression-free survival (PFS; 45 out of 114) and disease-free survival (DFS; 9 out of 114) (Tables 1–3).
The most frequently reviewed tumor markers were CEA (98 out of 114), CYFRA 21-1 (72 out of 114), and NSE (33 out of 114), while other markers such as SCCA, CA 125, CA 19-9, CA 15-3, tissue polypeptide specific antigen (TPS) were investigated in single studies (Tables 1–3). Furthermore, routine blood parameters like C-reactive protein (CRP) [33–35], natrium [36], albumin [37, 38], ferritin [39], neutrophil-lymphocyte ratio (NLR) [40–42] and lactate dehydrogenase (LDH) [43] were identified as independent prognostic factors in studies investigating serum tumor markers in NSCLC (Tables 1–3). Over 90% of studies provided evidence levels 3 and 4, according to Hayes et al. [31].
In early stage NSCLC, most studies investigated tumor markers in patients undergoing surgery with or without additional chemotherapy (Table 1). Patients in studies investigating all stages were mainly treated with chemotherapy; however, treatment strategies were highly heterogeneous (Table 2). Reflecting the therapeutic advancements in late-stage NSCLC, chemotherapy regimens (18 out of 64) have been increasingly supplemented or substituted by tyrosine kinase inhibitor (TKI) (32 out of 64) or immune checkpoint inhibitor (ICI) (12 out of 64) therapies (Table 3).
Cytokeratin-19 fragments –CYFRA 21-1
As already reported in the previous review [27], CYFRA 21-1 is one of the most valuable prognostic tumor markers in early and late-stage NSCLC. CYFRA 21-1 is the soluble fragment of cytokeratin 19 that is released after proteolytic degradation of the cytoskeleton of epithelial cells into the blood stream [44, 45].
In early-stage NSCLC, surgical resection of the tumor is applied as potentially curative therapy. However, 5-year OS is only 61%, which leaves about 40% of patients with a worse prognosis underlining the need for adjuvant chemotherapies [7]. Most homogenous prognostic studies focus on a subgroup, e.g. only stage I diseases. Eighty percent of the reviewed early-stage prognostic studies consistently confirm the independent unfavorable prognostic value of high pretherapeutic CYFRA 21-1 levels (Table 1). Several studies combined CYFRA 21-1 with CEA in a, so called, tumor marker index (TMI), which was prognostically more informative than CYFRA 21-1 or CEA alone [46–48].
In a retrospective study [49] including 227 patients, subjects with elevated baseline CYFRA 21-1 and CEA levels (high risk group) had a shorter PFS as compared with the low risk group in the whole cohort and in the LUSC subgroup, but not in patients with LUAD. On the other hand, Chen et al. (2021) investigated 2654 NSCLC patients [50] and reported high CYFRA 21-1 levels being associated with worse recurrence free survival (RFS) in LUAD but not in LUSC patients, which was concurring with several other studies [51–53]. In a cohort of 1016 early stage NSCLC patients, Jiang et al. [54] found shorter OS and DFS for high CYFRA 21-1 levels in LUAD patients with EGFR-mutated, but not with EGFR wild-type tumors. These studies highlight the importance of histological subgroup analyses and consideration of EGFR mutation status.
Studies on the prognostic value of STM in all NSCLC stages (I–IV) are more difficult to interpret as the results mix up completely different clinical situations and therapeutic options. Once again, high pretherapeutic CYFRA 21-1 levels were mainly associated with poor OS [36, 55–60]. In times of multiple therapy options that can be applied sequentially or in combination, a meta-analysis with 6395 patients [45] is of particular interest, and confirmed the strong prognostic value of high CYFRA 21-1 levels for worse OS and PFS with a pooled hazard ratio (HR) of 1.6 and 1.41, respectively. Additional significant associations were observed in patients treated with platinum-based chemotherapy (HR 1.53) EGFR-TKI inhibitors (HR 1.83), surgery (HR 1.94) as well as early vs. late stage, Asian vs. Caucasian ethnicity and prospective vs. retrospective study design [45].
However, conflicting results might be a consequence of different settings and portions of squamous- and adeno-cell carcinoma patients across various studies. Chakra et al. [57] stated prognostic significance of high (>3.6 ng/mL) CYFRA 21-1 levels for shorter survival (HR 1.5) in 451 NSCLC patients, among which 55% were diagnosed with LUSC. In a prospective study, Cho et al. [61] compared three cytologic and serum tumor markers, CYFRA 21-1, CEA and SCCA, in 253 patients, and could not find a significant prognostic value for CYFRA 21-1, however, only 18% (n = 47) of patients were diagnosed with LUSC. On the other hand, Zhang et al. [58] reported high CYFRA 21-1 levels being an independent, unfavorable prognostic factor in patients with LUAD (HR 1.86) but not in patients with LUSC alone. However, in combined histology investigations, CYFRA 21-1 was a significant prognostic marker of OS in stage I-II (HR 3.67), stage III (HR 1.92) and stage IV (HR 1.47). Takahashi et al. [55] investigating the survival in 1202 NSCLC patients found prognostic significance of high CYFRA 21-1 levels for shorter survival (HR 2.02, p = 0.001), too. However, they selected a high cut-off of 18 ng/mL which exemplifies the inconsistent choice of cut-off levels.
In advanced stage NSCLC the comparability of studies is complex due to vast changes and improvement of diagnostic possibilities and therapeutic options (Table 3). Baseline determination of tumor marker levels before treatment and further, STM kinetics along the course of treatment, acknowledging individual marker levels and changes instead of stipulating a certain cut-off, were taken under consideration [62–65]. Most of the investigations found CYFRA 21-1 baseline values and/or a reduction of the values prognostically significant when assessed prior or after one to three cycles of therapy for patients mainly treated with chemotherapy (Fig. 2).
Sato et al. [66] investigated CYFRA 21-1, CEA and CA 19-9 levels of 246 stage IIIB/IV lung adenocarcinoma patients, treated with chemotherapy. Patients with initial low levels of CYFRA 21-1 or CA 19-9 had a significantly longer survival (HR 0.47 and 0.60, respectively). In line with these results, Rumende et al. [67] found high CYFRA 21-1 levels (≥10.9 ng/mL) as a negative prognostic factor for 1-year survival in 111 patients treated or not treated with chemotherapy (HR 1.74), high initial CEA levels (≥21.3 ng/mL) however, were not significantly associated with shorter survival.
Single investigations questioning CYFRA 21-1 as an independent marker for survival in patients in advanced stages treated predominantly with chemotherapy, were mainly retrospective, with a limited number of patients, or only confirmed prognostic significance, when combining CYFRA 21-1 with other markers [33, 68–70] (Tables 3, 4). Baek et al. [33] could not find prognostic significance for longer survival of low baseline CYFRA 21-1 levels alone, however, a combination of low CYFRA21-1 levels and high (>4.7 ng/mL) pretreatment CEA levels (HR 0.52) had significant prognostic value. Studies discussing advanced stage NSCLC patients treated with TKIs or immunotherapy are considered separately.
Overview and general presentation of the significant results in multivariate survival analysis for survival, progression-free survival and other endpoints investigated
Overview and general presentation of the significant results in multivariate survival analysis for survival, progression-free survival and other endpoints investigated
+ (low tumor marker levels reflect longer endpoint (positive prognostic)), – (high tumor marker levels reflect longer endpoint (negative prognostic)), NS (not significant), uni (only univariate analysis was performed), B (baseline), K (kinetics), pOP (postoperative), preOP (preoperative), NSCLC (non-small cell lung cancer), LUAD (lung adenocarcinoma), LUSC (lung squamous cell carcinoma), CEA (carcinoembryonic antigen), CYFRA21-1 (cytokeratin-19 fragment), NSE (neuron-specific enolase), CA125 (cancer-antigen 125), SCCA (squamous cell carcinoma antigen), CA19-9 (carbohydrate antigen 19-9), ProGRP (Pro-Gastrin-Releasing-Peptide), PFS (progression-free survival), DFS (disease-free survival), DCR (disease control rate), PRS (post-recurrence survival), PPS (post-progression survival), RFS (recurrence-free survival), TTP(time to progression), TMI (tumor marker index), CSS (cancer-specific survival), PLT (platelet count), LRFS (local relapse-free survival), DMFS (distant metastasis-free survival), RR (response rate), ORR (overall response rate).
CEA is an oncofetal glycoprotein [30] that plays an important role in cell adhesion and it is normally produced during fetal development [71]. Known as “pan-marker”, CEA is used as a tumor marker in several types of cancers with different origins, including NSCLC, and it is especially associated with adenocarcinoma [72, 73]. CEA has proven to be a relevant marker in the management of lung cancer [27], however, it is primarily used for disease monitoring [56]. Several studies consistently confirm the independent unfavorable prognostic value of high pretherapeutic CEA levels (Table 4).
Wang et al. [74] investigated the prognostic relevance of CEA in a meta-analysis of 16 studies with 4296 patients in all stages of NSCLC, emphasizing stage I NSCLC. High levels of preoperative CEA had a significant correlation with poor OS (HR 2.28) in both Asian and non-Asian study populations. Other studies [56, 75–79] were not able to show a prognostic value of elevated CEA levels for survival (Tables 2+4, Fig. 2). Diverse composition of the study populations in terms of size, staging or histology as well as different cut-offs used or varying lengths of follow-up and censoring could be explanations for differing results.
In studies on early-stage NSCLC, CEA was investigated with regard to the pre- and postsurgical levels and its kinetics in order to identify high-risk patients in need of additional adjuvant therapies (Table 1). Chen et al. [80] analyzed the longitudinal change in serum CEA levels in stage I NSCLC patients after surgery and found no prognostic value for baseline levels alone but for pre- and additionally postsurgical high CEA levels (>10 ng/mL; HR 10.27) and for increasing kinetics (HR 4.67) being associated with unfavorable prognosis for RFS. Prognostic significance of preoperative STM levels, however, may vary with radiological features or histologic subtypes of NSCLC. In a large retrospective study (n = 2654) by Chen et al. [50], who investigated six STMs in histological subgroups of NSCLC, CEA was an independent predictor of RFS in LUAD (HR 1.25) but not in LUSC. The use of a combination of STMs [46–48, 81] and other blood biomarkers [82, 83], such as CRP, was repeatingly mentioned, as it enhanced the prognostic value over single marker measurements (Table 1).
Due to the recent changes of treatment approaches in NSCLC from classical chemotherapies to modern TKI and ICI-based regimes, prognostic investigations concerning STM in patients treated with chemotherapy after 2010 are limited. Like earlier studies, baseline high serum levels of CEA before the initiation of chemotherapy or missing reduction after therapy in late-stage NSCLC were associated with unfavorable outcomes [35, 64], however, the majority of studies reported non-significant results for the prognostic relevance of CEA (11 out of 14) (Table 3, Fig. 2).
Other serum tumor markers and combinations
NSE is a glycolytic enzyme present in neurons, peripheral neuroendocrine tissues and is found in cancers of neuroendocrine cellular origin [84, 85] especially in small cell subtypes of lung cancer (SCLC) [84, 86]. However, prognostic values of NSE in NSCLC is still controversial. A pooled analysis of eight studies including 2389 patients treated with chemo- or radio-chemotherapy could not find a prognostic value of NSE in patients with NSCLC [87], concurring with several prospective studies [62, 89]. Yan et al. [90] however, showed significantly shorter PFS and OS in 363 advanced stage NSCLC patients with elevated NSE levels treated with EGFR-TKIs or chemotherapy. The portion of patients with LUSC (47%) and the optimal cut-off value (≥26.1 ng/mL) chosen were relatively high, which could have overestimated the significance of NSE as a prognostic biomarker. In line with this assumption is the histological subgroup analysis, emphasizing the prognostic value of NSE for OS particularly in LUSC but not in LUAD. Rather high numbers of LUSC patients were also seen in several other studies stating prognostic significance of NSE [36, 91]. However, overall, the prognostic significance of NSE could not be confirmed in early or late stage NSCLC patients in almost 70% of the investigations (Fig. 2).
Other markers like CA 125 or CA 15-3 were investigated in single studies (Tables 1–4, Fig. 2). Zhai et al. [92] assessed the baseline levels of CEA, CYFRA 21-1 and CA 125 in 1011 patients with stage III-N2 NSCLC after R0 resection. Patients with normal CA 125 (<35 ng/mL) achieved higher five-year OS, PFS, local relapse-free-survival (LRFS) and distant metastasis-free survival (DMFS) than patients with elevated levels. Further, a simple prognostic model of the combination of baseline CEA, CYFRA 21-1 and CA 125 levels which classified patients into high, medium, and low risk groups, accurately predicted all outcome endpoints mentioned above. Several studies consistently showed that combined investigations of different tumor markers could enhance the prognostic significance (Table 1–3) [47, 93].
Serum tumor markers in targeted therapy
EGFR-mutations are present in about 50% of Asian NSCLC patients and around 10% of patients in Western countries [94], and are more frequently observed in females, non-smokers and patients with adenocarcinoma [95]. Numerous studies have demonstrated the efficacy of anti-EGFR tyrosine kinase inhibitor (TKI) treatments in a subset of patients with various driver EGFR-activating mutations, leading to molecular/biological EGFR-testing becoming a standard diagnostic procedure in lung cancer patients [96–98]. Nevertheless, it is questioned whether STM are relevant for prognosis or response prediction in EGFR mutation positive or negative patients or serve for monitoring during and after TKI therapy.
Remarkably, it was found that low CEA levels had a negative predictive value for PFS in patients treated with TKIs [99–104], but also in those undergoing chemotherapy and/or radiotherapy [33] and immunotherapy [43], reflecting the low comparability of individual studies with different conclusions drawn. A randomized phase II trial [99], investigating 138 advanced NSCLC patients treated either with combinations of the VEGFR/PDGFR inhibitor linifanib and chemotherapy or chemotherapy alone reported longer PFS in patients with high CEA >3 ng/mL and low CYFRA 21-1 <7 ng/mL signature in the TKI arm. Kuo et al. (2020) [105], found extremely high pretreatment CEA levels (>100 ng/mL) being a negative prognostic factor for OS and PFS in LUAD patients harboring EGFR-mutations. When investigating post-progression survival (PPS), high CEA levels at initial diagnosis and low levels at time of progression were predicting longer PPS, suggesting a changed CEA expression pattern after EGFR TKI therapy. Arrieta et al. [106] investigated STM kinetics in 748 patients with elevated CEA levels treated with first-line TKI or chemotherapy. They reported that a CEA decrease of more than 20% was predictive of longer OS and PFS in patients treated with chemotherapy (adjusted HR 0.75 and 0.71, respectively) and for PFS in patients treated with TKI (HR 0.67). Again, the selection of study design, thresholds for the STMs, endpoints, as well as the varying statistical evaluation and reporting of results are factors influencing the conclusions.
More consistent results were obtained from studies that evaluated CYFRA 21-1 levels, which consistently found that low levels were a favourable prognostic marker for OS and PFS. Nonetheless, roughly 50% of the studies concluded that CYFRA 21-1 did not have any prognostic significance (Fig. 2). Takeuchi et al. [107] (n = 95) found high CYFRA 21-1 levels (>3.5 ng/mL) to be predictive for shorter PFS (HR 2.17) but not for OS. In line with these results, Tanaka et al. [108] observed high CYFRA 21-1 levels (>2 ng/mL) being prognostic for shorter PFS (HR 1.27) but not OS. Although no control cohort was included, they suggested a predictive but not a prognostic value of CYFRA 21-1 in patients treated with EGFR-TKIs.
Serum tumor markers in immunotherapy
Several studies evaluated the prognostic value of CYFRA 21-1 and CEA in patients treated with immune checkpoint inhibitor (ICI) therapies. Dall’Olio et al. [40] investigated pre-therapeutic blood levels and their kinetics in 296 patients treated with second-line nivolumab or atezolizumab, first-line pembrolizumab and a control cohort treated with chemotherapy only. They indicated high baseline CYFRA 21-1 levels (>8 ng/mL) as an independent negative prognostic biomarker in all cohorts (HR 1.90), thereby suggesting a higher impact of CYFRA 21-1 levels for OS in patients treated with ICI than with chemotherapy. High CEA levels, however, were only significant in pretreated patients undergoing second-line ICI therapy. An early reduction of at least 20% of STM levels correlated with OS for both CYFRA 21-1 (HR 0.19) and CEA (HR 0.12), which revealed prognostic and predictive validity of CEA and CYFRA 21-1. In line with these findings, is a prospective study with 308 ICI-treated patients by Zhang et al. [109], who evaluated the dynamic changes of four STMs, CEA, CYFRA 21-1, CA125, and SCC. Six weeks after therapy initiation, a decrease of at least 20% in more than two STMs was associated with a significantly longer PFS and OS and better overall response rates, suggesting a prognostic benefit. This was also confirmed in histologic subgroup analyses.
Lang et al. [110] conducted a study that provided further evidence to support these findings. Their study examined 84 ICI-treated NSCLC patients at their initial staging exams and found that those with a >2-fold increase in the leading tumor markers (CEA, CYFRA 21-1 or CA 19-9) were more likely to have shorter PFS and OS (HR 9.08). This was also true in patients who were initially radiologically classified as non-responders. Muller et al. [111] prospectively measured five STMs at baseline and every other week, in order to early identify responders and non-responders in 376 patients treated with nivolumab or pembrolizumab. They found that an increase of >50% of a single STM, CEA, CYFRA 21-1 or NSE, as well as diverse STM combinations (CEA+CYFRA 21-1 or CEA+CYFRA 21-1+NSE) predicted non-response with a sensitivity of 38.4% at a specificity of >95% for both combinations, as early as six weeks after initiation of ICI therapy. In univariate survival analysis, OS and PFS was significantly prolonged with a negative result of CYFRA 21-1 or CEA. The benefit of combined investigations of several STMs was shown by Tang et al. [41] in 124 Chinese patients with advanced NSCLC. They reported a combination of neutrophil to lymphocyte ratio (NLR) in addition to the leading STM dynamic changes as an independent indicator of OS. Chai et al. [34] developed a prognostic nomogram for OS probability at three, six and twelve months, based on STM and clinical parameters before the start of ICI therapy in advanced NSCLC patients with a C-Index of 0.81 emphasizing the importance of the inclusion of existing prognostic factors and covariates.
Discussion
Many efforts have been made to assess the clinical significance of STMs for predicting monitoring therapy response, as well as for prognosis of NSCLC patients. Although many studies provide strong evidence of the high relevance of STMs for prognosis and prediction in both traditional chemotherapy and new targeted and immune therapies, none have been incorporated into guidelines or routine clinical practice. This may be due to the often retrospective nature of the studies –particularly in the chemotherapy era –and the lack of randomized controlled trials. As a result, many studies only attained evidence levels of 3 or 4 [31], while only a few high quality-pooled or meta-analyses reached higher levels. Noticeable efforts have been made since 2008 to adhere to the existing guidelines for reporting prognostic biomarkers, known as the REMARK recommendations [112], which were first introduced in 2005. In addition, improvements in study design and harmonization of study populations through subgroup investigations, particularly with respect to stage and histology, have been observed. However, the approval and introduction of new therapies that offer diverse treatment options and drug combinations have contributed to increased heterogeneity within patient cohorts. This, in turn, has made study reports heterogeneous, inconsistent, and sometimes conflicting, thereby complicating direct comparisons.
The most commonly investigated STMs were CYFRA 21-1, CEA, and NSE. Especially CYFRA 21-1 demonstrated high prognostic relevance across various therapeutic settings, stages and histologic subgroups. While elevated STM levels were often associated with poor prognosis, the relationship with CEA in TKI therapies was more controversial, as high CEA levels also predicted longer OS [113] and PFS in several studies [99–104]. A growing number of studies considered the inclusion of established clinical prognostic markers such as performance score, TNM stage, and histology, which were also the most important clinical parameters with prognostic relevance. However, data on factors potentially affecting STM levels, like concomitant diseases, were seldom provided. Although many studies adhered to REMARK guidelines, the reporting of pre-analytical specimen handling was often inadequate, while the documentation of analytical methods saw improvement. Cut-off levels for the STM were primarily determined based on manufacturer instructions, or own cut-off values were defined through receiver operating characteristic (ROC) curve analysis, leading to a range of inconsistencies. Concerning statistical survival analysis, most studies employed a multivariate Cox proportional model, but often relied on one-sided stepwise variable selection methods, including univariate prognostic variables without (nested) cross-validation. In some instances, the prognostic impact may have been overestimated due to selective variable assessment and subgroup evaluations that did not account for significant prognostic variables, small sample sizes, and too many events per variable in the multivariate analysis. There was a notable absence of control group investigations and of large, prospective, multicentric studies.
With the advent of new targeted and immune therapies and the definition of various first-, second- and even third-line therapy sequences, there is an increasing need for predictive and prognostic biomarkers that inform treatment decisions and long-term outcomes. Particular attention has to be drawn to the distinction between predictive and prognostic biomarkers, when evaluating outcomes in patients receiving specific treatments [18]. Predictive markers interact with treatment and directly affect patient outcomes by distinguishing responders from non-responders, while prognostic biomarkers are associated with differential disease outcomes regardless of the treatment applied [18, 114]. Unfortunately, inconsistent and interchangeable use of the terms “prediction” and “prognosis,” particularly, when progression-free survival is the study endpoint, has led to confusion. To establish a biomarker as predictive for a specific treatment’s benefit, a control group receiving a different treatment must be included to rule out the possibility that the biomarker is merely prognostic, indicating survival in both cohorts [18, 115]. Ideally, prognostic and predictive value should be validated simultaneously, as the presumed therapy benefit and, consequently, predictive value could merely reflect the prognostic significance of the marker [18, 115].
Prognostic biomarkers are typically defined by evaluating various survival endpoints such as overall survival (OS), disease free survival (DFS) and progression free survival (PFS) [20, 116]. However, each outcome measure’s limitations must be considered. While OS is objective and considered as the gold standard, it requires larger sample sizes and longer follow-up periods but can be accurately assessed due to its definite endpoint of death or disease-related death [118]. In times of multiple sequential therapy options, ‘surrogate endpoints’ like PFS and DFS are used to expedite drug approval or therapy changes in the event of treatment failure, often with shorter follow-up periods and smaller sample sizes [116]. Challenges with these surrogate endpoints include i) the necessity of frequent radiological controls, ii) at well-defined time intervals, iii) controlling evaluation bias due to interobserver variations, and iv) precise, clinically meaningful definitions of tumor response (complete response, partial response, stable disease etc.) or ‘progression event’ [118, 119]. The response evaluation criteria in solid tumors (RECIST 1.1) [117, 120], serve as the foundation for surrogate endpoint determinations, particularly for evaluating cytotoxic chemotherapy responses [121]. However, atypical response patterns (pseudoprogression and hyperprogression) observed in patients undergoing immune therapies [122], have made disease monitoring challenging using this measure, leading to the introduction of immune-based RECIST criteria [123]. Moreover, non-measurable lesions, asymmetrical tumor size changes, multiple metastatic lesions, differing dynamics of tumor size versus tumor activity, and the critical definition of “stable disease” present additional challenges for applying RECIST criteria in evaluating treatment outcomes [123, 124]. Some of these issues may be addressed by emerging developments like the metabolic imaging (e.g. 18-FDG PET –PERCIST and iPERCIST criteria) [121, 125] or radiological image pattern analyses (Radiomics), where medical imaging analysis and data mining methods are combined [126]; furthermore the combination with clinical aspects [127] and liquid biopsies [25] or more futuristic approaches, including deep learning mechanisms (artificial intelligence algorithms) [121] could result in personalized disease profiles and individualized therapy strategies [125, 128].
In recent years, molecular liquid profiling of cell-free tumor (ct)DNA in the blood plasma opened a whole new field of biomarkers and gained more and more interest. Several studies have explored the potential of liquid biopsies for prognosis, prediction and monitoring therapy response and detecting disease progression in lung cancer. These studies have investigated the additional use of circulating tumor cells (CTCs) [129] and cell-free tumor (ct)DNA [130] with STMs. Results have shown that changes in these biomarkers over time may correlate with longer progression-free survival (PFS) and overall survival (OS) in certain cancer types. In addition, newer approaches such as the use of cell-free RNA (cfRNA) in addition to STM testing have also shown promising results for early detection and monitoring of NSCLC [131]. These findings suggest that joint liquid profiling and STM investigations have the potential to be valuable tools in the clinical management of cancer patients.
In general, it has to be stated, that a noticeable heterogeneity in study designs, patient characteristics, analytical methods, pre-analytical methods, and statistical evaluations made it difficult to confidently assess the prognostic validity of STMs. Moreover, due to the non-comparability of these studies, it is currently not possible to provide concrete recommendations on how to use STMs for prognostic approaches, including clinically significant timepoints in early and late-stage therapies, absolute value thresholds, and kinetics in serial evaluations, preventing their timely incorporation into existing lung cancer protocols.
To address earlier and the above mentioned [27, 133] unresolved and fundamental issues in future studies, we suggest creating a core set of study criteria to conduct consistent, comprehensive, and comparable studies, that yield reliable clinical and biomarker data, thereby producing a more robust evidence base for specific tumor marker testing. A standardized core set could assist in the planning, the correct and sufficient evaluation of generated data and, especially, reporting of the results. Our proposal is to create such a core set through a Delphi panel, with an overview provided in Fig. 3.

Checklist for prognostic and predictive serum tumor marker studies. STM (serum tumor markers), NPV (negative predictive value), PPV (positive predictive value).
The present survey updates and reaffirms the significant prognostic value of individual STMs and their combinations, particularly CYFRA 21-1 and CEA, in both early and advanced NSCLC patients undergoing chemotherapy, despite the considerable heterogeneity in study design and reporting. Furthermore, the clinical utility of STMs for prognosis and prediction in novel TKI and ICI therapies is demonstrated. To achieve higher evidence level of STM studies, it is recommended to include STMs in translational biomarker substudies of randomized phase III trials. These trials should include a large number of patients in both treatment and control groups, adhere to well-regulated (post)-treatment protocols, employ standardized outcome measures, establish well-defined blood collection schedules, and maintain standardized preanalytics, biobanking, analytics, and statistical evaluations. There remains substantial work to be done to fully harness the potential of protein-based blood biomarkers in traditional and emerging targeted and immune therapies.
Author contributions
Conception: IT, SH.
Interpretation or analysis of data: IT and SH.
Preparation of manuscript: IT and SH.
Revision for important intellectual content: IT and SH.
Supervision: SH.
Conflict of interest
IT has declared no conflict of interest.
SH has received research funding or honoraria from Roche Diagnostics, Bristol Myers Squibb, Merck KgaA, Sysmex Inostics and Volition SPRL. SH is also an editorial board member of Tumor Biology but had no involvement in the peer review process of this article.
