Abstract
Background:
Tumor mutation burden (TMB) has been validated as a predictive biomarker for immunotherapy response and survival in numerous cancer types. Limited data is available on the inherent prognostic role of TMB in early-stage tumors.
Objective:
To evaluate the prognostic role of TMB in early-stage, resected non-small cell lung cancer (NSCLC).
Design:
Systematic review and meta-analysis of pertinent prospective and retrospective studies.
Data sources and methods:
Publication search was performed in PubMed, Embase, Cochrane Library, and Web of Science databases. Based on the level of heterogeneity, a random- or fixed-effects model was used to calculate pooled effects of hazard ratio (HR) for overall survival (OS) and disease-free survival (DFS). The source of heterogeneity was investigated using sensitivity analysis, subgroup analysis, and publication bias assessment.
Results:
Ten studies comprising 2520 patients were included in this analysis. There was no statistically significant difference in OS (HR, 1.18, 95% CI, 0.70, 1.33; p 0.53, I2 = 80%; phet < 0.0001) and DFS (HR, 1.18, 95% CI, 0.91, 1.52; p = 0.53, I2 = 75%; phet = 0.0001) between the high-TMB and low-TMB group. Subgroup analyses indicated that East Asian ethnicity, and TMB detected using whole exome sequencing, and studies with <100 patients had poor DFS in the high-TMB group.
Conclusion:
The inherent prognostic role of TMB is limited in early-stage NSCLC. Ethnic differences in mutation burden must be considered while designing future trials on neoadjuvant immunotherapy. Further research in the harmonization and standardization of panel-based TMB is essential for its widespread clinical utility.
Keywords
Introduction
Lung cancer (LC) comprises molecularly and histologically distinct subtypes, of which non-small cell lung cancer (NSCLC) accounts for 85% of new lung cancer cases. 1 Approximately 39% of new NSCLC cases are diagnosed at an early stage (I-IIIA). 2 Despite curative treatment, the 5-year overall survival (OS) rate plunges from 92% in clinical stage IA1 to 36% in stage IIIA. 3 The Tumor Nodes Metastasis (TNM) staging system is the standard prognostic assessment tool currently used in the clinical practice for LC. 4 Due to tumor biological heterogeneity, anatomical classification inherently has deficiencies; therefore, reliable biomarkers for prognosis must be continuously sought. 5
The number of somatic non-synonymous mutations within a tumor genome represents tumor mutational burden (TMB). 6 Lung adenocarcinomas (LUAD) and squamous cell carcinomas (LSCC) have one of the highest prevalence of somatic mutations [above 10 somatic mutations per megabase (Mb) (Mut/Mb) of coding DNA], and thus, more likely to form neoantigens to be detected by autologous T-cells. 7 Based on this phenomenon, TMB has been explored extensively as a biomarker of immunotherapy response in various cancer types.8–10 High-TMB patients showed a significantly improved OS, progression-free survival (PFS), and objective response rate (ORR) following immune checkpoint inhibitor (ICI) therapy in advanced NSCLC, independent from programed cell death ligand-1 (PD-L1) expression.9,11–13 Interestingly, a recent meta-analysis (10 studies, 491 participants) identified TMB as a promising predictive biomarker for major pathological response [Odds Ratio (OR), 3.40, 95% confidence interval 1.33–8.70; p = 0.0109] and pathological complete response (pCR) [OR, 1.98, 95% CI 1.08–3.63, p = 0.0265] to neoadjuvant immunotherapy in early-stage NSCLC, suggesting an emerging role of TMB in early-stage NSCLC. 14
The inherent prognostic implication of TMB, independent of ICIs, in early-stage, resected NSCLC is variable, where some studies have shown a positive 15 or negative association16,17 between high-TMB and survival, while others have not.18,19 Ethnicity, NSCLC subtypes, TMB detection methods, including the different regions of genes analyzed, cutoff values, and sample sources may further influence this association in individual studies. Consequently, we performed a systematic review and meta-analysis of studies that assessed the prognostic role of TMB for early-stage resected NSCLC survival.
Materials and methods
The Preferred Reporting Items for Systematic Review and Meta-analyses (PRISMA) guidelines were utilized to conduct this systematic review and meta-analysis 20 (Supplemental Table S1). An a priori-defined protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) (CRD42023392846).
Literature search
PubMed, Embase, Web of Science, and Cochrane Library were comprehensively searched for articles through 31 January 2023. The following Medical Subject Headings (MeSH)/Emtree terms were used (tumor mutational burden OR tumor mutation burden OR tumor mutational burden OR tumor mutational load OR tumor mutation load OR TMB OR TML) AND (lung cancer OR lung neoplasms OR lung tumor OR non-small cell lung cancer OR cancer of lung OR lung adenocarcinoma OR adenocarcinoma of lung OR squamous cell carcinoma of lung OR lung squamous cell carcinoma OR human pulmonary neoplasm OR human pulmonary carcinomas; Supplemental Table S2). To include a maximum number of appropriate studies, the American Society of Clinical Oncology, European Society for Medical Oncology (ESMO), and World Lung Cancer Congress conference abstracts were also searched. In addition, the references of the full-text eligible studies were consulted to collect as many suitable studies as possible.
Study selection
Article titles, abstracts, and full text (if required) were screened by two independent reviewers (DW and PH) and evaluated based on the inclusion criteria. In case of differing opinions, disagreements were resolved by mutual discussion or involving the third reviewer (SG).
The inclusion criteria were as follows: (1) Studies with patients diagnosed with histologically confirmed early-stage NSCLC (I-IIIA), (2) Studies reporting TMB into two or three groups (high TMB and/or intermediate TMB versus low TMB levels) based on clear TMB cut-off values, (3) Studies with OS and/or DFS as the clinical endpoints (4) Studies with either reported or extractable time-to-event data as HR and its 95% CI related with the TMB level, 21 (5) No language restriction was placed.
The PICOS criteria for this review were as follows:
The exclusion criteria were as follows: Studies without original data (narrative or systematic review, editorial, commentary, and case reports), and studies with incomplete data.
Data extraction and quality assessment
The following data were extracted: Name of the first author, the country where the study was performed, year of publication, number of total participants and each group (high TMB and low/intermediate-TMB), study design, NSCLC subtypes, demographic data (male and ever-smokers), stage, sample source, TMB detection method, TMB threshold, median TMB value and its range, median follow-up duration, and clinical endpoints. In addition, HR and 95% CI, source of time-to-event data, and associated adjusted variables were also reported. The HR was reported as high TMB versus low TMB.
The methodology quality of studies was assessed using the Newcastle-Ottawa Scale (NOS). 22 Each study was scored under three categories: selection (representativeness of exposed cohort, selection of non-exposed cohort, and ascertainment of exposure), comparability of cohorts, and outcome (assessment of outcome, adequacy, and length of follow-up). The cumulative scores defined the risk of bias in the individual studies: 7–9 as low risk, 4–6 as moderate risk, and 0–3 as a high risk of bias.
Statistical methods and data analysis
The association between NSCLC survival and TMB was determined by pooled HR derived from the inverse variance method. The between-study heterogeneity was assessed using Cochran’s Q test and measured by the I2 indices. Significant heterogeneity was considered if I2 > 60%, and a random-effects model was used. In the case of moderate (I2 = 30-60%) and low (I2 < 30%) heterogeneity, a fixed-effects model was utilized.
Investigation into the sources of heterogeneity was performed through the publication bias assessment, sensitivity, and subgroup analyses. Funnel plot asymmetry and a significant Egger et al. ’s regression test (p < 0.1) suggested publication bias. 23 In that case, Duval and Tweedie’s trim and fill method was employed to derive an adjusted pooled estimate [adjusted HR (aHR)]. 24 Sensitivity analyses were conducted using the leave-one-out method. Subgroup analyses based on NSCLC subtypes (LUAD and others), ethnicity (East Asian and non-Asian), adjustment of clinical covariates (multivariate and univariate/extracted data), sample size (</> 100 patients), TMB cut-off values (</> 10 Mut/Mb), and TMB assessment methods [whole exome sequencing (WES) and targeted gene panels]. A minimum of two studies should have reported clinical attributes for subgroup analysis. To reduce the chance of type I error from multiple testing on the same dataset, the Bonferroni correction was performed for the subgroup analyses. 25 Primary, sensitivity, and subgroup analyses were performed using Review Manager 5.4 (The Cochrane Collaboration, Copenhagen, Denmark). Publication bias analyses were performed in the JASP software (JASP 0.16, the JASP team). 26 The certainty of the evidence for primary outcomes was evaluated using GRADE criteria based on the number of studies, type of evidence, study quality, consistency, directness, and effect size. 27
Results
Literature search
Of 2975 citations following the initial database search including 76 conference abstracts, 874 duplicates were removed and 1982 articles were excluded due to irrelevant topics, absence of original data, or incomplete data. Manual reference search did not yield any new article. Subsequently, 119 articles underwent full-text screening and 10 studies finally fulfilled the inclusion criteria.16–19,28–33 The PRISMA flowchart for the study selection process is presented in Figure 1.

PRISMA flow diagram for the study selection process.
Study characteristics and quality assessment
Ten studies comprised 2520 patients (range 55–1008) with a median sample size of 124. Table 1 features the detailed study characteristics. All studies were retrospective cohorts in design conducted in North America,28–31,33 Europe,18,32 and Asia.16,17,19 Self-reported non-Hispanic-East Asian ethnicity was observed in 15% of the study population. Approximately 56% of the entire population were males and 43% were ever-smokers. Data were available on LUAD from six studies17,18,28,29,31,32 and LSCC from two studies.28,33
Study characteristics.
DFS, disease-free survival; F1 CDX, FoundatioOne, CDx™; GMS, genomic modeling system; h/o, history of; LUAD, Adenocarcinoma; LSCC, lung cancer-specific survival; Mut/Mb, Mutations per Megabase; MV, multivariate; NA, not available; NGS, Next generation sequencing; NOS, Newcastle-Ottawa Scale; OS, overall survival; PI, pleural invasion; PS, performance status; SCC, squamous cell carcinoma; TTR, time to response; UV, univariate; VI, vascular invasion; WES, whole exome sequencing.
The sample source was reported in all except one study, 17 and most studies used Formalin-Fixed Paraffin-Embedded (FFPE) resected primary tumor specimens. WES (included in survival analysis) was employed in four studies,16,17,19,30 whereas the rest of the studies used targeted gene panels for TMB assessment. Most studies divided the comparative groups into high-TMB and low-TMB based on TMB cut-off values. Devarakonda et al. divided the patients into three groups, namely, high-TMB (⩾8 Mut/Mb), intermediate-TMB (4–8 Mut/Mb), and low-TMB (⩽4 Mut/Mb), of which we included high- versus moderate-TMB for survival analyses. 28 The TMB cutoff values varied considerably between individual studies and have been reported in detail in Table 1.
Stages I and II had similar distribution, seen in 38% of patients. Median follow-up ranged between 16.1 and 87 months. Data on OS and DFS were available from seven,16,18,19,28,31–33 and nine studies,16–19,28,30–32 respectively. Five studies evaluated the effects of TMB level on survival outcome through multivariate Cox proportional hazards regression models, adjusting for clinicopathological variables, such as age, gender, smoking status, stage, adjuvant therapy, histological subtypes, and histological grades.16,18,19,28,32 Furthermore, Owada-Ozaki et al. incorporated TP53, EGFR, KRAS, and ERBB2 mutations in this model. 16 Two studies required the extraction of data from the survival curves.17,30
The results from the NOS revealed that seven studies had a low risk16,18,19,28,29,31,32 and the rest had a moderate risk of bias (Table 1 and Supplemental Table S3). Most studies provided independent and comparable cohorts, adequate ascertainment of exposure, appropriate statistical analysis, and thorough experimental results. A lack of data on the adequacy of follow-up and on sufficient follow-up time for the event to transpire were the major deficiencies in a few studies.17,19,28,33 The median and the mean score was 7 (range 5–9), suggesting an overall high methodological quality.
Pooled effects of HR for OS
From seven studies and 1760 patients, the pooled HR for OS was 1.18 (95% CI, 0.70, 1.33; p 0.53), indicating a lack of significant survival benefits between high-TMB and low-TMB groups. A significant between-study heterogeneity was observed (I2 = 80%, phet <0.0001; Figure 2).

Forest plot of HR for OS in patients with high TMB versus low TMB.
The subgroup analysis results are presented in Table 2 and Supplemental Figures S1, S2, S3, S4, and S5. None of the subgroups’ pooled effects showed a statistically significant association between TMB and OS. All except NSCLC subtypes subgroup analyses showed significant between-study heterogeneity. For NSCLC subtypes subgroup analysis, the LUAD subgroup [I2 = 85%, phet < 0.0001] had a significant heterogeneity whereas LSCC did not [I2 = 0%, phet = 0.48]. Due to a limited number of studies, TMB cutoff value subgroup analysis could not be carried out.
Summary of the meta-analysis results.
NSCLC, non-small cell lung cancer; TMB, Tumor mutation burden; WES, Whole exome sequencing.Bold values suggest a statistically significant effect estimate.
Pooled effects of HR for DFS
From nine studies and 2365 patients, the pooled HR for DFS was 1.18 (95% CI, 0.91, 1.52; p = 0.53), suggesting no significant survival difference between the high-TMB and low-TMB groups. Significant heterogeneity was observed (I2 = 75%, phet = 0.0001; Figure 3).

Forest plot of HR for DFS in patients with high TMB versus low TMB.
The subgroup analysis results are presented in Table 2. The results of the subgroup analyses indicated that ethnicity (I2 = 80.8%, psubgroup = 0.02), sample size (I2 = 78%, psubgroup = 0.03), and TMB assessment method (I2 = 83.8%, psubgroup = 0.01) significantly modified the association of TMB with DFS. In the East Asian subgroup, the DFS of patients with high TMB was shorter than patients with low TMB (HR, 1.68, 95% CI 1.24, 2.26; p = 0.0007; Supplemental Figure S6). For the sample size subgroup, when the sample size was fewer than or equal to 100 participants, the DFS of patients with high TMB was shorter than that of low TMB (HR, 1.59, 95% CI 1.17, 2.17; p = 0.003; Supplemental Figure S7. In the WES subgroup, DFS was shorter in patients with high TMB levels (HR, 1.59, 95% CI 1.26, 2.00; p = 0.0001), whereas DFS was similar between high- and low-TMB groups in the targeted-gene panel subgroup (Supplemental Figure S8). Meanwhile, the heterogeneity of the East Asians, the sample size of fewer than 100 patients, and the WES subgroup were resolved. The association of DFS and TMB level was unrelated to survival analysis methods (Supplemental Figure S9) and the TMB cut-off values (Supplemental Figure S10). Due to a limited number of studies, NSCLC subtypes subgroup analysis could not be performed. However, additional analysis of studies with LUAD exclusively showed a consistent result as the primary analysis (HR, 0.97, 95% CI 0.67, 1.40; p = 0.85; I2 = 75%, phet = 0.003; Supplemental Figure S11).
Sensitivity and publication bias
The leave-one-out method exhibited non-statistically significant variation in the pooled OS and DFS effects. The presence of symmetrical funnel plots and Egger’s regression tests [OS, p = 0.107; DFS = 0.828] indicated a lack of publication bias (Supplemental Figure S12 and S13). Based on the GRADE criteria, the certainty of the evidence for OS and DFS was low (Supplemental Table S4).
Discussion
Our meta-analysis evaluated the hypothesis of whether TMB was associated with early-stage, resected NSCLC survival. Data from 10 studies comprising 2520 patients suggested that patients with high TMB have a similar OS and DFS compared to low TMB. Furthermore, primary analyses were associated with significant between-study heterogeneity. Subgroup analysis for OS indicated that NSCLC subtypes might explain high heterogeneity. Furthermore, subgroup analyses for DFS revealed that ethnicity, sample size, and the TMB assessment methods might explain significant between-study heterogeneity. In addition, patients of East Asian ethnicity, studies with fewer than 100 patients, and the use of WES were associated with poor DFS in the high TMB group. Overall, the GRADE criteria indicated a low certainty of the evidence for both primary analyses’ outcomes.
Prior studies have shown that high TMB was associated with improved OS and PFS following ICI therapy in melanoma, colorectal cancer, and NSCLC patients.34–36 Examination of the prognostic role of TMB assessed using MSK-IMPACT in a large, advanced solid cancer patient’s cohort who did not receive ICIs, indicated a lack of association between high-TMB (highest mutation load quintile 20%) and OS (HR, 1.12, p = 0.11). 37 A similar lack of prognostic impact was observed in metastatic NSCLC (n = 623; HR 1.108, p = 0.508). 37 Similarly, a pooled analysis of seven studies on advanced solid tumors not treated with ICIs showed no survival difference between high and low TMB. 38 In early-stage NSCLC, high TMB seems to have no general prognostic benefit per our findings. This lack of prognostic benefit was also observed within both NSCLC subtypes.
Being a genetic biomarker, ethnic or racial differences in TMB appear intuitive.39,40 In a continuously enrolled cohort of advanced NSCLC patients, patients of African genomic ancestry had the highest median TMB level (8.75 Mut/Mb). In contrast, the patients of Asian ancestry had the lowest median TMB level (3.75 Mut/Mb), despite stratification for smoking status (p < 0.001). 41 One of the key findings in our study was the presence of ethnicity-specific differences in DFS outcome by TMB, where patients of East Asian ethnicity had worse DFS in the high-TMB group (p = 0.0007). This finding was consistent with an analysis of the non-ICI treated patients in the Cancer Genome Atlas (TCGA) and MSK-IMPACT cohorts, where high-TMB (>10 Mut/Mb) negatively affected survival in Asian patients (HR 2.99; 95% CI 1.22, 4.39). 42 The ethnicity-specific association between TMB and clinical outcome is likely due to (1) the lack of statistical power to detect subgroup differences due to unequal covariate distribution in the subgroups (East Asian: 3 studies, 380 participants; non-East Asian: 6 studies, 1985 participants), (2) residual analytical issues in TMB assessment: Five out of six studies in the non-Asian subgroup used tumor-only panels which are shown to have ‘TMB-inflation’, particularly in patients of Asian or African ancestry than patients of European ancestry,18,28,29,31,32,43 (3) implications of social healthcare determinants, including differential smoking habits and healthcare disparities.44,45 However, considering comparable healthcare access between Asians and non-Hispanic Whites and a similar mutational profile between Asian and Black patients, the race-related genetic and environmental factors have a multi-faceted relationship with TMB and thus, require further exploration.46,47
Targeted gene panel assays are increasingly being considered in clinical practice compared to WES because of their short turnaround time, identification of targetable mutation, and cost-effectiveness, despite the latter being the current reference standard. 48 Numerous preanalytical factors such as specimen types, quality, and quantity; sequencing factors such as gene list, panel size, and sequencing depth; and bioinformatics algorithms with somatic and germline variant calling and filtering are likely to affect TMB estimates.49,50 These factors may also explain the targeted-gene panel subgroup as one of the main sources of heterogeneity in our review.
Based on the mounting evidence on the prognostic role of TMB assessed in circulating tumor DNA, B-FIRST (NCT02848651), a phase II clinical trial, revealed that advanced NSCLC patients (IIIB-IV) who received first-line atezolizumab had longer OS and higher ORR if blood TMB (bTMB) was ⩾16 (14.5 Mut/Mb).51–53 Nine studies in our review reported the use of FFPE tumor specimens which may lead to TMB overestimation due to deamination from formalin fixation.16,18,19,28–33,54 Only one study reported tumor fractionation and even low tumor fractions might result in deflated TMB levels.29,55 Four studies utilized the U.S. Food and Drug Agency approved gene panel tests, namely, FoundationOne, FoundationOne CDx (F1 CDx), and MSK-IMPACT, that have shown acceptable concordance with WES in empirical and in-silico TMB assessment.18,29,31,32,56,57 Similarly, a pooled analysis from TCGA database did not find a significant prognostic impact of TMB on LUAD (HR 1.35, 95% CI 0.89, 2.05) and LSCC (HR, 0.77, 95% CI 0.50, 1.18) survival independent of TMB assessment methods (WES, MSK-IMPACT, and FoundationOne CDx). 58 Two international consortiums, the Friends of Cancer Research and Quality Assurance Initiative Pathology validated bioinformatic algorithms, standardized panel-based TMB estimates using in-silico analysis, and assessed between-TMB panel variations in FFPE tumor specimens for multiple cancer types, thereby taking TMB one step closer to a more refined prognostic biomarker.59–61
Evaluation of TMB, assessed using F1 CDx assay, as a predictive biomarker for ICI response in numerous landmark clinical trials on NSCLC utilized ⩾10 mut/Mb (~200 mutations by WES) as the cut-off for defining ‘high TMB’.52,62–64 A 23-gene TMB estimation panel evaluated in an early-stage, resected NSCLC Chinese cohort (n = 89) showed a statistically significant association between DFS and TMB only when TMB cut-offs were between two and four mutation counts, despite having a high concordance with WES (Spearman r = 0.8487, p < 0.0001). 65 TMB threshold for prognostic purposes depends on the tumor microenvironment and may vary according to the cancer types, the assay, and clinical factors, like race and treatment history.43,66 Six studies in our review used either tertile cut-offs or median TMB levels contributing to high heterogeneity.16,19,28,30,31,33 Due to the lack of comparability, these TMB cut-offs cannot be used in future clinical studies. 67 The phase III of the FoCR study is aimed at validating TMB thresholds for individual cancer types thereby allowing cross-trial comparisons and eventually its widespread clinical implementation. 68
Several aspects of the present study strengthen its role in the contemporary NSCLC therapeutic landscape. Our meta-analysis is the first to report the pooled effect estimates of OS and DFS in early-stage, resected NSCLC patients by TMB levels. The lack of inherent prognostic value of TMB in this large, non-ICI-treated cohort provides a benchmark for future studies, considering the accumulating evidence of the role of neoadjuvant immunotherapy in early-stage NSCLC.69–71
Despite adhering to delivering a statistically robust meta-analysis, certain limitations prevailed. Although two studies in the OS and three studies in the DFS analyses showed a prognostic association of high TMB in early-stage NSCLC, the limited sample sizes (ranging from 55 to 148) in those individual studies portend limited statistical power. There is substantial evidence supporting this contention, as the association between NSCLC DFS and TMB levels is significantly stronger in cohorts with fewer than 100 participants (p = 0.003). Although publication bias in the OS and DFS was not evident, Egger’s regression or symmetry of funnel plots is not reliable in a meta-analysis with fewer than 10 studies.72,73 Lastly, variable study designs, multiple TMB assessment methods, unadjusted clinicopathological characteristics, and the different survival analysis methods may further confer heterogeneity. However, we strived to ascertain the sources of heterogeneity through subgroup analyses. Based on GRADE criteria, the true effect might be markedly different from the estimated effect, thus requiring more research into this area. 74
Conclusion
Our current systematic review and meta-analysis provide evidence against an inherent prognostic role of TMB in early-stage, resected NSCLC patients. In the East-Asian patients, high TMB was associated with shorter DFS; nonetheless, the association is not apparent in non-Asian patients. Ethnicity, TMB assessment methods, and sample size were the sources of between-study heterogeneity for DFS. Overall, as per the GRADE criteria, the certainty of the evidence is low and thus, further research in harmonization and standardization of TMB assays is required for its widespread utilization in clinical practice.
Supplemental Material
sj-docx-1-tam-10.1177_17588359231195199 – Supplemental material for The prognostic value of TMB in early-stage non-small cell lung cancer: a systematic review and meta-analysis
Supplemental material, sj-docx-1-tam-10.1177_17588359231195199 for The prognostic value of TMB in early-stage non-small cell lung cancer: a systematic review and meta-analysis by Durgesh Wankhede, Sandeep Grover and Paul Hofman in Therapeutic Advances in Medical Oncology
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
