Abstract
Lung cancer is the leading cause of cancer death world-wide. Along the entire timeline of lung cancer identification, diagnosis and treatment, clinicians and patients face challenges in clinical decision-making that could be aided by useful biomarkers. In this review, we discuss the development of biomarkers and qualities that are ideal in a biomarker candidate, types of biospecimens that can be utilized for biomarker development in lung cancer, and how biomarkers could be clinically useful at various points along lung cancer timeline. We then review biomarkers that have been validated and are clinically available to assist with the management of lung nodules and diagnosis of lung cancer, which includes blood-based biomarkers to assist with decision-making prior to an invasive diagnostic procedure, as well as specimens obtained during a bronchoscopy and applied in cases of an inconclusive biopsy result. Finally, we discuss challenges in biomarker application and recent publications relevant to future lung cancer biomarker development.
Background:
Lung cancer is the leading cause of cancer-related death globally with an estimated 2 million deaths per year and incidence of 2.3 million new diagnoses per year. 1 The overall five year survival for lung cancer remains only 23%. 2 When diagnosed at a localized or earlier stage, the five-year survival is higher at 56–80%, but only 16% of cases are diagnosed at early stage. 3 Therefore, there has been significant focus on improving early detection and diagnosis of lung cancer. Accurate biomarkers to predict lung cancer risk may aid in identifying patients at highest risk to develop lung cancer to improve early detection. Proper integration of non-invasive biomarkers into evaluation of lung nodules may prevent unnecessary invasive interventions and costs for benign disease. Development of novel biomarkers will shape the future landscape of prediction, diagnosis, prognostication, and monitoring lung cancer.
This article overviews the current landscape of biomarkers, including the types of specimens and their possible applications as well as implementation along the diagnostic and treatment spectrum. It will then focus on the current state of development of bronchoscopic biomarkers and future directions.
Development of biomarkers
The discovery of -omics, technologies that enable measurement and analysis of large amounts of patient-centered data to identify a signature to detect disease and response to treatment, has fueled numerous innovative studies in non-invasive lung cancer testing with biomarkers. This research has focused on the challenges that standards of practice face in evaluation and diagnosis of lung cancer. The Center for Disease Control and Prevention has developed a model to aid with development of biomarkers called the ACCE model which has 5 stages: discovery, analytical validation, clinical validation, clinical utility, and implementation. 4 This model optimizes the likelihood that biomarkers developed can have a significant clinical impact on lung cancer care. Artificial intelligence has potential to assist in discovery of biomarkers for lung cancer, yet the other stages of biomarker development remain relevant to maximizing clinical impact. 5
Biomarker qualities
In order to have a clinical impact on lung cancer care, there are certain attributes a biomarker should demonstrate. A successful biomarker must advance and surpass current standard of practice. 6 It should have optimal accuracy, precision and reproducibility. 7 Often biomarkers undergo validation with testing on multiple training sets prior to testing in a cohort of intended use subjects. It is ideal if the training sets match the characteristics of the intended use population to optimize clinical validation. 7 Depending on the stage of testing the biomarker is intended for, the burden of disease or stage of cancer may impact the applicability of the test.
To optimize applicability in a large population, clinical validation should be performed in a diverse population including patients from different races, ethnicities and geographic backgrounds. This is particularly important given the known health disparities in lung cancer for rural and minority populations. African American and Native Hawaiian individuals have the highest incidence of lung cancer and African American men have the highest lung cancer mortality of all groups. 2 There is significant underrepresentation of African Americans and other racial and ethnic minorities in biobanking programs. 8
The application of biomarkers into practice should influence clinical decision making and patient outcomes. Even biomarkers with adequate accuracy and reproducibility may not impact clinical outcomes. 4 Biomarkers can serve to identify patients at risk to develop cancer, identify cancer early, minimize patient risk from invasive interventions, and reduce downstream costs. However, careful consideration should be made for potential harms of biomarker use. Biomarker results should be easy to interpret and describe to patients. Misinterpretation of results can lead to missed diagnosis and upstaging or unnecessary procedures or interventions. Difficult to interpret results may also affect patient distress and anxiety levels. Further exploration on how to integrate biomarker results into shared decision making with patients is critical and lacking. Application of a biomarker should also be cost-effective which is often defined in cost effective analysis (CEA) by quality-adjusted life years (QALY) and incremental cost-effectiveness ratio (ICER). 4
Biospecimens
Biospecimens should be easily accessible and obtainable, simple to prepare and store and available in sufficient amounts for measurement. 7 There has been substantial discovery of numerous biomolecules from variable sources with the advancements of specimen collection and processing technologies. Most of the biomarkers developed have focused on adenocarcinoma and squamous cell carcinoma subtypes of lung cancer with few pertaining to small cell lung cancer or other subtypes of non-small cell lung cancer. Biospecimens can be obtained from multiple sources, most commonly blood, but also expectorated sputum, bronchial lavage or aspirate, airway epithelium, urine, saliva and even exhaled air. Samples can be analyzed to identify numerous biomolecules. Figure 1 lists many of the types of lung cancer biomarkers under development from their perspective sources.

Types of biomarkers and biospecimen sources. ctDNA: circulating tumor DNA; miRNA: microRNA; mRNA: messenger RNA; SNPs: Single nucleotide polymorphisms; VOCs: Volatile organic compounds.
Autoantibodies develop in response to tumor antigens and have been identified in all histological types and stages of lung cancer. 9 A well-validated autoantibody panel was found to have high specificity (91%), but low sensitivity (37%). 10 Lung cancer can also activate the complement pathway increasing downstream split products in plasma. C4d levels have been linked to increased lung cancer risk and when evaluated in patients with incidental pulmonary nodules (IPN), C4d was found to be elevated in patients with malignant nodules. This test also had high specificity (89%) with low sensitivity (44%) with 84% negative predictive value (NPV) and 54% positive predictive value (PPV). 11
There have been many biomarkers that identify measurable serum antigens or proteins. Some of the proteins identified include carcinoembryonic antigen, cancer antigen 125, cytokeratin fragment 21–1, cancer antigen 15.3, neuron specific enolase, prograstin-releasing peptide. These are often evaluated as panels that combine multiple proteins as well as autoantibodies.12,13 The PANOPTIC trial evaluated a two-protein biomarker ratio in combination with the Mayo model lung nodule risk calculator in patients with nodules with intermediate risk and found that the integrated classifier had a 40% relative risk reduction of invasive testing in benign nodules. 14
Host-tumor interactions can produce circulating microRNA (miRNA) in plasma, which has been evaluated as a biomarker for cancer diagnosis and prognosis. In one study, a liquid miRNA test and miRNA signature classifier (MSC) resulted in a fourfold and fivefold reduction in LDCT-false positive rate. It had also shown good performance in post-surgical patients in monitoring for recurrence or relapse. 15 Circulating tumor DNA (ctDNA) is better established as a biomarker for advanced stage lung cancer in detecting mutational burden and next generation sequencing. Sensitivity of these tests for stage I cancer (15%) is lower than for metastatic cancer (up to 100%), with an overall sensitivity of 48%. 16
Lung cancer and other malignant tissue may exhibit abnormal methylation patterns of DNA with global hypomethylation but hypermethylation of the promoter region of tumor suppressor genes. 17 Similar patterns can also be seen in other pulmonary diseases like COPD and fibrotic interstitial lung disease. 18 One study by Hulbert et al. assessed a three-gene model in sputum and plasma samples from patients with suspicious nodules on CT imaging, which was found to have 93% sensitivity and 62% specificity in plasma and 98% sensitivity and 71% specificity in sputum. 19
Airway epithelial gene expression represents another area of interest to identify bronchial genomic classifiers as biomarkers in the context of a “field of injury” concept. Patients with intermediate risk nodules undergoing diagnostic bronchoscopy may benefit from additional bronchial epithelium biomarker testing, especially in the setting of indeterminant or non-diagnostic results. 20 Similar evaluation has been done on nasal epithelium gene expression. 21 Volatile organic compounds (VOC) are small molecular mass compounds that can be detected and measured in exhaled breath and are believed to reflect metabolic processes at a tissue level, including inflammation and oxidative stress. More than 3000 VOCs have been identified that could be related to lung cancer. 22 While individual VOC measurement lack specificity for disease, a signature selection of VOCs may be an optimal non-invasive biomarker to detect lung cancer.
The tumor microenvironment (TME) includes not only includes tumor epithelial cells but also the surrounding vasculature, cancer-associated fibroblasts, extracellular matrix and infiltrating immune cells. Elements of the tumor microenvironment may have prognostic utility and help determine therapeutic response. 23 Cancer-associated fibroblasts indicate poor prognosis and higher risk of recurrence. 24 An extracellular matrix may predict response to adjuvant chemotherapy. 25 Favorable prognosis and survival has been associated with increased tumor-infiltrating lymphocytes with high CD4:CD8 T cell ratio. 26
Several urine metabolites have been explored to help non-invasively identify lung cancer risk and diagnosis with variable sensitivity and specificity, though most await further validation. Limitations include the small sample size of some of these studies, and exogenous effects of dietary and drug intake may have on performance. 27
Radiomics is an advancing field utilizing advanced image analysis to extract high-content information from medical images to create radiomic signatures as imaging biomarkers to assess risk of lung cancer, characteristics, treatment response and prognosis. Radiomics evaluate morphological, histogram, texture and airway characteristics. 28 Recently, a commercially available artificial intelligence (AI) radiomics-based CAD tool was shown to improve prediction of cancer risk in pulmonary nodules with increased sensitivity in combination with the Mayo model calculator from 38% to 56%. 29 Widescale validation, however, is still lacking.
Biomarker applications in lung cancer timeline
There are multiple points along the lung cancer timeline in which biomarkers can influence patient care and outcomes. Figure 2 illustrates multiple intervals along the lung cancer timeline in which biomarkers could alter clinical outcomes.

Biomarker application along lung cancer timeline. LCS: Lung Cancer Screening.
Risk assessment
There are numerous risk calculators to estimate an individual's risk to develop lung cancer. Most use certain patient characteristics to determine risk such as age, gender, race, smoking history, personal or family history of lung cancer, asbestos exposure, education, BMI or history of COPD. 30 However, there are numerous other exposures not included in these risk calculators that have been linked to development of lung cancer such as radon, domestic fuel smoke, occupational exposures, infectious and inflammatory disease. 31 About 15% of non-small lung cancer patients are never smokers. 32 Addition of biomarkers either alone or in conjunction with risk model calculators can help identify patients in the general population at risk to develop lung cancer. Biomarkers could also identify high risk patients who may benefit from chemoprevention trials.
Lung cancer screening with low dose computed tomography (LDCT) is one of the major efforts to improve early detection and diagnosis. The National Lung Screening Trial (NLST) evaluated patients at high risk to develop lung cancer and found a 20% relative risk reduction in lung cancer mortality when patients were screened with LDCT yearly. 33 Implementation of lung cancer screening has been poor as less than 5% of eligible patients are screened due to wide-ranging barriers. 34 13% of initial LDCTs have a false positive result, or benign nodule that may lead to unnecessary risk of interventions, patient anxiety and cost. 35 Recently in 2021, the USPSTF modified the eligibility criteria for lung cancer screening in order to include patients at high risk to develop lung cancer that were previously excluded from the original criteria defined by the NLST. 36 Integration of a biomarker into lung cancer screening may improve in accurately determining who should undergo LDCT screening. This could lead to decreased lung cancer deaths without increasing harms or cost and help identify patients unlikely to benefit from lung cancer screening. It could also be incorporated into personalized shared decision making with patients.
Diagnosis
About 1.6 million incidental pulmonary nodules are identified annually, 5.2% of which will represent a primary lung cancer. 37 Guidelines for management of pulmonary nodules encourage physicians to determine the pre-test probability that the nodule(s) represent lung cancer, splitting them into low, intermediate and high risk categories. Low risk nodules undergo continued CT surveillance and high risk nodules are directed to definitive surgical treatment for acceptable candidates. Intermediate risk nodules have the option of close CT surveillance, PET and biopsy. 38
There are several validated prediction calculators, however Tanner et al. revealed that physician specialist assessment of risk is more accurate than these models. Despite this, physicians did not follow guideline-based recommendations. 39 Several biomarkers have been or are under development to assist physicians in determining if intermediate risk nodules should undergo biopsy. Biomarkers may reassure physicians to follow guidelines in management of nodules and may avoid unnecessary procedures, complications, and costs for benign disease.
While percutaneous biopsy may have a high diagnostic yield, there is a higher risk of pneumothorax when compared to bronchoscopic biopsy and the diagnostic yield of navigational bronchoscopy is highly variable. 40 For patients with indeterminant results from biopsy, the addition of a biomarker could help determine if the patient should undergo continued surveillance versus repeat or more definitive biopsy. There is significant practice variability when regarding the rate of acceptable biopsy or resection of benign nodules. Incorporation of biomarkers into practice could help minimize intervention on benign disease.
Prognosis
Classically, prognosis has been determined by stage of lung cancer using the TNM staging criteria. There has been some emphasis on lymphatic vessel invasion on pathology indicating worse patient prognosis. 41 Presence of certain genetic oncogenes and tumor suppressor expression can aid in prognostication. For example, EGFR, ERRC, RRM have been linked to favorable prognosis whereas kras, p53 and Her2 have been linked to poor prognosis. 42 Biomarkers could be utilized to non-invasively estimate prognosis on diagnosis.
Within the spectrum of behavior of lung cancer, there are more indolent or slow-growing lung cancers of which the management of surveillance or treatment is debated. 43 Biomarkers could play a role in predicting development of invasive disease requiring treatment as opposed to surveillance.
Response to treatment
Clinical practice already incorporates the use of biomarkers to predict response to therapy in patients undergoing targeted therapy or immunotherapy. Especially in later stage lung cancer, the measure of PD-1/PDL-1. CTLA-4, mutational burden or presence of infiltrating lymphocytes in tumor has been used to estimate response to treatment. 42 Circulating tumor DNA has been used to diagnose and predict response to treatment with targeted therapies. 44 With the advent and adoption of neoadjuvant chemotherapy and immunotherapy, there may be a role for biomarkers to predict pathologic response prior to resection. Biomarkers could function as a quantitative measure of disease burden.
Recurrence and surveillance
Both the National Comprehensive Cancer Network (NCCN) and American College of Chest Physicians (ACCP) recommend surveillance of patients who have undergone curative intent treatment of lung cancer with serial chest CT every 6 months for 2 years followed by annual CT. However, only 61% of patients receive guideline adherent surveillance. 45 Findings on surveillance CT can be nonspecific, reflecting either normal treatment change or recurrence, and require additional imaging or invasive interventions. Location of recurrence within previously radiated fields or along suture lines may also inhibit accurate biopsy. The supplement of biomarkers in surveillance of recurrence may avoid the cost of additional testing and risk of interventions to determine recurrence and more accurately diagnose earlier in relapse of cancer.
Clinically available diagnostic biomarkers
Pulmonary nodule diagnostic evaluation can be aided by biomarkers prior to the decision to obtain a biopsy, as well as after a biopsy attempt. A diagnostic biomarker can be used to help with the risk stratification of a nodule, potentially moving it from an intermediate-risk into a high-risk or low-risk category to more clearly direct next steps in management. These biomarkers should be specimens that can be collected without use of an invasive procedure, including blood-based tests, nasal swabs or exhaled breath. Several biomarkers have shown promise in this area, and some have been developed into commercially available tests. The ideal biomarker would be one with both high sensitivity and high specificity for identifying a patient with lung cancer, however this holy grail has been elusive in the lung nodule setting. In the absence of a single biomarker, investigators have focused on identifying adequate “rule-in” or “rule-out” tests.
The Nodify XL2 test (Biodesix) is a blood-based proteomic test that measures two proteins, LG3BP and C163A. The ratio of these two proteins was shown to impact the probability of a nodule being diagnosed as benign.14,46 The PANOPTIC trial results assessing this biomarker demonstrated that this biomarker used in conjunction with the Mayo nodule risk calculator had a sensitivity of 97%, specificity of 44% and negative predictive value of 98%. This trial reported that the integrated classifier of the biomarker plus risk calculator outperformed available risk calculators, physician estimates of lung cancer risk, and PET scans. 14 In a clinical utility study, the impact of this integrated classifier on patient diagnostic workup and outcomes was studied with a propensity match design. With this study design, investigators found that patients for whom the integrated classifier was applied were less likely to undergo an invasive procedure with an absolute difference between groups of 14%. 47
The EarlyCDT-Lung Test (OncImmune) is a blood-based test of a panel of 7 autoantibodies that can be produced in response to the presence of lung cancer. The test has a high specificity (98%) and high positive predictive value (78%). 48 It can be used as a “rule-in” test by assessing a blood sample from a patient with intermediate risk of lung cancer, and if the autoantibodies are positive then the patient's prior estimated risk can be adjusted upward. One analysis estimated use of the EarlyCDT-Lung test to be cost-effective. 49 To our knowledge, clinical utility data for this test have not been published. In 2019, Biodesix acquired OncImmune's incidental pulmonary nodule malignancy Early CDT-Lung test in the United States. Biodesix now offers both the Nodify XL2 test and the now-named Nodify CDT test as proteomics tests that can be performed on blood-based samples to both “rule-out” with the Nodify XL2 and “rule-in” with the Nodify CDT, and in combination shift patients out of the intermediate-risk range into low- or high-risk.
When patients are sent for bronchoscopy, despite advances in techniques and tools to access pulmonary nodules, some procedures yield inconclusive results. The Percepta Genomic Sequencing Classifier (GSC; Veracyte) is a test that can be performed on an endobronchial brush sample obtained from normal-appearing mucosa of the right mainstem bronchus at the time of bronchoscopy. 20 Building on findings that the entire respiratory tract is involved in the “field of injury” of cigarette smoke, investigators found that gene expression changes that are relevant to lung cancer risk can be detected from this endobronchial brush. 50 Collected during a diagnostic bronchoscopy and utilizing whole-transcriptome RNA sequencing with over 1000 genes, if the diagnostic samples are inconclusive then the Percepta GSC can be implemented as a biomarker for further risk-stratification. 51 A study of Percepta application demonstrated that for cases with non-diagnostic initial bronchoscopies, Percepta GSC results down-classified the malignancy risk in 34%, with physicians deciding to not pursue an additional invasive procedure in 74% of these instances. 52 Another study of cases in which Percepta GSC increased the risk of malignancy from high-risk to very high-risk demonstrated a shift in a decision to be more aggressive with next steps in management, and improved provider confidence in their decision making. 53 Follow-up work has further demonstrated the ability of this test to re-classify patients’ lung cancer risk to both very high-risk and low-risk and impact clinical decisions. 54
Potential future biomarkers in bronchoscopy
There have been published accounts of additional potential biomarkers to be used in lung nodule diagnosis that are in various stages of development, and we are anxious to see results of ongoing and future studies in this area.55–57 Despite having the clinically available biomarker tests discussed above, we have no data regarding the uptake and clinical application of these tests. As with any new technology, there will be variability in uptake of these new tests, and understanding barriers to this will be important. Correct interpretation of results, and appropriately discussing them with patients, can be complicated with the most straightforward of new tests, and is likely more complex with the biomarkers currently available given the types of tests they are. Discussing risk estimates with patients is a challenge for even the most experienced clinician, and patient and clinician understanding and preference are not factors that have been studied in this space. We hope that clinicians become more comfortable with available biomarkers and their application, and that there is continued development of biomarker tests to apply toward this clinical challenge.
The optimal characteristics for a biomarker test, as discussed earlier in this review, are difficult to achieve, but important factors to consider. A benefit of using ctDNA as a biomarker is that it can be highly specific. We also speculate clinicians may have higher confidence in a test with results that are directly linked to the presence of cancer cells and are more familiar to interpret. There are challenges, however, to use of ctDNA as a biomarker in early stage lesions, and the vast majority of ctDNA studies in lung cancer have been done in the setting of prognostication and disease monitoring.58,59 ctDNA has been difficult to study in a pulmonary nodule population given the low amount of ctDNA detected in the circulation from smaller nodules, though technologies and techniques are advancing.60,61 Sensitivity of ctDNA may be limited in some cases of less invasive adenocarcinoma that have a low-shedder phenotype. 61
While the ideal situation would be to use a biomarker to avoid invasive testing, for the foreseeable future we expect to be using bronchoscopy for diagnosis of lung cancer, which comes with the potential for a non-diagnostic result. The application of a biomarker in this clinical scenario is a relevant one, as work with the Percepta GSC test has shown. Other sample types, including bronchoalveolar lavage fluid, can be accessed via bronchoscopy and may be used for biomarker evaluation. 62 For example, cell-free tumor DNA has been successfully isolated from BAL fluid.63–65
While non-small cell lung cancer is the most common lung cancer diagnosis, other types of lung malignancy do arise and can have their own management challenges. For example, small cell lung cancer can be aggressive and metastasize quickly, making timely diagnosis and treatment paramount. In order to adequately answer the question of whether a nodule represents lung cancer, not just non-small cell lung cancer, biomarker validation must include all types of lung malignancy in sufficient numbers.
Conclusions
Biomarkers have the potential to play an important role at all points along the timeline of lung cancer development. Various biospecimens types have the potential to provide access to relevant biomarkers. In the area of lung nodule diagnostic evaluation, several biomarkers have been developed to the point of clinical availability and have demonstrated the potential to shift the needle on risk stratification, resulting in fewer invasive procedures for low-risk nodules and a more aggressive diagnostic approach for high-risk nodules. Advancements in technologies and techniques may pave the road for future biomarker candidates to be studied and developed.
Footnotes
Author contributions
N.T. and M.N. both contributed to the conception, literature review, preparation and revision of the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
