Abstract
BACKGROUND:
Lung cancer is the leading cause of cancer mortality worldwide. The collection of exhaled breath condensate (EBC) is a non-invasive method that may have enormous potential as a biomarker for the early detection of lung cancer.
OBJECTIVE:
To investigate the proteomic differences of EBC between lung cancer and CT-detected benign nodule patients, and determine whether these proteins could be potential biomarkers.
METHODS:
Proteomic analysis was performed on individual samples from 10 lung cancer patients and 10 CT-detected benign nodule patients using data-independent acquisition (DIA) mass spectrometry.
RESULTS:
A total of 1,254 proteins were identified, and 21 proteins were differentially expressed in the lung adenocarcinoma group compared to the benign nodule group (
CONCLUSION:
Significantly differentially expressed proteins were detected between lung cancer and the CT-detected benign nodule group from EBC samples, and these proteins might serve as potential novel biomarkers of EBC for early lung cancer detection.
Introduction
Lung cancer is a major public health problem, and ranks first among the leading causes of cancer morbidity and mortality worldwide according to the GLOBOCAN 2018 [1]. Over the past several decades, the incidence of lung cancer has been increasing in China due mostly to, tobacco smoking, severe outdoor air pollution and indoor air pollution from cooking fumes [2]. Most lung cancer cases are diagnosed in the late stages (Stage III and IV), which are hard to cure [3]. In contrast, about 80% of low-dose computed tomography (LDCT)-detected lung tumors are found at an early curable stage [4]. Thus, the early detection of lung cancer is essential to reduce mortality from the disease. The most effective method for the early detection of lung cancer in at-risk individuals in China and elsewhere is LDCT. The landmark National Lung Screening Trial (NLST) was carried out to measure whether LDCT could reduce the mortality of lung cancer [5]. The results showed that in asymptomatic long-term smokers, lung cancer mortality could be reduced by 20% through annual LDCT screening. But the NLST also found a high false-positive rate of about 25% [5]. Most of the nodules detected by LDCT are benign, and many patients with positive scans receive unnecessary treatment.
Exhaled breath condensate is a biological fluid that comes from the respiratory tract and consists of water vapor and aerosolized particles [6]. It contains a variety of biological markers, such as proteins, metabolites, and DNA, which have been used to study the causes of pulmonary diseases [7, 8, 9]. The collection of EBC is simple, non-invasive, and cost-efficient, which allows for its potential application in low dose lung cancer screening to help differentiate benign from malignant lesions.
Several studies have considered how various types of biomarkers can be measured in EBC, and have profiled the presence of specific biomarkers in various disease states [7, 10]. Recently, proteomic technology has developed rapidly [11, 12], which has improved the in-depth protein profiling for biomarker identification. Although EBC is considered to be an ideal tool to explore biomarkers for lung cancer, the low protein concentration (approximately 1
Data-independent acquisition (DIA) is a new mass spectrometry technology developed in recent years [17, 18]. In contrast with data-dependent acquisition (DDA), DIA uses a different data scanning mode: the entire full scan range of the mass spectrum is divided into several windows, and then all ions in each window are detected and fragmented [19]. Therefore, the information of all ions in the sample can be obtained without omission. Based on these technical advantages, DIA is extremely suitable for traceable analysis of large-scale samples and could complete massive analysis on EBC samples [20, 21, 22, 23].
Thus, we designed a study to explore proteomic differences between lung cancer and CT-detected benign nodule in EBC samples using the DIA method, and conducted bioinformatics to identify potential biomarkers for lung cancer and the underling functions and pathways of differentially expressed proteins. Our goal is to discover new biomarkers might improve the criteria for LDCT and help to reduce the false-positive rate and harms from unnecessary treatment.
Baseline characteristics of all the participants
Baseline characteristics of all the participants
Sample collection
A total of 20 subjects were included in this study, including 10 lung cancer patients and 10 benign nodule patients (shown in Table 1). EBC samples were collected from subjects between May 21, 2018 and December, 2018 in Shanghai Chest Hospital (Shanghai, China). R-tubes (Respiratory Research Inc., USA) were used for EBC collection, and nose clips were used to avoid nasal influence according to the guide of the American Thoracic Society and European Respiratory Society (ATS/ERS) [14]. Participants breathed normally through the tube for about 10–12 mins, and the final volume of EBC was 1.5 mL–2 mL. Samples were collected before 10:00 am, and stored at
The lung cancer group (LC) included newly diagnosed patients with histopathologically-confirmed lung adenocarcinoma. Inclusion criteria included FEV1 / FVC greater than 70% (FEV1: forced expiratory volume in one second; FVC: forced vital capacity). The exclusion criteria were having received chemotherapy or undergone lung cancer surgery, or had airway inflammation or other lung infections in the past 3 months, or had other types of cancer. The diagnosis of lung cancer was made according to the IASLC/ATS/ERS classification for lung adenocarcinoma [15] and WHO classification of lung tumors [16].
The benign pulmonary nodule group (PN) included patients who had a positive LDCT scan and were diagnosed with lung nodules. Inclusion criteria included 50–74 years of age, no signs or symptoms of lung cancer, 30-pack years or greater history of tobacco smoking, and being a current smoker or have quit smoking within the last 15 years. The exclusion criteria for this group were having received anti-tumor or anti-inflammatory treatment before the sample collection, or had other lung diseases, or the lesion was larger than 3 cm.
This study was performed according to the Code of Ethics of the World Medical Association and approved by the Ethics Committee of East China University of Science and Technology. An informed consent was signed by each participant.
Sample preparation
An in-house filter-aided sample preparation (FASP) protocol was used for sample digestion. A 10 KDa filter membrane was used for sample concentration, and mixed with 100
LC-MS/MS analysis
The nano-flow HPLC system (Easy nLC-1200) (Thermo Fisher Scientific, CA, USA) was equipped with a C18 column (75
For DDA analysis, the peptides were separated by a 120 min gradient at the flow rate of 250 nL/min. Mobile phase A consisted of 0.1% FA in water, and mobile phase B consisted of 0.1% FA in acetonitrile. The gradient program was set as follows: 0–97 min, mobile phase B from 8% to 30%; 97–110 min, mobile phase B from 30% to 100%; 110–120 min, held at 100% for mobile phase B.
After separation, full scan was performed on a Q-Exactive HF mass spectrometer between 300–1800 m/z. The detection mode was positive ion, AGC was 3e6, and maximum injection time was 50 ms. followed by 20 ddMS2 scans (MS2 scans). The isolation window was 1.6 Th, the resolution was 30K, AGC target was 3e6, maximum ion injection time was 120 ms, and normalized collision energy was 27.
Volcano plot of the proteins identified in lung cancer group and benign pulmonary nodule group. Volcano plot showed 21 proteins were differentially expressed in lung cancer group (
DIA analysis was conducted on the same LC-MS/MS system with DDA. 5
For DIA analysis, the cycle was carried out as follows: one full MS scan at the resolution of 120K between 350 and 1650 m/z, and AGC target was 3e6, maximum ion injection time was 50 ms; followed by 30 DIA windows acquired at the resolution of 30K. AGC target was 3e6, MS2 activation type was HCD and the energy was 30. The spectra data was recorded as profile.
Spectral library generation
DDA data were analyzed with MaxQuant (version 1.5.3.17) against the Uniprot human database, and the sequence of iRT was added to the database. Tryptic cleavage was chosen and two missed cleavages were acceptable at most. Fixed modification was Carbamidomethyl, variable modifications were Oxidation(M) and Acetyl(Protein N-term). The results were filtered with an FDR
GO classification and enrichment results of differentially expressed proteins. (A) GO classification in biological process, molecular function and cellular component; (B) enriched GO terms in biological process, molecular function and cellular component.
Differentially expressed proteins in lung cancer group compared to benign pulmonary nodule group
The DIA data have been deposited to the ProteomeXchange Consortium via the MassIVE partner repository with the dataset identifier PXD020919 and MSV000085953.
To ensure the validity and accuracy of subsequent biological information and statistical analysis, according to general principles, proteins with quantitative values in more than 50% of the samples were used for subsequent statistical and bioinformatics analysis. Differentially expressed proteins were identified with a fold change
Gene Ontology (GO) analysis was performed on the targeted protein set using Blast2GO (
Statistical analysis
The mean
Results
Identification of differentially expressed proteins (DEPs)
In total, 1,254 proteins were identified in this study, and the full list of identified proteins is shown in Supplementary Table S1. With a cutoff
Functional analysis
To further understand the characteristics of these differentially expressed proteins, GO classification and enrichment analysis were conducted (Fig. 2). In molecular function, proteins were mostly enriched on cation and transition metal ion binding. In the biological process, neutrophil degradation, granulocyte activation, neutrophil activation involved in immune response, neutrophil activation, neutrophil mediated immunity, and leukocyte degranulation were the most enriched terms. In cellular components, most proteins were involved in cytosol, secretory granule, and the membrane.
15 pathways were annotated based on the KEGG analysis, and the results suggested that significantly expressed proteins play an essential role in carbohydrate metabolism and amino metabolism. The most enriched pathways were pyruvate metabolism (hsa00620) and propanoate metabolism (has00640), shown in Fig. 3.
Enriched KEGG pathways in differentially expressed proteins. Enriched KEGG results showed that differentially expressed proteins were mostly involved in propanoate metabolism and pyruvate metabolism.
Protein-protein interactions were analyzed on differentially expressed proteins (Fig. 4). We found ME1 and LDHB have the most complicated interactions among these proteins. The PPIs of all the proteins were presented in the Supplementary Fig. S1.
Protein-protein interactions of the differentially expressed proteins.
Few studies have conducted proteomic analysis from EBC, and the application of this approach for differentiating lung cancer from benign nodule patients has not yet been explored. Codreanu et al. [18] conducted proteomic profiling on lung cancer and benign lung nodules on biopsied samples, and found that 10 proteins showed nodule-specific abundance. Although the benign group was not ascertained through LDCT, this study provides evidence that proteomic differences between benign nodule and lung adenocarcinoma have the potential to be biomarkers for use in LDCT. In our study, a thorough proteomic study was conducted to explore the EBC proteomic differences in lung cancer and CT-detected benign nodule patients, and the results showed that the differentially expressed proteins held great potential for the early detection of lung cancer.
Other studies have reported that EBC protein profiling has the biomedical potential to discriminate different pulmonary diseases, including asthma, COPD, and lung cancer [19, 20, 21, 22, 23]. Lopez et al. [24] conducted proteomic research on 192 EBC samples from healthy controls, lung cancer and COPD patients, and a total of 348 proteins were identified. In the lung cancer group, the total amount of proteins was significantly higher than in the control group, and the ROC curve suggested the differences had diagnostic value. Like ours, this study also showed the proteomics of EBC could help to develop biomarkers for the early detection of lung cancer. Sun et al. [25] carried out a proteomic analysis using tandem mass tags (TMTs) on COPD patients, and a total of 257 proteins were identified. This current study showed that the proteins in EBC could reflect airway-related disease and might be potential biomarkers. The use of TMTs helped to identify more proteins compared to previous studies, but also had the limitation: only six samples could be analyzed at one time, which may not be suitable for a large sample size study. Although previous studies attempted to investigate the biomedical potentials of EBC using proteomics, the lack of a high-throughput method for protein identification limited the implications from these findings.
Our proteomic analysis was done by a DIA approach, which divided the full scan range of mass spectrum into several windows, and all ions in each window could be detected and fragmented [26]. Different from DIA, DDA is biased to high abundant peptides [27]. One potential application of DIA is its use for the proteomic profile of individual EBC sampling, which is a noninvasive method that could potentially identify biomarkers for the diagnosis of lung cancer. The DIA method we used in this study was previously described [28]. In our previous study, we developed the DIA approach for the proteomics of individual EBC samples and conducted a pilot proteomic study in lung cancer and healthy controls. We had shown that DIA was suitable for EBC proteomics even if only a small amount of EBC sample was used for analysis. The proteins upregulated in lung cancer were related to human disease and mostly involved in the Ras signaling pathway, suggesting the biomedical potential of EBC. Our previous EBC proteomics study showed great potential in studying lung cancer, which provided the basis for further exploration of the proteomic differences in lung cancer and CT-detected benign lung nodules.
In this study, a total of 1,254 proteins were identified, which extremely expanded the EBC proteome compared to previous studies. Through GO and KEGG analysis, we found these DEPs were mostly involved in neutrophil-related activities, and the most enriched pathways were pyruvate metabolism (hsa00620) and propanoate metabolism (has00640), which helped to better understand the potential biological mechanism of lung cancer and benign nodules.
Tumor microenvironment is vital for tumorigenesis, and chronic inflammatory response is one of its main characteristics [29]. Neutrophils are an important part of tumor microenvironment and play a critical role in local inflammatory response [30]. Tumor-associated neutrophils (TANs) are the neutrophils infiltrated in the tumor, which originate from blood and enter the tumor tissue after attracted by the chemokines [31]. The function of TANs in tumor microenvironment is complicated, which is reported to be playing a dual role in tumor progression [32]. In the early stage of lung cancer, TANs could stimulate T cell responses and inhibit the development of cancer [33]. Houghton et al. [34] reported that neutrophil elastase could promote the proliferation of tumor cells in lung adenocarcinoma by degrading insulin receptor substrate-1. Wcuek and Malanchi [35] reported that neutrophils are the driver of lung colonization of metastasis-initiating breast cancer cells, suggesting neutrophils could help tumors spread. The functional analysis of our study showed that 5 DEPs were annotated to neutrophil-related activities: APRT, SERPINB12, HRNR, ARG1, DSG1. Majer et al. [36] observed the elevation of APRT in cell bronchogenic carcinoma. ARG1 was reported to inhibit the CD8+ response to promote tumor development [37], and Colegio et al. [38] reported the expression of ARG1 was significantly related to tumor growth. Saaber et al. [39] found the downregulated expression of DSG1 in lung cancer cell lines and suggesting its role as potential diagnostic markers. Most of these proteins (except HRNR) have not been reported in previous EBC studies, and might be novel biomarkers of lung cancer.
Pyruvate has been reported to be a critical molecule related to the human metabolism activities [40]. It is the end-product of glycolysis, plays an essential role in mitochondrial ATP generation and central carbon metabolism [41]. Some studies have reported that altered pyruvate metabolism may lead to cancer [40] including lung cancer [42]. Warburg effect is a metabolic switch that is used to define cancer, which indicates glycolytic carbon flux is always upregulated in cancer cells [43, 44, 45]. In our study, 7 proteins were annotated to pyruvate metabolism pathway: MDH2, LDHA, LDHB, PKM, MDH1, ME1, GLO1. LDHB and ME1 were also in our list of significantly expressed proteins, so these two proteins might play an essential role in lung cancer.
LDHB is one of the isomers of lactate dehydrogenase (LDH), which is an important metabolic enzyme in tumor microenvironment [46]. It plays a vital role in the conversion of pyruvate to lactate and has been proven to be an essential player in cancer metabolism [47]. Some studies have reported the dysregulation of LDHB in several tumor cells, such as breast cancer [48, 49], pancreatic cancer [50, 51, 52], and lung cancer [53, 54], suggesting that LDHB might have a role in tumorigenesis, progression, and tumor cell survival [52]. McClelland et al. have reported that LDHB can regulate cell proliferation in lung adenocarcinoma, and its upregulated expression may be related to poor clinical outcome, suggesting that LDHB could be a novel therapeutic target [54]. But for now, none of the previous studies have shown LDHB in EBC, especially in lung cancer and CT-detected benign nodule. Our study might be the first study that reported significant differential expression of LDHB in EBC samples.
ME1 is a NADP
Although the NLST project shows that the use of LDCT could reduce the mortality of lung cancer by 20%, it also has a high false-positive rate (23.3%) [5]. Considering the radiation risk and the cost of LDCT, alternative or complimentary method for early detection are needed [61]. EBC sampling might be an ideal method because it measures physiologic changes in the airway and alveolar spaces, and is a non-invasive method that causes no harm to patients [62]. Our study provided evidence for discriminating lung cancer and benign nodule patients through the EBC protein profile.
The limitations of this study are the lack of an independent validation set, and that biomarkers used in conjunction with LDCT screening need to consider the variability in lung cancer diagnosis. For example, our group of subjects included only lung adenocarcinomas without COPD.
In summary, this study compared the proteomic differences between CT-detected benign pulmonary nodule and lung cancer, and the use of the DIA approach helped to extend our knowledge of differentially expressed proteins in EBC. Our protein data could provide evidence for the diagnostic potential of EBC in lung cancer and benign pulmonary nodules.
Conclusion
This was the first study comparing the proteomic differences between CT-detected benign pulmonary nodule and lung cancer patients using DIA approach. 21 proteins were differentially expressed, and all of them were upregulated in the lung adenocarcinoma group. GO, and KEGG analysis suggested these proteins were annotated to neutrophil-related biological processes and mostly involved in pyruvate metabolism. Our study showed that EBC has the potential for early detection of lung cancer, and might help to reduce the false-positive rate in low dose CT screening.
Author contributions
Conception: Guangli Xiu, Guanghong Xiu.
Interpretation and analysis of data: Lin Ma, Raghu Sinha.
Preparation of the manuscript: Lin Ma.
Revision for important intellectual content: Joshua Muscat, Dongxiao Sun.
Supervision: Guangli Xiu.
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/CBM-203269.
Footnotes
Acknowledgments
This research was supported by the Key R & D Plan Project of Yantai, Shandong Province, China, grant number 2017YT06000331, and National Natural Science Foundation of China, grant number 21707035).
