Abstract
BACKGROUND:
Analysis of molecular changes in sputum may help diagnose lung cancer. Long non-coding RNAs (lncRNAs) play vital roles in various biological processes, and their dysregulations contribute to the development and progression of lung tumorigenesis. Herein, we determine whether aberrant lncRNAs could be used as potential sputum biomarkers for lung cancer.
METHODS:
Using reverse transcription PCR, we measure expressions of lung cancer-associated lncRNAs in sputum of a discovery cohort of 67 lung cancer patients and 65 cancer-free smokers with benign diseases and a validation cohort of 59 lung cancer patients and 60 cancer-free smokers with benign diseases.
RESULTS:
In the discovery cohort, four of the lncRNAs displayed a significantly different level in sputum of lung cancer patients vs.
cancer-free smokers with benign diseases (all
CONCLUSION:
We have for the first time shown that the analysis of lncRNAs in sputum might be a noninvasive approach for diagnosis of lung cancer.
Introduction
Approximately 155,870 Americans will die from lung cancer each year, more than the other three leading cancers combined (breast, prostate, and colorectal cancers). Over 85% lung cancers are non-small cell lung cancers (NSCLC). NSCLC mainly consists of adenocarcinoma (AC) and squamous cell carcinoma (SCC). Tobacco smoking is the major cause of NSCLC. The early detection of lung cancer in a large randomized trial using low-dose CT (LDCT) has revealed a 20% reduction in mortality as compared to chest X-rays [1]. Therefore, lung cancer early detection can increase curability and save lives. LDCT is recommended to be used for lung cancer early detection among smokers in the USA [2, 3]. However, LDCT is associated with over-diagnosis, excessive cost, and radiation exposure [2, 3]. The development of non-invasive biomarkers that can accurately and cost-effectively diagnose early stage lung cancer remains to be clinically imperative [2, 3].
It has been well accepted that lung tumors develop from a field defect characterized by molecular abnormalities resulted from repeated exposure of the entire airway to the tobacco carcinogens [4, 5]. Kadara et al. [6, 7] found that the molecular alterations in the large bronchial airways reflected those in primary lung tumors in the distal lung, regardless of the anatomic location relative to the tumors. Spira et al. [8, 9] demonstrated that molecular changes in epithelial cells collected from the normal-appearing mainstem bronchus of smokers could be developed as biomarkers for NSCLC. Sputum contains bronchial epithelial cells from the lungs and lower respiratory tract. Based on the field cancerizations, examination of the exfoliated bronchial epitheliums of airway in sputum might detect the lung tumor-related molecular alterations, and hence provide a non-invasive and specific means for diagnosis of NSCLC [6, 7, 8, 10, 11, 12].
Long non-coding RNAs (lncRNAs) are transcripts of longer than 200 nucleotides in length, and have important functions in various biological processes [13, 14]. Dysfunctions of lncRNAs play vital roles in the development of lung cancer [15]. Specifically, lncRNAs can regulate different molecular signaling pathways via changing gene expression, and therefore, are implicated in numerous mechanisms of lung carcinogenesis [16, 17, 18, 19, 20]. Since aberrant lncRNAs detected in clinical specimens are associated with lung tumorigenesis, exploration of the molecules for clinical diagnosis is a growing area of research [15, 21, 22]. For examples, Liu et al. found that an lncRNA, HOTAIR, was significantly up-regulated in NSCLC tissues and might be a prognostic biomarker for the disease [23]. Furthermore, we have found that expression levels of SNHG1 and RMRP are reliably measured in plasma, and the lncRNAs may provide cell-free circulating biomarkers for lung cancer [24]. In addition, we recently showed that integrated analysis of lncRNAs, miRNAs, and mRNAs in plasma could have a synergistic impact on lung cancer early detection [25]. However, examination of aberrant lncRNA levels in sputum, which is surrogate material for noninvasive and specific diagnosis of lung cancer has not been investigated.
We recently identified six lncRNAs whose aberrant expression level in surgically resected tumor tissues was associated with lung cancer [26, 27]. The objective of this study was to investigate whether the lung cancer-associated lncRNAs could be used as sputum biomarkers for distinguishing lung cancer patients from smokers with benign diseases.
Materials and methods
Study population
The study protocol was approved by the Institutional Review Board of the University of Maryland Medical Center. Written informed consent forms were obtained from all participants. The participants were recruited at the point of their referral for suspected lung cancer between the ages of 55–80. Exclusion criteria included pregnancy, current pulmonary infection, surgery within 6 months, radiotherapy within 1 year, and life expectancy of
Characteristics of NSCLC patients and cancer-free smokers in a training cohort
Characteristics of NSCLC patients and cancer-free smokers in a training cohort
Abbreviations: NSCLC, non-small cell lung cancer.
Characteristics of NSCLC patients and cancer-free smokers in a training cohort
Abbreviations: NSCLC, non-small cell lung cancer.
Sputum was collected and processed from the subjects as previously described [10, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43], before they received any treatment (e.g., surgery, preoperative adjuvant chemotherapy, and radiotherapy). The sputum samples were centrifuged at 800 g for 10 min. The cell pellets were mixed with phosphate-buffered saline solution (PBS) (Sigma-Aldrich, St. Louis, MO. Cytospin slides were prepared from the cell pellets and underwent Papanicolaou staining for evaluating whether the specimens were representative of deep bronchial cells. All sputum samples were of lower respiratory origin as indicated by the presence of macrophages and bronchial epithelial cells. Cytologic diagnosis was made using the classification of Saccomanno [44]. Positive cytology included both carcinoma in situ and invasive carcinoma [28, 29]. The remaining sputum cells were stored at
The analysis of lncRNAs in sputum by reverse transcription-PCR (RT-PCR)
RNA was extracted from cell pellets of sputum as previously described [31, 32, 33, 34, 36, 37]. The purity and concentration of RNA were determined by OD260/280 readings using a dual beam UV spectrophotometer (Eppendorf AG, Hamburg, Germany). RNA integrity was determined by capillary electrophoresis using the RNA 6000 Nano Lab-on-a-Chip kit and the Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). The expression levels of the six lncRNAs (SNHG1, MALAT1, HOTAIR, H19, MEG3, and RMRP) were determined in sputum by using RT-PCR with Taqman assays (Applied Biosystems, Foster City, CA, USA) as previously described [24, 25]. The cycle threshold (Ct) was defined as the number of cycles required for the fluorescent signal to cross the threshold. An internal control gene, U6, was analyzed in parallel in the specimens. Relative expression of a targeted lncRNA in a given sample was computed using the equation 2-
Statistical analysis
We used univariate analysis to identify the lncRNAs whose expression levels were associated with NSCLC. We analyzed the significantly associated lncRNAs by using logistic regression models with constrained parameters as least absolute shrinkage and selection operator (LASSO) to eliminate the less relevant variables [45, 46, 47, 47]. We estimated functions of the combined lncRNA biomarkers by logistic regression with or without adjustment for known risk factors for NSCLC. The performances of the lncRNAs were subjected to receiver-operator characteristic (ROC) curve. We also generated a 95% confidence interval for the difference in the area under the ROCs (AUCs) by using the bootstrap [48]. We established the optimal cut-off value by using the Youden index [49, 50]. The results of the training cohort were blindly validated in the testing cohort by using leave-one-out cross validation [51]. To compare different approaches for lung cancer diagnosis, we computed their AUCs to determine the sensitivity and specificity as previously described [52].
Expression levels of sputum lncRNAs in the training cohort of lung cancer patients and cancer-free individuals
Expression levels of sputum lncRNAs in the training cohort of lung cancer patients and cancer-free individuals
Abbreviations: SEM, the standard error of the mean; AUC, the area under receiver operating characteristic curve; CI, confidence interval.
Diagnostic values of the sputum lncRNA biomarkers in training cohort and testing cohort
Abbreviations: NSCLC, non-small cell lung cancers; AC, adenocarcinoma; SCC, squamous cell carcinoma; CI, confidence interval.
Developing a panel of sputum lncRNA biomarkers for the early detection of lung cancer
All targeted six lncRNAs had
Sensitivity and specificity of sputum lncRNA biomarkers and cytology for diagnosis of lung cancer. The sputum lncRNA biomarker panel has a higher sensitivity and a similar specificity compared with sputum cytology (
To evaluate the diagnostic performance of the sputum biomarkers, the three lncRNAs (SNHG1, H19, and HOTAIR) were assessed in sputum of additional 59 NSCLC patients and 60 cancer-free individuals. The three genes used in combination could differentiate the NSCLC patients from cancer-free individuals with 81.36% sensitivity and 88.33% specificity (Table 4). Furthermore, the panel of the lncRNA biomarkers had a higher sensitivity for SCC (88.46%) compared with AC (75.76%) of the lungs (
Discussion
As a mirror to lung diseases, sputum is noninvasively obtained and contains bronchial epithelial cells from the lungs and lower respiratory tract, and thus has the advantages as surrogate material for specifically diagnosing lung cancer. Herein, we for the first time report the feasibility of detection of lncRNA in sputum and develop a panel of sputum lncRNA biomarkers for lung cancer. Furthermore, the biomarker panel has a higher sensitivity and a similar specificity compared with sputum cytology, a standard clinical model. Like other sputum molecular biomarkers, the sputum lncRNA biomarkers display a higher diagnostic performance for SCC than AC of the lungs. Interestingly, the sensitivity and specificity of the biomarkers are independent of stage of lung cancer. Therefore, the discovery might be an important characteristic if they are employed for the diagnosis of NSCLC, particularly SCC, at the early stage. Nevertheless, the potential of the sputum lncRNA biomarkers for the early detection of lung cancer need to be prospectively validated in a large cohort study.
The three lncRNAs (SNHG1, H19, and HOTAIR) have diverse and important functions in lung tumorigenesis through regulating different molecular pathways. For instance, elevated expression of SNHG1 was frequently observed in lung cancer tissues [53]. Furthermore, SNHG1 could promote NSCLC progression of lung cancer via miR-101-3p/SOX9/Wnt/
There are some limitations in this present study. 1), the size of the cohorts is small. We will perform a large and prospective study to validate the diagnostic value of the sputum lncRNA biomarkers for the early detection of lung cancer. 2), the panel of the three biomarkers is developed from six lung cancer-associated lncRNAs. Although showing promise, the three biomarkers’ diagnostic performance is not high enough to be used in clinical settings. By searching published data, we recently found 21 lncRNAs whose malfunction was characterized in lung tumorigenesis [19, 60, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86]. Our ongoing study is evaluating if the additional lncRNAs might be added into the biomarker panel to improve the diagnosis of lung cancer. 3), the objective of this present study is to develop sputum biomarkers for differentiating lung cancer from cancer-free smokers who have benign diseases. Toward that end, we collect sputum from lung cancer patients and cancer-free smokers with benign diseases who serve as control individuals. Sputum of healthy controls is not available in this study. Therefore, we are not able to detect differences of lncRNAs in sputum between smokers with benign diseases and healthy controls. In future, it might be important to collect sputum from nonsmokers who don’t have any benign disease to determine if there is different lncRNA changes in sputum between the individuals with the smokes with benign diseases and the healthy controls.
In sum, sputum lncRNA biomarkers are developed that could potentially be used as a noninvasive approach for differentiating early stage NSCLC patients from cancer-free smokers. Nevertheless, undertaking a prospective study to further validate the sputum biomarkers in a large cohort is required.
Footnotes
Acknowledgments
This work was supported in part by VA Merit Award I01 CX000512, the Geaton and JoAnn DeCesaris Family Foundation, NIH/NCI-1R21CA240556, University of Maryland Marlene & Stewart Greenebaum Comprehensive Cancer Center Pilot Grant Program, and DoD-Idea Development Award (F.J.), and NIH/NCI-Early Detection Research Network-5U24CA11509 (S.S.).
Conflict of interest
The authors declare no conflict of interest.
Supplementary
The receiver-operator characteristic (ROC) curve of sputum lncRNA biomarkers for lung cancer diagnosis. Three lncRNAs (SNHG1, H19, and HOTAIR) were selected as the best ones and incorporated into an algorithm: Probability
