Abstract
Background:
Lung cancer is the most common cause of cancer-related deaths in men and women worldwide. Novel diagnostic biomarkers are urgently required to enable the early detection and treatment of lung cancer, and using novel methods to explore tumor-related biomarkers is a hot topic in lung cancer research. The purpose of this study was to use ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS) metabolomics analysis technology combined with multivariate data processing methods to identify potential plasma biomarkers for non-small cell lung cancer (NSCLC).
Methods:
Plasma samples from 99 NSCLC patients and 112 healthy controls were randomly divided into the screening group and the validation group, respectively. UPLC-MS metabolomics analysis technology combined with multivariate data processing methods were used to identify potential plasma biomarkers for NSCLC.
Results:
A total of 254 metabolites were detected and validated in plasma. Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) modeling indicated that 28 endogenous metabolites were present at significantly different levels in patients with NSCLC than healthy controls (variable importance in projection (VIP)>1 and P<0.001 (independent samples t-test) in both the screening group and the validation group). Further analysis revealed that cortisol, cortisone, and 4-methoxyphenylacetic acid had high sensitivity and specificity values as biomarkers for discriminating between NSCLC and healthy controls. Significant associations between specific plasma metabolites and the pathological type or stage of NSCLC were also observed.
Conclusions:
Metabolomics has the potential to distinguish between NSCLC patients and healthy controls, and may reveal new plasma biomarkers for the early detection of NSCLC.
Introduction
Lung cancer is the main cause of cancer-related death worldwide, and non-small cell lung cancer (NSCLC) accounts for approximately 85%. 1 Despite the different therapeutic strategies developed to date, the 5-year survival rate of patients with NSCLC is only 10% to 15%. Because no specific symptoms occur at the early stage, most cases of lung cancer are diagnosed at the advanced stage. 2 Therefore, early diagnosis and treatment is very important, and seeking non-invasive biomarkers with high sensitivity and specificity has major clinical significance in lung cancer. Currently, the most commonly used blood tumor markers for early detection of NSCLC are protein biomarkers. Carcinoembryonic antigen, a human embryo antigen determinant of acidic glycoprotein, was first used for the diagnosis of colon cancer and is now widely used as a marker for lung adenocarcinoma. Serum carcinoembryonic antigen levels are elevated in 30%–70% of patients with lung cancer, and an elevated serum carcinoembryonic antigen before surgery is an independent prognostic factor in lung cancer. 3 Cytokeratin is a component of intermediate filaments in epithelial cells and is a reliable marker of epithelial differentiation. Cytokeratin 19 fragment antigen (CYFRA21-1) is the most commonly used biomarker for the diagnosis of lung cancer, and is released into bodily fluids by tumor cells when they undergo necrosis. CYFRA21-1 has the highest sensitivity and is considered the best plasma biomarker for lung squamous cell carcinoma, and its serum levels and sensitivity elevate with progression of the disease. However, it is recognized that their use is sometimes inappropriate with a high rate of over-utilization and potential negative repercussion both on individual subjects and on the health care system. Meanwhile, the diagnostic accuracy of the squamous cell carcinoma antigen (SCCA) does not increase markedly; therefore, this marker has limited clinical potential.4,5 SCCA is a novel tumor marker of diagnostic value in patients with lung cancer. The diagnostic sensitivity of SCCA for lung cancer varies from 15% to 55%, with the highest sensitivity observed for lung squamous cell carcinoma. However, even when combined with CYFRA21-1, the diagnostic accuracy of SCCA does not increase markedly. Therefore, we need to look for new plasma biomarkers for the early detection of NSCLC.
Metabolomics (or metabonomics) examines the changes in metabolites or metabolic changes over time after genetic stimulation or pathological disturbances. 6 Metabolomics involves the analysis of endogenous metabolites with a molecular weight below 1000 Da, including sugars, amino acids, lipids, organic acids, and carnitine,7-9 and employs high throughput analysis and statistical data processing.
A range of internal and external interactions lead to tumor initiation and progression; environmental factors are associated with the development of more than 80% of malignant tumors. The genomic and proteomic changes that lead to cancer eventually cause changes in the metabolism of small molecules. 10 During the development of cancer, the tumor cells are subjected to various external environmental stimuli, and small changes in the internal environment can lead to the production of unusual metabolites, eventually causing metabolic changes in the tumor microenvironment and bodily fluids such as serum. 11 Analysis of the metabolic changes that occur during the development of lung cancer using metabolomics may help to identify diagnostic biomarkers for lung cancer. It may also help us to further understand the pathogenesis of lung cancer and provide specific targets for drug development.
At present, commonly used techniques in metabolomics include nuclear magnetic resonance (NMR) imaging, gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). NMR technology can analyze metabolic components without damaging the sample; sample pretreatment is simple, but has a relatively low sensitivity. GC-MS has a good sensitivity and reproducibility; however, the sample pretreatment derivatization methods are complicated. LC-MS possesses the advantages of high sensitivity, simple sample treatment, clinical applicability, and strong practicability. In this research, we used ultra-high-performance liquid chromatography tandem mass spectrometry (UPLC-MS) to conduct a metabolomic study of plasma samples from 99 patients with lung cancer and 112 healthy controls. The aim of this study was to comprehensively analyze the metabolic changes that occur in the plasma of patients with NSCLC, in order to identify small molecular biomarkers specific to NSCLC and to provide further data on the metabolic changes that occur during the development of NSCLC.
Materials and methods
Patients and samples
This study was conducted at the First Affiliated Hospital of Nanjing University, and all procedures were approved by the Ethics Committee of that institution. Patients with newly diagnosed primary NSCLC (n = 99) who had not undergone previous surgery, chemotherapy, radiotherapy, or any other treatment were included in this study. The healthy control group (n = 112) was recruited from individuals undergoing general physical examinations (routine blood, blood biochemistry, blood tumor marker, urinalysis, chest x-ray, and ultrasounds tests) who were not diagnosed with cancer or any other serious diseases. None of the patients with NSCLC or healthy controls had any history of chronic long-term diseases requiring medication (Table 1).
Characteristics of lung cancer patients and controls.
AD: adenocarcinoma; BMI: body mass index; n: number; SC: squamous cell carcinoma; TNM: tumor, node, metastasis.
All subjects fasted for 12 h, and 2 mL of peripheral blood was collected between 7 a.m. and 9 a.m. into tubes containing EDTA anticoagulant. The tubes were transported to the laboratory within 30 min, centrifuged at 3000 rpm for 10 min at 4°C, and the plasma was removed into 1.5 mL Eppendorf tubes and stored at −80°C until analysis.
Reagents and instruments
Chromatography grade methanol, acetonitrile and formic acid were provided by Merck (Germany). Water was purified using an ultra-pure water machine (Thermo Fisher Scientific Company, Carlsbad, CA, USA).
Sample pretreatment
Frozen plasma samples were thawed and vortexed at 4°C; 1200 µL of pure methanol was added to 400 µL plasma then centrifuged at 16,000 rpm for 15 min to precipitate proteins; the supernatant was removed to a clean tube and dried under nitrogen; then the residue was reconstituted in 20 μL of pure methanol, vortexed for 30 s, centrifuged at 12,000 rpm for 10 min, and the supernatant was subjected to metabolomic analysis.
UPLC-MS metabolomics analysis
The samples were analyzed in a random manner to avoid bias related to the injection sequence. UPLC-MS metabolomic analysis was performed on a UPLC series Thermo Fisher LTQ-FT linear ion trap cyclotron resonance mass spectrometer (provided by Waters Acquity Company, Milford Massachusetts, USA) according to a previous report 12 with minor modifications: the UPLC autosampler temperature was set at 4°C and the injection volume for each sample was 5 μL.
Chromatography was carried out on a UPLC Ultimate 3000 system (Dionex) using a Hypersil GOLD C18 column (Thermo Scientific; 100 mm × 2.1 mm, 1.9 μm) at a column temperature of 40°C. A multistep gradient of mobile phase A containing 0.1% formic acid in ultra-pure water and mobile phase B consisting of acetonitrile acidified with 0.1% formic acid was used; the flow rate was 400 μL/min over a run time of 15 min.
The conditions for heated electrospray ionization were: source capillary temperature, 250°C; heater temperature, 425°C; sheath gas flow, 50 arbitrary units (AU); auxiliary gas flow, 13 AU; a spray voltage of 3.5 kV for positive mode; a spray voltage of 2.5 kV for negative mode; spare gas flow, 0 AU; tube lens voltage, 60 V. The molecular weight scanning range was 70 m/z to 1050 m/z, at a resolution of 70,000 (m/z = 200), with an automatic gain control target of 1 × 10−6 charges and maximum injection time of 30 ms. Metabolites were identified by reference to the mass and retention time of standard compounds.
Statistical analysis
Single factor analysis of variance (ANOVA) was used to assess the significance of the differences in age and body mass index (BMI) between groups and the Pearson’s chi square test was used to assess the significance of the differences in gender between groups; independent sample t-tests were used for other statistical comparisons between the two groups.
The mean fold-difference was calculated as the ratio of mean levels of metabolites in the NSCLC group relative to the healthy control group. A fold-change > 1 indicated the metabolite was higher in the plasma of the NSCLC group and a fold-change < 1 indicated the metabolite was lower in the plasma of the NSCLC group. An orthogonal projections to latent structures discriminant analysis (OPLS-DA) model based on the metabolomics data was fit using SIMCA-P 13.0 (Umetrics, Umea, Sweden). A P-value < 0.001 and variable importance in projection (VIP) > 1 were considered statistically significant.
Results
Patient characteristics
According to the 2004 World Health Organization classification of tumors, the study population included 52 cases of adenocarcinoma and 47 cases of squamous cell carcinoma. On the basis of the 7th edition of the American Joint Committee on Cancer-tumor, node, metastasis (AJCC-TNM) staging system, the study population included 11 cases of stage I NSCLC, 22 cases of stage II, 31 cases of stage III, and 35 cases of stage IV.
All subjects were randomly assigned to the screening group (53 patients with NSCLC, 56 healthy controls) or the validation group (46 patients with NSCLC, 56 healthy controls). The age (P = 0.086), gender (P = 0.916), BMI (P = 0.410) and the smoking status (P = 0.174) of the patients with lung adenocarcinoma, patients with lung squamous cell carcinoma and the healthy controls were not significantly different.
OPLS-DA model
During metabolomic analysis of the plasma samples, 254 metabolites were identified in all samples. The two-dimensional distribution of the patients with NSCLC and the healthy controls when stratified by the OPLS-DA analysis is shown in Figure 1. The OPLS-DA modeling parameters for differentiating between the patients with lung adenocarcinoma and the healthy controls in the screening group were R2X = 0.48, R2Y = 0.869, and Q2 = 0.617 (Figure 1(a)), and R2X = 0.433, R2Y = 0.847, and Q2 = 0.711 for differentiating between the patients with lung squamous cell carcinoma and healthy controls in the screening group (Figure 1(b)). The OPLS-DA modeling parameters for differentiating between the patients with lung adenocarcinoma and the healthy controls in the validation group were R2X = 0.566, R2Y = 0.954, and Q2 = 0.737 (Figure 1(c)), and R2X = 0.558, R2Y = 0.915, and Q2 = 0.759 (Figure 1(d)) for differentiating between the patients with lung squamous cell carcinoma and healthy controls in the validation group. These results indicate that the OPLS-DA model was reliable and had a high predictive value for the differentiation of patients with NSCLC and the healthy controls.

OPLS-DA modeling. Lung cancer group in the A, D quadrant (green dot); the healthy control group in B, C quadrant (blue dot).
Identification and validation of selected differentially expressed metabolites in patients with NSCLC and healthy controls
Metabolites with a VIP > 1 and P < 0.001 (independent samples t-test) in the OPLS-DA model for the screening group were selected for further analysis in the validation group; 28 metabolites met these criteria in the validation group and may have potential as biomarkers for differentiating between patients with NSCLC and healthy individuals; further details of these 28 metabolites are shown in Table S1.
To further investigate the potential of the selected metabolites for screening patients with NSCLC, metabolites with a fold-change < 0.4 or > 2.5 between patients with NSCLC and healthy controls were selected for further analysis. Three small plasma metabolites met these criteria: cortisol, cortisone and 4-methoxyphenylacetic acid (Table 2). The receiver operating control curves; the area under the curve values; the sensitivity and specificity of each of these metabolites alone and in combination for differentiating patients with NSCLC; and the healthy controls are shown in Figure S1 and Table 3.
Potential early detection markers of NSCLC.
NSCLC: non-small cell lung cancer; VIP: variable importance in projection.
Note: Fold-change (lung cancer/control) = 0.121, shows that 4-methoxyphenylacetic acid in lung cancer patients is 8.26 times lower than healthy controls.
Sensitivity and specificity of the potential markers.
AUC: area under the curve.
Relationship between the levels of selected plasma metabolites and the clinicopathological features of NSCLC
We analyzed the levels of the 28 selected metabolites in patients with lung adenocarcinoma and patients with squamous cell carcinoma. The levels of homovanillic acid (P = 0.047) and serotonin hydrochloride (P = 0.014) were significantly different between patients with lung adenocarcinoma and patients with squamous cell carcinoma.
Additionally, we analyzed the levels of the 28 metabolites in different stages of NSCLC. The levels of stearamide (P = 0.045), cortisol (P = 0.033), DL glyceraldehyde (P = 0.026), testosterone (P = 0.010), and myristic acid (P = 0.035) were significantly different between patients with early stage (stage I and II) and advanced stage (stage IV) NSCLC.
Discussion
In the process of metabolomics research, we detected a lot of metabolites and identified which ones were different from others. It followed the proportion of each metabolite to total metabolites instead of detecting their content. Therefore, the detection result was relative content and there was no unit. In this study, the plasma levels of small molecule metabolites in patients with NSCLC and healthy controls were analyzed using UPLC-MS technology. The mean fold-difference was calculated as the ratio of mean levels of metabolites in the NSCLC group relative to the healthy control group. Meanwhile, the OPLS-DA model, which was based on the metabolomics data, was used to calculate the VIP value of every metabolite. OPLS-DA modeling revealed that the plasma metabolic profiles of patients with NSCLC were distinct from those of the healthy controls, reflecting obvious differences in plasma metabolites due to the pathological condition.
Differences in the plasma amino acid profiles of patients with NSCLC and healthy controls have previously been reported,31 indicating that plasma amino acid analysis may provide a potential screening tool for NSCLC. In this study, 254 metabolites were detected in the patients with NSCLC and in the healthy controls, and 28 of these metabolites could be used to distinguish between the patients with NSCLC and the healthy controls. In the present study, all enrolled patients were newly diagnosed primary NSCLC patients who had not undergone any previous treatment. Therefore, the alteration of these metabolites may have nothing to do with the disease condition or the possible treatment. For clinical diagnosis purposes, we identified the three metabolites with the highest power to discriminate between NSCLC and healthy controls: cortisol, cortisone, and 4-methoxyphenylacetic acid. In fact, previous studies have shown that the three metabolites have a high correlation with malignant tumors, including NSCLC, while other metabolites have not been reported in the research of lung cancer.14-23 Further analysis revealed that cortisol, cortisone, and 4-methoxyphenylacetic acid all had high sensitivity and specificity values as biomarkers for discriminating between NSCLC and healthy controls.
Cortisol is an important glucocorticoid hormone. Research examining the relationship between the total plasma cortisol concentration and the DNA repair capacity of human peripheral lymphocytes cultured in vitro indicates that high concentrations of cortisol inhibit DNA repair. 14 Flattening of the diurnal salivary cortisol rhythm is associated with an increased risk of death in patients with NSCLC, which is independent of other prognostic indicators. 15 The administration of cortisone acetate to Swiss mice in conjunction with an intravenous injection of tumor cell suspensions induced widespread metastasis; it was concluded that cortisone may affect the reticuloendothelial tissues and result in the failure of tissue-specific antibody production. 16 The metabolite 4-methoxyphenyl acetic acid is widely used as an intermediate for the synthesis of pharmaceutical compounds with anti-angiogenic and antitumor activity. 17 It is also used to synthesize drugs that can enhance immunity and myocardial contractility, protect myocardial cells, lower blood pressure, and prevent platelet aggregation. Hou et al. 18 used 4-methoxyphenylacetic acid as a raw material to obtain a high yield of resveratrol. Numerous studies have shown that the syntheses of resveratrol are significantly lower in a variety of tumor cells19-23 compared to normal cells. This study reveals that the healthy controls had a significantly higher plasma concentration of 4-methoxyphenylacetic acid than the patients with NSCLC. These data indicate that 4-methoxyphenylacetic acid may play a protective role to prevent the development of lung cancer; this effect and the mechanism of action of 4-methoxyphenylacetic acid require further research. Previous studies have investigated the differences between serum and plasma metabolomes applying holistic metabolic profiling platforms, either GC/MS 27 or LC-MS.24,25 Both platforms showed distinct metabolic profile shift in lung adenocarcinoma from healthy controls. The alterations of these metabolites, such as glycerophosphocholines, erythritol, creatinine, hexadecenoic acid, and glutamine, suggested disturbances in the metabolism of amino acids, phospholipids, and fatty acids in lung adenocarcinoma. All of these studies have indicated the utility of metabolomic analyses in clinical prognosis and the particular utility of plasma in relation to the clinical management of lung cancer.
Although metabolomics technology has certain advantages for the early detection of cancer, many challenges remain in this field. Metabolic profiling of biological samples is easily affected by other clinicopathological factors, such as age, gender, race, lifestyle habits, diet, drug use, environmental conditions, and other diseases. The mechanism and biology of tumor metabolism remains unclear. According to the Warburg effect, 26 cancer cells undergo reduced aerobic glycolysis and produce higher levels of lactic acid; however, the precise mechanisms of tumor energy metabolism are poorly understood. In addition, most research on cancer metabolism is based on tumor cells,27,28 while the levels and effects of altered tumor metabolites in the blood have received less attention. Further research on larger numbers of samples from healthy individuals and patients with NSCLC both at diagnosis and after treatment, as well as patients with other pathological conditions, is required to verify and assess the efficacy of the potential diagnostic biomarkers for NSCLC identified in the present study.
Limitations in the present study include the relatively small sample size and the possible selection bias due to its retrospective nature, which would not draw confirmed conclusions. Another weakness is that a mix of different types and stages of cancer are considered. In future studies, we plan to increase the number of samples and to distinguish different types and stages of lung cancer based on the results of this study.
In conclusion, UPLC-MS was used in this study to analyze the changes in the plasma metabolites of patients with NSCLC, which revealed 28 potential plasma biomarkers for NSCLC. The mechanisms leading to the changes in the plasma levels of these metabolites and the related metabolic pathways, as well as the role of these metabolites in the development of lung cancer, merit further investigation. This study provides a new direction of research for the early detection of NSCLC.
Supplemental Material
supplementary_fig_S1_and_table_S1 – Supplemental material for Cortisol, cortisone, and 4-methoxyphenylacetic acid as potential plasma biomarkers for early detection of non-small cell lung cancer
Supplemental material, supplementary_fig_S1_and_table_S1 for Cortisol, cortisone, and 4-methoxyphenylacetic acid as potential plasma biomarkers for early detection of non-small cell lung cancer by Chengcheng Xiang, Shidai Jin, Juan Zhang, Minjian Chen, Yankai Xia, Yongqian Shu and Renhua Guo in The International Journal of Biological Markers
Footnotes
Author contributions
Chengcheng Xiang and Shidai Jin contributed equally to this work and should be considered as co-first authors.
Declaration of conflicting interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by Jiangsu Province Clinical science and technology projects (Clinical Research Center, BL2012008).
Supplemental material
Supplementary material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
