Abstract
PURPOSE:
Functions associated with glycolysis could serve as targets or biomarkers for therapy cancer. Our purpose was to establish a prognostic model that could evaluate the importance of Glycolysis-related lncRNAs in breast cancer.
METHODS:
Gene expressions were evaluated for breast cancer through The Cancer Genome Atlas (TCGA) database, and we calculated Pearson correlations to discover potential related lncRNAs. Differentially expressed genes were identified via criteria of FDR
RESULTS:
Eighty-nine differentially expressed lncRNAs were identified from 420 Glycolysis-related lncRNAs. 14 lncRNAs were correlated with prognosis in training set and were selected to establish the prognostic model. Low risk group had better prognosis in both training (
CONCLUSION:
Our study indicated that glycolysis-related lncRNAs had a significant role to facilitate the individualized survival prediction in breast cancer patients, which would be a potential therapeutic target.
Keywords
Introduction
Glycolysis was firstly described as a phenomenon that cancer cells metabolize glucose in a manner by Warburg[1, 2]. Glucose metabolism alteration is an important hallmark of tumor cells, which has been applied to the cancer diagnosis and assessment of cancer response. Interestingly, glycolysis is also a potential therapeutic strategy for cancers[3, 4]. It is increasingly evident that glycolysis is able to facilitate tumor growth and promote chemotherapy resistance[5, 6, 7].
Breast cancer is a highly heterogenous disease that is the most commonly diagnosed malignancies in women. Clinical tumor staging, lymph node status, histological grade and molecular characteristics are considered as prognostic factors by current guideline of AJCC (American Joint Committee on Cancer,
Long noncoding RNAs (lncRNAs) were defined as a type of RNA with more than 200 nucleotides in length and barely encode proteins. LncRNAs are capable of regulating expressions of mRNA at various levels and take part in many critical biological processes. They have a correlation with the diagnosis, treatment and prognosis of breast cancer[8, 9]. However, the mechanism by which lncRNAs regulate gene transcription remains largely unknown. The prognosis of breast cancer patients has a correlation with the status of immune infiltrates [10]. This study aims to identify glycolysis-related lncRNAs and explore the status of immune infiltrates in breast cancer patients.
Materials and methods
Datasets and study cohort
Data of RNA expressions and clinical information of tumor tissues and normal breast tissues (1053 tumor tissues and 111 normal tissues) were obtained from the TCGA database (
Bioinformatic analysis and statistical analysis
Genome Reference Consortium (GRCh38) was applied to identified lncRNAs Pearson’s correlation was calculated to screen out glycolysis-related lncRNAs identified as square of correlation coefficient
Entire samples were randomly separated into training and validating sets. In training set, univariate Cox regression analysis were performed again to identify prognostic lncRNAs among training set. We identified 14 prognostic lncRNAs in training set and each gene was compressed into a single score. Sankey diagram and co-expression network were depicted via R v4.02 and Cytoscape software 3.8.0. The risk score was calculated using the following formula to construct the prognostic model.
Kaplan-Meier curves were drawn to compare the survival differences. Clinicopathological variables (e.g., age, stage and TNM staging) and risk scores were compared through Cox univariate and multivariate analyses. The ROC curves were drawn according to clinicopathological variables and risk score. To test the feasibility of our model, the value was validated in validating set.
Data of tumor-infiltrating immune cells were downloaded from the Tumor Immune Estimation Resource (TIMER) database. The infiltrating status of neutrophils dendritic, macrophage T cells and B cells were evaluated to identify their correlation with risk score.
Results
Identification of DEGs and prognostic glycolysis-related lncRNAs
There were 14142 lncRNAs in the breast cancer dataset downloaded in the TCGA database, and among them, 420 were glycolysis-related ones identified according to selection criteria (correlation coefficient
The characteristics of training and validation set
The characteristics of training and validation set
A: Heatmap of DEGs Blue and red indicate lower expression and higher expression. B: Volcano map of expression of Glycolysis-related lncRNAs. C: The Sankey diagram listed the relationship of 14 lncRNAs and 18 mRNAs. D: The co-expression network listed the relationship of 14 lncRNAs and 18 mRNAs.
Univariate Cox analysis of 14 prognostic glycolysis-related lncRNAs in training set
A: Kaplan-Meier Curve of low-risk group and high-risk groups according to the risk score in training set. B: Distribution of prognostic index in training set. C: Survival status of patients in training set D: Forest plot of Cox univariate analysis in training set E: Forest plot of Cox multivariate analysis in training set. F: Multi-parameter ROC curves for risk score, age, stage, lymph node statue metastasis statue and tumor size in training set.
A: Kaplan-Meier Curve of low-risk group and high-risk groups according to the risk score in validating set. B: Distribution of prognostic index in validating set. C: Survival status of patients in validating set D: Forest plot of Cox univariate analysis in validating set E: Forest plot of Cox multivariate analysis in validating set. F: Multi-parameter ROC curves for risk score, age, stage, lymph node statue metastasis statue and tumor size in validating set.
We randomly divided the entire samples into a training set (480 patient samples) and a validating set (478 patient samples) at a 1:1 ratio. The characteristics of two sets were presented in Table 1. There were no differences in clinicopathological variables between the two sets. We identified 14 prognostic lncRNAs via univariate Cox regression analysis in the training set (Table 2). Sankey diagram (Fig. 1C) and co-expression network (Fig. 1D) listed the relationship of 14 lncRNAs and 18 mRNAs. Among 14 prognostic lncRNAs, 8 lncRNAs have a positive (HRs
The correlation between neutrophils, CD4+ T cells CD8+ T cells B cells macrophage, dendritic cells and risk score.
Each prognostic lncRNA obtained a risk score, and thus a prognostic model was constructed according to the formula. Finally, breast cancer patients were separated into high risk group and low risk group according to the median risk score for survival analysis. Kaplan-Meier curves indicated that low-risk patients were associated with better prognosis (
The neutrophils (
Discussion
Breast cancer is a heterogeneous disease that is commonly detected in clinical practice. Although multiple therapeutic strategies for breast cancer are optional, there were half a million deaths of breast cancer in 2012 [11]. In the future, cancer treatment based on tumor biology and early therapy response is the trend to enhance clinical outcomes of cancer patients [12]. Numerous studies have found that lncRNAs are critical regulators of tumorigenesis, growth, tumorigenesis and drug resistance of breast cancer[13, 14, 15, 16]. In our analysis, we investigated expression levels of 420 lncRNAs in breast cancer, narrowing down to 87 DEGs. We divided the entire samples into a training set (480 patient samples) and a validating set randomly. In training set, we identified 6 oncogenic lncRNAs and 8 tumor-suppressor lncRNAs. We constructed a prognostic model according to prognostic genes for breast cancer. We found that glycolysis-related lncRNAs were independent prognostic factors for breast cancer via Cox regression analyses. This result was validated in validating set Lactic acid is a key production induced by glycolysis. The accumulation of lactic acid may promote biological processes via tumor microenvironment [17]. On the other hand, the differentiation of CD4+ and CD8+ T cells relies on glycolysis[18]. The status of immune infiltrates was evaluated to identify their correlation with risk score of glycolysis-related LncRNAs. We observed that risk score was negatively correlated with neutrophils, CD4+ T cells, CD8+ T cells, B cells and dendritic cells. Previous studies found that CD8+ T cells was predictive factor for pathological complete remission to primary systemic therapy and was associated with better clinical outcomes[19, 20, 21]. Although CD4+ T cell may serve as an antitumor factor [22]. Recent study found that high CD4+ T cells had a non-favorable effect on breast cancer patient[21]. The main mechanism of anti-tumor immune response of CD4+ T cells is still being explored.
So far, among 14 prognostic lncRNAs, the function AC002546.1, AC010503.4, AC092142.1, AC092718.4, AL031316.1, AL136084.3 and U62317.1 have not been previously explored in breast cancer or other cancers. LINC01929 has been found positively correlated with tumor progression in liver cancer, oral squamous cell carcinoma and lung cancer[23, 24, 25]. Che et al. indicated that LINC01929 promotes tumor progression through miR-137-3p/FOXC1 axis in oral squamous cell carcinoma [23]. So far, LINC01929 has not been reported in breast cancer yet. We found that LINC01929 acted as a carcinogenic lncRNA in breast cancer, and its specific oncogenic mechanism remains unclear and is worth exploring. In the present study, LINC02446 was consistently identified as a prognostic gene and might serve as a promising therapeutic biomarker. LINC01614 was identified as a non-favorable prognostic biomarker in breast cancer[26, 27], which is consistent with our analysis. Previous studies have indicated that the oncogenic LINC01614 stimulated the development of osteosarcoma, gastric cancer cells, non-small cell lung cancer and lung adenocarcinoma, which provide a new insight into the development of cancer treatment[28, 29, 31]. MIAT was firstly discovered as a susceptible locus for myocardial infarction[32]. In recent years, a growing number of evidence has indicated the vital effect of MIAT in regulating the development of human cancers[33]. Previous study indicated that silence of MIAT inhibits proliferation of MCF-7 cells, and overexpression of it yields the opposite trend[34]. In addition, MIAT has a correlation with 5-fluorouracil (5-FU) resistance of MCF-7 cell. Yao et al. found that the expression level of GRP78, MIAT, OCT4 and AKT were dysregulated in 5-FU resistant MCF cells. These genes contributed to drug resistance [35]. However, we found that upregulation of MIAT was associated with a better overall survival. It is reported that the MIAT is lower-expressed in hormone receptor-negative tumor tissues than that in hormone receptor-positive ones. Hormone receptor-positive subtypes account for 60% of breast cancers and their prognosis is better than that of hormone receptor-negative subtypes[11, 36]. In addition, expression level of MIAT increases in early-stage breast cancer, rather than stage III and IV[34, 37]. These may attribute to the inconsistent findings.
Our findings suggested that glycolysis-related lncRNAs may serve as therapeutic targets against breast cancer progression. Several limitations existed in our analysis Our data was downloaded from the TCGA database, and most of breast cancer patients were primarily Americans. Breast cancer patients from other countries should be explored. On the other hand, all samples only record the breast cancer related events and we are limited in discussing the event of other organ sites. Experiments of in vitro and in vivo were lacked to validate our data mining results, which are needed in the future.
In summary, we have identified transcripts corresponding to 14 genes that may serve as prognostic markers in training set and constructed a prognostic model of breast cancer. This analysis provides a promising method in predicting individualized survival of breast cancer. In the prognostic model, risk score were correlated with infiltrating status of neutrophils dendritic, macrophage T cells and B cells. Our results might provide potential therapeutic targets for breast cancer.
Footnotes
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/CBM-210446.
Conflict of interest
None.
Author contributions
Data curation: Qi Zhu, Yanlin Gu.
Data analysis and statistical analysis: Yanlin Gu, Jiayue Zou and Xiaohua Li.
Figure constuction: Qi Zhu, Yanlin Gu.
Bioinformatic analysis: Jiayue Zou.
Manuscript draft: Jiayue Zou.
Review and Supervise: Lei Qin.
