Abstract
OBJECTIVES:
We aimed to analyze lncRNAs, miRNAs, and mRNA expression profiles of bladder cancer (BC) patients, thereby establishing a gene signature-based risk model for predicting prognosis of patients with BC.
METHODS:
We downloaded the expression data of lncRNAs, miRNAs and mRNA from The Cancer Genome Atlas (TCGA) as training cohort including 19 healthy control samples and 401 BC samples. The differentially expressed RNAs (DERs) were screened using limma package, and the competing endogenous RNAs (ceRNA) regulatory network was constructed and visualized by the cytoscape. Candidate DERs were screened to construct the risk score model and nomogram for predicting the overall survival (OS) time and prognosis of BC patients. The prognostic value was verified using a validation cohort in GSE13507.
RESULTS:
Based on 13 selected. lncRNAs, miRNAs and mRNA screened using L1–penalized algorithm, BC patients were classified into two groups: high-risk group (including 201 patients ) and low risk group (including 200 patients). The high-risk group’s OS time ( hazard ratio [HR], 2.160; 95% CI, 1.586 to 2.942;
CONCLUSIONS:
The risk model based on 13 DERs patterns could well predict the prognosis for patients with BC.
Introduction
Bladder cancer (BC) is the most prevalent carcinoma of urinary system in China and even the whole world, and the ninth killer in all kinds of cancers in the world [1, 2]. Although current treatment of BC has improved the 5-year overall survival (OS), poor prognosis is also existing due to the absence of typical symptoms. And it is found that most cases of BC patients had advanced and distal metastasis [3, 4, 5]. Therefore, it is of great need to develop and improve the diagnosis and treatment methods of BC.
Recently, molecular biomarkers including lncRNA, miRNAs and mRNA can be used for detection, diagnosis, recurrence prediction and post-treatment monitoring of BC [6, 7, 8, 9]. Moreover, it has been found that the lncRNAs could interact with miRNAs as competing endogenous RNA (ceRNA) to regulate BC. For example, LncRNA GAS6-AS2 has been identified as a new prognostic marker of BC could promote BC proliferation through miR-298-CDK9 axis [10]; LncRNA LINC00641 was a potential factor to predict prognosis and nhibit BC developing through miR-197-3p-KLF10-PTEN-PI3K-AKT aixs [11]. lncRNA TINCR can reduce the OS rate of BC patients by inhibiting miR-7 and regulating mTOR [12]. Above all, based on the above researches, we found that the RNAs could be a potential biomarkers for BC and also were associated with BC through the ceRNA mechanism [13, 14, 15].
Thus, this study aimed to construct a RNA-based risk model to predict the prognosis of BC. We firstly conducted a ceRNA regulatory network analysis to screen the RNAs which could be used as the biomarkers for BC, and subsequently, we integrated genomic and clinical features to build a nomogram model based on prognostic RNAs to predict prognosis of patients with BC.
Methods
RNAs expression data
The mRNA, lncRNA and miRNA profile was obtained from The Cancer Genome Atlas (TCGA) (
Differentially expressed RNAs screening
We used limma package in R language [16] (Version 3.34.7) to identify the differentially expressed RNAs (DERs) including mRNA (DEmRNA), lncRNAs (DElncRNAs), and microRNAs (DEmiRNAs) in TCGA dataset. The criterion was set to FDR
Competing endogenous RNAs regulatrory network construction
The relationship between the DEmiRNAs and DElncRNAs was predicted using DIANA-LncBasev2 [18, 19], and the correlation of DEmiRNAs and mRNA using starBase (Version 2.0) [20]. Subsequently, we retained the negative regulation among DEmiRNAs, DElncRNAs and mRNA. The ceRNA regulatory network including the DEmiRNAs, DElncRNAs and mRNA was constructed and visualized by the cytoscape (Version 3.6.1) [21]. Moreover, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment were performed to analyze the function of mRNA involved in the ceRNA regulatory network with the cutoff of p value
Prognosis related RNAs screening
We used univariate and multivariate Cox regression analysis in the survival package (Version 2.41–1) [24] to identify DERs related to prognosis with the threshold of log-rank
Analysis of significantly differentially expressed RNAs (DERs). (A) the volcano plot showed the significantly DERs, red points were up-regulated RNAs, blue points were down-regulated RNAs; (B) The two-way clustering of DERs, 
In order to obtain the RNAs’ performance in predicting prognosis and OS time for BC patients, we established risk score using coefficients weighted by the penalty Cox model, and calculated the risk score according to the follow formula
The median value of risk score in training dataset was considered the cut-off point as classifying cohort to high- and low-risk group. Subsequently, we analyzed the OS with Kaplan-Meier plots and used the receiver operating characteristic (ROC) analysis to evaluate their performance using the value of area under curve (AUC) [27, 28]. The same analysis was performed in the validation cohorts of GSE13507. We performed univariate and multivariate Cox regression analysis to analyze the independent significance of different factors with the threshold of p value
Nomograms were constructed using the rms package (
Results
DERs analysis
A total of 15860 mRNA, 1116 lncRNAs and 458 miRNAs were obtained from TCGA dataset. Subsequently, 635 mRNA, 49 lncRNAs and 326 miRNAs were differentially expressed in the training cohorts, which were visualized using Volcano plots (Fig. 1A). The heatmap revealed that RNAs expression patterns between BC and healthy control were distinguishable (Fig. 1B).
KEGG and GO enrichment for the genes in the ceRNA regulatory network
KEGG and GO enrichment for the genes in the ceRNA regulatory network
Note: KEGG, Kyoto Encyclopedia of Genes and Genomes; GO, Gene Ontology.
According to the DIANA-LncBasev2, 130 regulatory relationships between DEmiRNAs and DElncRNAs were identified. Subsequently, a total of 148 regulatory relationships between DEmiRNAs and mRNAs were obtained using the starBase Version 2.0, which included 31 DElncRNAs, 30 DEmiRNAs and 51 mRNA (Fig. 2). The mRNA in ceRNA network were enriched in 3 KEGG pathways including axon guidance, cell adhesion molecules, stem cell pluripotency regulating signaling pathways, and 18 GO terms, such as cell morphogenesis involved in differentiation, axonogenesis, and sensory organ development (Table 1). Taken together, we successfully constructed a ceRNA network, the DERs of which might involve in biological function in BC.
ceRNA regulatory network. Differentially expressed lncRNAs (DElncRNAs) were showed by the represented squares; the triangles showed differentially expressed microRNAs (DEmiRNAs) while differentially expressed mRNA (mRNA) were shown by the circles; the color key represented the log2FC. The red connecting lines represented the DElncRNA-DEmiRNA, while the gray connecting lines represented the DEmiRNA-DEmRNA regulatory connection.
A total of 31 DERs (20 mRNA, 5 DElncRNAs and 6 DEmiRNAs) were significantly related to the prognosis. Subsequently, 16 DERs were selected using the multivariate cox regression analysis. At last, 13 optimized DERs including 3 lncRNAs (LINC00161, ZFHX4-AS1 and DSCR9), 2 miRNAs (hsa-miR-431-5p and hsa-miR-590-3p) and 8 mRNAs (GPM6A, ISL1, SH3GL3, PRIMA1, KCNQ5, PCDH9, NTNG1 and POU4F1) were identified (Table 2). Collectively, the 13 optimized DERs were correlated with the prognosis of BC patients, which was selected for the construction of risk score model.
The optimized RNAs information
The optimized RNAs information
Note: HR, hazard ratio; CI, confidence interval; Coef, cox regression coefficient.
Clinical factor statistics and cox regression analysis
Note: TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval.
Following are the risk scores in the training cohort:
Risk score
The performance of 13 differentially expressed RNAs expression signature in predicting overall survival. (A) In training cohort, low risk group including 200 samples, and high risk group including 201 patients; (B) In validation cohort, low risk group including 82 samples, and high risk group including 83 patients.
With the cut-off point of median risk score in the training cohort, we found that 201 patients in high-risk (50.1%) group had a lower OS (hazard ratio (HR), 2.160; 95% CI cores, 1.586 to 2.942;
To screen the clinical factors significantly related to prognosis, we analyzed the significance of clinical characteristics between BC and the controls using univariate and multivariate analyses. As shown in Table 3, we found that age, tumor recurrence and PS model status were the significant clinical factors related to prognosis. Further analyses of KM curves indicated that OS of patients aged younger than 60 years higher than the patients older than 60 years (HR 1.033, 95% CI 1.017 to 1.049;
ROC curve parameter information table for each type of model
ROC curve parameter information table for each type of model
Note: TCGA, The Cancer Genome Atlas.
Overall survival was evaluated using Kaplan-Meier curves in line with age and recurrence-tiered low-risk and high-risk scores. (A) Age, age 
Nomograms that predict the 3- and 5-year overall survival.
A nomogram was built based on clinicopathological factors in the TCGA cohort to predict the prognosis of samples (Fig. 5). The predictors included age, recurrence and prognostic score. The calibration plots have good predictive ability for 3- and 5-year OS with the c index was 0.683 and 0.674, respectively.
The ROC curve of training set TCGA (A) and validation set GSE13507 (B) which based on the results of each type of model.
The combination of all the features including age, tumor recurrence, 3 lncRNAs, 2 miRNAs, 8 mRNA could best discriminate the BC to healthy controls with the AUC of 0.901 for TCGA and 0.758 for GSE13507 (Fig. 6, Table 4). All these results suggested that the combined model based on the 13-DERs signature and clinical factors could accurately predict the prognosis and OS for BC patients.
The present study identified novel prognostic RNAs (including 3 lncRNAs, 2 miRNAs , and 8 mRNA ) in BC and constructed a risk model for predict the prognosis of patients with BC. We validated the tool and found that it could improve the ability to predict prognosis in BC patients compared with clinical risk factors. Additionally, a nomogram comprising prognostic score and clinicopathological data including age and tumor recurrence was obtained to predict the OS. Our results indicated that the 13 DERs signature could be an indicator that categorized patients into a high-risk and a low-risk group, and the OS of the training and validation dataset differed significantly.
We constructed a ceRNA-related 13-gene signature to explore the correlation between BC and lncRNA-associated ceRNA network based on the TCGA cohort using Lasso-penalized Cox regression algorithm. Previous studies have confirmed that the risk score model based on the novel signature could predict the prognosis and survival time of patients with BC [32, 33, 34, 35]. These studies explored the model based on one type such as lncRNA or mRNA, which is inconsistent with our study. In our study, we constructed a ceRNA-related 13-gene signature including lncRNAs, miRNAs and mRNA that have high accuracy of prediction for prognosis of BC patients. Compared to unique type RNA biomarker, the biomarkers screened in this study is a mix of different type of RNAs including lncRNAs, miRNAs and mRNA [36, 37, 38]. The combination of the three type RNAs has the best AUC of 0.852 compared with the unique type of RNA.
Moreover, increasing evidence has implied that some of them might be involved in crucial biological functions in the progression of cancer. From the results of KEGG pathways, we found that NTNG1 (one mRNA in the 13-RNA signature) was involved in cell adhesion molecules pathway. Cell adhesion molecules are complex protein and carbohydrate molecules of many different types found on the surfaces of all cells. They are important in many aspects of cell biology including development, differentiation, and motility. These processes are frequently disturbed in cancer and recent work has demonstrated that disturbances in cell adhesion molecule expression are also common in malignancy [39]. Furthermore, NTNG1 has been reported to be potential tumor diagnostic markers in multiple cancers including BC, Breast invasive carcinoma, and Colon adenocarcinoma [40]. In addition, ISL1 involved in Signaling pathways regulating pluripotency of stem cells. For classic bladder exstrophy, ISL1 has been reported as a significant susceptibility gene, and it regulates the development of urinary tract [41, 42, 43]. Its abnormal methylation could use to predict BC patients outcome [41, 44, 45, 46]. The two mRNA were the important biomarkers to predict prognosis, which might be regarded for targeted therapy for patients with BC. The other DERs need much experiments to explore their function in BC. This study provided a basic therapies for application of the DERs in targeted therapy.
Notably, the signature was validated in another dataset related to BC (GSE13507). Interestingly, patients with BC in the low-risk group have a favorable survival advantage through KM curves. Indeed, the each AUC values of ROC plots in 1, 3 and 5 years were nearly high in the training dataset and validation dataset. These results suggested that the model had excellent accuracy for predicting the prognosis of patient with BC. Furthermore, we also found that risk score in model was related to age and recurrence. Previous studies have indicated that age and recurrence were significantly correlated with the prognosis of BC patients [47, 48]. The two factors have potential prediction ability for patients with BC. Besides, we also found that a nomogram was built based on tumor recurrence and age to predict the OS which could provide simple, accurate prognosis predictions for BC.
The study also has limitations. First, in order to be used widely, an operation-friendly methods such as qPCR should be performed to further validate the model. Second, additional biological experiments should be performed to validate the mechanism of the biomarkers especially the non-coding RNAs. Third, the sample of the validation dataset used in this study is tissue which is not non-invasive. So the data from the blood or the urine should be included to validate the model in the future.
Conclusion
In summary, this study suggested that lncRNA-related ceRNA signature is highly correlated with BC. Moreover, we constructed the prognosis model based on 13-RNAs signature that is a potential prognostic tool for predicting the prognosis of BC patients. Additionally, the 13 lncRNA-related RNAs might be considered as the reliable therapeutic biomarkers for predicting the prognosis of BC patients.
Data availability statement
The datasets generated and/or analysed during the current study are available in the NCBI repository,
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Ethics of approval statement
Not Applicable.
Patient consent statement
Not Applicable.
Permission to reproduce material from other sources
Not Applicable.
Clinical trial registration
Not Applicable.
Author contributions
Conception: Yin Cao and Zhiyi Zhao.
Interpretation or analysis of data: Lingyun Liu and Hongliang Wang.
Preparation of the manuscript: Lingyun Liu.
Revision for important intellectual content: Zhiyi Zhao, Lingyun Liu, Hongliang Wang.
Supervision: Yin Cao and Zhiyi Zhao.
Footnotes
Conflict of interest
The authors have no conflicts of interest to declare.
