Abstract
Background:
Glioblastoma (GBM) is the most common clinical intracranial malignancy worldwide, and the most common supratentorial tumor in adults. GBM mainly causes damage to the brain tissue, which can be fatal. This research explored potential gene targets for the diagnosis and treatment of GBM using bioinformatic technology.
Methods:
Public data from patients with GBM and controls were downloaded from the Gene Expression Omnibus database, and differentially expressed genes (DEGs) were identified by Gene Expression Profiling Interactive Analysis (GEPIA) and Gene Expression Omnibus 2R (GEO2R). Construction of the protein–protein interaction network and the identification of a significant module were performed. Subsequently, hub genes were identified, and their expression was examined and compared by real-time quantitative (RT-q)PCR between patients with GBM and controls.
Results:
GSE122498 (GPL570 platform), GSE104291 (GPL570 platform), GSE78703_DMSO (GPL15207 platform), and GSE78703_LXR (GPL15207 platform) datasets were obtained from the GEO. A total of 130 DEGs and 10 hub genes were identified by GEPIA and GEO2R between patients with GBM and controls. Of these, strong connections were identified in correlation analysis between
Conclusions:
The hub genes
Keywords
Introduction
Glioblastoma (GBM) is the most common intracranial malignancy in the clinic, accounting for 15% of all intracranial tumors and 50% of gliomas; it is also the most common supratentorial tumor in adults (Alexander and Cloughesy, 2017). GBM can occur at any age although the average age of onset is 57 years. Surgery is typically advised as a treatment for GBM because it removes as much of the tumor as possible with little nerve damage; however, microinfiltrations of GBM cells can occur, and it is not possible to dissect the entire tumor, so a relapse is possible (Brown, et al., 2016). GBM mainly causes damage to brain tissue by compression, which results in a loss of neural function, and can be fatal.
Glioma is a general term for neuroepithelial tumors that involve the differentiation of cells into glial cells (Korshunov, et al., 2016), and it is the most common primary intracranial tumor. The World Health Organization divides glioma into 4 pathological grades (I-IV) with increasing disease severity; GBM belongs to grade IV which has an unfavorable prognosis. GBM has 2 subtypes in clinical diagnosis. Secondary glioblastoma, which develops from low-grade glioma, accounts for 5%-10% of all GBM, and mainly affects patients younger than 55 years. Primary glioblastoma (pGBM), which is typically diagnosed during the initial consultation, accounts for 90%-95% of all GBM, and mainly affects patients older than 55 years. GBM affects any part of the central nervous system, especially the deep white matter of the cerebral hemisphere (Jain, et al., 2014). Typically, both frontal and temporal lobes are involved at the same time, with deep infiltration and extensive invasion.
Because the prognosis of GBM is very poor, studies have investigated genetic markers to aid its prediction, diagnosis, and treatment. Exploring precise molecular targets involved in the occurrence and progression of GBM is of great importance in prolonging the survival of patients with GBM. For example, Toll-like receptor 4 (TLR4) expression in the central nervous system is closely associated with GBM (Sanson and Idbaih, 2013), and the interaction between the TLR-4 signaling pathway and micro (mi)RNA has become a target for modern GBM immunotherapy. TLR-4 activation promotes the expression of programmed death ligand-1, resulting in the autocrine induction of local immunosuppression in the GBM microenvironment. TLR-4 also promotes the Wnt/DKK-3/Claudin-5 signaling pathway, limiting GBM invasion (Litak, et al., 2020).
Recent studies have shown that micro (mi)RNAs, which account for 1%-3% of the human genome, have the potential to provide new immune checkpoints and hypotheses for the control of GBM based on genome sequencing. As an example, miR-29c inhibits O6-methylguanine-DNA methyl-transferase via specificity protein 1 to treat GBM, whereas miR-29c-induced G1 phase arrest was shown to promote apoptosis and inhibit cell migration and invasion, thus blocking glioblastoma cell proliferation. At least part of its antitumor effect is mediated by the specific down-regulation of CDK6 expression (Wang, et al., 2013). Moreover, miR-124-3p suppresses the expression of endothelin receptor type B to impede the development of GBM (Mazurek, et al., 2020).
Studies on screening and identifying GBM genetic targets by the integration and analysis of big data are still limited because of the high false positive rate of single-center studies and lack of data. Bioinformatic technology is an emerging tool of big data mining that can determine differences in gene expression between patients and healthy controls (Wilson, et al., 2018), and has been proven to be an effective means of identifying biomarkers of diseases.
Therefore, in this study, 4 gene expression datasets from patients with GBM and control individuals were downloaded and analyzed. We conducted functional enrichment analysis, survival analysis, and correlation analysis to screen differentially expressed genes (DEGs) and hub genes related to the occurrence and progression of GBM, and discuss possible molecular mechanisms involved in disease.
Materials and methods
DEG identification by Gene Expression Profiling Interactive Analysis (GEPIA)
The expression profiles of genes showing differential expression between GBM patients and control samples were observed using GEPIA (http://gepia.cancer-pku.cn/).
Public databases
The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo) is an open platform to store genetic data (Edgar, et al., 2002). Four expression profiling datasets [GSE122498 (GPL570 platform), GSE104291 (GPL570 platform), GSE78703_DMSO (GPL15207 platform), and GSE78703_LXR (GPL15207 platform)] were obtained from GEO. Within these datasets, GSE122498 contained 16 GBM samples and 1 healthy brain sample; GSE104291 contained 24 and 2; GSE78703_DMSO contained 3 and 3; and GSE78703_LXR contained 3 and 3, respectively.
DEG identification by GEO2R
GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) is an interactive online tool to identify DEGs from GEO series (Barrett, et al., 2013) and was used here to identify DEGs between GBM and healthy brain tissue samples. The Benjamini-Hochberg adjustment was made to the P-value (adj. P) to control the false discovery rate and maintain the balance between the possibility of false-positives and the detection of significant genes. If 1 probe set lacked a homologous gene, or if 1 gene had numerous probe sets, the data were removed. The fold-change (FC) threshold was set as ≥2 and adj. P ≤0.01 was considered statistically significant. Venn diagrams were constructed by FunRich software (www.funrich.org).
Functional annotation for DEGs using Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses
The Database for Annotation, Visualization and Integrated Discovery (DAVID; https://david.ncifcrf.gov/home.jsp) (version 6.8) tool suite (Huang, et al., 2007) was used to perform GO (Ashburner, et al., 2000) and KEGG (https://www.kegg.jp/) (Kanehisa, 2002) analyses. GO analysis classified ontologies into 3 categories: biological process (BP), cellular component (CC), and molecular function (MF). P<0.05 was considered statistically significant.
Construction of the protein–protein interaction (PPI) network and identification of significant module
Search Tool for the Retrieval of Interacting Genes (STRING, http://string.embl.de/) was used to construct the DEG PPI network, which was presented by Cytoscape visualization software (version 3.6.1) (Smoot, et al., 2011) (Szklarczyk, et al., 2015). A confidence score >0.4 was set as the criterion of judgment. Next, the Molecular Complex Detection (MCODE) (version 1.5.1, a plug-in of Cytoscape) identified the most important module of the network map using the following criteria: degree cut-off = 2, MCODE scores >5, max depth = 100, node score cut-off = 0.2, and k-score = 2 (Bader and Hogue, 2003).
Analysis and identification of hub genes
Hub genes were excavated when the cut-off value for degrees ≥10. Subsequently, KEGG and GO analyses in the DAVID database were used to functionally annotate the hub genes. cBioPortal (http://www.cbioportal.org) (Cerami, et al., 2012) was used to obtain a co-expression network of hub genes and perform clustering and survival analyses of hub genes, including Kaplan–Meier analysis. Hub gene expression profiles in GBM and control samples as well as in different organs were analyzed and displayed using GEPIA. GEPIA also displayed hub gene expression profiles in different tumor types and compared these using correlation analysis.
GBM patients and controls
Twelve participants were recruited to the study from Zhejiang Cancer Hospital and The Fourth Hospital of Hebei Medical University between 1 April 2017 and 1 April 2019. These included 6 GBM patients (3 males and 3 females, average age: 60 ± 5 years old) and 6 control individuals with mesial temporal lobe epilepsy (3 males and 3 females, average age: 60 ± 5 years old). During surgery, brain samples were taken from control individuals and from GBM tumors in patients with GBM. The research conformed to the Declaration of Helsinki and was authorized by the Human Ethics and Research Ethics Committees of Zhejiang Cancer Hospital (approval no. ZJCH-2017012). Written informed consent was obtained from all participants.
Real-time quantitative (RT-q)PCR
Total RNA was extracted from brain samples using the RNAiso Plus (TRIzol) kit (Thermo Fisher Scientific, Waltham, MA, USA), and reverse-transcribed into cDNA. RT-qPCR was performed using a Light Cycler® 4800 System with specific primers for hub genes (Table 1). Relative gene expression was determined using 2−ΔΔCt, where Ct is the threshold cycle, and are presented as fold-changes in gene expression relative to the control group. Glyceraldehyde 3-phosphate dehydrogenase was used as an endogenous control.
Results
Screening of DEGs in GBM
Several DEGs were identified between GBM and control samples (Figure 1A). After analysis of the GSE122498, GSE104291, GSE78703_DMSO, and GSE78703_LXR datasets with GEO2R, differences between GBM tissues and control samples were presented in volcano plots (Figure 1B–E). A total of 130 DEGs were shown to be common to all 4 datasets in a Venn diagram (Figure 1F).

Identification of DEGs. (A) DEGs between GBM and control samples. Over-expressed genes are shown in red, and under-expressed genes in green. (B) Volcano plot showing the difference between GBM and control samples after GSE122498 dataset analysis with GEO2R. (C) Volcano plot showing the difference between GBM and control samples after GSE104291 dataset analysis with GEO2R. (D) Volcano plot showing the difference between GBM and control samples after GSE78703_DMSO dataset analysis with GEO2R. (E) Volcano plot showing the difference between GBM and control samples after GSE78703_LXR dataset analysis with GEO2R. (F) Venn diagram showing the 130 DEGs common to all 4 datasets DEG, differentially expressed gene; GBM, glioblastoma.
DEG functional annotation using KEGG and GO analyses
GO analysis showed that variations in the BP were mainly enriched in cell division, mitotic nuclear division, DNA replication, chromosome segregation, sister chromatid cohesion, cell proliferation, the G2/M transition of mitotic cell cycle, and the G1/S transition of mitotic cell cycle. Changes in CC were mainly enriched in the nucleoplasm, nucleus, cytoplasm, condensed chromosome kinetochore, and cytosol, whereas variations in MF were enriched in protein binding and ATP binding (Table 2). KEGG analysis showed that DEGs were mainly enriched in the cell cycle, DNA replication, and the Fanconi anemia pathway (Table 2).
Construction of the PPI network and identification of the significant module
A total of 1394 edges and 98 nodes were identified in the PPI network (Figure 2A), and 1006 edges and 47 nodes in the significant module (Figure 2B).

(A) PPI network of DEGs consisting of 1394 edges and 98 nodes. (B) The significant module network selected from the PPI network consisting of 1006 edges and 47 nodes. (C) Ten hub genes identified by the criterion of judgment (degrees ≥10), including
Hub gene selection and analysis
Ten hub genes were identified using Cytoscape:
One co-expression network of these hub genes was obtained with cBioPortal (Figure 2D). Hierarchical clustering showed that hub genes could differentiate female GBM samples from male GBM ones (Figure 2E). cBioPortal was also used to perform Kaplan-Meier estimates of progression-free and overall survival. In a total of 206 GBM patients from The Cancer Genome Atlas, worse overall survival was seen when there were no mutations in any of the 10 hub genes identified in the present study (Figure 3A–T).

Kaplan-Meier estimates of progression-free and overall survival by cBioPortal for
GEPIA analysis showed that the expression of all hub genes was significantly higher in GBM samples than control samples (P<0.05; Figure 4A-J). Moreover, hub genes were expressed at significantly higher levels in different organs of patients with GBM compared with those in healthy controls (P<0.05; Figure 4K-T). Comparing the expression of hub genes among various tumors, all hub genes were shown to be significantly up-regulated in GBM samples compared with other tumor samples (P<0.05; Figure 5A). Correlation analysis identified strong connections between

Comparison of hub gene expression between GBM and control samples for

(A) Comparison of hub gene expression among tumor types. (B) Correlation analysis between
RT-qPCR analysis
RT-qPCR analysis showed that relative expression levels of

Relative expression of
Discussion
Recent studies have focused on screening immune-related molecular markers to predict the prognosis of patients with GBM. Overall survival was shown to differ significantly in patients with GBM whose isocitrate dehydrogenase gene mutation status varied, with those harboring mutations found to have a better prognosis (Wang, et al., 2013). Another study identified 10 miRNA molecular markers for GBM, of which 7 (miR-31, miR-222, miR-148a, miR-221, miR-146b, miR-200b, and miR-193a) were dangerous, whereas 3 (miR-20a, miR-106a, and miR-17-5P) were protective (Srinivasan, et al., 2011). Moreover, according to an mRNA expression profile, 3 genes (
Blocking the malignant progression of GBM is a key aim of targeted therapy (Polivka Jr, et al., 2017), and the exploration of relevant molecular biological mechanisms of GBM through new bioinformatic techniques is an important direction in neural tumor studies. The identification of potential biomarkers for efficient GBM diagnosis and treatment is urgently needed, and herein we identified 130 DEGs and 10 hub genes from the GEO database using bioinformatic technology which are potential therapeutic targets or biomarkers of GBM. Of the 10 hub genes, 4 were identified as being strongly connected through correlation analysis (
The
The protein encoded by
The product encoded by
A limitation of this study is that it was restricted to bioinformatics analysis, with no
Conclusions
The present study identified 130 DEGs and 10 hub genes between patients with GBM and control individuals, which could serve as biomarkers for the diagnosis and treatment of GBM. By achieving an early diagnosis or targeted treatment, the survival rate and quality of life of patients with GBM could be greatly improved.
Footnotes
List of abbreviations
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Ethics approval and consent to participate
Our study was approved by The Human Ethics and Research Ethics Committees of Zhejiang Cancer Hospital. All patients provided written informed consent prior to enrollment in the study.
Conflict of interests statement
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
