Abstract
CD133 is a valuable prognostic marker in multiple types of cancer. However, the expression, methylation levels, and prognostic relevance of CD133 have not been evaluated in a pan-cancer perspective. The expression and methylation levels of CD133 across different types of cancer were determined using The Cancer Genome Atlas (TCGA) dataset. Univariate cox regression and Kaplan-Meier survival were used to determine the prognostic significance of CD133 expression and methylation. CD133 was highly expressed in papillary renal cell carcinoma (PRCC) or pancreatic adenocarcinoma (PAAD). Correspondingly, PAAD and PRCC had low CD133 methylation levels. Through pan-cancer perspective analysis, we found that CD133 high expression was a poor prognostic factor in lower grade glioma (LGG), while, CD133 high expression was a good prognostic factor in PRCC. Moreover, genes positively correlated with CD133 expression were associated with the poor clinical outcomes of LGG. In PRCC, genes negatively correlated with CD133 expression were correlated with the poor overall survival. Furthermore, CD133 expression levels were highly correlated with the CD133 methylation levels in LGG or PRCC. Correspondingly, CD133 hypermethylation was a good prognostic factor in LGG. On the contrary, CD133 hypomethylation was a good prognostic factor in PRCC. We also found that CD133 was highly expressed and hypomethylated in wild type IDH subgroup of LGG. CD133 was highly expressed and hypomethylated in low stages and type1 of PRCC. CD133 high expression and hypomethylation were bad prognostic factors in LGG, while, CD133 high expression and hypomethylation were good prognostic factors in PRCC.
Keywords
Introduction
Prominin-1 (PROM1, CD133) is a transmembrane glycoprotein which has been used to identify a subpopulation of cancer cells termed as cancer stem cells in brain tumor,1–3 colon cancer,4,5 pancreatic cancer, 6 lung cancer, 7 prostate cancer, 8 liver cancer,9,10 ovarian cancer,11,12 acute myeloid leukemia (AML) 13 and acute lymphoblastic leukemia (ALL). 14 Cancer stem cells are resistant to chemotherapy 15 and induce tumor growth and recurrence, 16 representing a potential target in cancer therapy.17–19 However, in some cases, the using of CD133 as a cancer stem cell biomarker is controversial. For example, in glioma, CD133 negative cells show similar self renewal and drug resistant characteristics as CD133 positive cells.20–22 Although the functions of CD133 still need further studies, the identification of cancer stem cells provides therapeutic target for anti-cancer treatment.23,24
The prognostic relevance of CD133 in various types of tumor is also studied. In glioma, 25 breast cancer, 26 colon cancer, 27 stomach cancer, 28 non-small cell lung cancer 29 and liver cancer, 30 patients with higher CD133 expression have worse clinical outcomes than patients with lower CD133 expression. However, in endometrial cancer 31 or renal cell carcinoma, 32 CD133 positive tumor status is correlated with favorable prognosis. The methylation level of CD133 is also a prognostic marker. In glioma and gastrointestinal stromal tumors, CD133 hypomethylation is associated with high tumor recurrence.33,34 On the contrary, in head and neck cancer, the hypermethylation of CD133 is associated with the poor prognosis. 35 In liver cancer patients, the different location of CD133 represents opposite prognostic significance. Cytoplasmic CD133 is correlated with unfavorable outcomes, while, nuclear CD133 is associated with favorable outcomes. 36 Those results highlight the divergent prognostic effects of CD133 in different types of tumor. So, evaluating the expression and prognostic relevance of CD133 in a pan-cancer manner is needed.
The Cancer Genome Atlas (TCGA) project contains the molecular signatures and clinical characteristics of more than 11,000 human cancer patients across 33 different tumor types.37,38 With the available TCGA dataset, in the present study, we systematically evaluated the expression and methylation levels of CD133, and determined the prognostic effects of CD133 across different tumor types.
Materials and methods
Data collection
The TCGA RNA-seq, HumanMethylation450 datasets along with the clinical datasets were downloaded from the TCGA hub (tcga.xenahubs.net). The Chinese Glioma Genome Atlas (CGGA) datasets were available at http://www.cgga.org.cn/index.jsp website.39–41 The gene expression matrix of glioblastoma multiformem (GBM) samples was downloaded from the Gene Expression Omnibus (GEO) website (www.ncbi.nlm.nih.gov/geo), including GSE7696 42 and GSE13041 43 datasets. The DNA methylation value was described as beta values. Higher beta values represented higher level of DNA methylation, that is, hypermethylation. And lower beta values represented lower level of DNA methylation, that is, hypomethylation.
Univariate cox regression analysis
Univariate cox regression analysis was determined by “survival” package (version 3.1-8; https://cran.r-project.org/web/packages/survival/index.html) in R statistics software (version 3.5.0; https://www.r-project.org/). The “survival” package and the basic usage could be downloaded from bioconductor (http://www.bioconductor.org/). Log-rank test was used to calculate the p values. In the univariate cox regression analysis, the coefficient measured the impact of covariates. The exponentiated coefficients were known as hazard ratios (HR).
Kaplan-Meier survival analysis
Kaplan-Meier plots were created using “survival” package in R statistics software. Tumor patients were divided into two sub-clusters based on the mean expression levels or methylation levels of different genes. Kaplan-Meier estimator was applied to determine the overall survival of those two sub-clusters of patients. p values were determined using Log-rank test.
Correlation plots
Correlation plots were created using the “corrplot” package (version 0.84; cran.r-project.org/web/packages/corrplot/index.html) in R statistics software. The Spearman’s correlation test was used to test the correlation coefficients.
Kyoto encyclopedia of gens and genomes (KEGG) signaling pathways and transcription factors enrichment analysis
The Database for Annotation, Visualization and Integrated Discovery (DAVID) website (version 6.8; https://david.ncifcrf.gov) was used to determine the KEGG signaling pathways and transcription factors which were associated with CD133 expression. Enriched signaling pathways and transcription factors with p value less than 0.05 was considered to be statistical significant.
Statistical analysis
The box plots were generated from GraphPad software Prism 5.0 (https://www.graphpad.com/). Statistical analysis was performed using the two-tailed unpaired Student’s t test. p value less than 0.05 was chosen to be statistically different.
Results
The expression and methylation levels of CD133 across different types of tumor
Collectively, 7930 cancer patients derived from TCGA RNA-seq datasets were used to test the mRNA expression levels of CD133 across 24 different types of tumor. The abbreviation and number of cancer patients in each tumor type were shown in Table 1. Compared with other types of tumor, CD133 was most highly expressed in patients with PRCC (Figure 1(a)). Patients with PAAD or COAD were also with high CD133 expression (Figure 1(a)). On the contrary, patients with adrenocortical cancer (ACC), large B-cell lymphoma (DLBC) or liver hepatocellular carcinoma (LIHC) were with low CD133 expression levels (Figure 1(a)).
Univariate cox regression analysis was used to determine the prognostic significance of CD133 expression across different types of tumor.
ACC, Adrenocortical cancer; BLCA, Bladder urothelial carcinoma; BRCA, Breast invasive carcinoma; CESC, Cervical Squamous cell carcinoma; COAD, Colon adenocarcinoma; DLBC, Large B-cell lymphoma; ESCA, Esophageal carcinoma; GBM, Glioblastoma multiforme; HNSC, Head and neck squamous cell carcinoma; KIRC, Renal clear cell carcinoma; PRCC, Papillary renal cell carcinoma; LAML, Acute myeloid leukemia; LGG, Brain lower grade glioma; LIHC, Liver hepatocellular carcinoma; LUAD, Lung adenocarcinoma; LUSC, Lung squamous cell carcinoma; OV, Ovarian serous cystadenocarcinoma; PAAD Pancreatic adenocarcinoma; PRAD, Prostate adenocarcinoma; READ, Rectum adenocarcinoma; SARC, Sarcoma; SKCM, Skin cutaneous melanoma; STAD, Stomach adenocarcinoma; UCEC, Uterine corpus endometrial carcinoma; UCS, Uterine carcinosarcoma; HR, hazard ratio; CI, confidence interval.

The expression and methylation levels of CD133 across different types of tumor: (a) box plot showed the expression levels (log2 count) of CD133 based on TCGA RNA-seq datasets across different types of tumor, and (b) methylation levels (Beta values) of CD133 in different types of tumor which were retrieved from TCGA HumanMethylation450 datasets.
The DNA methylation levels of CD133 across different types of tumor were also analyzed. The number of tumor patients used for DNA methylation analysis was shown in Table 2. Corresponding to the high expression levels of CD133 in PRCC, PAAD, or COAD patients, the DNA methylation levels of CD133 in PRCC, PAAD, or COAD patients were relatively low (Figure 1(b)). On the contrary, CD133 was hypermethylated in patients with prostate adenocarcinoma (PRAD) or skin cutaneous melanoma (SKCM) (Figure 1(b)). Those significant different expression and methylation levels of CD133 suggested the diverse functions of CD133 across different types of tumor.
Univariate cox regression analysis was used to determine the prognostic significance of CD133 methylation across different types of tumor.
Prognostic relevance of CD133 expression levels across different types of tumor
Using univariate cox regression analysis, we determined the prognostic relevance of CD133 expression levels in each tumor type. In all the 25 studied types of tumor, CD133 expression was only associated with the clinical overall survival of LGG, PRCC, and PAAD (Table 1). Although, it was previously reported that CD133 was a biomarker associated with the overall survival of COAD, stomach adenocarcinoma (STAD) and lung adenocarcinoma (LUAD), we did not obtain similar results using TCGA datasets (Table 1). The univariate cox regression analysis also supposed the opposite prognostic significance of CD133 expression in LGG and PRCC. CD133 expression was an unfavorable prognostic marker in LGG patients (Coefficient = 0.31, p = 2.77E-05), while, CD133 expression was a favorable prognostic marker in PRCC patients (Coefficient = −0.2, p = 3.04E-06) (Table 1).
The Kaplan-Meier survival analysis demonstrated similar opposite prognostic effects of CD133 in patients with LGG or PRCC. Compared with CD133 highly expressed LGG patients, CD133 lowly expressed patients had more favorable overall survival (Figure 2(a)). On the contrary, CD133 lowly expressed PRCC patients had more unfavorable clinical overall survival, compared with CD133 highly expressed PRCC patients (Figure 2(a)). Moreover, CD133 high expression was an unfavorable prognostic factor in patients with SKCM, while, CD133 high expression was a favorable prognostic factor in patients with ACC (Figure 2(a)). The Kaplan-Meier survival analysis also demonstrated that there was no prognostic significance of CD133 expression in patients with PAAD, COAD, STAD, or LUAD (Figure 2(a)).

Prognostic relevance of CD133 expression levels across different types of tumor: (a) Kaplan–Meier plots demonstrated the different overall survival in CD133 highly expressed (red) and CD133 lowly expressed patients (blue) with LGG, SKCM, PRCC, ACC, PAAD, COAD, STAD, or LUAD. p value was generated from Log-rank test, and (b) Kaplan-Meier plots demonstrated the prognostic significance of CD133 expression levels in LGG in CGGA or GBM in CGGA, GSE7696 and GSE13041 datasets. The log-rank test was applied to compare the different overall survival of glioma patients with high CD133 expression levels (red) or low CD133 expression levels (blue).
The prognostic significance of CD133 expression levels in patients with LGG was further validated using CGGA dataset. Similar to the results derived from TCGA dataset, CD133 lowly expressed LGG patients had better clinical outcomes, compared with CD133 highly expressed LGG patients in CGGA dataset (Figure 2(b)). Although, CD133 expression was not a prognostic marker in patients with glioblastoma multiforme (GBM) in TCGA dataset, we found that in CGGA, GSE7696 and GSE13041 datasets, CD133 highly expressed GBM patients had worse clinical overall survival, compared with CD133 lowly expressed GBM patients (Figure 2(b)). Those results highlighted the heterogeneity of cancer, and suggested that the prognosis of CD133 should be further validated using other cohort of patients.
Prognostic relevance of the genes correlated with CD133 expressionin patients with LGG
In both univariate cox regression and Kaplan-Meier survival analysis, CD133 was a significant prognostic marker for LGG patients. To further demonstrate the prognostic relevance of CD133 expression, we identified the top 200 genes which were most positively correlated with CD133 expression in TCGA LGG dataset. Those genes were highly associated with proteoglycans in cancer, apoptosis and Hippo signaling pathway (Figure 3(a)). Four genes CASP3, CASP6, CASP8, and CASP10 were enriched in apoptosis signaling pathway. As demonstrated in the corrplots, the correlations of CD133 with CASP3, CASP6, CASP8, and CASP10 were significant (Figure 3(c)). Moreover, high CASP3, CASP6, CASP8, and CASP10 expression levels were unfavorable prognostic markers for LGG patients. Compared with CASP3, CASP6, CASP8, or CASP10 highly expressed LGG patients, CASP3, CASP6, CASP8, or CASP10 lowly expressed patients had better clinical outcomes (Figure 3(d)).

Prognostic relevance of the genes correlated with CD133 expression in patients with LGG: (a) Functional signaling pathway enrichment analysis of the top 200 genes which were most positively associated with CD133 expression in TCGA LGG patients. The significantly enriched pathways were shown, (b) Transcription factor enrichment analysis of the top 200 genes which were most positively associated with CD133 expression, (c) Corrplots demonstrated the association of CD133 with genes in apoptosis signaling pathway. The size of the circle represented correlation coefficients, (f) Kaplan–Meier survival analysis was used to compare apoptosis signaling pathway associated genes CASP3, CASP6, CASP8, or CASP10 highly expressed LGG patients (red) with CASP3, CASP6, CASP8, or CASP10 lowly expressed LGG patients (blue) in TCGA dataset. p values were generated from Log-rank test, and (e) Kaplan–Meier plot demonstrated the prognostic significance of transcription factors POU3F1, BACH1, FOXD3, and SOX9 in patients with LGG in TCGA dataset.
The top 200 genes which were most positively correlated with CD133 expression were significantly enriched by transcription factors POU3F2, BCAH1, FOXO4, SOX9, and SOX5 (Figure 3(b)). Moreover, prognostic effects of transcription factors POU3F1, BACH1, FOXD3, and SOX9 in patients with LGG were significant (Figure 3(e)). Those results further highlighted the prognostic significance of CD133 in patients with LGG.
Prognostic relevance of the genes correlated with CD133 expression in patients with PRCC
The top 200 genes which were most positively correlated with CD133 expression in TCGA PRCC dataset were also identified. Those genes were associated with cytokine-cytokine receptor interaction, complement and coagulation cascades, and leukocyte transendothelial migration signaling pathways (Figure 4(a)). However, genes from those signaling pathways demonstrated no prognostic effects in patients with PRCC.

Prognostic relevance of the genes correlated with CD133 expression in patients with PRCC: (a) Functional signaling pathway enrichment analysis of the top 200 genes which were most positively associated with CD133 expression in TCGA PRCC patients, (b) functional signaling pathway enrichment analysis of the top 200 genes which were most negatively associated with CD133 expression in TCGA PRCC patients, and (c) Kaplan–Meier survival analysis was used to determine the prognostic effects of p53 signaling pathway associated genes CCNB1, CCNB2, CCNE1, CDK1, CDK4, GTSE1, and RRM2 in PRCC patients. p values were generated from Log-rank test.
So, we identified the top 200 genes which were most negatively correlated with CD133 expression. We found that cell cycle and p53 signaling pathway were highly enriched by CD133 negatively correlated genes (Figure 4(b)). Moreover, genes CCNB1, CCNB2, CCNE1, CDK1, CDK4, GTSE1, and RRM2 from p53 signaling pathway were unfavorable prognostic markers for patients with PRCC. Compared with CCNB1, CCNB2, CCNE1, CDK1, CDK4, GTSE1, or RRM2 highly expressed PRCC patients, CCNB1, CCNB2, CCNE1, CDK1, CDK4, GTSE1, or RRM2 lowly expressed patients had better clinical outcomes (Figure 4(c)).
Prognostic relevance of CD133 methylation levels across different types of tumor
Next, we tried to determine the prognostic relevance of CD133 methylation levels across different types of tumor. Similarly, using univariate cox regression analysis, we showed that CD133 methylation levels were associated with the overall survival in patients with ACC, bladder urothelial carcinoma (BLCA), GBM, head and neck squamous cell carcinoma (HNSC), renal clear cell carcinoma (KIRC), PRCC, acute myeloid leukemia (LAML), or LGG (Table 2). The prognostic significance of CD133 methylation in LGG and PRCC was also opposite. CD133 methylation was favorable prognostic marker in LGG patients (Coefficient = −6.24, p = 9.62E-13), while, CD133 methylation was an unfavorable prognostic marker in PRCC patients (Coefficient = 5.81, p = 1.84E-06) (Table 2).
Spearman correlations demonstrated high correlation coefficients of CD133 expression levels and CD133 methylation levels in patients with LGG or PRCC in TCGA dataset (Figure 5(a)). However, the correlation of CD133 expression levels and CD133 methylation levels in patients with GBM or HNSC were relatively low. And CD133 methylation levels but not CD133 expression levels were associated with the overall survival of patients with GBM or HNSC (Tables 1 and 2).

Prognostic relevance of CD133 methylation levels across different types of tumor: (a) Spearman correlations of CD133 expression levels and CD133 methylation levels in patients with LGG, PRCC, GBM, or HNSC in TCGA datasets were determined, and (b) overall survival was determined in CD133 hypermethylated (red) and CD133 hypomethylated patients (blue) in LGG, GBM, BLCA, LAML, PRCC, HNSC, ACC, or KIRC. P value was generated from Log-rank test.
Kaplan-Meier survival analysis also showed that LGG, GBM, BLCA, or LAML patients with hypermethylated CD133 had better clinical outcomes than patients with hypomethylated CD133 (Figure 5(b)). On the contrary, PRCC, HNSC, ACC, or KIRC patients with hypermethylated CD133 had worse clinical outcomes (Figure 5(b)).
The expression and methylation levels of CD133 in different grades or subtypes of LGG
The expression levels of CD133 in different grades or subtypes of LGG were tested. Compared with grade II LGG, CD133 were highly expressed in LGG patients with grade III (Figure 6(a)). Furthermore, the CD133 expression levels were even higher in grade IV glioma (Figure 6(a)). However, there were no significant different expression levels of CD133 in LGG astrocytoma, oligoastrocytoma, and oligodendroglioma subtypes (Figure 6(a)). The expression levels of CD133 were also tested in LGG patients with IDH1 mutation 1p19q codeletion, IDH1 mutation 1p19q non-codeletion or wild type IDH1. Consistent with the poor prognosis of IDH1 wild type LGG, the expression levels of CD133 were higher in LGG patients with wild type IDH1 (Figure 6(a)). Similar results were validated using CGGA dataset that LGG patients with wild type IDH1 demonstrated high CD133 expression levels (Figure 6(b)). And grade IV glioma was with higher CD133 expression levels (Figure 6(b)).

The expression and methylation levels of CD133 in different grades or subtypes of LGG: (a) box plots demonstrated the CD133 expression levels in different grades or subtypes of LGG in TCGA datasets. p values were performed using two-tailed unpaired Student’s t test, (b) box plots demonstrated the expression levels of CD133 in different grades or subtypes of LGG in CGGA datasets, and (c) the methylation levels of CD133 in different grades or subtypes of LGG in TCGA datasets
The methylation levels of CD133 in different grades or subtypes of LGG were also tested. Corresponding to the higher expression levels of CD133 in grade IV glioma, the methylation levels of CD133 in grade IV glioma was lower (Figure 6(c)). Also, LGG patients with wild type IDH1 which demonstrated higher expression levels of CD133 were with lower CD133 methylation levels (Figure 6(c)).
The expression and methylation levels of CD133 in different grades or subtypes of PRCC
We also tested the expression levels of CD133 in patients with different pathological stages of PRCC. Compared with PRCC patients with stage I or stage II, CD133 was lowly expressed in patients with stage III PRCC (Figure 7(a)). Moreover, the expression levels of CD133 in patients with T3 stage of PRCC were also lower (Figure 7(a)). Correspondingly, CD133 was hypermethylated in PRCC patients with stage III or T3 (Figure 7(b)).

The expression and methylation levels of CD133 in different subtypes of PRCC: (a) box plots demonstrated the CD133 expression levels in different subtypes or stages of PRCC in TCGA datasets. p values were performed using two-tailed unpaired Student’s t test, and (b) the methylation levels of CD133 in different subtypes or stages of PRCC in TCGA datasets.
Besides the basic pathological stages, patients with PRCC were divided into two main subtypes: type 1 and type 2, characterized by different genetic alterations and clinical features. 44 We found that CD133 was highly expressed in type 1 PRCC patients, compared with type 2 PRCC patients (Figure 7(a)). On the contrary, the methylation levels of CD133 in type 1 PRCC patients were lower (Figure 7(b)).
Discussion
Using TCGA dataset, we determined the prognostic effects of CD133 expression and methylation levels in a pan-cancer manner. Previously, CD133 was reported to be associated with the prognosis of breast cancer, 26 colon cancer, 27 stomach cancer 28 or liver cancer. 30 However, we did not find similar prognostic significance of CD133 based on the TCGA dataset. Those results highlighted the complexity of cancer and suggested that different cohort of patients and methods of measurement would influence the prognosis of CD133 in different types of tumor. For example, CD133 had no prognostic significance in TCGA GBM dataset, but correlated with the clinical outcomes of GBM patients derived from CGGA, GSE7696, and GSE13041 datasets (Figure 2(a)). So, the correlations of CD133 expression and clinical outcomes in other cohort of patients with breast cancer, colon cancer, stomach cancer, or liver cancer should be further studied.
Glioma is a common type of brain cancer and divided into LGG (grade II-III glioma) and GBM (grade IV glioma) subtypes. 45 Previously, the using of CD133 as a cancer stem cell marker and prognostic marker was mainly focused on GBM, while, the prognostic relevance of CD133 in LGG was unclear. We found that CD133 methylation was a more significant prognostic marker than CD133 expression levels in GBM. Moreover, CD133 expression levels were highly correlated with CD133 methylation levels in LGG and CD133 expression and methylation levels were both associated with the clinical outcomes and sub-classifications of patients with LGG. It is interesting to the test whether CD133 is also a cancer stem cell marker for LGG.46,47
PRCC is a type of renal cell carcinoma. 48 Compared with other types of tumor, CD133 was most expressed in patients with PRCC. Using tissue microarray, previous report suggested that CD133 served as favorable prognostic marker in PRCC. 49 Here, we found that CD133 mRNA high expression and CD133 hypomethylation were also favorable prognostic markers in patients with PRCC. Moreover, CD133 was highly expressed and hypomethylated in type 1 PRCC. All those results highlighted the prognostic significance of CD133 was highly depended on the types of tumor. However, the precise functions of CD133 in PRCC and LGG should be further investigated.
The aim of present study is to determine the expression, methylation, and prognostic relevance of CD133 across cancer types. By comprehensive analysis, our results revealed the opposite prognostic significance of CD133 in LGG and PRCC. However, there are limitations in our comprehensive pan-cancer analysis. Our analyses were based on the published TCGA databases and lack of further clinical validation. Therefore, quantitative PCR would be used to test the CD133 expression in patient with LGG or PRCC to determine the prognosis of CD133.
Conclusions
CD133 demonstrated opposite prognostic significance in patients with LGG or PRCC. CD133 high expression and hypomethylation were bad prognostic factors in patients with LGG, while, CD133 high expression and hypomethylation were good prognostic factors in patients with PRCC.
Footnotes
Acknowledgements
We appreciate the generosity of the TCGA and GEO groups for sharing the huge amount of data.
Authors’ contributions
HW designed and performed data analysis. XW and LX analyzed the data. HW wrote the manuscript. HC and JZ designed and supervised the work.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The present study was supported by grants from the Fujian Maternity and Child Health Hospital (grant nos. YCXB 18-10 and YCXM 19-04). This study was also supported by Natural Science Foundation of Fujian province (grant nos.2020J01337).
