Abstract
In China, hepatocellular carcinoma (HCC) is the most commonly diagnosed cancer and the leading cause of cancer death in men, followed by lung and stomach cancer. There was an urgent need to identify novel prognostic biomarkers for HCC. We explored the expression pattern of m6A related proteins in HCC tissues by using TCGA in this study. We found that the m6A ‘reader’ YTHDF1 was significantly upregulated in HCC and was positive correlated with pathology stage. Kaplan-Meier analysis showed that Lower YTHDF1 expression level was associated with better survival of HCC patients. Furthermore, we performed GO and KEGG pathway analysis of YTHDF1 co-expressed genes and found YTHDF1 played an important role in regulating HCC cell cycle progression and metabolism. We believed that this study will provide a potential new therapeutic and prognostic target for HCC.
Keywords
Introduction
In China, hepatocellular carcinoma (HCC) is the most commonly diagnosed cancer and the leading cause of cancer death in men, followed by lung and stomach cancer. Moreover, according to Cancer Statistics in China of 2015 [1], HCC is the second and fifth most common cause of cancer death in males and females, respectively. Recently a series of therapies including surgery, chemotherapy, and radiotherapy were developed for patients with HCC, however, the mortality rate of HCC was still very high [2, 3, 4, 5]. Although tumor-node-metastasis (TNM) classification systems was the basic prognostic model, there was still an urgent need to identify novel prognostic biomarkers for HCC.
N
In this study, we explored the expression pattern of m6A related proteins in HCC tissues by using TCGA. We evaluated the correlations between m6A related proteins and HCC patients’ clinical features including age, pathological stage and recurrence free survival. We believed that this study will provide a potential new therapeutic and prognostic target for hepatocellular carcinoma.
Materials and methods
Patients and clinicopathological data
The level-3 expression data (RNA-seqV2) and clinicopathological data of 373 HCC patients and 50 normal samples were downloaded from The Cancer Genome Atlas (TCGA,
A total of 372 patients with survival data in the TCGA database were given a follow-up exam ranging from 0 to 120.73 months, and the median survival was 19.66 months. For the analysis of survival and follow-up, the date of surgery was used as the beginning of the follow-up period. Overall survival (OS) and disease-free survival (DFS) were calculated in months from the data of diagnosis to the time of death and to the time of tumor progression or recurrence, or death of the patient from HCC, respectively. All patients that died from diseases other than HCC or unexpected events were excluded from the cohort. In this study, we focused on exploring whether m6A related protein were associated with 5-years OS and DFS. In present study, the upper quarter m6A related mRNA expression in all HCC tissues was used as the cutoff point to divide all cases into gene-high and gene-low groups.
GO and KEGG pathway analysis
To identify functions of m6A related proteins in HCC, we performed GO function enrichment analysis by using MAS 3.0 (
Co-expression network construction and analysis
In present study, we performed co-expression analysis of candidate genes by calculating the Pearson correlation coefficient of paired genes. The co-expressed gene pairs with the absolute value of Pearson correlation coefficient
The expression level of seven m6A related genes (YTHDF1, YTHDF2, ALKBH1, ALKBH5, FTO, METTL3, and METTL14) in a TCGA dataset. YTHDF1 (A) and METTL3 (G) were up-regulated and YTHDF2 (B), ALKBH1 (D), ALKBH5 (E), FTO (F) and METTL14 (H) were down-regulated in hepatocellular carcinoma. Significance was defined as 
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING; string-db.org/) database is a pre-computed global resource for the investigation and analysis of associations between proteins. We used STRING to analyze the interactions between DEGs and constructed a PPI network. The database reveals protein interactions, including experimental and predicted protein interaction information. In this study, only experimentally validated interactions with a combined score
The correlation between the expression levels of m6A related proteins and clinicopathological characteristics. YTHDF1 (A) (but not YTHDF2 (B), YTDHF3 (C), ALKBH1 (D), ALKBH5 (E), FTO (F), METTL3 (G), METTL14 (H)) expression was significantly up-regulated in Stage III/ IV HCC compared to Stage I and Stage II HCC. Significance was defined as 
Numerical data were presented as the mean
The correlation between the expression levels of m6A related proteins and 5-year survival rate. Upregulation of YTHDF1 (A) and YTHDF2 (B) and down-regulation of FTO (F) was correlated with significantly shorter 5-year survival rate.
The correlation between the expression levels of m6A related proteins and DFS rates. Upregulation of YTHDF1 (A) and down-regulation of YTHDF3 (C), ALKBH5 (E), FTO (F) and METTL14 (H) was correlated with shorter DFS rate.
m6A related proteins were dysregulated in hepatocellular carcinoma
In present study, we analyzed TCGA data to explore the expression patterns of m6A related proteins in hepatocellular carcinoma. A total of 373 HCC patients and 50 normal samples were included in TCGA data. As shown in Fig. 1, we observed seven m6A related genes (YTHDF1, YTHDF2, ALKBH1, ALKBH5, FTO, METTL3, and METTL14) were differentially expressed in HCC compared to match normal patients. Of these genes, YTHDF1 (Fig. 1A,
YTHDF1 was positive correlated with pathology stage in hepatocellular carcinoma
We next analyzed the correlation between the expression levels of m6A related proteins and clinicopathological characteristics. As shown in Fig. 2, we found that YTHDF1 expression was significantly up-regulated in Stage III/IV HCC compared to Stage I (Fig. 2A,
Overexpression of YTHDF1 is associated with poor prognosis in patients with hepatocellular carcinoma
OS curves were plotted according to candidate gene expression level by the Kaplan-Meier method and log-rank tests. As shown in Fig. 3, overexpression of YTHDF1 (Fig. 3A,
Multivariate analysis of BCR-free survival for all HCC patients
Multivariate analysis of BCR-free survival for all HCC patients
overall survival. We also explored whether these m6A related genes were associated with disease free survival of HCC patients. Similar to OS analysis, we also found the DFS rates were lower in YTHDF1-high patients in both datasets comparing to YTHDF1-low patients (Fig. 4A,
(A) The protein-protein interaction network of YTHBF1 co-expressed genes in the HCC. GO (B) and pathway (C) analysis of YTHBF1 co-expressed genes.
Collectively, our results had showed overexpression of YTHDF1 was associated with the progression and could predict poor prognosis of HCC. However, the potential roles of YTHDF1 in HCC remains unclear. To predict the functions of the YTHBF1, we first constructed gene co-expression networks to identify interactions among differentially expressed mRNAs and YTHBF1 by using TCGA. The co-expressed YTHBF1-gene pairs with the absolute value of Pearson correlation coefficient
MAS 3.0 system was used to perform GO and pathway analysis on the co-expressed genes. A total of 413 differentially expressed genes were classified according to GO term. Our analysis revealed that these co-expressed genes were associated with transcription, cell cycle, cell division, mitosis, oxidation reduction, response to DNA damage stimulus, DNA repair, DNA replication, RNA splicing, chromatin modification, microtubule-based movement, protein amino acid phosphorylation, modification-dependent protein catabolism, proteolysis, lipid metabolism (Fig. 5B). KEGG pathway enrichment analyses showed co-expre- ssed genes of YTHBF1 mainly participate in regulating Valine, leucine and isoleucine degradation, Cell cycle, Complement and coagulation cascades, Propanoate metabolism, Fatty acid metabolism, Butanoate metabolism, Lysine degradation, beta-Alanine metabolism, Tryptophan metabolism, Caprolactam de- gradation, DNA polymerase, PPAR signaling pathway, Homologous recombination, Limonene and pinene degradation, Fatty acid elongation in mitochondria, etc. (Fig. 5C).
Discussion
Hepatocellular carcinoma is one of the most frequently diagnosed cancer worldwide [30]. In the past years, a series of therapies including surgery, chemo- therapy, and radiotherapy were developed for HCC treatment. Meanwhile many efforts were payed to identify prognostic biomarkers of HCC. A few proteins including HAO2 [31], ACTL6A [32], and RBMY [33, 34] were reported to be differentially expressed in HCC. Of note, long non-ding RNAs (such as ZEB1-AS1 [35], SNHG20 [36, 37] and ATB [38]) were also showed to be associated with HCC progression. However, the prognosis of HCC patients still remains dismal. Therefore, there was still an urgent need to identify novel prognostic biomarkers for HCC. In present study, we explored whether m6A related genes could serve as novel biomarkers for HCC. We found many m6A related proteins, including YTHDF1, YTHDF2 and FTO, were involved in HCC progression. For example, seven m6A related genes (YTHDF1, YTHDF2, ALKBH1, ALKBH5, FTO, METTL3, and METTL14) were differentially expressed in HCC compared to match normal patients. Moreover, we observed YTHDF1, YTHDF2 and FTO were correlated with 5-year survival rate. Furthermore, we found YTHDF1, YTHDF3, ALKBH5, FTO and METTL14 were correlated with DFS rate.
N6-methyladenosine (m6A), the most prevalent and internal modification of mRNAs, played crucial roles in regulating mRNA splicing, export, localization, translation, and stability. Several proteins, including YTHDF1, YTHDF2, YTHDF3, ALKBH1, ALKBH5, FTO, METTL3 and METTL14 were reported to mediated the m6A methylation. Previous studies had showed expression of these m6A related genes were dysregulated in several cancers (e.g., breast cancer, lung cancer and glioma). For example, overexpression of ALKBH5 were associated with lower survival time of glioma [39]. METTL3 was significantly up-regulated in non-small-cell lung carcinoma (NSCLC). These results suggest that these proteins were involved cancer development and progression. Moreover, Chen et al. found RNA N6-methyladenosine methyltransferase METTL3 promoted liver cancer progression through YTHDF2 dependent post-transcriptional silencing of SOCS2 [40]. In this study, we observed that YTHDF1 and METTL3 were up-regulated and YTHDF2, ALKBH1, ALKBH5, FTO and METTL14 were down-regulated in hepatocellular carcinoma.Moreover, we found that upregulation of YTHDF1 was significantly up-regulated in high grade of HCC. Kaplan-Meier analysis showed that overexpression of YTHDF1 and down-regulation of FTO was correlated with significantly shorter OS and DFS survival rate of HCC. These results showed YTHDF1 could serve as a biomarker for HCC. To the best of our knowledge, it was firstly reported that m6A related genes was involved in the prognosis of HCC.
YTHDF1, containing a YTH RNA-binding domain, could bind to m6A and was involved in Protein recruitment. In present study, we also explored the potential function roles of YTHDF1 in HCC by using GO and KEGG pathway analysis. Our analysis revealed that these co-expressed genes were associated with transcription, cell cycle, cell division, mitosis, oxidation reduction, response to DNA damage stimulus, DNA repair, DNA replication, RNA splicing, chromatin modification, microtubule-based move-
Conclusion
In conclusion, our study showed for the first time that YTHDF1 was significantly upregulated in HCC. Furthermore. our results showed YTHDF1was upregulated in high pathology stage HCC. Kaplan-Meier analysis showed that Lower YTHDF1expression level was associated with better survival of HCC patients. Furthermore, we performed GO and KEGG pathway analysis of YTHDF1 co-expressed genes and found YTHDF1 played an important role in regulating HCC cell cycle progression and metabolism. We believed that this study will provide a potential new therapeutic and prognostic target for HCC.
Footnotes
Conflict of interest
None.
Supplementary data
The important pathways in HCC progression
Pathway
Count
p-value
q-value
Valine, leucine and isoleucine
22
1.09E-29
4.60E-28
degradation
Cell cycle
29
1.18E-28
4.14E-27
Complement and coagulation
21
2.25E-23
5.27E-22
cascades
Propanoate metabolism
15
5.67E-20
9.21E-19
Fatty acid metabolism
16
2.21E-19
2.74E-18
Butanoate metabolism
13
9.71E-16
8.20E-15
Lysine degradation
13
7.33E-14
4.83E-13
Table 1, continued
Pathway
Count
p-value
q-value
Beta-Alanine metabolism
10
1.17E-13
7.46E-13
Tryptophan metabolism
11
5.95E-12
3.40E-11
Caprolactam degradation
5
7.23E-09
2.40E-08
DNA polymerase
8
1.61E-08
5.01E-08
PPAR signaling pathway
10
2.35E-08
7.18E-08
Homologous recombination
7
5.38E-08
1.55E-07
Limonene and pinene degradation
6
7.32E-08
2.09E-07
Fatty acid elongation in
5
8.40E-08
2.35E-07
mitochondria
Pyrimidine metabolism
10
3.67E-07
9.05E-07
Gap junction
10
3.67E-07
9.05E-07
Purine metabolism
12
7.53E-07
1.69E-06
Pyruvate metabolism
7
8.83E-07
1.94E-06
Benzoate degradation via CoA
5
9.49E-07
2.00E-06
ligation
Bile acid biosynthesis
6
1.66E-06
3.27E-06
Retinol metabolism
8
1.94E-06
3.68E-06
Ubiquitin mediated proteolysis
11
2.12E-06
3.96E-06
p53 signaling pathway
8
3.07E-06
5.53E-06
Citrate cycle (TCA cycle)
6
3.70E-06
6.46E-06
Drug metabolism – cytochrome
8
4.25E-06
7.28E-06
P450
Base excision repair
6
6.30E-06
1.05E-05
Ribosome
9
7.37E-06
1.20E-05
3-Chloroacrylic acid degradation
4
1.79E-05
2.73E-05
Alanine and aspartate metabolism
5
3.23E-05
4.77E-05
Histidine metabolism
5
4.53E-05
6.41E-05
Non-small cell lung cancer
6
6.87E-05
9.47E-05
Adherens junction
7
7.46E-05
1.00E-04
Pathogenic Escherichia coli
6
7.63E-05
1.02E-04
infection – EHEC
Pathogenic Escherichia coli
6
7.63E-05
1.02E-04
infection – EPEC
Synthesis and degradation of
3
1.69E-04
2.10E-04
ketone bodies
Methionine metabolism
4
2.00E-04
2.39E-04
Mismatch repair
4
2.00E-04
2.39E-04
Glycosylphosphatidylinositol
4
2.80E-04
3.16E-04
(GPI) – anchor biosynthesis
Metabolism of xenobiotics by
6
2.93E-04
3.27E-04
cytochrome P450
Ascorbate and aldarate metabolism
4
3.27E-04
3.60E-04
Tyrosine metabolism
5
3.45E-04
3.77E-04
Parkinson’s disease
8
4.07E-04
4.36E-04
Chronic myeloid leukemia
6
4.26E-04
4.54E-04
Urea cycle and metabolism of
4
4.39E-04
4.66E-04
amino groups
Drug metabolism – other enzymes
5
5.06E-04
5.31E-04
Starch and sucrose metabolism
5
5.54E-04
5.74E-04
Inositol phosphate metabolism
5
7.19E-04
7.25E-04
Table 1, continued
Pathway
Count
p-value
q-value
Pantothenate and CoA biosynthesis
3
8.66E-04
8.52E-04
Glyoxylate and dicarboxylate
3
8.66E-04
8.52E-04
metabolism
Small cell lung cancer
6
8.84E-04
8.63E-04
ErbB signaling pathway
6
9.39E-04
9.09E-04
Geraniol degradation
2
9.80E-04
9.42E-04
Prostate cancer
6
9.97E-04
9.52E-04
Glycolysis/Gluconeogenesis
5
0.001432
0.001337
Glioma
5
0.001535
0.001421
Pancreatic cancer
5
0.002567
0.002257
Androgen and estrogen
4
0.002689
0.002354
metabolism
GnRH signaling pathway
6
0.002709
0.002362
Systemic lupus erythematosus
7
0.002765
0.002401
Phosphatidylinositol signaling
5
0.00306
0.002614
system
Pentose and glucuronate
3
0.003979
0.003241
interconversions
Selenoamino acid metabolism
3
0.004455
0.003581
Pentose phosphate pathway
3
0.004455
0.003581
Arachidonic acid metabolism
4
0.006315
0.005009
Phenylalanine, tyrosine and
2
0.006983
0.005508
tryptophan biosynthesis
Regulation of actin cytoskeleton
8
0.007736
0.006068
Alzheimer’s disease
7
0.008414
0.006409
Tight junction
6
0.008636
0.006462
Folate biosynthesis
3
0.008746
0.006521
Oxidative phosphorylation
6
0.009558
0.007101
Primary immunodeficiency
3
0.010295
0.007465
1- and 2-Methylnaphthalene
2
0.011797
0.008368
degradation
1, 4-Dichlorobenzene degradation
1
0.012902
0.008985
Melanoma
4
0.013498
0.009368
Bladder cancer
3
0.016901
0.011394
Glycine, serine and threonine
3
0.017998
0.012018
metabolism
Nucleotide excision repair
3
0.019134
0.012696
Type II diabetes mellitus
3
0.020309
0.01337
Notch signaling pathway
3
0.022775
0.014786
Monoterpenoid biosynthesis
1
0.025638
0.016343
Glutathione metabolism
3
0.026765
0.01701
Biosynthesis of unsaturated fatty
2
0.032374
0.020032
acids
Phenylalanine metabolism
2
0.032374
0.020032
Insulin signaling pathway
5
0.033766
0.020772
Nicotinate and nicotinamide
2
0.038035
0.023068
metabolism
Tetrachloroethene degradation
1
0.03821
0.023068
Galactose metabolism
2
0.044042
0.025671
Melanogenesis
4
0.046053
0.02655
Wnt signaling pathway
5
0.047776
0.027319
