Abstract
Introduction
Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide, but its pathogenic mechanisms remain unclear. This study aimed to identify the potential biomarkers underlying the diagnosis and treatment of HNSCC.
Methods
Weighted gene co-expression network analysis (WGCNA) followed by pathway enrichment analysis, analysis of infiltrating immune cells, survival analysis, and methylation analysis were applied to identify the potential hub genes underlying the prognosis of HNSCC. The expression of hub genes was validated by immunofluorescence staining.
Results
A total of 10,274 differentially expressed genes (DEGs) were identified. Through WGCNA, the yellow module (R2 = 0.33, P = 2e-14) was confirmed to be the most significantly associated with the histological grade of HNSCC, and the “Cell Cycle” proved to be the most enriched signaling pathway. Based on the results of survival analysis and immune cell infiltration, 10 hub genes (HMMR, CENPK, AURKA, CDC25C, FEN1, CKS1B, MAJIN, PCLAF, SPC25, and STAG3) were identified. Eight of these (excluding MAJIN and STAG3) were confirmed by performing survival analysis using another dataset (GSE41613). Further, we identified 4 methylation loci in 3 hub genes (cg15122828 and cg20554926 in HMMR, cg12519992 in CDC25C, and cg2655739 in KIAA0101/PCLAF) as being significantly related to survival. Finally, we demonstrated the high mRNA and protein expression of HMMR and CDC25C in HNSCC patients.
Conclusion
Two real hub genes (HMMR and CDC25C) and 3 methylation loci were identified that could potentially serve as prognostic and therapeutic targets for HNSCC, which is significant for studying the pathological mechanisms underlying HNSCC and for developing novel therapies for this disease.
Introduction
Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide, with more than 300,000 patients dying from it annually.1,2 It is characterized by malignancies originating from the oral cavity, oropharynx, hypopharynx, and larynx. Many factors have been identified as important risk factors for developing HNSCC, such as smoking, alcohol consumption, and infection with human papillomavirus (HPV) or Epstein-Barr virus (EBV). 3 The optimal treatment strategy for patients with advanced HNSCC should be discussed within a multidisciplinary team (MDT). Patients should be treated at high-volume facilities. 4 Nowadays, MDT is an essential component for oncologic disease management. Its benefit is also extensively recognized in head and neck cancer (HNC) community, due to tumor rarity and complex treatment. 5 Although advances have been made in many treatments for HNSCC, including surgery, radiotherapy, and chemotherapy,3,6 the 5-year survival rate remains poor at only approximately 50%. 7 There is thus an urgent need to identify potential biomarkers for the diagnosis and treatment of HNSCC, with the aim of improving the prognosis of affected patients.8,9
Tumorigenesis is a complex process whereby a group of genes are synergistically modulated. With the rapid development of high-throughput sequencing technology, substantial biological data about genomics have been generated. 8 Bioinformatic analysis has become a popular tool for comprehensively analyzing big data, which can provide insights into the mechanisms behind diseases and aid the discovery of promising biomarkers for the diagnosis and treatment of diseases, including HNSCC.10-12 Weighted gene co-expression network analysis (WGCNA) is an effective method of exploring the gene function and genome-wide associations, which can be used to detect synergistically expressed gene clusters related to diseases and identify the hub genes of each cluster associated with clinical phenotypes.13,14
Recently, WGCNA has been widely applied in multiple types of cancers.15-20 In HNSCC, Li et al. identified 10 hub genes downregulated in HNSCC tissues, of which CSTA is the only 1 related to progression and is a candidate biomarker for the future diagnosis and treatment of HNSCC. 21 In addition, Zhang et al. identified 12 hub genes related to the perineural invasion of HNSCC. 22 Moreover, Cao et al. found 6 hub genes that were downregulated in HNSCC and related to its prognosis and immune cell infiltration. 23 Furthermore, Jin et al. found 10 hub genes related to the lymph node metastasis of HNSCC. 24 These studies identified different hub genes related to HNSCC through performing WGCNA with little overlap. Moreover, the genetic network in HNSCC remains largely unknown.
In this study, as shown in Figure 1, we first downloaded HNSCC data from The Cancer Genome Atlas (TCGA) database. After identifying the differentially expressed genes (DEGs) between HNSCC patients and healthy controls, these genes were used to build WGCNA networks. Then, the key modules related to the clinical stage and histological grade of HNSCC were determined. After conducting enrichment analysis based on the genes in the top ranked module, we obtained the hub genes related to HNSCC. Furthermore, by performing survival analysis and evaluating infiltrating immune cells, we obtained the hub genes that can be used as potential biomarkers for the prognosis of HNSCC, which was further confirmed using another independent Gene Expression Omnibus (GEO) dataset. We further identified the methylation loci of the hub genes with HNSCC survival. Finally, we validated the mRNA and protein expression of the real hub genes in HNSCC patients. Work flow of this study.
Materials and Methods
Data Collection
The HNSCC data were obtained from TCGA data portal (https://portal.gdc.cancer.gov/), including 546 samples (502 tumor samples, 44 normal samples). In this dataset, 85.66% of patients were White American, 9.43% were Black or African-American, 2.1% were Asian, and for 2.83% the ethnicity was unknown. The dataset contains mRNA expression counts and clinical data such as clinical stage, histological grade, survival, outcome, sex, and age. Gene expression profiles and clinical traits of GSE41613 were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), including 97 tumor samples. GPL10558 Illumina HumanHT-12 V4.0 expression beadchip was used to extract the expression profile information of the GSE41613 dataset. The HNSCC data from TCGA were used to perform WGCNA, while the dataset GSE41613 was used to validate the hub genes that we identified.
Sample Size Determination
The sample size was determined through power analysis. We conducted a sample size calculation using the following parameters with an independent samples t test: an effect size (Cohen’s d) of 0.8, a significance level (alpha) of 0.05, and a statistical power of 0.8. Thus, approximately 25 samples are needed per group, resulting in a total sample size of 50.
DEG Screening
DEGs from the HNSCC TCGA dataset were identified by using the “DESeq2” R package. |Log2FC| > 1 and P adj < 0.05 were set as the cut-offs to screen DEGs, which were then shown in a volcano plot.
Establishment of Weighted Co-Expression Network
The selected DEGs were used to construct a weighted gene co-expression network by using the “WGCNA” package in R. First, we used “hclust” in the “WGCNA” package to cluster the samples and removed the outliers. Next, we set the soft-threshold as β = 4 (scale-free R2 = 0.94). Subsequently, we transformed the adjacency matrix into a topological overlap matrix (TOM). The TOM matrix is a tool for to quantitatively describing the similarity in nodes by comparing the weighted correlation between 2 nodes and other nodes. 25 Finally, we performed hierarchical clustering to identify significant modules, each containing at least 30 genes (minModuleSize = 30). We hierarchically clustered the modules and merged ones (mergeCutHeight = 0.25).
Identification of Key Module and Annotation of Module Function
The co-expression module is a set of genes with high topological overlap similarity. To identify the key module related to HNSCC in this study, we calculated the correlations between the modules and the clinical data. Then, we uploaded the list of all genes in the most significant module to the Metascape website (https://metascape.org) for functional annotation analysis. P < 0.01 was used as the threshold to identify enriched GO terms.
Identification of Hub Genes Related to Survival in Key Module
After functional annotation analysis, the hit genes in the top 20 most enriched ontology clusters were selected as candidate hub genes. Next, we conducted survival analysis with the HNSCC data from TCGA to verify the candidate hub genes. R package “Survival” was used for survival analysis.
Hub Genes and Analysis of Infiltrating Immune Cells
Tumor Immune Estimation Resource (TIMER; https://cistrome.shinyapps.io/timer/) is an online tool to comprehensively analyze the immune cells infiltrating into tumors based on TCGA. 26 It was applied here to explore the correlation among the infiltration of 6 kinds of immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils, and dendritic cells) and expression levels of hub genes. Pearson’s correlation coefficients between the infiltration scores and expression levels of hub genes were calculated.
Validation of Hub Genes for HNSCC Prognosis
The GSE41613 dataset was used to confirm the validity of the hub genes in the key module related to HNSCC. By performing survival analysis with the clinical data and gene expression, the hub genes consistently found to be associated with overall survival (OS) were defined as the candidate real hub genes reflecting the prognosis of HNSCC. The R package “ggsurvplot” was used for survival analysis.
Methylation Analysis of Hub Genes and HNSCC Prognosis
An analysis of the association between survival and methylation of the candidate real hub genes was performed by using the MethSurv database (https://biit.cs.ut.ee/methsurv/), 27 a web tool for survival analysis based on the patterns of CpG methylation utilizing the methylome data from TCGA. All methylation sites in these hub genes were employed to perform survival analysis.
Expression of HMMR and CDC25C in HNSCC
The UALCAN (https://ualcan.path.uab.edu/index.html) is an online tool facilitating tumor subgroup gene expression analyses. 28 We applied UALCAN to perform mRNA expression analysis of HMMR and CDC25C, which not only provides mRNA expression based on sample types, but also shows the mRNA expression in different tumor grades.
To further validate the expression of these 2 genes in tumor cells, we conducted immunofluorescence staining experiment. Briefly, the tissue microarray slide contains 48 HNSCC samples and 5 normal samples (HOraC060PG01, Shanghai Outdo Biotech Co., Ltd.) was placed in xylene and gradient ethanol. Then, antigen retrieval was carried out in Citric acid buffer followed by washing with PBS for 5 min. After blocking with 3% BSA for 30 min at room temperature, the primary antibody against HMMR (1:200, Proteintech) and CDC25C (1:400, Abcam) in 1% BSA (dissolved in PBST) were incubated at 4°C overnight. On the next day after washing, the slide was incubated with biotinylated secondary antibody (Flare520, green and Flare570, red) for 30 min at room temperature. Finally, the slide was counterstained with DAPI and mounted. Images were captured and analyzed with SlideViewer software (3DHISTECH Ltd.). For the quantification of the expression level, AIpathwell software (Servicebio, Wuhan, China) was applied and the protein level was defined as Positive Area Density = Positive Mean Optical Density/Positive Area.
Statistical Analysis
DEGs were identified by using the “DESeq2” R package, |Log2FC| > 1 and Padj <0.05 were set as the cut-offs to screen DEGs. A Kaplan-Meier curve for overall survival was presented for each gene using the R package “ggsurvplot”. The follow-up cutoff was set at the median quartile, and a A P-value of <0.05 was considered statistically significant. For the immunofluorescence staining experiment, an independent samples t test was used to statistically analyze the expression of HMMR and CDC25C proteins based on positive area density, a A P-value of <0.05 was considered statistically significant.
Results
Screening of Differentially Expressed Genes
Through comparing the gene expression between HNSCC patients and the healthy control group in TCGA dataset using the “DESeq2” R package, a total of 10,274 DEGs (6367 upregulated and 3907 downregulated) were screened under the thresholds of FDR <0.05 and |Log2FC| > 1. The heatmap of DEGs is shown in Figure 2A and all DEGs are listed in Table S1. Identification of the differentially expressed genes (DEGs) and modules between HNSCC patients and healthy controls in TCGA dataset. (A) Volcano plot of the DEGs between HNSCC patients and healthy controls. FDR <0.05, and |Log2FC| > 1. (B) Cluster dendrogram of all DEGs based on the 1-TOM matrix. (C) Heatmap of the correlation between the module and clinical traits of HNSCC.
Establishment of Weighted Co-Expression Network
The R package “WGCNA” was used to construct the weighted co-expression network. First, by performing sample clustering and Pearson’s correlation analysis, 3 outlier samples were removed (Figure S1). Next, the 10,274 DEGs found as described above were classified into modules by cluster analysis (Figure 2B). As shown in Figure S2A-C, the power of β = 4 (scale-free R2 = 0.94) was selected as soft-thresholding to ensure a scale-free network. Finally, we obtained 31 modules associated with HNSCC clinical traits (Figure 2C). The corresponding relationships between genes and modules are shown in Table S2, and the correlations between the module eigengenes and the gene expression profiles are shown in Table S3. The yellow module (including 622 genes) was found to be highly related to the pathological grade (R2 = 0.33, P = 2e-14), so it was selected as a clinically valuable module related to HNSCC and used for further analysis.
Functional Annotation of Genes in Yellow Module
To further understand the functional roles of genes in the identified module, we performed enrichment analysis of the biological significance of the genes that it contained. We uploaded the list of all genes in the yellow module to the Metascape website (https://metascape.org) for functional annotation. As shown in Figure 3A and Table S4, the enrichment analysis revealed that the yellow module was particularly enriched in the biological processes of cell cycle, nuclear division, DNA replication, regulation of cell cycle process, mitotic cell cycle process, and DNA repair. We then selected a subset of representative enriched terms and converted them into a network layout. Terms with a similarity score >0.3 were linked by an edge (with the thickness of the edge representing the similarity score). The network was visualized with Cytoscape (v3.1.2) with “force-directed” layout and with edges bundled for clarity (Figure 3B). Functional enrichment analysis. (A) Enriched ontology clusters of genes in the yellow module. (B) The network of enriched ontology clusters colored by cluster.
Identification of Hub Genes Related to Survival in the Key Module
To identify the hub genes related to HNSCC in the key module, 198 hit genes in the top 20 most enriched ontology clusters were selected as candidate hub genes. To further screen for significant hub genes, we performed survival analysis of these 198 hub genes with the HNSCC data from TCGA. Kaplan-Meier analysis was performed to estimate the survival curves and the log-rank test was used to compare the curves. We finally screened 10 genes (HMMR, CENPK, AURKA, CDC25C, FEN1, CKS1B, MAJIN, PCLAF, SPC25, and STAG3) that were significantly related to the progression-free survival (PFS) of HNSCC, which can be used to predict the prognosis of HNSCC (Figure 4). Progression-free survival (PFS) analysis of 10 hub genes in head and neck squamous cell carcinoma (HNSCC). (A-J) Survival analysis of HMMR, CENPK, AURKA, CDC25C, FEN1, CKS1B, MAJIN, PCLAF, SPC25, and STAG3 hub genes in the yellow module. Kaplan-Meier curves showing that patients with different expression levels of the 10 hub genes had different progression-free survival. Data are shown based on the HNSCC samples from TCGA database.
Correlation Between Hub Genes and Infiltration of Immune Cells
Immune cells play significant roles in the tumor microenvironment
29
and their infiltration is related to the clinical outcomes of cancer.
30
By applying the TIMER online resource, we found that 8 of the 10 hub genes (excluding MAJIN and STAG3) were positively correlated with tumor purity (Figure 5 and Figure S3), which was in turn correlated with immune cell infiltration. Although STAG3 showed a less significant correlation with tumor purity (P = 2.44e−01; Figure 5C), all 6 immune cells (B cells, CD8+ cells, CD4+ cells, macrophages, neutrophils, and dendritic cells) showed positive correlations of their infiltration levels with STAG3 (P < 1.00e−10). The correlations between 5 hub genes (HMMR, CENPK, STAG3, CDC25C, and KIAA0101/PCLAF) and the infiltration of B cells, CD8+ cells, CD4+ cells, macrophages, neutrophils, and dendritic immune cells.
Validation of Hub Genes for HNSCC Prognosis
To confirm the validity of the significant associations with HNSCC of hub genes in the yellow module, we used another independent dataset (GSE41613) to perform survival analysis. R package “ggsurvplot” was used to analyze overall survival (OS). Meanwhile, Kaplan-Meier analysis was performed to estimate the survival curves and the log-rank test was used to compare the curves. A P-value of <0.05 was considered to indicate statistical significance. As shown in Figure 6, for 8 of the 10 hub genes (excluding MAJIN and STAG3), their high expression was significantly associated with overall survival of HNSCC. Validation of the hub genes by survival analysis in head and neck squamous cell carcinoma (HNSCC). (A-H) Survival analysis of hub genes (HMMR, CENPK, AURKA, CDC25C, FEN1, CSK1B, PCLAF, and SPC25) based on the HNSCC samples from GSE41613. Kaplan-Meier analysis was performed to estimate the overall survival (OS) curves and log-rank test was used to compare the curves. A P-value of <0.05 was considered to indicate statistical significance.
Relationship of Hub Gene Methylation with HNSCC Prognosis
To further demonstrate the roles of the hub genes in the prognosis of HNSCC, we analyzed the methylation levels of the loci in hub genes by using the MethSurv tool. The results showed that hypermethylation of cg15122828 and cg20554926 in HMMR (P = 0.034 and P = 0.041, respectively), hypermethylation of cg12519992 in CDC25C (P = 0.008), and hypomethylation of cg26155739 in KIAA0101/PCLAF (P = 0.048) were significantly associated with the probability of survival of HNSCC patients (Figure 7). No significant loci of the other hub genes were found to be significantly correlated with HNSCC survival. Survival plot for the CpGs of hub genes in head and neck squamous cell carcinoma (HNSCC). (A, B) Survival analysis of (A) cg15122828 and (B) cg20554926 CpG sites in HMMR. (C) Survival analysis of cg12519992 CpG site in CDC25C. (D) Survival analysis of cg26155739 CpG site in KIAA0101/PCLAF.
Demonstration of HMMR and CDC25C Expression in HNSCC
Finally, to further demonstrate the mRNA and protein expression of HMMR and CDC25C in HNSCC, we performed UALCAN analysis and immunofluorescence staining. Based on the UALCAN analysis, as shown in Figure 8(A) and (C), HMMR and CDC25 C mRNA expression were significantly higher in tumor tissues compared with the normal control (P = 1.6 × 10−12 and P < 1 × 10−12, respectively), and they were also showed significance in different tumor grades (Figure 8(B) and (D)). Based on the immunofluorescence staining experiment, as shown in Figure 8E, the fluorescence signal intensity of the HMMR and CDC25C in tumor cells is significantly higher than that in normal tissue, indicating a higher expression of these genes in tumor cells. The protein expression of HMMR (Figure 8F) and CDC25C (Figure 8G) were remarkedly upregulated in HNSCC patients compared with the normal control (P < 0.05 and P < 0.01, respectively). These results demonstrated that HMMR and CDC25C were the real hub genes, which were highly expressed in HNSCC, and can be selected as the candidate biomarkers for HNSCC prognosis. Validation of HMMR and CDC25C expression in head and neck squamous cell carcinoma (HNSCC). (A-D) The expression of HMMR and CDC25C in HNSCC based on sample types and tumor grade from UCLCAN. Tumor grade definition: Grade 1indicates well differentiated (low grade), Grade 2 indicates moderately differentiated (intermediate grade), Grade 3 indicates poorly differentiated (high grade), and Grade 4 indicates undifferentiated (high grade). 
Discussion
In this study, by comparing the gene expression between HNSCC patients and healthy controls, we identified 10,274 DEGs. These genes were further subjected to WGCNA, and we found 31 modules associated with the histological grade of HNSCC. Among these identified modules, the yellow module (including 622 genes) was the most significant one. The genes within this module were particularly associated with the cell cycle signaling pathway. Ten hub genes (HMMR, CENPK, AURKA, CDC25C, FEN1, CKS1B, MAJIN, PCLAF, SPC25, and STAG3) were identified to be correlated with HNSCC patient survival, 8 of which (excluding MAJIN and STAG3)were correlated with immune cell infiltration. These 8 genes were also confirmed to be related to HNSCC patient’ survival using another dataset (GSE41613). Through methylation analysis, the methylation of 3 candidate hub genes (HMMR, CDC25C, and PCLAF) were found to be significantly associated with HNSCC survival. Finally, by performing public data and experimental validation, we demonstrated HMMR and CDC25C were highly expressed in HNSCC patients. In summary, we identified 2 real hub genes that can potentially be used as biomarkers of the prognosis of HNSCC.
The co-expression network analysis has proven to be a useful method for identifying modules of highly correlated genes related to diseases. It has been increasingly used for discovering candidate biomarkers or therapeutic targets of diseases.13,14 WGCNA is a systems biology approach used to explore patterns of gene co-expression in gene expression data. It constructs a co-expression network by calculating correlations between genes, typically using Pearson correlation coefficients. These correlations form an undirected network of connections between genes. Within the co-expression network, WGCNA uses unsupervised clustering algorithms (such as hierarchical clustering or dynamic tree cutting) to group genes with high correlations into modules. Each module consists of a set of highly co-expressed genes, which often share biological functions such as involvement in common metabolic pathways or cellular processes. Once genes are organized into modules, WGCNA can further analyze the correlation between each module and external traits. Through WGCNA of HNSCC, the genes in the identified module were shown to be particularly associated with immune response-related pathways or the immune microenvironment, and most of them can be used for predicting the prognosis of HNSCC.23,31-33 By applying this tool in the present study, the genes in the most significant module that we identified were shown to be particularly associated with the cell cycle signaling pathway, and the hub genes were also correlated with patient survival and immune cell infiltration.
The aberrant cell proliferation of tumors is usually accompanied by dysregulation of the cell cycle. 34 The cell cycle is an important checkpoint in various cancers, which further contributes to cancer development. 35 Genes in the cell cycle pathway, such as cyclin-dependent kinase inhibitor 2A (CDKN2A), TP53, and Cyclin D1 (CCND1), have been reported to participate in HNSCC. 36 In this study, we found a group of genes in the yellow module enriched in the “cell cycle” pathway, which was the top ranked ontology cluster related to HNSCC. Among these genes in the cell cycle pathway, 10 genes were significantly associated with HNSCC survival, and 8 of these 10 were also confirmed to be associated with survival using another dataset from GEO (GSE41613). In the HNSCC dataset from TCGA, 67.3% of patients had oral squamous cell carcinoma (including of the tongue, cheek mucosa, floor of the mouth, gums, etc.), and 25% had laryngeal squamous cell cancer (including of the larynx, hypopharynx and pharynx). The validation dataset GSE41613 was on patients with oral squamous cell carcinoma. This suggests that the findings of the present study mainly apply to in oral squamous cell carcinoma. Moreover, considering that 85.7% of the patients in the HNSCC dataset were White Americans, these hub genes may mainly apply to the Caucasian population.
In recent years, hyaluronan mediated motility receptor (HMMR), also known as CD168 and RHAMM, was reported mediating cell motility, 37 which functions as a regulator of homeostasis, mitosis, and meiosis. 38 HMMR has also been reported to be expressed in various cancers and to be associated with cancer risk, prognosis, and progression, 38 for example, in lung cancer, 39 malignant pleural mesothelioma, 40 hepatocellular carcinoma, 41 bladder cancer, 42 pancreatic cancer, 43 and colorectal cancer. 44 In HNSCC, HMMR mRNA and protein were also found to be highly expressed in tumor tissues, highlighting HMMR as a potential biomarker for a poor prognosis of HNSCC. 45 Consistent with this, we also found HMMR, as 1 of the DEGs in HNSCC located in the yellow module, to be significantly associated with HNSCC clinical traits through WGCNA. Moreover, we found that the expression of HMMR was related to the infiltration of immune cells, such as CD4+ T cells, macrophages, neutrophils, and dendritic cells. It has been reported that HMMR might induce macrophage-related immune response by activating M2 subsets. 46 M2 macrophages were regarded as “renegade” immune cells which contributed to poor prognosis in liver hepatocellular carcinoma and promote cancer invasiveness. 47 Additionally, higher expression of HMMR was found to be related to poor survival, as was HMMR hypomethylation (cg15122828 and cg20554926). This suggests that the hypomethylation-induced elevation of HMMR expression may provide a clue for the poor prognosis of HNSCC.
CDC25C (Cell division cycle 25C) is 1 of the members of the CDC25 family, and the proteins in this family are regulators of cell cycle progression. 48 CDC25C regulates the whole cell cycle through a CDK1‐Cyclin B‐CDC25C feedback loop driving cells to enter mitosis. 49 CDC25C was also reported as potential prognostic marker for hepatocellular carcinoma, 50 lung cancer, 51 colon cancer, 51 and breast cancer,52,53 with its aberrant expression being associated with carcinogenesis. Notably, accumulating evidence suggested that inhibition of CDKs not only contributes to cell cycle arrest but also triggers anti-tumor immunity54,55 and enhances the efficiency of anti-PD-1 therapy, 56 thus CDC25C might also have potential immunoregulatory effect via interaction with CDKs. In our study, CDC25C was significantly associated with the infiltration levels of multiple immune cells, these findings indicated that CDC25C promotes an immunosuppressive TME, which could impair the efficacy of immunotherapy.
KIAA0101/PCLAF (Proliferating cell nuclear antigen binding factor) was first reported in 2001 upon its discovery via a yeast two-hybrid experiment. KIAA0101 regulates DNA synthesis and repair during DNA replication,57,58 suggesting that it acts as a processivity factor regulating PCNA. 59 KIAA0101 is overexpressed in tumor tissues and involved in cell cycle regulation and cell cycle proliferation. 60 It has been identified as a prognostic biomarker in lung cancer,61,62 kidney cancer, 63 and hepatocellular carcinoma.64,65 To the best of our knowledge, this is the first study reported dysfunction of CDC25C and PCLAF in HNSCC correlated with patients’ survival and immune cell infiltration.
Gene expression can be regulated by DNA methylation. Numerous studies have identified many aberrant genes and methylation loci as prognostic biomarkers for HNSCC. 66 Recently, Ma et al. 67 identified a radiotherapy-related methylation signature with 4 genes through WGCNA, which can be used to predict the survival of HNSCC. In addition, Cao et al. 23 identified 6 hub genes with the potential to act as biomarkers for the diagnosis of HNSCC by WGCNA, of which 4 genes showed dysregulated methylation. In our study, of the 10 hub genes, only 4 methylation loci in 3 genes (HMMR, CDC25C, and PCLAF) were found to have significant associations with the probability of survival of HNSCC patients. These were designated as real hub genes and as potential biomarkers for HNSCC prognosis.
In this study, we not only validated a previously identified HNSCC-related gene, HMMR, as indicated by Cao et al, 23 but also identified another novel HNSCC-related hub gene CDC25C. These 2 genes were particularly associated with the biological processes of the cell cycle, nuclear division, DNA replication, regulation of cell cycle process, mitotic cell cycle process, and DNA repair, which are in turn closely related to the progression, treatment and prognosis of cancer. Moreover, we also found that HMMR and CDC25C are related to immune cell infiltration, and revealed that methylation sites regulating these hub genes are significantly related to HNSCC survival at the molecular level. Most importantly, we validated the high expression of HMMR and CDC25C in HNSCC patients by immunofluorescence staining experiment. To further check if the expression of these 2 genes is associated with the malignant grade in this HNSCC cohort from TCGA, we compared the expression level of HMMR and CDC25C across different histological grades of HNSCC and found that the overall differences between the groups were significant (P = 0.03 and P = 5.13e-12, respectively; Figure S4). We also found the high expression of HMMR and CDC25C was significantly associated with overall survival of this HNSCC cohort from TCGA (Figure S5), as well as validated in GSE41613 (Figure 6). To the best of our knowledge, this is the first time in HNSCC that multiple methods have been comprehensively applied to identify hub genes that can predict HNSCC survival.
Although our study offers valuable insights, there are several limitations that should be acknowledged. One limitation of our study is the relatively small sample size. Although our power analysis indicated that the sample size was sufficient to detect a significant effect, a larger sample size could provide more robust and generalizable results. Second, given that 85.7% of the patients in the HNSCC dataset were White Americans, this may limit the generalizability of our findings to other populations or settings. Future research should include larger sample sizes and be conducted across multiple centers. Expanding the study to include a more diverse population in terms of demographics, geography, and clinical characteristics will help to understand the broader applicability of our findings and identify any population-specific effects. Despite these limitations, our study offers important contributions to the understanding of prognostic molecular markers in HNSCC and provides a foundation for future research in this area.
Conclusion
In this study, through a co-expression network created by WGCNA and subsequent comprehensive analysis, we identified 2 real hub genes (HMMR and CDC25C) related to cell cycle processes correlating with HNSCC survival, of which CDC25C is a novel potential biomarker for the prognosis and treatment of HNSCC.
Supplemental Material
Supplemental Material - Comprehensive Analysis Identifies HMMR and CDC25C as Potential Prognostic Biomarkers in Head and Neck Squamous Cell Carcinoma
Supplemental Material for Comprehensive Analysis Identifies HMMR and CDC25C as Potential Prognostic Biomarkers in Head and Neck Squamous Cell Carcinoma by Hongrui Zhang, Yi Xu, Haijun Han, Xiongwei Ye, Lu Cheng, Yueshuang Shen, and Xiaochen Wan in Cancer Control
Footnotes
Acknowledgments
None.
Author Contributions
Hongrui Zhang and Yi Xu participated in data collection and analysis. Haijun Han, Xiongwei Ye, Lu Cheng and Yueshuang Shen participated in data collection. Hongrui Zhang, Yi Xu participated in writing and editing of the paper. Xiaochen Wan conceived of the study. All authors approved the final manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by Zhejiang Medical and Health Science and Technology Plan (Grant No. 2024KY603) to Hongrui Zhang.
Ethical Statement
Data Availability Statement
The raw sequencing data supporting the conclusions of this paper were obtained from TCGA data portal (https://portal.gdc.cancer.gov/) and the GEO database (
). All data generated or analyzed during this study are included in this article and its supplementary information files.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
