Abstract
Objectives
This study aims to investigate the significance of tumor microenvironment (TME)-related genes and signal transduction pathways in head and neck cancer (HNC).
Methods
Gene expression and clinical data of HNC patients were obtained from the Cancer Genome Atlas (TCGA) database. Differentially expressed genes (DEGs) were screened through a multi-step filtration approach to obtain candidate predictors. The biological role of COL5A1 in HNC was verified through rigorous bioinformatic analysis, experimental validation using quantitative real-time PCR (qRT-PCR), immunohistochemical (IHC) analysis from HNC samples, and IHC data from the Human Protein Atlas (HPA) database.
Results
COL5A1 was significantly upregulated in HNC tissues and cell lines. High COL5A1 expression was significantly associated with advanced tumor grade (P < .05) and shorter survival (TCGA: P < .001; GSE42743: P = .004). COL5A1 was an independent prognostic indicator (univariate analysis: HR = 1.324, P = .001; Multivariate analysis: HR = 1.326, P = .005). It was enriched in pathways related to tumor invasion and immune responses, and its expression was associated with decreased levels of CD8+ T cells and increased levels of macrophages and neutrophils. Spatial distribution analysis revealed higher expression at the tumor's leading edge (vs. tumor core: P < .001). COL5A1 expression is associated with tumor stage, with more pronounced expression in advanced-stage tumors.
Conclusion
COL5A1 represented a novel potential prognostic indicator and therapeutic target in an HNC database sample, as its expression is closely linked to tumor progression, immune cell infiltration, and adverse clinical outcomes. These findings, primarily derived from squamous cell carcinoma-dominated cohorts, warrant further functional validation.
Keywords
Introduction
Globally, head and neck cancer (HNC) ranks tenth in terms of incidence and seventh in terms of cancer-related mortality. 1 Tobacco use, alcohol consumption, and HPV infection are the primary risk factors for HNC. 2 Despite significant advancements in clinical diagnosis and treatment, HNC continues to be a major worldwide health concern. 3 HNC is routinely treated with surgical resection, chemotherapy, and radiotherapy. These treatments still have certain limitations in improving the survival rate of patients.4,5 A deeper understanding of HNC pathogenesis will help in the design of more effective treatment strategies.
The extracellular matrix (ECM) and stromal cells surround the tumor cells, collectively forming the tumor microenvironment (TME), which is critical for tumorigenesis. 6 TME refers to the complex network of immune cells, cytokines, chemokines, and other immune-related components present within the tumor tissue. The TME plays a central role in modulating tumor progression, immune evasion, and response to therapy, making it a key focus for therapeutic development. Other components of the TME include fibroblasts, myofibroblasts, neuroendocrine cells, adipocytes, inflammatory cells, and vascular and lymphatic networks. 7 Various molecules produced in the TME, such as chemokines, growth factors, and matrix-degrading enzymes, promote tumor cell proliferation and invasion. 8 Tumor growth promotes immune tolerance by suppressing immune activity, mediated by these signaling molecules and cellular components.9–11
In the TME of HNC, immunogenicity is repressed. 12 Studies have shown a significant relationship between tumor-infiltrating immune cells (TICs) in the TME and the prognosis of HNC patients. 1 An elevated degree of T-cell or B-cell infiltration is linked to a greater chance of cancer survival. 13 Eosinophils are involved in angiogenesis and metastasis, so a higher level of eosinophils is associated with a worse prognosis.14,15 Some tumor-infiltrating lymphocytes (TILs) have been proven to be independent prognostic factors of head and neck squamous cell carcinoma(HNSCC). 16 The infiltration of immune cells in the TME is a prognostic indicator in HNC, suggesting that future therapies may target TME remodeling. Therefore, identifying key TME-related drivers in HNC remains critical for novel therapeutic development.
Given the frequent aggregation of HNCs from various anatomical subsites (e.g., oral cavity, larynx) into a broad HNC classification in large public databases like TCGA and GEO, our analysis adopted a pan-HNC approach. This strategy was employed to ensure a sufficient sample size for robust statistical power in bioinformatic analyses and to mitigate the risks of sample exclusion due to inconsistent or incomplete pathological subtyping. Furthermore, it facilitates the investigation of common oncogenic pathways and TME features that may transcend anatomical boundaries, an approach consistent with established practices in translational oncology research. 17
Materials and methods
Raw data and experimental materials
The clinical information and gene expression profiles of 1114 HNC patients (102 patients in the normal group and 1012 patients in the tumor group) were retrieved from the TCGA database (https://portal.gdc.cancer.gov/). Another HNC cohort (GSE42743 18 ) of 103 patients with expression profiles and clinical information was obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). All datasets were subjected to a rigorous harmonization procedure in subsequent processing, including log2 transformation, quantile normalization, and ComBat batch correction using the “sva” R package (version 3.5.3).
Evaluation of the ImmuneScore, StromalScore, and ESTIMATEScore
Data were initially processed using the ESTIMATE package in R (version 3.5.3) to calculate the proportions of immune and stromal components within the TME for each sample. The resulting scores—ImmuneScore, StromalScore, and ESTIMATEScore—reflect the relative abundance of immune cells, stromal cells, and their combined presence in the TME, respectively and are positively correlated with the abundance of these components. The ESTIMATEScore is a scoring system based on gene expression profiles that integrates immune cell infiltration and stromal components to provide a comprehensive assessment of TME.
Survival analysis
Survival analysis was performed using the survival and survminer packages in R (version 3.5.3). Kaplan–Meier curves were generated, and differences between groups were evaluated using the log-rank test, with statistical significance defined as P < .05.
Bioinformatic and spatial analyses
Tumor samples (n = 1012) were stratified into high/low ImmuneScore and StromalScore groups using median cutoffs. Differentially expressed genes (DEGs) were identified with the LIMMA R package (version 3.5.3) using a threshold of │log₂(fold change)│ > 1 and an adjusted P-value (FDR) < .05. Functional enrichment of overlapping DEGs was performed via GO 19 and KEGG 20 analyses using the clusterProfiler R package (version 3.5.3) with a significance cutoff of P < .05 and q-value (FDR) < .05. A protein–protein interaction (PPI) 21 network was constructed using the STRING database (version 11.5; https://string-db.org) with a minimum interaction confidence score >0.95 and analyzed in Cytoscape software (version 3.9.1; https://cytoscape.org). Gene Set Enrichment Analysis (GSEA) 22 was performed using the GSEA software (version 4.3.2; from the Broad Institute, http://www.gsea-msigdb.org/gsea/) with the Hallmark gene sets (v7.2). Significantly enriched gene sets were defined by a normalized enrichment score (NES) and a nominal P-value (NOM P-value) < .05 with an FDR q-value < .06. Immune cell fractions were profiled using the CIBERSORT algorithm (https://cibersort.stanford.edu/) to estimate 22 immune cell types. Only samples with a deconvolution P-value < .05 were included in subsequent analyses. For spatial transcriptomics, the Oral Squamous Cell Carcinoma(OSCC) dataset GSE208253 23 (10x Visium, 55-μm resolution) was preprocessed using the Seurat R package (version 3.5.3). Spots were retained if they had >500 genes detected and a mitochondrial gene content <20%. Data were normalized using SCTransform. Pathological annotations (tumor core, leading edge, transition region, normal) from the original study were integrated for region-specific analysis. Statistical significance was defined as P < .05 throughout, with specific tests indicated in figure legends.
Cell culture
The human HNC cell lines TU212(RRID:CVCL_4915) and TU686 (RRID:CVCL_4916) (Xiangya Central Experiment Laboratory, Hunan, China), FaDu (RRID:CVCL_1218), CAL-27 (RRID:CVCL_1107) (Sunncell Biotech Co., Ltd., Wuhan, China), and human oral keratinocytes (HOK) (RRID:CVCL_YE19) (Bioleaf Biotech Co., Ltd., Shanghai, China) were cultured in RPMI-1640 (Gibco-BRL, Gaithersburg, MD, United States) supplemented with 10% heat-inactivated fetal bovine serum (Hyclone Laboratories Inc., Logan, UT, United States), 100 U/mL penicillin, and 100 U/mL streptomycin at 37 °C in a 5% CO2 atmosphere.
RNA extraction and quantitative real-time PCR
Total RNA was extracted using TRIzol reagent, and complementary DNA was synthesized using PrimeScript™ RT Master Mix according to the manufacturer's protocol. Quantitative real-time PCR (qRT-PCR) was performed with a SYBR Green PCR kit, using GAPDH as the internal control. The primer sequences for COL5A1 and GAPDH were designed, and relative gene expression was calculated using the 2−ΔΔCT method. All qRT-PCR reactions were performed in technical triplicate, and the experiment was independently repeated three times (n = 3). The primer sequences used were as follows: COL5A1 forward, 5’-TACCCTGCGTCTGCATTTCC-3’; COL5A1 reverse, 5’-GCTCGTTGTAGATGGAGACCA-3’; GAPDH forward, 5’-ACAACTTTGGTATCGTGGAAGG-3’; GAPDH reverse, 5’-GCCATCACGCCACAGTTTC-3’.
Immunohistochemical analysis
Formalin-fixed, paraffin-embedded (FFPE) tissue specimens from 14 HNC cases (retrospectively collected between 2016 and 2018) were subjected to immunohistochemical (IHC) analysis. Written informed consent was obtained from all participants, and the use of these samples was approved by the Institutional Review Board (IRB) of the First Affiliated Hospital with Nanjing Medical University (Nanjing, China; Approval No.: 2023-SR-288; Date of Approval: May 31, 2023). This study was conducted in accordance with the ethical principles of the Helsinki Declaration of 1975, as revised in 2024. Following heat-induced epitope retrieval (HIER) in Tris-EDTA buffer (pH 9.0), sections were incubated with COL5A1 rabbit monoclonal antibody (Cell Signaling Technology, #86903, clone E6U9W, Lot #1; diluted 1:50 in Tris-EDTA buffer) at 4 °C overnight. Subsequent detection was performed with appropriate secondary antibodies and diaminobenzidine (DAB) chromogenic substrate. Hematoxylin counterstaining was applied, followed by dehydration, clearing, and mounting. Whole-slide digital scanning was conducted using a 3DHISTECH Pannoramic™ slide scanner. Intergroup comparisons across different tumor stages were established throughout the experiment to validate result specificity.
Verification of protein expression of hub genes and comparison with TICs
The Human Protein Atlas (HPA) is a database that accumulates a large amount of human protein data and information, providing a valuable resource for clinicians and researchers. 24 Protein expression patterns based on immunohistochemistry are widely used to detect the relative position and abundance of proteins. 25 By employing the HPA database for IHC analysis, we observed and compared the variations in protein expression of the target genes and the differences in TICs between HNC and normal tissues.
Results
DEGs in HNC
The CIBERSORT and ESTIMATE algorithms were used to analyze transcriptome RNA-seq data from 1114 HNC patients to assess TIC ratios and immune and stromal cell levels in the TME. A PPI network was constructed based on DEGs with consistent immune and stromal scores. COL5A1 was identified as the only gene among the top 30 hub genes that was a significant prognostic factor in the univariate Cox regression analysis. Its molecular mechanism, particularly its relationship with TICs in HNC progression, was further validated.
Correlations of TME immune and stromal cell levels with survival and clinicopathological stage in HNC
Kaplan–Meier analysis showed that higher ImmuneScore and ESTIMATEScore were associated with poorer survival (Figure 1(a) and (b)), while StromalScore had no significant correlation with survival (Figure 1(c)). Further TCGA data analysis revealed positive correlations between ImmuneScore (Figure 1(d)), StromalScore (Figure 1(e)), and ESTIMATEScore (Figure 1(f)) with tumor stage, highlighting a strong link between TME components and HNC progression.

Correlation of scores with survival and tumor stage in head and neck cancer (HNC) patients. (a) Kaplan–Meier survival analysis of HNC patients stratified by high or low ESTIMATEScore relative to the median score (P = .002 by log-rank test). (b) Kaplan–Meier survival curve for the ImmuneScore (P = .002 by log-rank test). (c) Kaplan–Meier survival curve for the StromalScore (P = .427 by log-rank test). (d–f) Correlation of the ImmuneScore (d), StromalScore (e), and ESTIMATEScore (f) with tumor stage, illustrating the distribution of each score across different tumor stages.
Immune-related DEGs
Gene expression data were compared between high- and low-score groups. Based on the median ImmuneScore, 1584 DEGs were identified (1022 upregulated and 562 downregulated; Figure 2(a), (c) and (f)), while 817 DEGs were identified using the median StromalScore (779 upregulated and 38 downregulated; Figure 2(b), (c) and (f)). Cross-analysis revealed 551 upregulated and 30 downregulated overlapping DEGs. GO enrichment indicated that these DEGs were primarily associated with immune functions, including T-cell activation, immune response regulation, and lymphocyte activation (Figure 2(d)). KEGG analysis showed enrichment in pathways related to hematopoietic cell lineage, chemokine signaling, and cytokine–receptor interactions (Figure 2(e)), suggesting their involvement in HNC progression.

Heatmaps, Venn diagrams, and enrichment analysis of GO and KEGG for differentially expressed genes (DEGs). (a) Heatmap of DEGs generated by comparing the high-score group vs. the low-score group based on the immune score. The row name of the heatmap is the gene name, and the column name is the ID of the samples that are not shown in the plot. DEGs were determined via the Wilcoxon rank sum test with q = .05 and a fold change > 1 after log2 transformation as the significance threshold. (b) Heatmap of DEGs in the StromalScore, similar to (a). (c, f) Venn diagrams showing common upregulated and downregulated DEGs shared by the ImmuneScore and StromalScore and with q < .05 and a fold change > 1 after log2 transformation as the DEG significance filtering threshold. (d, e) GO and KEGG enrichment analyses of 581 DEGs; terms with P and q < .05 were considered significantly enriched.
COL5A1 as a prognostic gene for HNC
A PPI network of DEGs was constructed using Cytoscape, linking 581 genes (Figure 3(a)). The top 30 hub genes were identified and visualized in a bar chart (Figure 3(b)). Intersection analysis of the top 30 nodes with univariate Cox regression for 1035 HNC patients revealed that among the top 30 hub genes, only COL5A1 exhibited significant prognostic value in this analysis (Figure 3(c)).

Protein‒protein interaction (PPI) network and univariate Cox analysis. (a) Interaction network constructed with nodes with interaction confidence values >0.95. (b) Bar graph showing the top 30 genes ordered by the number of nodes in the PPI network. (c) Venn diagram showing the common factors shared by the 30 nodes in the PPI network and the top 30 main factors in the univariate Cox analysis.
Correlation of COL5A1 expression with survival, tumor grade, and modulation of the tumor microenvironment in HNC
The correlation of COL5A1 expression with survival, tumor grade, and TME modulation in HNC was assessed by stratifying patients into low- and high-expression groups. COL5A1 expression was significantly higher in tumor tissues compared to normal tissues (Figure 4(a) and (b)). Survival curves demonstrated that high expression of COL5A1 was greatly associated with poor overall survival in both TCGA and GSE42743 cohorts (Figure 4(c)). COL5A1 expression was shown to be positively correlated with tumor stage in correlation analysis with clinical features (Figure 4(d)). Moreover, univariate and multivariate Cox regression analyses adjusted for TNM stage, age, and gender identified COL5A1 as an independent prognostic factor. In the multivariate model, elevated COL5A1 expression remained significantly associated with poorer overall survival (HR = 1.326, 95% CI: 1.054–1.520; P = .005), indicating its prognostic value independent of TNM stage

Differential expression of COL5A1 and its impact on survival, tumor stage, and enriched pathways in head and neck cancer (HNC). (a) Differential analysis of COL5A1 expression between normal and tumor samples (P < .001 by the Wilcoxon rank sum test). (b) Paired analysis of COL5A1 expression in matched normal and tumor samples (P = 1.0e-12 by the Wilcoxon rank sum test). (c) Kaplan–Meier survival analysis of HNC patients in both TCGA and GSE42743 cohorts stratified into high- and low-expression groups based on the median COL5A1 expression level (P < .005 by the Wilcoxon rank sum test). (d) Correlation between COL5A1 expression and tumor stage, evaluated using the Wilcoxon rank sum test. (e) Multivariate analysis and univariate analysis. Red and blue squares represent factors associated with poor prognosis. (f) Gene set enrichment analysis (GSEA) of highly expressed COL5A1, illustrating the enriched gene sets in the HALLMARK collection. Each line represents a unique gene set, with upregulated genes on the left and downregulated genes on the right. Only gene sets with NOM P < .05 and FDR q < .06 are considered significant, and representative leading gene sets are shown.
Correlation of COL5A1 with the ratios of different types of TICs
The relationship between COL5A1 and TME immune status was investigated using the CIBERSORT algorithm to profile 22 immune cell types in HNC samples (Figure S1). Correlation analysis (Figure S2) showed that COL5A1 expression was negatively correlated with B-cell memory, naïve B cells, and CD8+ T cells, but positively correlated with M0 macrophages, neutrophils, and activated mast cells.
Spatial expression distribution of COL5A1
To characterize the distribution of COL5A1 at the spatial transcriptomic level, we collected a spatial transcriptomic dataset (GSE208253) from the GEO database. Figure S3(a) shows three samples subjected to H&E staining. On the basis of the cell annotation by pathologists from original research, 23 we first generated a spatial plot for cell annotation (Figure S3(b)). According to the cluster region information of GSE208253, we identified four cluster areas, including the tumor core, transition region, leading edge and normal region. Among these genes, COL5A1 was strongly associated with tumor invasion and metastasis26,27 (Figure S3(c)). Furthermore, we explored the spatial distribution of COL5A1 and found that COL5A1 was highly expressed in the leading edge region (Figure S3(d) and (e)).
Validation of COL5A1 expression in cell lines and tumor tissues
Using qPCR, we confirmed significantly elevated COL5A1 mRNA levels in HNC cell lines versus human oral keratinocytes (HOK) (P < .001, Figure 5(a)), supporting its pro-tumorigenic role. IHC analysis of 14 HNC specimens revealed COL5A1 positivity in 12/14 cases (85.7%). Notably, COL5A1 protein levels were significantly elevated in advanced-stage tumors (AJCC stage III). Correlation analysis further demonstrated a positive association between COL5A1 expression intensity and AJCC tumor stage, indicating progressively higher protein expression with increasing disease severity (Figure 5(b)). Consistently, IHC analysis based on the HPA database showed that COL5A1 protein was markedly overexpressed in tumors, consistent with cell line data. Additionally, we investigated the levels of CD8+ T-cell and macrophage infiltration in HNC samples, revealing generally low infiltration levels of both immune cell types in tumor tissues compared to normal, with infiltration levels negatively correlated with COL5A1 expression (Figure 5(c)).

Verification of COL5A1 expression in head and neck cancer (HNC) cell lines and tumor tissues. (a) Verification of COL5A1 mRNA expression levels in normal and tumor cells using qPCR. TU212, TU686, FaDu, and CAL-27 are HNC cell lines, while HOK is a human oral keratinocyte cell line (***P < .001). (b) Tissue sections from 12 HNC patients grouped by TNM stage: Stage I (Patients 1–4): Focal weak-to-moderate cytoplasmic staining; Stage II (Patients 5–8): Diffuse moderate staining with stromal involvement; Stage III (Patients 9–12): Strong pan-cellular staining and peritumoral matrix deposition. * All images captured at 25× magnification; scale bars = 50μm. Brown DAB signal indicates COL5A1 positivity; blue hematoxylin counterstain. (c) Immunohistochemical staining of COL5A1 protein in HNC and normal tissues from the Human Protein Atlas (HPA) database. The protein level of COL5A1 in normal salivary gland tissue (antibody HPA030769) showed low staining, weak intensity, and 25–75% quantity, while in HNC tissues, COL5A1 protein expression was high, with strong intensity and 25–75% quantity. Additionally, the infiltration levels of CD8+ T cells and macrophages in HNC tissues (antibodies CAB075722 and HPA048982, respectively) were not detected, with negative intensity and no detectable quantity.
Discussion
In this study, we screened HNC-related genes from the TCGA database. Given the predominance of oral/laryngeal squamous cell carcinoma in our cohort, these findings primarily support COL5A1 as a prognostic biomarker in these subtypes. Bioinformatic analysis and experimental validation showed that COL5A1 is linked to the TME immune status in HNC and may serve as a prognostic indicator. The disproportionate number of tumor and normal samples (1012 vs. 102) in our analysis may introduce a potential bias. To address this, paired differential expression analysis was performed where applicable, and the findings were further validated using independent tumor and normal cell lines. These complementary approaches mitigate the impact of sample size imbalance and support the robustness of our conclusions regarding COL5A1 expression. The TME immune status significantly impacts tumor cell activity and patient prognosis, suggesting that reshaping the TME could inhibit tumor progression. 28 On the basis of these findings, we calculated the immune scores according to the HNC data from the TCGA and found longer survival in the low-score group than in the high-score group. These results indicate that immune-related genes and cells in the HNC TME play vital roles in HNC progression.
HNC development is strongly associated with the levels of immune and stromal components in the TME. HNC cells evade host immunity by altering proinflammatory cytokine production and impairing effector cell function. 29 To evade the host immune system, HNC cells can weaken autoimmunogenicity and produce immunosuppressors and many other negative immunomodulators. Most HNC immunotherapies, which have been authorized or are being researched, function by targeting to activate T cells. 30 T cells express both positive and negative costimulatory receptors, called checkpoints, which are crucial for the immune response. Programmed cell death receptor 1 (PD-1) is a checkpoint expressed on T cells. Immune checkpoint inhibitors (ICIs) can reactivate the suppressed signal in T cells to regain their immunity, thus exerting an inhibitory effect on tumor development. 31 The FDA-approved anti-PD-1 monoclonal antibodies pembrolizumab and nivolumab are available as second-line therapies for HNSCC patients who have relapsed or spread to other locations. These ICIs have achieved satisfactory outcomes but are also associated with adverse events. 32 In previous studies, several genes related to HNC have been identified, such as TP53, PIK3CA, and EGFR. TP53 is closely associated with human tumors, and the understanding of TP53 has shifted from an oncogene to a tumor suppressor gene. Research has found that somatic mutations in TP53 contribute to the occurrence and development of HNC. 33 PIK3CA mutations activate the PI3K/AKT pathway, promoting the occurrence of HNC. However, PI3K inhibitors with anti-tumor activity may be limited in clinical application due to their inherent high toxicity and the emergence of treatment resistance. 34 The epidermal growth factor receptor (EGFR) is overexpressed in many types of HNCs, leading to increased cell proliferation. Although high EGFR protein levels in HNC are associated with reduced survival rates, patients’ responses to the EGFR inhibitor cetuximab usually decline rapidly after a brief period of effectiveness. 35
As a result, new HNC immunotherapies need to be created. The discovery and validation of new therapeutic targets will expand the available treatment options. The future of HNC research and treatment, through precision medicine, new therapeutic targets, and the integration of technology and data science, will help reduce overdiagnosis and overtreatment, improve survival rates, and enhance the quality of life for patients with HNC.
In the present study, we discovered that COL5A1 overexpression was strongly associated with a high tumor grade and a poor prognosis, suggesting its potential for use in the design of new immunotherapies for HNC.
COL5A1 (V-type collagen alpha 1 chain) has gained considerable attention in HNC research due to its profound impact on the progression of this disease. Consistent findings across studies have revealed that COL5A1 levels are notably higher in HNC tissues than in normal tissues, with this overexpression being a marker of increased tumor malignancy and a less favorable patient prognosis. The correlation between COL5A1 expression and tumor invasiveness in HNC is well established. Through immunohistochemistry and a series of statistical analyses, the study by Chen et al. found that COL5A1 is an adverse factor for tumorigenesis, clinical pathological outcomes, and prognosis in tongue squamous cell carcinoma. 36 This aligns with our findings and further supports our perspective. In terms of the scope of the study, we expanded the research to a broader range, which enhances the generalizability of our conclusions. Additionally, through intracellular experiments and more comprehensive analytical methods, we have elucidated the role of COL5A1 in the occurrence and development of HNC, as well as the associated pathways, thus addressing gaps that existed in previous studies. Although Zhu et al. identified high expression of COL5A1 in HNC through pancancer analysis and noted a certain association with prognosis, we further explored the expression of COL5A1 in different HNC cell lines. The results revealed that it was highly expressed in HNC cells, whereas its expression was higher in invasive cell lines (FaDu and CAL-27), suggesting that it may promote HNC progression by regulating the invasive capacity of HNC cells. In addition, we analyzed the spatial expression distribution of COL5A1 in HNC samples on the basis of spatial transcriptome data (GSE208253). COL5A1 was highly expressed in the leading edge region of HNC samples, an area that has been closely associated with tumor invasion in recent years.26,27
Transcriptomic analysis by Shi et al. demonstrated that COL5A1 upregulation promotes HNC invasiveness through EMT—a process critical for tumor metastasis that enhances cellular migration and invasion by reducing intercellular adhesion. 37 Furthermore, proinflammatory factors produced by cancer cells can stimulate EMT, thereby promoting the EMT process. 38 COL5A1 may amplify this process by stabilizing proinflammatory factor secretion. Recent tumor studies have indicated that EMT can serve as a prognostic indicator for immunotherapy efficacy. 39 Furthermore, the presence of COL5A1 in HNC is correlated with the infiltration of immune cells within the TME. For instance, single-cell sequencing analysis by Tsai et al. 40 revealed a positive correlation between COL5A1 expression and the infiltration of M1-type macrophages in HNC. This correlation suggests a potential regulatory role of COL5A1 in modulating immune cell activity, which could subsequently affect tumor immune evasion and responsiveness to immunotherapeutic strategies.
COL5A1 is associated with metabolic reprogramming in the HNC microenvironment, promoting aerobic glycolysis to support rapid cell proliferation. Li et al. 41 reported that elevated COL5A1 expression is correlated with increased aerobic glycolysis in tumor cells, which is crucial for supporting rapid cell proliferation. This metabolic reprogramming not only provides the necessary energy and biosynthetic precursors for tumor growth but also may modulate the function of immune cells within the TME by altering metabolic byproducts.
Most solid tumors are hypoxic. Hypoxia, as a main feature of the TME, can promote the differentiation of tumors into a malignant or even metastatic phenotype. 42 In many cancer types, hypoxia is linked to a poor prognosis. 43 Research has indicated that a variety of cancers, such as colorectal, prostate, and hepatocellular carcinoma, activate the IL6-JAK-STAT3 pathway.44–46 Previous studies have found that cathepsin L (CTSL) can up-regulate autophagy in laryngeal cancer cells by activating the IL6-JAK-STAT3 signaling pathway. 47 In the present study, COL5A1 was enriched in many pathways related to cancer progression, suggesting its oncogenic role in HNC.
CIBERSORT analysis showed that COL5A1 expression was negatively correlated with B-cell memory, naïve B cells, and CD8+ T cells, while positively correlated with M0 macrophages, neutrophils, and activated mast cells. 48 B cells, key mediators of humoral immunity, inhibit tumor growth by secreting immunoglobulins, activating T cells, and eliminating cancer cells, and their infiltration level is associated with better cancer survival. Similarly, higher CD8+ T-cell levels are correlated with improved prognosis, highlighting their critical role in antitumor immunity.49–52 Neutrophils and activated mast cells in the TME promote tumor growth and metastasis by modulating vascular permeability. 53 COL5A1 may contribute to HNC progression by influencing immune cell infiltration, including B cells, CD8+ T cells, neutrophils, and mast cells. While our data suggest a correlation, further mechanistic studies are required to confirm COL5A1's causative role in immune modulation. Moreover, our spatial transcriptomic analysis revealed elevated COL5A1 expression at the tumor invasive front with spatially resolved immunomodulatory properties. It has been demonstrated that in hepatocellular carcinoma, the critical distinction between recurrent and non-recurrent patients lies not in the total abundance of immune cells, but in their precise spatial distribution patterns within tumor tissues. Specifically, patients exhibiting enrichment of natural killer (NK) cells at the tumor-invasive margin showed significantly reduced recurrence risk. This finding confirms that the spatial architecture of immune infiltration more accurately reflects the underlying tumor immune status than absolute cell quantities. 54 In HNSCC, Kulasinghe et al. further substantiated that the phenotype and localization of immune cells are pivotal factors in deciphering antitumor immune responses and their impact on clinical outcomes. 55 A limitation of the current study is its inability to resolve cellular-level interactions; future work could leverage higher-resolution platforms (e.g., Visium HD with 2-μm resolution) to map direct contact networks between COL5A1-expressing tumor cells and immune cells. It has been found that high expression of COL5A1 is closely linked to the aggressiveness and poor prognosis of HNC. Building on this finding, potential future therapeutic strategies could include gene therapy, RNA interference (RNAi), and CRISPR/Cas9 gene editing techniques, all of which have the potential to reduce tumor aggressiveness and metastasis by inhibiting COL5A1 expression. However, these approaches are still at the experimental or preclinical stage, and further validation is required before they can be translated into clinical practice. Moreover, the development of COL5A1-targeted inhibitors, or strategies that modulate its interaction with the ECM, could offer a promising avenue for inhibiting tumor growth. While these potential therapies hold significant promise for clinical applications, several technical challenges remain, including issues related to delivery efficiency, target specificity, off-target effects, and tumor heterogeneity. Therefore, future research should focus on optimizing the delivery systems for COL5A1-targeted therapies and further validating their clinical applicability. While this study provides valuable insights into the role of COL5A1 in HNC, several limitations should be acknowledged. First, although the pan-HNC approach enhances statistical power and explores common biology, it inherently limits our ability to investigate the role of COL5A1 within specific anatomical (e.g., oropharyngeal vs. hypopharyngeal) or etiological (e.g., HPV-positive vs. HPV-negative) subtypes. The variation in tumor biology across these subtypes may influence COL5A1's prognostic and mechanistic relevance. Future studies with larger, well-annotated cohorts for each subtype are essential to validate and refine our findings. Second, the correlative nature of the majority of our data (bioinformatics and IHC correlation) limits the ability to draw definitive conclusions regarding the direct causal mechanisms by which COL5A1 influences the TME or tumor progression. Third, the primary validation was based on established databases (TCGA, GEO, and HPA) and cell line analyses; functional experiments (e.g., COL5A1 knockdown or overexpression in vitro and in vivo) are necessary to establish causality and to further elucidate the underlying molecular mechanisms. Fourth, although COL5A1 expression was validated in cell lines, HPA, and immunohistochemistry of HNC samples, the sample size for our qRT-PCR validation (cell lines) was modest. In the IHC analysis, two of the fourteen HNC samples included did not show COL5A1 expression, both of which were AJCC stage I. The inclusion of matched normal tissue samples in the comparison would have provided more informative insights. Finally, the therapeutic implications suggested by our findings require comprehensive preclinical evaluation in appropriate animal models.
Conclusion
In summary, our study demonstrates that elevated COL5A1 expression is a robust indicator of advanced tumor stage and poor survival in HNC. Its association with specific TME features underscores its potential as a prognostic biomarker. Future work is needed to experimentally validate its functional role and therapeutic relevance.
Supplemental Material
sj-docx-1-sci-10.1177_00368504251385413 - Supplemental material for COL5A1 in the tumor microenvironment predicts the prognosis of head and neck cancer
Supplemental material, sj-docx-1-sci-10.1177_00368504251385413 for COL5A1 in the tumor microenvironment predicts the prognosis of head and neck cancer by Shikun Dong, Jiahang Song, Zuoquan Zhu, XiChen, Xuerong Wang, Lei Cheng and Liqing Zhang in Science Progress
Footnotes
Acknowledgments
We would like to express our gratitude to all the participants for their contributions to this study and to the TCGA database platform for their support.
Ethical approval
The Institutional Review Board (IRB) of the First Affiliated Hospital with Nanjing Medical University reviewed and approved the study protocol (Approval No.: 2023-SR-288).
Author contributions
Conceptualization: all authors. Data curation: Shikun Dong and Liqing Zhang. Formal analysis: all authors. Funding acquisition: Lei Cheng, Xi Chen and Liqing Zhang. Methodology: Jiahang Song and Zuoquan Zhu. Project administration: Shikun Dong and Liqing Zhang. Visualization: Xuerong Wang. Writing—original draft: Shikun Dong and Liqing Zhang. Writing—review and editing: all authors.
Funding
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: This work was supported by the Clinical Capacity Enhancement Project of Jiangsu Province Hospital (JSPH-MC-2020-5) and the Jiangsu Province Capability Improvement Project through Science, Technology and Education (JSDW202203) of China.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The datasets generated and/or analysed during the current study are available from the corresponding authors on reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
