Abstract
Background:
The immunosuppressive tumor microenvironment in colorectal cancer is a key factor in its progression and treatment resistance, yet the coordinated mechanisms of its cellular composition and signaling networks are not fully understood. This study aims to explore the regulatory foundation of this microenvironment and construct a clinically relevant prognostic model through the integration of single-cell and multi-omics analyses.
Methods:
We first analyzed single-cell RNA sequencing data from 9 samples, mapping cellular landscapes and inferring intercellular communication networks. By integrating extensive transcriptomic data from public databases, we conducted weighted gene co-expression analysis to identify core functional modules. Additionally, we employed machine learning methods to integrate multi-omics features and construct a prognostic risk scoring model.
Results:
Analysis suggested that macrophages and fibroblasts in the tumor microenvironment play a dominant role in intercellular communication through specific chemokine signaling axes. Co-expression network analysis further identified the MAPK signaling pathway as a core module associated with immunosuppressive microenvironmental phenotype. Based on these findings, we developed and preliminarily verified a prognostic risk score comprising 17 genes.
Conclusion:
This score may independently predict predicts patient survival and effectively distinguishes tumor subtypes with distinct immune microenvironmental characteristics: the high-risk group shows an immunosuppressive, stroma-rich “cold tumor” phenotype, suggesting higher predictive sensitivity to certain chemotherapies and targeted drugs.
Keywords
Introduction
Colorectal cancer (CRC) remains a leading cause of cancer-related mortality worldwide, 1 with its complex and heterogeneous tumor microenvironment (TME) playing a pivotal role in disease progression, 2 therapeutic resistance, 3 and patient outcomes. 4 The TME is a dynamic ecosystem composed of malignant cells, 5 immune cells, 6 cancer-associated fibroblasts (CAFs), endothelial cells, and various signaling molecules. In CRC, an immunosuppressive TME characterized by dysfunctional cytotoxic T cells, 7 enriched regulatory T cells (Tregs), M2-polarized tumor-associated macrophages (TAMs), 8 and activated stromal components is often associated with poor prognosis and limited response to immunotherapy. Understanding the cellular composition, intercellular communication, and underlying molecular circuits that shape this immunosuppressive niche is crucial for developing novel therapeutic strategies.
Recent advances in single-cell RNA sequencing (scRNA-seq) have revolutionized our ability to deconvolute the cellular heterogeneity and transcriptional states within the TME at unprecedented resolution.9,10 Many studies have either focused solely on cellular cataloging using scRNA-seq or on deriving prognostic signatures from bulk omics data, often lacking a direct and systematic integration across these scales. This disconnect limits our ability to trace how specific cellular interactions and states observed at the single-cell level translate into coherent, module-level molecular programs detectable in bulk tissues, and ultimately, how these programs determine clinical outcomes. Specifically, the key signaling axes that dominate cell-cell communication within the CRC immunosuppressive TME, and the core transcriptional modules that encapsulate and drive this pathological ecosystem, are not fully delineated. Moreover, whether such modules can be harnessed to construct clinically actionable tools for prognosis prediction and therapy guidance requires further investigation. 11
Therefore, this study aimed to address these gaps by asking: What are the fundamental cellular communication networks that sustain the immunosuppressive TME in CRC, and can we identify and validate a core signaling module derived from this network as a robust prognostic biomarker? To answer this, we performed an integrative multi-omics analysis. 12 We first applied scRNA-seq to profile the cellular landscape and intercellular communication in CRC, identifying dominant signaling pathways. We then employed WGCNA on bulk transcriptomic data to identify gene modules correlated with pathway activity, bridging single-cell observations with tissue-level expression programs. The overlapping genes from these analyses were used to construct and validate a prognostic risk score model via machine learning algorithms. Finally, we comprehensively characterized the biological features, immune context, and therapeutic implications associated with the risk model.
We anticipate that this work will provide a deeper, systems-level understanding of the wiring of the immunosuppressive TME in CRC, highlighting the MAPK pathway as a central coordinator. The derived risk score model is expected to serve as a valuable independent prognostic tool and a potential indicator for tailoring therapeutic strategies, thereby offering both theoretical insights into TME biology and practical value for clinical management.
Materials and Methods
Quality Control, Dimensionality Reduction Clustering, Cell Type Annotation, and Inter-Sample Cell Identity Comparison Analysis
Single-cell RNA sequencing data from the GSE277905 dataset, were analyzed using the Seurat (v4) package in R. Raw count matrices were imported and merged into a single Seurat object. Low-quality cells were filtered by removing cells expressing fewer than 50 genes or exhibiting mitochondrial gene content greater than 15%. Gene expression data were normalized using the LogNormalize method (scale factor = 10 000), and 2850 highly variable genes were identified using the variance-stabilizing transformation (vst) method. Cell clustering was conducted using a shared nearest neighbor (SNN) graph-based approach with a resolution of 0.6, and clusters were visualized by t-distributed stochastic neighbor embedding (t-SNE). Cluster-specific marker genes were identified using the FindAllMarkers function (|log2 fold change| > 1, adjusted P < .05). Cell type annotation was performed with SingleR using multiple reference datasets and Monaco Immune Data, enabling comparison of cell-type distributions between normal and tumor samples.
Ligand-Receptor Interaction Analysis, Cell Communication Network Construction, and Signal Pathway Enrichment Visualization
Cell–cell communication analysis was performed using annotated single-cell RNA-seq data. Standardized gene expression matrices and corresponding cell type annotations were used as input. Cells were grouped by cell type, and ligand–receptor interactions between cell populations were inferred using the CellChat package (v1.6.0) with curated human ligand–receptor pairs from CellChatDB. Ligand or receptor genes expressed in fewer than 10% of cells were excluded, and interactions with an inferred communication probability of P < .05 were considered significant. Communication probabilities between cell types were quantitatively estimated based on gene expression levels and cell numbers, retaining only interactions between populations containing at least 10 cells to reduce small-sample bias.
Visualization and Differential Expression of Single-Cell Gene Set Variation Analysis (GSVA)
We performed single-sample gene set enrichment analysis (ssGSEA) using the GSVA package. We obtained the “MAPK” pathway gene sets from the MSigDB database, which were derived from public databases. The ssGSEA algorithm was applied to the log-normalized expression matrix of the integrated Seurat object, with a minimum gene set size of 3 and a maximum of 500. For each gene set, the pathway activity score in every cell was stored as a new assay in the Seurat object for downstream visualization and analysis. We visualized pathway activity across all cells using FeaturePlot and VlnPlot. For all populations annotated as tumor cells, we classified them into “Tumor_High” and “Tumor_Low” groups according to the median value of their MAPK pathway ssGSEA scores. Using the FindMarkers function in Seurat (Wilcoxon rank-sum test), we identified differentially expressed genes between the Tumor_High and Tumor_Low groups, with thresholds set at |log2FC| > 1 and adjusted P-value <.05.
Weighted Gene Co Expression Network Analysis (WGCNA)
Pathway activity scores for each cell were stored as a new assay in the Seurat object for downstream analyses. Pathway activity was visualized using FeaturePlot and VlnPlot across all cells and annotated clusters. Tumor cells were stratified into high- and low-activity groups based on the median MAPK ssGSEA score, designated as Tumor_High and Tumor_Low. Differentially expressed genes (DEGs) between the 2 groups were identified using the FindMarkers function in Seurat with the Wilcoxon rank-sum test, applying thresholds of |log2 fold change| > 1 and adjusted P < .05.
Machine Learning Constructs and Validates Tumor Prognosis Prediction Models and Survival Analysis
Gene expression profiles and corresponding clinical survival data from the TCGA database were used as the training cohort, while an independent GEO dataset served as the validation cohort. Samples with survival times ⩾30 days were retained. Expression profiles of the intersecting genes common to both datasets were extracted and standardized. Univariate Cox proportional hazards regression analysis (P < .05) was performed on the training set to identify prognosis-related feature genes. Multiple machine learning approaches (Supplemental Table 7) and Support Vector Machine (SVM) were subsequently applied to construct prognostic models using a two-step strategy: first, variable selection was conducted using each algorithm; second, multivariate Cox regression models were built based on the selected features to calculate individual risk scores. Model performance was evaluated using the concordance index (C-index) in both the TCGA training cohort and the GEO validation cohort. C-index values of all models were visualized in a Heatmap and ranked by their mean performance to identify the optimal model. Patients were stratified into high- and low-risk groups according to the median risk score of the optimal model, and survival differences were assessed using Kaplan–Meier analysis with the log-rank test.
Analysis of the Independent Prognostic Value, Predictive Performance, and Clinical Feature Distribution of Risk Scoring
Univariate and multivariate Cox regression analysis were used to evaluate their independent prognostic value, and the results were visualized through forest plots; Using time-dependent ROC curves to compare the predictive performance of risk scores with the above clinical pathological features and calculate AUC values, while evaluating the predictive ability of risk scores for 1-, 3-, and 5-year overall survival; The distribution differences of clinical pathological features between high and low-risk groups were shown using a Circos plot, and the statistical significance of the differences between groups was evaluated using chi square test.
GO, KEGG, and GSEA Enrichment Analysis
Based on the significantly different gene set identified in the previous step, official gene symbols were converted to Entrez Gene IDs, and Gene Ontology (GO) enrichment analysis was performed using the enrichGO function, covering biological process, cellular component, and molecular function categories. Significance thresholds were set at P < .05 and q < 0.05 after false discovery rate correction. Enrichment results were visualized using bubble plots, bar charts, and circular plots. Using the same gene set, KEGG pathway enrichment analysis was conducted with the enrichKEGG function, specifying the species as Homo sapiens (“hsa”), and the top 30 significantly enriched pathways were visualized under the same significance criteria. In addition, gene expression profiles were compared between high- and low-risk groups to calculate and rank log2 fold changes across the whole transcriptome. GSEA was performed using publicly available GO gene sets as the reference, with 1000 permutations. The top 5 significantly enriched pathways in both the high- and low-risk groups were selected for visualization.
Analysis of Tumor Immune Microenvironment
The tumor microenvironment was quantified using the ESTIMATE algorithm to calculate stromal scores, immune scores, and ESTIMATE composite scores for each sample. Immune cell infiltration was further assessed using the CIBERSORT deconvolution algorithm with the LM22 signature matrix, estimating the relative proportions of 22 immune cell subsets. Only samples with CIBERSORT output P < .05 were retained to ensure estimation reliability. Differences in ESTIMATE scores and immune cell proportions between high- and low-risk groups were compared using the Wilcoxon rank-sum test. Spearman correlation analysis was performed to evaluate the relationships between continuous risk scores and immune-related indicators. Prior to analysis, expression data were uniformly preprocessed by retaining tumor samples only, merging duplicate samples from the same patient, and standardizing gene symbols. Visualization was conducted using the ggplot2 and ggpubr packages, generating violin plots for score distributions, box plots for immune cell proportion differences, and scatter plots for correlation analyses.
Prediction and Differential Analysis of Drug Sensitivity
Drug sensitivity of TCGA samples to compounds in the GDSC2 database was predicted using the oncoPredict package in R. Gene expression data were obtained from TCGA and subjected to standardization and batch effect correction using the limma empirical Bayes method, after which low-expression genes (row mean ⩽ .5) and normal samples were removed. Drug response scores for each sample were estimated using the calcPhenotype function based on the GDSC2 training datasets (GDSC2-Expr.rds and GDSC2-Res.rds). For each drug, sensitivity differences between high- and low-risk groups were evaluated using the Wilcoxon rank-sum test, with a significance threshold of P < .001. Effect size was defined as the difference in median predicted sensitivity scores between the high- and low-risk groups. The top 20 drugs with the largest absolute effect sizes were selected and visualized using an integrated dumbbell plot generated with ggplot2.
Statistical Analysis
All statistical analyses were performed using R software (v4.4.2). Unless otherwise stated, a two-tailed P-value <.05 was considered statistically significant (*P < .05, ** P < .01, ***P < .001). A log2 fold change threshold of |logFC| > 1 was applied where indicated. Categorical variables were analyzed using the chi-square test or Fisher’s exact test, as appropriate.
Results
Single-Cell Transcriptome Analysis Reveals Alterations in Immune Cell Composition in the Colorectal Cancer Tumor Microenvironment
In this study, we systematically investigated the association between the immunological microenvironment of colorectal cancer and the MAPK pathway (Figure 1). Firstly, we conducted a systematic single-cell transcriptome analysis on colorectal cancer (CRC) samples (Supplemental Figure 1). After strict quality control and batch effect evaluation, the data quality is good and suitable for in-depth analysis (Supplemental Figure 2). Through unsupervised clustering, we identified 35 unique cell subpopulations in the tumor microenvironment (TME; Supplemental Figure 2A), with marker genes for each subpopulation as shown in Supplemental Table 1. To confirm the authentic biological properties of the cell clusters, we presented heatmaps of marker genes for each cluster (Supplemental Figure 2B). The expression of markers was clearly segregated between different clusters, providing a basis for subsequent cell type annotation. After annotating these subgroups with classical biomarkers, it was found that tumor tissue exhibited apparent cellular remodeling compared to normal tissue: enriched with cell types such as tumor associated macrophages, cancer-related fibroblasts, and endothelial cells that promote immune suppression and matrix remodeling; However, the initial T cells and B cells with immune surveillance function were appeared relatively reduced (Supplemental Figure 2C, Supplemental Table 2). This raises a central question: what functional roles do these distinct cell types undertake within the tumor microenvironment, and how do they collaborate?

The flowchart of this study.
Cell Communication Network and Functional Pathway Analysis Reveal Key Regulatory Axes in the Tumor Microenvironment
To answer the above questions, we systematically analyzed the ligand receptor interaction network between cells in TME. Analysis revealed that cell communication appears to be highly dependent on secreted signals (Figure 2A), with macrophages and fibroblasts serving as the hubs of the network, sending a large number of signals outward (Figure 2B and C). It is worth noting that multiple pathways closely related to immune cell recruitment and functional regulation, such as MIF/(CD74 + CXCR4) and MIF/(CD74 + CXCR2) signaling axis, are abnormally active in TME (Figure 2D). The analysis of cell communication shows that MIF and its receptor complexes (CD74 + CXCR4) and (CD74 + CXCR2) are abnormally active in the data of this study. Literature has confirmed that CD74 is the main receptor of MIF, and the binding of MIF-CD74 classically activates the MAPK (ERK1/2) and PI3K signaling pathways.13,14 At the functional level, the MAPK pathway is directly related to multiple key features of cancer (such as proliferation, anti-apoptosis, invasion, and angiogenesis), and is the core mediator by which MIF exerts its oncogenic effect. Supplemental Figure 3 illustrates the communication networks when each cell type acts as a signal transmitter (Supplemental Figure 3A-N). This result suggests that the TME of CRC appears not to be a simple accumulation of cells, but an ecosystem precisely regulated by key cell types through specific signaling pathways, which may be potential intervention targets. To further explore the functional characteristics of distinct cell types, we conducted cell-level enrichment analysis of specific functional pathways using ssGSEA. The ssGSEA profile (Figure 3A) showed that certain cell populations exhibited heightened pathway activity, particularly in pathways related to inflammation, immune activation, and antigen presentation. The distribution of ssGSEA scores across different cell types (Figure 3B and Supplemental Table 3) further confirms that immune effector cells such as macrophages and dendritic cells exhibit higher scores in pathways associated with immune responses; fibroblasts show significantly elevated activity in pathways related to ECM remodeling; and T cells show apparent differential activity in immune regulatory pathways.

Analysis of cellular communication networks reveals the strength of communication between different cell types within the tumor microenvironment and key signaling pathways. (A) Distribution diagram of Communication Database Category, showing the sources of ligand–receptor pairs used for inferring cellular communication, including various database categories such as Secreted Signaling, ECM–Receptor, and Cell–Cell Contact. (B and C) Cell communication networks constructed per cell type: (B) Interaction Count network; (C) Interaction Weight network. Line thickness and color denote communication intensity, illustrating interactions among immune cells, fibroblasts, epithelial cells, and others within the tumor microenvironment. (D) A bubble plot of ligand–receptor pairs displays communication probability between key cells. Color indicates communication intensity, while dot size represents statistical significance (P-value).

The ssGSEA algorithm was used to evaluate the activity levels of the MAPK signaling pathway in samples from different groups. (A) Distribution of single-sample Gene Set Enrichment Analysis (ssGSEA) scores for each cell within the t-SNE space. The blue-white-red gradient indicates increasing enrichment scores. (B) Demonstrates gene expression differences across cell types. The y-axis represents gene expression levels, while the x-axis denotes distinct cell types. This box plot illustrates gene expression distributions across cell types, revealing expression disparities between populations.
Integrate and Analyze the MAPK Pathway as the Core Module for Coordinating TME Phenotype
Based on the active cellular communication in TME, we further explored the core molecular modules that may contribute to its overall phenotype. First, the optimal soft threshold for network construction was determined via scale-free topological fitting of the power-law graph (Supplemental Figure 4A). At this threshold, the network exhibited scale-free topological properties while maintaining stable average node connectivity. Subsequently, hierarchical clustering based on gene expression correlations identified multiple co-expression modules using the dynamic clipping method (Figure 4A). Further calculations of the correlation between each module’s characteristic genes and MAPK pathway scores revealed that the pink and tan modules showed significant positive correlations with MAPK pathway scores, with the pink module exhibiting the highest correlation coefficient (r = .83; Figure 4B), the genes contained within each color module are shown in Supplemental Table 4. Sample clustering indicated no outliers (Supplemental Figure 4B). The MAPK pathway is known to be a core pathway that regulates cell proliferation, stress response, and inflammation. This discovery links the heterogeneous cellular states observed at the single-cell level with a clear and targeted signaling pathway, suggesting that abnormal activation of the MAPK pathway may be a key molecular driving force in coordinating the formation of CRC immunosuppressive TME.

WGCNA constructs co-expression modules and identifies key modules associated with MAPK trait scores. (A) Gene dendrogram constructed based on selected soft thresholds and module partitioning results obtained via DynamicTreeCut. Different colors denote distinct co-expression modules. (B) Heatmap of correlations between modules and MAPK signaling pathway scores (trait). Each cell displays the Pearson correlation coefficient (bottom) and its P-value (in parentheses) between the module’s characteristic gene (Module eigengene, ME) and the MAPK pathway score. The blue-white-red gradient indicates increasing correlation intensity toward red.
Construct and Validate a Prognostic Risk Scoring Model Based on the MAPK Pathway
In order to translate the above biological findings into practical clinical tools, we integrated single-cell differential genes and MAPK related module genes screened by WGCNA, obtaining 253 intersecting genes (Figure 5A), the genes included are as shown in Supplemental Table 5. Through survival analysis, 17 key genes that showed an association with poor patient prognosis were identified (Figure 5B and Supplemental Table 6). To construct an optimal MAPK pathway feature risk scoring model, we systematically tested multiple machine learning Cox modeling approaches and evaluated their predictive consistency (C-index) across the TCGA and GEO cohorts. Results indicated that the model combining StepCox (backward) with Enet (alpha = .1) achieved the highest C-index in TCGA cohorts (Figure 5C), and was therefore selected as the final prognostic model. This model can effectively distinguish high-risk and low-risk patients with different prognoses in the TCGA training set (Figure 5D). The specific risk scores are shown in Supplemental Table 8. This represents an exploratory step toward translating a discovery from TME based biological research into a concrete prognostic indicator.

Key gene screening and construction and validation of a MAPK-feature-related prognostic model (A) Venn diagram of the intersection between scRNA-seq differentially expressed genes (scRNADEG) and WGCNA module genes. The red region represents scRNA differentially expressed genes, while the blue region denotes key module genes identified by WGCNA. A total of 253 genes were recognized by both methods, serving as candidate genes for subsequent modeling. (B) Univariate Cox regression forest plot of intersecting genes. Displays 17 genes significantly associated with overall survival. The x-axis represents hazard ratios, with red squares indicating hazard ratio estimates and blue lines denoting 95% confidence intervals. (C) C-index heatmaps of multiple modeling approaches (including StepCox, RSF, CoxBoost, etc.) across TCGA and GEO cohorts. Yellow denotes higher C-index values, blue denotes lower C-index values; red represents GEO, blue represents TCGA. (D) Survival curves (Kaplan–Meier) for high-risk versus low-risk groups stratified by risk score in the TCGA cohort. High-risk patients exhibit significantly poorer overall survival (P < .001). (E) Survival curves (Kaplan–Meier) for high-risk versus low-risk groups stratified by risk score in the GEO cohort (P = .433).
Risk Score is a Prognostic Biomarker Independent of Clinical Staging
In order to better apply this model to clinical research, we systematically evaluated the clinical efficacy of this risk score. Multivariate Cox regression analysis illustrated that this score is an independent prognostic predictor of age, gender, and the most important clinical stage (Figure 6A and B and Supplemental Tables 9 and 10). Its predictive accuracy (AUC) is comparable to clinical staging and has stable long-term predictive ability (Figure 6C and D). More importantly, high-risk scores were showed a correlation with later TNM staging, deeper tumor infiltration, and more metastatic events (Figure 6E). These pieces of evidence collectively indicate that our risk score is not an isolated molecular label, but a comprehensive indicator that may be associated with the degree of malignant progression of tumors.

Correlation analysis between independent prognostic value of MAPK-specific risk scores and clinical characteristics. (A) Univariate Cox regression analysis demonstrating the impact of age, sex, clinical stage, and risk score on overall survival (OS). Green boxes denote hazard ratios (HR), with horizontal lines representing 95% confidence intervals. (B) Multivariate Cox regression analysis adjusting for age, gender, clinical stage, and risk score demonstrates their impact on overall survival (OS). Red squares denote adjusted HRs, with horizontal lines representing 95% confidence intervals. (C) ROC curve comparison between traditional clinical characteristics and risk score, displaying ROC curves for risk score, age, gender, and clinical stage. (D) Time-dependent ROC curves for the risk model’s 1-, 3-, and 5-year survival prediction. The risk model demonstrates stable predictive capability at 1 year (AUC = 0.725), 3 years (AUC = 0.713), and 5 years (AUC = 0.703). (E) Risk score correlation analysis with clinical characteristics. The ring diagram illustrates distribution differences between high-risk and low-risk groups for age, gender, stage, T stage, M stage, and N stage.
High Risk Score Characterization of “Cold Tumor” Phenotype: Matrix Remodeling and Immune Suppression
To elucidate the biological basis of the scoring system, we compared the functional differences between high and low-risk groups. Analysis suggested that the high-risk group appeared to be characterized by enrichment in pathways such as extracellular matrix remodeling, collagen formation, and muscle contraction, exhibiting typical epithelial mesenchymal transition and pro fibrotic features (Figure 7A-C and Supplemental Tables 11 and 12). The KEGG enrichment analysis in Supplemental Figure 5 yielded similar results, with detailed data presented in Supplemental Table 13. On the contrary, the low-risk group appeared to be enriched in immune cell activation and adhesion related pathways. Immune infiltration analysis confirmed that the high-risk group TME was filled with M2 macrophages and had a high proportion of matrix components, while CD8+ T cells with killing function and activated NK cells were scarce (Figure 7D and E). These observations mean that the risk score may distinguish between 2 TME states: high risk represents an immunosuppressive, matrix rich “cold tumor” phenotype; Low risk represents the immune active “hot tumor” phenotype. This may help explain its potential association with prognosis.

Differences in GO enrichment analysis, GSEA analysis, and immune infiltration assessment between high- and low-risk groups. (A) GO functional enrichment analysis based on differentially expressed genes (|log2FC| > 1, FDR < 0.05). The top pathways showing significant enrichment across the 3 major categories—Biological Process (BP), Cellular Component (CC), and Molecular Function (MF)—are displayed. Bubble size indicates the number of enriched genes, while color denotes the q-value. (B) GSEA analysis reveals biological pathways significantly enriched in the high-risk group (NES > 0, P < .05). The ranking curve displays the top 5 pathways, with the enrichment score on the Y-axis and gene expression data ranking on the X-axis. (C) GSEA analysis revealing biological pathways significantly enriched in the low-risk group (NES < 0, P < .05), showing the top 5 enriched pathways. (D) Box plots illustrating the infiltration status of 22 immune cell types based on the CIBERSORT algorithm. Significant differences exist in immune cell proportions between high- and low-risk groups (*P < .05, **P < .01, ***P < .001). (E) Differences in tumor microenvironment (TME) scores. X-axis: Stromal Score, Immune Score, ESTIMATE Score. Y-axis: TME score (*P < .05, **P < .01, ***P < .001).
Risk Score Indicates Immune Microenvironment Characteristics and Potential Treatment Strategies
To better verify the predictive performance of the model in clinical settings, we utilized it to analyze and predict various clinical parameters. The risk score showed a correlation with the infiltration level of specific immune cells (such as plasma cells; Figure 8A-C). When combined with tumor mutation burden, it could stratify patient prognosis (Figure 8D and E). The drug analysis scores are shown in Supplemental Table 14. Of particular importance, drug sensitivity analysis showed that high-risk patients exhibited higher sensitivity to various chemotherapy and targeted drugs, including oxaliplatin and dalafenib (Figure 8F and Supplemental Figure 6). This suggests that the model can not only predict prognosis, but may also be used in the future to identify which patients may benefit from stronger therapeutic interventions or specific targeted/immune combination strategies.

Correlation analysis between risk scores and tumor mutational burden, immune microenvironment, and drug sensitivity. (A-C) Correlation analysis between risk scores and immune cell infiltration. (A) Risk scores positively correlated with B cell naïve; (B) Risk scores negatively correlated with activated CD4+ T cells; (C) Risk scores negatively correlated with plasma cells. (D) Patients in the high TMB group exhibited marginally poorer overall survival, though this did not reach statistical significance (P = .058). (E) After grouping patients based on combined TMB and risk scores, high-risk + high-TMB patients demonstrated the poorest prognosis, while low-risk + low-TMB patients showed the best survival (P < .001). (F-H) Differences in drug sensitivity scores between high-risk and low-risk groups for 3 agents: (F) Oxaliplatin, (G) Dabrafenib, (H) Temozolomide. The P-values in the figure are based on the Wilcoxon test, indicating significant differences in drug sensitivity between groups (P < .001).
Discussion
Colorectal cancer ranks third in global incidence and second in cancer-related mortality, 15 and its intrinsically heterogeneous tumor micro-environment acts as the central hub that governs tumor evolution, 16 therapeutic resistance and, ultimately, patient survival. Although the development of single-cell RNA sequencing (scRNA-seq) and systems biology methods has enhanced our understanding of the composition and transcriptional status of TME cells, 17 there are still significant knowledge gaps regarding the key signaling axes of intercellular communication 10 that play a dominant role in CRC immunosuppressive TME, 18 the core transcriptional modules that drive this pathological ecosystem, and how to transform single-cell scale observations into tools that can be used for clinical prognosis prediction and treatment guidance. 19 Therefore, this study aims to explore the regulatory basis of CRC immunosuppressive TME by integrating single-cell and multi-omics analysis, 20 and construct a clinically relevant prognostic model. 21 Specifically, we explored the following core question: What is the basic cellular communication network that maintains CRC immunosuppressive TME? Can a core signal module be identified and validated from this network as a robust prognostic biomarker?
By analyzing scRNA seq data from 9 samples, we have created a detailed cell map of CRC TME, 22 bespeaking the enrichment of immune suppression and matrix remodeling related cells (such as TAMs, CAFs) in tumor tissues, as well as the decrease of immune monitoring cells (such as naive T cells). Cell communication analysis 23 further suggested that macrophages 24 and fibroblasts appear to be core hubs of signal transmission in TME, and contribute to immune regulation through signaling axes such as MIF/(CD74 + CXCR4) and MIF/(CD74 + CXCR2). 25 Subsequently, by integrating single-cell findings with WGCNA of the TCGA cohort data, 26 we identified the MAPK signaling pathway as a core molecular module associated with this immunosuppressive TME phenotype. The MAPK pathway, as a core pathway regulating cell proliferation, stress response, and inflammation, its abnormal activation is likely to be a key molecular driving force in coordinating the formation of CRC immunosuppressive TME. Based on the intersection of single-cell differential genes and WGCNA MAPK related module genes, we constructed and validated a prognostic risk scoring model consisting of 17 genes using machine learning methods. 27 This model can effectively distinguish high-risk and low-risk patients with significantly different prognoses in both the training set (TCGA) and independent validation set (GEO). Multivariate analysis signaled that the risk score may be an independent prognostic factor independent of age, gender, and clinical stage. Functional analysis showed that high-risk scores were associated with extracellular matrix remodeling, epithelial mesenchymal transition (EMT), and enrichment of pro fibrotic features. The TME exhibited an immunosuppressive “cold tumor” phenotype with M2 macrophage infiltration as the main factor and functional CD8+ T cells and NK cells scarce. On the contrary, the low-risk group is enriched in immune cell activation related pathways and displayed a “hot tumor” phenotype. 28 In addition, drug sensitivity prediction suggests that high-risk patients may be more sensitive to chemotherapy and targeted drugs such as oxaliplatin and dalafenib. These findings point to the risk scoring model is not only a prognostic indicator, but may also provide potential guidance for patients’ treatment strategy choices, such as intensified chemotherapy or specific targeted/immune combination regimens.
The main contribution of this study lies in the progressive integrated analysis strategy of “single-cell atlas cell communication core module prognostic model,” which attempts to connect the bridge between microscopic cell states and macroscopic clinical phenotypes, 29 filling the gap of previous studies that were mostly limited to single scale analysis. Firstly, compared with early studies that only used scRNA seq for cell classification or batch transcriptome for prognostic marker screening such as 10 ; suggested that the heterogeneous cellular states observed at single-cell resolution 30 (such as specific subpopulations of macrophages and fibroblasts) and their communication networks were directly associated with a detectable and coherent molecular program at the tissue level (ie, MAPK core module). This provides a more mechanistic paradigm for biomarker discovery. Secondly, we identified that the MAPK pathway is the core signaling hub that coordinates the immunosuppressive TME of CRC. This discovery highlights the previous understanding that the MAPK pathway mainly drives cancer cell self-proliferation in CRC such as, 30 underscores its global regulatory role in shaping the overall immunosuppressive ecosystem, and is consistent with recent research directions that emphasize tumor matrix interactions. 31 Finally, the value of the 17 gene risk scoring model we constructed includes not only its independent prognostic ability, but also in its profound biological foundation - it can be considered a quantitative indicator of the degree of TME immune suppression and matrix remodeling. 32 This makes it more biologically interpretable and potentially therapeutic than many gene labels obtained through “black box” big data mining, such as the ability to distinguish between “cold/hot” tumors and associate them with the sensitivity of specific drugs.
However, this study also has several limitations. Firstly, the initial sample size (9 cases) used for scRNA seq analysis is relatively small. Although we validated it by integrating a large amount of public transcriptome data, certain details of cell subpopulations and communication networks may vary in larger and more representative cohorts. Secondly, this study is mainly based on transcriptomic data. Although the core role of MAPK is inferred through cell communication and pathway analysis, there is a lack of direct experimental verification of key molecules upstream and downstream of the MAPK pathway (such as phosphorylation protein levels) and broader proteomics and metabolomics levels. 33 For example, how the MAPK pathway precisely regulates the expression of specific ligands (such as MIF/CD74 family members) in TAMs or CAFs, thereby affecting T cell function, requires further in vitro co culture or in vivo experiments to clarify. Thirdly, although independent public datasets were used for the construction and validation of prognostic models, these data were all sourced from retrospective cohorts. The potential of the model in predicting treatment response, especially the efficacy of immune checkpoint inhibitors, has not been validated in prospective clinical cohorts or clinical trial data. In addition, the specific functions and roles of the 17 genes included in the model in the MAPK pathway and TME interaction are not fully studied, and further mechanism exploration is needed for some genes.
Conclusions
In summary, through integrated analysis of single-cell and multi-omics data, suggests that the MAPK signaling pathway is associated with the formation of an immunosuppressive microenvironment in colorectal cancer. Based on these findings, a risk scoring model comprising 17 genes was constructed. This model demonstrated independent predictive capability for patient survival within the available datasets and generally reflected differences in the immune infiltration status—namely, cold versus hot tumor phenotypes—within the tumor microenvironment. Additionally, the analysis indicated that patients with high-risk scores might exhibit greater sensitivity to certain chemotherapies or targeted agents. However, given the limitations of this study regarding single-cell sample size and experimental validation, these conclusions require further investigation in subsequent research.
Supplemental Material
sj-docx-1-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-docx-1-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-2-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-2-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-3-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-3-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-4-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-4-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-5-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-5-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-6-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-6-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-7-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-7-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-8-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-8-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-9-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-9-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-10-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-10-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-11-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-11-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-12-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-12-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-13-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-13-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-14-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-14-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-15-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-15-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-16-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-16-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Supplemental Material
sj-xlsx-17-cix-10.1177_11769351261451845 – Supplemental material for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model
Supplemental material, sj-xlsx-17-cix-10.1177_11769351261451845 for The MAPK Pathway Coordinates an Immunosuppressive Microenvironment in Colorectal Cancer: A Single-Cell Guided Prognostic Model by Yao Zhang, Xizheng Zhang, Jiayu Wei, Yelei Wang, Ying Shen, Anqi Jiang and Qian Liu in Cancer Informatics
Footnotes
Author Contributions
All authors have read and approved the final manuscript. The individual contributions of each author are as follows: Yao Zhang: conceptualization, formal analysis, investigation, writing – original draft. Xizheng Zhang: data curation, software, validation. Jiayu Wei: methodology, resources, data curation. Yelei Wang: investigation, validation, data curation. Ying Shen: investigation, validation. Anqi Jiang: investigation, validation. Qian Liu*: supervision, project administration, funding acquisition, writing – review & editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research project was funded by the National Natural Science Foundation of China [grant number 81872275]; by Changzhou High-Level Medical Talents Training Project [No: 2022CZBJ110]; by Open Project of Jiangsu Provincial Key Laboratory of Xuzhou Medical University [XZSYSKF2023034].
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data used in this study were sourced from the TCGA database (https://portal.gdc.cancer.gov/), the GEO database (https://www.ncbi.nlm.nih.gov/geo/), and the GSEA database (
). These databases are publicly available. This study adheres to the respective data use and publication policies of each database.
Clinical Trial Number
Not applicable.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
