Abstract
Background
Lactate shapes the tumor microenvironment and modulates immunity. Investigating lactate metabolism genes in renal cell carcinoma (RCC) could elucidate therapeutic targets.
Methods
Single-cell RNA sequencing (GSE242299), bulk RNA sequencing (GSE102101), and spatial transcriptomics (GSE175540) data for RCC were retrieved from GEO. Data processing included quality control, normalization, and dimensionality reduction (R packages). RCTD and CellChat were used for spatial deconvolution and cell-cell communication analysis. CIBERSORT and GSEA evaluated immune infiltration and pathway enrichment. Lactylation scores were derived via single-sample gene set enrichment analysis (ssGSEA) of lactate metabolism genes. Mendelian randomization (MR) assessed gene-cancer risk associations.
Results
Macrophages demonstrated a higher potential for interactions with other cell types due to their extensive receptor-ligand relationships. Lactylation score and MR analysis identified six pivotal genes associated with renal cancer risk: C4A and SERPINA1 were correlated with an elevated disease risk, whereas CD70, FXYD2, SERPINE1, and TUBB6 were associated with a reduced risk. These genes are linked to the degree of immune cell infiltration and can influence the disease process through diverse mechanisms. We also explored the expression profiles of primary genes involved in lactate metabolism in RCC and compared the metabolic pathways between different groups. Notably, experimental validation via tissue microarray immunofluorescence confirmed that the risk-associated genes C4A and SERPINA1 were significantly overexpressed in RCC tumor tissues.
Conclusion
Lactic acid metabolism regulates RCC progression by modulating metabolic activity and immune cell infiltration. Key lactate metabolism genes present novel targets for RCC treatment.
Keywords
Introduction
Renal cell carcinoma (RCC) is a prevalent malignant tumor of the urinary system, with its incidence steadily increasing over the years. RCC is often resistant to conventional systemic therapies such as chemotherapy and radiotherapy. While surgical resection remains the primary treatment for localized disease, advanced RCC frequently develops resistance to these non-surgical interventions, necessitating novel therapeutic strategies. 1 Recent advancements in the molecular biology of RCC have provided profound insights into the underlying metabolic pathways and immune mechanisms, offering crucial guidance for the development of targeted treatment strategies. 2 Immunotherapy has emerged as a promising approach, significantly extending the survival of kidney cancer patients and reducing relapse rates to a certain extent. 3 However, the heterogeneity of the tumor microenvironment, influenced by cancer cell metabolism, has led to resistance to immunotherapy in some kidney cancer patients.4,5 Lactate metabolism stands out as a pivotal pathway within the tumor microenvironment, exerting influence beyond immune status. 6 Recent meta-analyses confirm lactate's dual role in promoting tumor progression while suppressing anti-tumor immunity across multiple cancer types.7,8 Its involvement in renal cancer has been well-documented, making its impact on immune characteristics a critical area of investigation.
Single-cell RNA sequencing (scRNA-seq) enables high-resolution profiling of gene expression at individual cell resolution, facilitating cell type classification and elucidation of transcriptional dynamics during cellular development. 9 However, its requirement for tissue dissociation irreversibly disrupts the native three-dimensional (3D) microenvironment, resulting in critical loss of spatial information about cellular organization and contextual interactions. Spatial transcriptomics overcomes this fundamental limitation by preserving architectural context during RNA profiling. 10 The integration of scRNA-seq with spatial transcriptomics creates a powerful analytical framework that maps molecular profiles onto tissue topography, enabling unprecedented investigation of cell-cell communication networks, microenvironmental niches, and structural determinants of tissue function. 11
While single-cell and spatial transcriptomics provide unprecedented resolution into the cellular heterogeneity and spatial organization of the RCC microenvironment, and their integration with lactate metabolism gene sets reveals strong associations with immune dysregulation, these approaches primarily identify correlative relationships. Establishing causal links between specific lactate metabolism genes and RCC pathogenesis remains a significant challenge, as observational studies are susceptible to confounding and reverse causation. To address this critical gap and robustly infer causality, we employed Mendelian Randomization (MR). MR leverages genetic variants as instrumental variables (IVs), exploiting their random allocation to minimize confounding, thereby providing stronger evidence for causal effects of gene expression on disease risk.12,13 Recent advances have integrated MR with single-cell RNA sequencing (scRNA-seq) to elucidate cell-type-specific causal mechanisms and phenotypic traits.14,15 However, while studies on cell signaling mechanisms are prevalent, they often overlook the spatial localization of cells and their interactions. Here, we pioneer the integration of single-cell RNA sequencing, spatial transcriptomics, and MR to establish causal links between lactate metabolism genes and immune dysregulation in RCC. This multi-dimensional approach overcomes key limitations: (1) Spatial transcriptomics resolves cellular architecture lost in scRNA-seq dissociation; (2) MR provides causal inference beyond correlative analyses; (3) Their integration enables spatially anchored causal gene discovery—a paradigm shift in tumor microenvironment research.
Methods and materials
Retrieval of datasets
Data were downloaded from public databases. Single-cell analysis: 31 samples from GSE242299 (GEO; https://www.ncbi.nlm.nih.gov/geo/), comprising 19 disease cases (histologically confirmed clear cell RCC) and 12 controls (adjacent non-tumor tissues from RCC patients). Spatial transcriptomics: 12 disease samples from GSE175540 (all histologically confirmed RCC). RNA-seq: GSE102101 (20 patients: 10 RCC cases and 10 matched adjacent normal tissues) with annotation file GPL11154. MR analysis: We employed a two-sample MR (TSMR) design to investigate the causal associations between lactate metabolism gene expression and RCC risk. This design utilizes summary statistics from two independent, non-overlapping sample sets for the exposure (gene expression) and outcome (disease). Exposure Data (Gene Expression): IVs for lactate metabolism gene expression were derived from cis-eQTL summary statistics obtained from the eQTLGen consortium (https://www.eqtlgen.org). This dataset provides genome-wide significant associations between genetic variants (SNPs) and gene expression levels measured in whole blood, primarily based on individuals of European ancestry. We focused on cis-eQTLs (SNPs located within 1 Mb upstream or downstream of the gene transcription start site (TSS)) for each lactate metabolism gene. Outcome Data (Disease Risk): Summary statistics for the association between the same SNPs and RCC risk were obtained from the FinnGen consortium (https://www.finngen.fi/en/access_results). Specifically, we used results from the kidney clear cell carcinoma analysis (finngen_R11_C3_KIDNEY_CLEAR_CELL_CARCINOMA_EXALLC), comprising 1354 cases and 345,118 controls. This dataset is derived exclusively from the Finnish population.
Case/Control Criteria: Disease groups included samples with histopathological confirmation of RCC. Control groups consisted of adjacent non-tumor tissues (for GSE102101 and GSE242299) or healthy renal tissues (for FinnGen controls). All sample groupings were verified against original GEO metadata and clinical annotations.
MR analysis
We focused on cis-acting expression quantitative trait loci (cis-eQTLs) for each lactate metabolism gene. Variants were restricted to those located within ±1 Mb of the gene TSS, as cis-eQTLs exhibit stronger biological interpretability and reduced pleiotropy compared to trans-eQTLs. On this basis, we chose the SNPs with the total significance level of 1×10−6 for each gene in the whole locus as the potential IVs. We also tested for the LD among these SNPs, and thus, only those SNPs that had R² more than 0.001 were removed using a clumping window size of 10,000. In this study, when selecting weak IVs, the expected F-value was used with a condition that it should be more than 10. Other statistical tests were also conducted in the analysis of the data: (1) IVW: This meta-analysis method is a weighted Wald statistics of an SNP to give an average. (2) MR Egger: It has a principle that the instrument strength does not rest on direct affectation (InSIDE). (3) Weighted Median: In this method of estimation, it is possible to make valid causal estimations even if as much as 50% of IVs are invalid. (4) Weighted Mode: The main advantage of this method is that it does not have a high risk of bias and type I error when compared to MR-Egger regression. If one statistical technique could be used in a particular SNP within the causal association, then only the Wald ratio was used for evaluation. These were to establish the stability of the causal effects, which would quantify the sum of the cis- and some of the cross-region gene expressions in the peripheral blood on renal cancer lesions. In the end, screened causal relationships were also verified, and leave-one-out was also used in the analysis. MR assumptions: (1) Relevance: Ensured by F-statistic > 10. (2) Independence: IVs are not associated with confounders (assessed via FinnGen covariates). (3) Exclusion restriction: IVs affect the outcome only through the exposure (tested via MR-Egger intercept test, P > 0.05 indicating no horizontal pleiotropy).
Sensitivity analysis
To answer the question as to which genetic variants can lead to kidney cancer, we conducted an MR leave-one-out analysis. It also helps in omitting variants of large effects on the total estimate by systematically removing each SNP and redistributing the SE of the rest of the SNPs. In each case, it was possible to obtain the new point estimate with 95% CI, thus minimizing the limitations of the method and determining the impact of every SNP individually. We have also provided these estimates and the estimate of the time when all the SNPs were taken into consideration. The above estimates will help in assessing the impact of omitting any particular SNP for proper assessment of the stability of the conclusions that have been made on genetic variants and kidney cancer risks.
Gene set enrichment analysis (GSEA)
Samples were stratified by gene expression into high/low groups. GSEA was performed using MSigDB v7.0 gene sets. Lactate metabolism-related genes were curated from Huang et al. (2024), 16 which established a comprehensive lactate metabolism signature comprising 332 genes through literature mining and pathway enrichment analysis. The following annotated gene list was used to perform the differential expression on the pathways to consider the subtype pathways of the two groups. The genes were then ranked based on the adjusted p-value, and only genes with an adjusted P-value of < 0.05 were considered significantly enriched genes of interest. This is because the GSEA approach is often used in works that incorporate disease classification with the directionality of gene sets to their functions, which gives a further understanding of molecular pathways.
Tissue microarray (TMA) immunofluorescence staining
For immunostaining, the sections were put in 3% H2O2 for 20 min followed by deparaffinization and heat-mediated antigen retrieval treatment and incubated in blocking buffer (3% BSA in PBS supplemented with 0.1% Triton X-100) at room temperature for 1 h. Sections were incubated with primary antibody for 2 h, followed by detection using HRP-conjugated secondary antibody and TSA-fluorophores. The primary and secondary antibodies were eliminated by heating the slides. Alpha-1-Antitrypsin (Proteintech, Cat# 16382-1-AP, 1:200), and C4 Alpha Chain (Proteintech, Cat# 22233-1-AP, 1:200) were sequentially detected. Vectashield containing DAPI nuclear counterstain was used to mount the sections. The slices were imaged using the Olympus SLIDEVIEW VS200 slide scanner. TMA of human RCC and paired adjacent normal tissues (HKidE180Su04) were purchased from Shanghai Outdo Biotech Company (Shanghai, China). The study was approved by the Ethics Committee of Shanghai Outdo Biotech Company (SHYJS-CP-240303).
Statistical analysis
For statistical analysis, the inverse-variance weighted (IVW) method under a random-effects model was designated as the primary analysis approach for estimating causal effects between lactate metabolism gene expression and RCC risk. The statistical methods employed included the random-effects IVW regression as the primary method for main causal estimates, supplemented by MR-Egger regression to assess and adjust for directional pleiotropy, weighted median estimator robust to ≤50% invalid instruments, weighted mode estimator robust to invalid instruments with similar causal estimates, MR-PRESSO to detect and remove outliers, Wald ratio for single-SNP analyses, and multivariable MR to adjust for potential pleiotropic pathways. We performed comprehensive heterogeneity tests to assess the consistency of causal estimates across individual genetic variants, including Cochran's Q statistic to quantify heterogeneity among IVW estimates, with a significance threshold of P < 0.05 indicating substantial heterogeneity; I2 statistic to measure the percentage of variation across studies due to heterogeneity rather than chance, where I2 values of 25%, 50%, and 75% represented low, moderate, and high heterogeneity, respectively; and leave-one-out analysis to identify potential outlier variants driving heterogeneity. In the presence of significant heterogeneity (Cochran's Q P < 0.05), we utilized random-effects models instead of fixed-effects models and applied MR-PRESSO to detect and remove pleiotropic outliers. Given the testing of multiple lactate metabolism genes against RCC risk, we applied Bonferroni correction to account for multiple comparisons, with the significance threshold adjusted to P < 0.05/[number of independent tests]. For sensitivity analyses, consistency across multiple methods (IVW, MR-Egger, weighted median) was required to declare significance. Key thresholds and parameters used in the analyses can be found in Supplemental Table 1. All the statistical tests were done using R software with a P-value of 0.05.
Results
Single-cell RNA-sequencing processing
We analyzed single-cell expression profiles from 31 renal cancer samples using Seurat. Cells were filtered based on UMI counts, gene numbers, and mitochondrial gene percentage (using ±3 MAD from median as outlier threshold). DoubletFinder removed doublets, leaving 45,602 cells for analysis (Supplemental Figure S1A & B). The 10 genes with largest standard deviations are shown in Supplemental Figure S1C.
ElbowPlot selected 20 PCs (Supplemental Figure S1D). PCA revealed batch effects (Supplemental Figure S1E), corrected using Harmony (Supplemental Figure S1F). UMAP identified 12 subtypes (Figure 1(A)), annotated into 10 cell types: Macrophages, Monocytes, T cells, NK cells, Epithelial cells, B cells, Proximal Tubular Cells, Vascular Smooth Muscle Cells, Mast Cells, and Endothelial Cells. Cell categorizations are shown in Figure 1(B), with marker bubble charts (Figure 1(C)) and sample proportions (Figure 1(D)). Using 332 literature-derived lactate-related genes, 16 single-sample gene set enrichment analysis (ssGSEA) scoring revealed significantly elevated lactate levels in disease samples for macrophages, T cells, epithelial cells, vascular smooth muscle cells, and endothelial cells (Figure 1(E)).

Single-cell RNA data processing and annotation. (A) The scatter plot displays 12 cell clusters derived from dimensionality reduction using the UMAP tool, with each cluster represented by a different color. (B) The plot illustrates the annotation results of 12 cell clusters, ultimately identifying 10 distinct cell types. (C) Bubble plot of 10 types of cells and cell markers. (D) The left plot shows the proportion of 10 cell types in the single-cell RNA dataset, the middle plot depicts the proportion of these 10 cell types in normal and tumor tissues, and the right plot displays the cell count for each of the 10 cell types. (E) The left plot shows the lactate score of 10 cell types, and the right plot illustrates the differential expression of lactate scores across 10 different cell types in normal and tumor tissues.
Spatial transcriptome sequencing data analysis
We analyzed spatial transcriptomes from 12 renal cancer samples. UMI counts were higher in epithelial regions (Supplemental Figure S2). After normalization, we performed PCA, UMAP, and Louvain clustering, identifying 15 subgroups (Figure 2(A)). Differentially expressed genes (DEGs) for each cluster were identified using Wilcoxon rank-sum test (expression significantly higher in-cluster than others, expressed in ≥10% of spots). The top 5 DEGs per subgroup were presented (Figure 2(B)). Furthermore, functional analysis of the genes was performed using the R package name “ClusterProfiler.” Analysis revealed that these DEGs were significantly enriched in biological processes such as antibacterial humoral response and oxygen transport. Their subcellular localization was primarily associated with structures like blood microparticles and the IgG immunoglobulin complex, while their molecular functions were related to oxygen carrier activity and antigen binding (Figure 2(C)).

Spatial transcriptome sequencing data processing and subpopulation segmentation. (A) The left plot displays 15 subgroups derived from the analysis of 12 spatial transcriptomics samples, achieved through nonlinear dimensionality reduction using UMAP and Louvain clustering. The right plot illustrates the grouping of these 12 spatial transcriptomics samples. (B) The heatmap shows the differential expression of the top 5 ranked genes across spots in 15 subgroups. (C) Bar plots and bubble plots of GO enrichment analysis for differentially expressed genes based on ClusterProfiler.
RCTD deconvolution and spatial ligand-receptor analysis
Using spacexr, we deconvolved spatial transcriptome data with single-cell data to identify dominant cell types per spot (Figure 3(A)). Deconvolution accuracy was verified by identifying marker genes per cell type using FindAllMarkers (logfc.threshold = 0.585, min.pct = 0.25, only.pos = FALSE; Figure 3(B)). CellChat visualized ligand-receptor interactions, revealing complex cell subtype networks (Figure 3(C) and (D)). Subsequently, ssGSEA scored lactate gene set enrichment in spatial data, dividing macrophages into high/low lactate groups. Differential analysis (|LogFC|>0.585, p_val_adj < 0.05) between these groups identified 496 DEGs (DiffGene.csv; Figure 3(E)). Further, we performed the GO pathway analysis for these DEGs, and the results revealed that these genes are involved in wound healing, humoral immune response, and response to steroid hormones biological processes (Figure 3(F)). The latter refers to any process that results in a change in state or activity of a cell or an organism as a consequence of a steroid hormone stimulus.

RCTD deconvolution and spatial ligand-receptor analysis. (A) Deconvolution analysis was performed on 12 spatial transcriptomic samples to determine the predominant cell type in each spot. (B) The scatter plot shows the most differentially expressed marker genes in each cell type. (C) A network of cellular interactions between nine types of cells, with the edge width indicating the probability and intensity of communication between cells. (D) A comparison of the total number of interactions in the communication network between nine cell species, decreasing from left to right, with the strongest being macrophages. (E) Volcano map of differentially expressed genes, with red indicating up-regulated differential genes and blue indicating down-regulated differential genes. (F) GO pathway enrichment analysis of the differential genes.
MR analysis
To further identify the key genes affecting kidney cancer development from the set of differential genes, we obtained outcome identifiers through summary statistics from 346,472 kidney cancer-related samples (Controls: 345,118; Cases: 1354): finngen_R11_C3_KIDNEY_CLEAR_CELL_ CARCINOMA_EXALLC. Through MR analysis, we identified a causal relationship involving six pairs of differential genes associated with eQTL positive outcomes (Figure 4(A)–(F), IVW P-value < 0.05). The corresponding genes are: C4A, CD70, FXYD2, SERPINA1, SERPINE1, and TUBB6. Among these, CD70 (0.554; 0.352–0.872; P = 0.011), FXYD2 (0.604; 0.422–0.864; P = 0.006), SERPINE1 (0.598; 0.365–0.981; P = 0.042), and TUBB6 (0.851; 0.737–0.984; P = 0.029) are associated with a decreased risk of disease, while C4A (1.265; 1.002–1.598; P = 0.048) and SERPINA1 (1.205; 1.025–1.417; P = 0.024) are linked to an increased risk of disease. To ensure the reliability of these causal relationships, we conducted sensitivity analyses on the six pairs of genes. The results indicated that the overall error bars exhibited minimal change when excluding any single SNP, demonstrating the robustness of the six identified causal relationships (Supplemental Figure S3A–F). Consequently, C4A, CD70, FXYD2, SERPINA1, SERPINE1, and TUBB6 have been established as key genes for our subsequent research (Supplemental Table 2).

Mendelian randomization analysis. (A–F) Scatter plot of MR analysis of key genes, different colors indicate different statistical methods, and the slope of the lines respectively indicate the causal effect of each method.
Immune infiltration analysis
The microenvironment is primarily composed of fibroblasts, immune cells, extracellular matrix components, various growth factors and inflammatory mediators. We illustrated the distribution of immune infiltration levels and the correlations among immune cell types in different contexts (Figure 5(A) and (B)). In our analyses, we observed that the levels of M1 macrophages, CD8T cells, follicular helper T cells, resting NK cells, and activated memory CD4T cells were significantly elevated in the disease group compared to the control group. Conversely, levels of naive B cells, activated NK cells, and plasma cells were significantly decreased (Figure 5(C)). Further investigation into the relationships between key genes and immune cells revealed that C4A exhibited a significant positive correlation with activated memory CD4T cells, while demonstrating a significant negative correlation with activated NK cells and plasma cells (Figure 5(D)). CD70 showed a significant positive correlation with activated memory CD4T cells and resting NK cells, along with a notable negative correlation with naive B cells and activated NK cells (Figure 5(E)). FXYD2 was found to have a significant positive correlation with naive B cells and activated NK cells, while exhibiting a significant negative correlation with activated memory CD4T cells and CD8T cells (Figure 5(F)). SERPINA1 showed strong positive correlations with activated memory CD4T cells and follicular helper T cells, but significant negative correlations with naive B cells and activated NK cells (Figure 5(G)). SERPINE1 exhibited a significant positive correlation with activated memory CD4T cells and a significant negative correlation with resting mast cells (Figure 5(H)). TUBB6 was notably positively correlated with resting NK cells and activated memory CD4T cells, while being significantly negatively correlated with naive B cells and activated NK cells (Figure 5(I)). Additionally, we analyzed the correlations between key genes and various immune factors, including immunosuppressive and immunostimulatory factors, chemokines, and receptors (Supplemental Figure S4A–E, Supplemental Table 3). These analyses underscore that key genes are closely associated with immune cell infiltration levels and play a critical role in shaping the immune microenvironment.

Analysis of immune infiltration. (A) The vertical stack bar chart shows the proportions of different immune cells in the disease group and control group samples. (B) The heatmap displays Pearson correlations between immune cells, negative in blue and positive in red. (C) The box plot shows differences in immune cell content between control and tumor samples. (D–I) The dumbbell charts show a correlation between key gene immunity and epidemic cells.
Signaling pathways involving key genes and their relationship with disease-associated genes
Therefore, we wanted to know which identified genes are involved in the specific signaling pathways and how these genes might affect disease progression at the molecular level. The GSEA revealed that C4A is involved in some signaling pathways such as NOD-like receptor signaling pathway, cytosolic DNA-sensing, and Toll-like receptor signaling pathway (Figure 6(A)). Likewise, the targets of CD70 were also observed to be involved in the NOD-like receptor signaling pathway, cytoplasmic DNA-sensing signaling pathway, and RIG-I-like receptor signaling pathway (Figure 6(B)). The gene FXYD2 was also identified to be significantly associated with retinol metabolism, tyrosine metabolism, and butanoate metabolism (Figure 6(C)). Finally, SERPINA1 was associated with the TNF signaling pathway, the cytosolic DNA-sensing pathway, and the chemokine signaling pathway (Figure 6(D)).

Signaling pathways involving key genes and their relationship with disease-associated genes. (A–F) KEGG pathway analysis for genes C4A, CD70, FXYD2, SERPINA1, SERPINE1, and TUBB6. (G) The box plot shows differences in the expression of tumor regulatory genes, with blue representing control patients and yellow representing tumor patients. (H) The middle figure shows Pearson correlation analysis of key genes and tumor genes, with a negative correlation in blue and a positive correlation in red. The left graph shows the correlation between CD70 and HLA-B, and the right graph shows the correlation between FXYD2 and CASP10.
Meanwhile, the activated TNF signaling pathway and NOD-like receptor signaling pathway were correlated with SERPINE1 and the NF-kappa B signaling pathway (Figure 6(E)). At the same time, increased expression of TUBB6 was observed in the TNF signaling pathway, chemokine signaling pathway, and p53 signaling pathway (Figure 6(F)). Further, gene set variation analysis (GSVA) was used to determine the genes contributing to the disease progression through these pathways (Supplemental Figure S5A–F).
The disease regulation genes were identified from the GeneCards database. This paper concentrated on analyzing 20 genes out of the total 1530 genes, which were identified as being most relevant and included in the transcriptome based on the Relevance*Score. We found that the genes, including ATM, BRCA2, CASP10, and CCND, significantly differ between the two patient groups (Figure 6(G)). Further, we have carried out cross-correlation by comparing the above-identified genes with the disease-regulated genes. The changes in the expression of the key genes were found to be statistically significant with the changes in the disease-regulated genes. In particular, CD70 and HLA-B showed a very significant direct association (r = 0.829, P = 6.3×10−6), whereas FXYD2 and CASP10 had a significant negative relationship (r = −0.705, P = 0.00052) (Figure 6(H)).
We finally used the mircode database to perform reverse prediction analysis of the key genes, which revealed 64 miRNAs and 142 total mRNA/miRNA interaction pairs. These relationships were shown in Cytoscape (see Supplemental Figure S6A). In the present study, we were interested in the direction of the key genes as the gene set and found out that several common processes, such as transcription factors control them. To pursue these regulation mechanisms, cumulative recovery curves of the transcription factors were performed on the enrichment analysis (Supplemental Figure S6B). Using the Motif-TF annotation and selection, it is evident that the highest value of NES = 5.52 belongs to cisbp__M4723. All the enriched motifs and the transcription factors related to the essential genes have been included here (Supplemental Figure S6C).
Expression profiles of key genes in spatial transcriptome data and developmental trajectories in spatial transcriptome data
This study analyzed the expression of key genes within the spatial transcriptome, highlighting the expression abundance of C4A, CD70, FXYD2, SERPINA1, SERPINE1, and TUBB6 across 12 spatial transcriptome datasets (Supplemental Figure S7A–F). We extracted key cellular macrophages from the spatial transcriptome in our quasi-chronological analysis. Initially, we calculated the similarity between cells to construct a cell differentiation trajectory. In this way, we created a diagram of cell differentiation throughout time, which helps when studying cell differentiation and active genes at a certain time. Therefore, we generated pseudo-time values (pseudotime is computed by Monocle2 using cell gene expression data, which represents the order of cell developmental stages) and Seurat clusters of cells and created images of the same (Figure 7(A) and (B)). These images represent the dynamics of the expression of those genes during the pseudotime from the start to the end (Figure 7(C)). Additionally, we incorporated pseudotime mapping into the images presented in H&E (Supplemental Figure S8 and Figure 7(D)). This integration of spatial and pseudotemporal data provides crucial insights into how the local microenvironment spatially orchestrates macrophage functional programming, potentially influencing disease progression and response at specific anatomical sites.

Correlation between cell development trajectories and metabolic pathways. (A–B) The scatter plots show a Quasi-temporal analysis of cells and developmental trajectories. (C) The images show the relationship between key gene expression and cell development locus. (D) The pseudotime is mapped onto the H&E images in 12 spatial transcriptomics samples.
Validation using a TMA
To validate the contribution of key genes to RCC risk, we selected C4A and SERPINA1 from a pool of six candidate genes, as they were the only two significantly associated with an increased risk of RCC. Immunofluorescence staining was performed on a TMA (HKidE180Su04), which included samples from 93 patients, among which 87 cases contained both cancerous and adjacent non-cancerous tissues. Comparative analysis of C4A expression levels in these 87 patients revealed significantly higher expression in tumor tissues compared to adjacent normal tissues (Figure 8(A) and (B)). A similar pattern was observed for SERPINA1, showing consistent overexpression in cancerous regions (Figure 8(C) and (D)).

Analysis of C4A and SERPINA1 expression by immunofluorescence staining on tissue microarrays (TMA) (A and C) Representative images of C4A or SERPINA1 immunofluorescence staining on the TMA. (B and D) Comparison of C4A or SERPINA1 expression between tumor and adjacent normal tissues on the TMA.
Co-expression network of key genes and disease genes in single cells and activity differences between metabolic pathways
We obtained kidney cancer-related genes from the GeneCards database. From this dataset, we selected three genes (HLA-DRB1, PTEN, and TP53) with the highest Relevance scores. We explored the co-expression network of these key genes, as well as their correlation with kidney cancer-related genes, through correlation analysis (Supplemental Figures S9, S10, S11, S12, and S13). Our study employed the GSVA software package to analyze the single-cell expression profiles using the default Gaussian parameters. After normalization, we calculated the ssGSEA scores to construct the expression profile's metabolic pathway matrix (ssgseaOutAll.txt). The pheatmap package was then utilized to visualize this metabolic pathway matrix, creating a heat map based on the activity grouping of the key genes expressed in single cells (Supplemental Figure S14). Our findings indicated that the FXYD2-positive group exhibited higher activity in the amino acid metabolism-related markers, specifically Arginine Biosynthesis and the Urea Cycle.
Discussion
The global incidence of RCC has risen over recent decades, with smoking, obesity, and hypertension established as significant risk factors. 17 Surgical resection remains the primary treatment for early-stage RCC. 18 However, many patients present with advanced disease at diagnosis. While current therapies modestly extend survival in advanced stages, prognosis remains poor, evidenced by persistently low five-year survival rates. 19 The metabolic alterations in RCC, including Warburg effect and lipogenesis, create a biochemical environment that supports uncontrolled tumor cell proliferation by meeting the high bioenergetic and biosynthetic demands of cancer cells. 20 This reprogramming stems from inactivation of tumor suppressors and activation of oncogenes. 21 Consequently, identifying and characterizing RCC-associated metabolic genes holds significant promise for improving disease management.
Lactate mediates multiple cancer-promoting processes, including metabolic reprogramming of the TME, protein lactylation, immunosuppression, chemoresistance, epigenetic modulation, and metastasis.22,23 Clinically, elevated lactate levels correlate strongly with poor prognosis across cancer types. For instance, lactate suppresses T-cell and dendritic cell activity, facilitating immune evasion by cervical cancer cells—thereby increasing tumor invasiveness and reducing survival rates. 24 In colorectal cancer, lactate drives tumorigenesis and accelerated progression, positioning inhibition of its transporter MCT1 as a promising therapeutic strategy. 25 Lactate metabolism is equally critical in RCC pathogenesis. It enhances RCC aggressiveness by inducing phenotypic changes in TME-resident normal cells, promoting malignant transformation. 26 Mechanistically, FKBP10 directly binds to and phosphorylates LDH, amplifying the Warburg effect and histone lactylation to fuel RCC progression. 27 Conversely, downregulation of HSPA12A—associated with lymph node metastasis—reduces lactate production and glycolysis, ultimately suppressing RCC cell migration and positioning HSPA12A as a novel metastasis inhibitor. 28 Despite these insights, systematic characterization of lactate metabolism genes in RCC remains limited, warranting deeper investigation.
Transcriptome sequencing has historically been fundamental for identifying tumor-driving genes. 29 However, conventional bulk RNA-seq analysis primarily captures aggregate gene expression patterns, limiting resolution for cellular heterogeneity studies. ScRNA-seq overcomes this constraint by enabling gene expression profiling at individual-cell resolution, facilitating characterization of cellular composition, activity states, and stochastic gene expression within populations. 30 Despite scRNA-seq's transformative impact on delineating tumor heterogeneity, establishing causal relationships between co-expressed genes remains challenging. MR—a genetic epidemiology approach leveraging genetic variants as IVs—integrates with scRNA-seq to infer causal genotype-phenotype relationships.31,32 This synergy elucidates how specific genetic variants causally regulate gene expression, enabling identification of causal genes and their mechanistic pathways.
This study distinguishes itself from previous work through three key innovations. First, we pioneer the integration of single-cell RNA sequencing, spatial transcriptomics, and MR to establish causal links between lactate metabolism genes and immune dysregulation in RCC. While prior studies combined scRNA-seq with MR analysis, they lacked spatial resolution.33–35 Our addition of spatial transcriptomics addresses this gap by preserving tissue architecture and enabling spatially anchored causal gene discovery—a paradigm shift in tumor microenvironment research.36–39 Second, unlike conventional transcriptome analyses that focus on overall gene expression, 29 our multi-omics approach dissects cell-type-specific lactate metabolism and spatially resolves interactions within the tumor niche. Third, we identify six novel lactate metabolism genes (C4A, CD70, FXYD2, SERPINA1, SERPINE1, TUBB6) with causal roles in RCC risk. We also confirmed that C4A and SERPINA1 are overexpressed in RCC tumor tissues using our existing cohort. Subsequent investigation demonstrated their significant correlations with distinct immune cell types, such as T cells, B cells, and NK cells, highlighting their importance in immune regulation. However, our approach has important limitations. First, our MR analysis relies on peripheral blood eQTL data to infer gene effects in tumor tissues. This introduces potential bias, as eQTLs can exhibit tissue-specificity due to differences in chromatin accessibility, transcription factor activity, and cellular composition between blood and kidney tumors. Second, our spatial transcriptomics data lacks single-cell resolution, potentially obscuring rare cell populations or subcluster-specific effects. Future studies should integrate higher-resolution spatial techniques with scRNA-seq to refine cell-type annotations. Third, the absence of a prospective sample size calculation is a limitation. The sample size was constrained by the available public data. Although FinnGen provides a large control group, the relatively small number of RCC cases (n = 1354) may reduce power to detect small causal effects, increasing the risk of false-negative findings. 40
Conclusion
Lactate metabolism plays a significant role in RCC pathogenesis by modulating tumor metabolic activity and immune cell infiltration. Through integration of single-cell sequencing, spatial transcriptomics, and MR, we identified six lactate metabolism genes (C4A, CD70, FXYD2, SERPINA1, SERPINE1, TUBB6) causally associated with RCC risk. These genes demonstrate distinct correlations with immune cell populations, suggesting their involvement in immune dysregulation within the tumor microenvironment. This study provides a foundation for translating lactate metabolism insights into clinical applications, though further validation is required to establish definitive therapeutic targets.
Supplemental Material
sj-docx-1-sci-10.1177_00368504251408094 - Supplemental material for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study
Supplemental material, sj-docx-1-sci-10.1177_00368504251408094 for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study by Jiangbo Li, Chen Liu, Zhiguang Fu, Qi He, Xu Liu and Zhijia Sun in Science Progress
Supplemental Material
sj-pdf-2-sci-10.1177_00368504251408094 - Supplemental material for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study
Supplemental material, sj-pdf-2-sci-10.1177_00368504251408094 for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study by Jiangbo Li, Chen Liu, Zhiguang Fu, Qi He, Xu Liu and Zhijia Sun in Science Progress
Supplemental Material
sj-docx-3-sci-10.1177_00368504251408094 - Supplemental material for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study
Supplemental material, sj-docx-3-sci-10.1177_00368504251408094 for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study by Jiangbo Li, Chen Liu, Zhiguang Fu, Qi He, Xu Liu and Zhijia Sun in Science Progress
Supplemental Material
sj-docx-4-sci-10.1177_00368504251408094 - Supplemental material for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study
Supplemental material, sj-docx-4-sci-10.1177_00368504251408094 for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study by Jiangbo Li, Chen Liu, Zhiguang Fu, Qi He, Xu Liu and Zhijia Sun in Science Progress
Supplemental Material
sj-docx-5-sci-10.1177_00368504251408094 - Supplemental material for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study
Supplemental material, sj-docx-5-sci-10.1177_00368504251408094 for Lactate metabolism orchestrates immune dysregulation in renal cancer: A multi-omics and causal inference study by Jiangbo Li, Chen Liu, Zhiguang Fu, Qi He, Xu Liu and Zhijia Sun in Science Progress
Footnotes
Acknowledgments
We are deeply grateful to Professor Kun Liu from the Department of Epidemiology, Faculty of Preventive Medicine, Air Force Medical University for his expert guidance on statistical methodology in this study. During the preparation of this work, the authors used ChatGPT-5 to assist with language polishing and grammar improvement. After using these tools, the authors reviewed and edited the content as needed and take full responsibility for the final content of the publication.
Ethics approval and consent to participate
The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study adhered strictly to the principles outlined in the Declaration of Helsinki (as revised in 2013). As the anonymous patient data is from the publicly available database, individual consent was not required for this research. This study was conducted with the approval of the Ethics Committee of the Air Force Characteristic Medical Center (Approval No. 2024-90-PJ31, Date: 2024-12-15). The tissue samples were collected with informed consent from the donors and the study was approved by the Ethics Committee of Shanghai Outdo Biotech Company (SHYJS-CP-240303).
Author contributions
Conceptualization: Zhijia Sun. Data curation: Jiangbo Li, Chen Liu, and Zhiguang Fu. Methodology: Zhijia Sun, Jiangbo Li, Chen Liu, and Qi He. Project administration: Zhijia Sun. Resources: Zhiguang Fu, Xu Liu, and Qi He. Software: Jiangbo Li. Supervision: Zhijia Sun. Validation: Zhijia Sun. Visualization: Zhijia Sun and Chen Liu. Writing—original draft: Zhijia Sun and Jiangbo Li. Writing—review & editing: Zhijia Sun.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Availability of data and materials
The original data supporting the conclusions of this study will be provided by the authors without reservation. For further inquiries, please contact the corresponding author directly.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
