Abstract
Objective
To test whether exposure to bisphenol A is related to immune modulation and prognosis in breast cancer using an integrative framework.
Methods
Bisphenol A targets from toxicology resources were intersected with breast cancer genes. Protein–protein interaction networks were constructed, and hubs were prioritized by centrality. Survival was evaluated in The Cancer Genome Atlas breast cancer cohort using Cox and Kaplan–Meier analyses. The tumor microenvironment was profiled using ESTIMATE scores and CIBERSORT deconvolution. Consistency was examined in the independent METABRIC cohort. Docking was used to estimate bisphenol A–protein binding free energies.
Results
Filtering converged on four immunoregulatory genes—
Conclusions
This integrative analysis connected environmental toxicology to tumor immunity and nominated CCL19, CD40LG, IGLL5, and KLRB1 as candidate biomarkers for exposure–risk assessment. The findings are correlative and require mechanistic validation.
Introduction
Bisphenol A (BPA) is an endocrine disruptor with epidemiologic and experimental links to breast cancer (BC). However, the association between BPA and immune evasion in BC remains unclear. Immune surveillance and antitumor activity are strongly influenced by the tumor microenvironment (TME), in which chemokine signaling and costimulatory pathways govern lymphocyte trafficking and activation.
1
Using multidatabase target prediction, intersecting disease genes, STRING-based protein interactions, and survival modeling, we prioritized a four-gene immune axis (
In recent years, the TME has attracted increasing attention as a critical ecological niche that shapes tumor progression and therapeutic response. 2 The composition, activation status, and functional imbalance of immune-infiltrating cells within the TME are now recognized as key determinants of tumor immune escape. 3 For instance, CD8+ T cell exhaustion, accumulation of regulatory T cells, and suppression of antigen presentation pathways can collectively compromise immune surveillance, thereby facilitating tumor evasion and metastasis. 4 Although BPA has been shown to modulate immune function, it remains largely unexplored at the mechanistic level whether it contributes to TME remodeling through specific immunoregulatory mechanisms, ultimately promoting immune escape in BC.5,6
Previous studies have demonstrated that prolonged exposure to low-dose BPA at nanomolar concentrations can disrupt immune homeostasis at multiple levels. 7 Regarding innate immunity, BPA suppresses major histocompatibility complex (MHC) class II and interleukin (IL)-12 expression in dendritic cells and attenuates the cytotoxic activity of natural killer (NK) cells.8,9 In the context of adaptive immunity, chronic exposure over 8 weeks has been shown to skew CD4+ T cell differentiation toward Th2/Th17 phenotypes, reduce the interferon (IFN)-γ/IL-4 ratio, and enhance B cell activating factor receptor (BAFF-R) signaling, thereby promoting B cell activation and autoantibody production.10,11 Epidemiological data further support these findings, with positive correlations reported between urinary BPA concentrations and elevated serum levels of C-reactive protein and IL-6, indicating chronic inflammatory activation. 12
At the epigenetic level, BPA induces hypermethylation of the FOXP3 promoter, contributing to regulatory T cell dysfunction. 13 It also enriches H3K27ac modifications in chemokine gene clusters in mammary epithelial cells and suppresses the STING–type I IFN (IFN-I) pathway via the upregulation of miR-146a-5p.14,15 These findings collectively suggest that BPA exerts its immunomodulatory effects through DNA methylation, histone modification, and noncoding RNA pathways.16,17 In summary, long-term, low-dose BPA exposure establishes a permissive immune landscape that increases BC susceptibility via immune imbalance and epigenetic remodeling. However, its precise mechanistic interactions with specific immunoregulatory axes within the TME remain to be fully elucidated.
A review published in 2023 noted that environmental toxicological factors and the TME–immune axis interact through a complex multitarget network, with mechanisms that may transcend traditional hormone–receptor pathways and involve immune pathways such as chemokines, costimulatory molecules, antigen presentation, and complement signaling.
18
Additionally, BPA-induced “immune axis disruption” may influence immune cell recruitment, polarization, and functional states by regulating specific immune genes (such as
Building upon these findings, the present study proposes a novel “environmental exposure–immune target–immune escape” framework, integrating network toxicology, bioinformatics, and multiomics data analysis to construct an immunoregulatory network linking BPA exposure to BC progression. The study aims were as follows: (a) to intersect BPA‐predicted targets with BC genes; (b) to identify interaction hubs and survival-relevant candidates; and (c) to contextualize them with TME enrichment and immune cell infiltration in TCGA-BRCA, with directional consistency verified in METABRIC. All analyses were observational/computational and therefore predictive/correlative; mechanistic interpretations were hypothesis-generating and required functional validation. By elucidating the immunomodulatory effects of BPA within the BC microenvironment, this study provides a theoretical basis and identifies candidate biomarkers for exposure–risk assessment and immune-based intervention strategies. We hypothesize that BPA disrupts the CCL19–CD40LG–IGLL5–KLRB1 immune axis, reshaping the TME to favor immune evasion and tumor progression. This hypothesis offers a mechanistic framework to guide subsequent multiomics validation and experimental investigation. Following network toxicology analysis, the study further explored the effects of BPA on immune regulation and tumor progression, and the intervention effects of modulating key immunoregulatory genes on BPA toxicity were also assessed. The methods and procedures are outlined in Figure 1, aiming to provide deeper insights into the toxicity mechanisms of BPA in BC. Concurrently, the clinical role of immunotherapy in BC has expanded, particularly in triple-negative disease; however, patient selection remains challenging and hinges on biomarkers such as programmed death-ligand 1 (PD-L1), tumor mutational burden (TMB), and tumor-infiltrating lymphocytes (TILs). Authoritative reviews emphasize that intact antigen processing and presentation support durable responses to checkpoint blockade, encouraging mechanistic studies that link environmental exposures to TME biology. Building on these insights, we integrated network toxicology and multiomics to connect BPA exposure to immunoregulatory circuits in the BC TME. 20

Overview of the research design and data sources used in this study. This schematic illustrates the stepwise methodology used to investigate the impact of bisphenol A (BPA) on immune evasion in breast cancer. The study employed several data sources and analytical tools, Continued.including PharmMapper, Super-Prediction Model (PRED), and GeneCards for target prediction, STRING and Cytoscape for network analysis, and PubChem, UniProt, and University of California, San Francisco (UCSF) Chimera for molecular docking validation. TCGA database was utilized for clinical data analysis, including ImmuneScore and StromalScore, and further correlation with tumor-infiltrating immune cells using CIBERSORT. TCGA: The Cancer Genome Atlas.
Materials and methods
Network toxicology analysis
Screening of potential targets for BPA and BC
In this study, potential targets of BPA were predicted using multiple publicly available databases, including PharmMapper, SwissTargetPrediction, BATMAN-TCM, and SuperPred. The chemical structure and molecular formula of BPA were entered into these platforms to perform in silico target prediction. Duplicate and invalid entries were removed, and the resulting targets were consolidated to generate a list of putative toxicological targets.
Simultaneously, BC-related genes were retrieved from several disease-associated databases, including the Therapeutic Target Database, PharmGKB, GeneCards, and OMIM (https://www.omim.org/), using “Breast Cancer” as the primary keyword. After merging and deduplicating the retrieved gene sets, a curated list of BC-associated genes was compiled. These two datasets were then compared to identify intersecting targets, which served as the basis for subsequent network construction and functional analyses. The intersecting set of 179 genes formed the discovery network that was subjected to topological screening (as described in the subsection “Hub gene prioritization in the discovery PPI”).
Discrepancies between databases were resolved by retaining targets predicted by ≥2 platforms. To exclude nonspecific interactions, a protein–protein interaction (PPI) network was constructed using STRING, followed by Cytoscape analysis for hub prioritization. STRING settings for the discovery network were as follows: organism =
Regarding the rationale for the 0.700 cutoff, we prespecified a high-confidence threshold (STRING combined score ≥0.700) for the discovery-scale network of 179 intersecting genes to balance interaction specificity against network connectivity for downstream topological screening.
Regarding sensitivity to STRING thresholds, to assess robustness, discovery PPI analyses were repeated at alternative cutoffs (0.400 and 0.900/0.950). Hub identities and centrality ranks were qualitatively stable across thresholds, and the downstream conclusions remained unchanged. General procedures (databases, deduplication, STRING settings, and hub metrics) were conducted as described in the subsection “Statistical and bioinformatics analysis”; URLs and versions are listed in Table S1.
Overlap identification and Venn visualization
Candidate targets related to BPA and BC were aggregated from the curated resources described above, following the data source–specific inclusion thresholds defined in the subsection “Screening of potential targets for BPA and BC.” To ensure comparability across lists, identifiers were harmonized to HUGO Gene Nomenclature Committee (HGNC) gene symbols (
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis
To explore the functional roles of the core targets, GO and KEGG pathway enrichment analyses were performed. The core targets were entered into the DAVID database, with the human genome (
Hub gene prioritization in the discovery PPI
Building on the intersecting targets defined in the subsection “Screening of potential targets for BPA and BC,” we prioritized hub genes within the discovery-scale network to guide downstream analyses. The discovery-scale PPI network was constructed from the 179 intersecting genes using STRING (organism =
TCGA database analysis
Level 3 RNA sequencing (RNA-seq) data and clinical data for the TCGA-BRCA cohort were retrieved from the Genomic Data Commons (GDC) Portal (Data Category: Transcriptome Profiling; Data Type: Gene Expression Quantification). We did not reprocess raw FASTQ/BAM files. For differential expression analyses, we used HTSeq-Counts (raw gene-level counts) and analyzed them using DESeq2, which applies internal size-factor normalization. For analyses requiring normalized continuous expression (e.g. visualization and immune-related scoring/deconvolution), we used GDC-provided upper-quartile (UQ)–normalized fragments per kilobase of transcript per million mapped reads (FPKM) (FPKM-UQ) values. This strategy leveraged the GDC-harmonized pipeline to minimize processing variability and maintain cross-sample comparability. To avoid confusion, we clarified that all expression inputs were used as provided by GDC (HTSeq-Counts for DESeq2; FPKM-UQ for tasks requiring normalized continuous data). For differential expression, DESeq2 was applied directly to raw HTSeq-Counts without prior batch correction, with internal size-factor normalization; no ComBat correction was applied to raw counts. When batch correction was applicable to continuous matrices (FPKM-UQ or DESeq2 VST), the ComBat policy outlined in subsection “Statistical and bioinformatics analyses” was followed. Discovery analyses were performed using TCGA-BRCA, although an independent METABRIC cohort was used solely to verify the external consistency of survival directionality for the four core genes (detailed information is presented in the subsection “PPI network and Cox regression analysis” and statistical settings in the subsection “Statistical and bioinformatics analyses”).
From TCGA-BRCA RNA-seq (transcriptome profiling), we retained primary tumor specimens (sample-type code “01”) and excluded nontumor tissues (e.g. “11” solid tissue normal) and metastases (e.g. “06”). When multiple tumor aliquots were available for a patient, they were collapsed to a single entry by TCGA barcode, retaining the lexicographically first aliquot to ensure reproducibility. We further required complete overall survival (OS) time/status for downstream analyses, yielding 1081 primary tumors. Subtype-stratified differential expression analyses were not performed; all DESeq2 models were fitted to the overall TCGA-BRCA primary tumor cohort. When referenced for context, triple-negative BC (TNBC) refers to immunohistochemistry (IHC)-based estrogen receptor–negative (ER−)/progesterone receptor–negative (PR−)/human epidermal growth factor receptor 2–negative (HER2−) status, as defined in TCGA clinical annotations.
To prevent conflation of technical and biological variation, biological subtypes (e.g. ER+/HER2−, TNBC) were not treated as batch effects. When ComBat was applied, the batch factor corresponded to the sequencing plate/center (plateID) derived from the TCGA barcode. Detailed batch correction policy is shown in the subsection “Statistical and bioinformatics analyses.”
Data availability statement
The datasets generated and analyzed during this study are available in publicly accessible repositories, ensuring transparency and reproducibility of the findings. The data sources, accession numbers, and repository links are provided below. Regarding gene expression data, the gene expression profiles analyzed in this study were obtained from the TCGA database. Regarding PPI data, the PPI network analysis was conducted using data from the STRING database (https://string-db.org/) and further processed in Cytoscape. The results and images of KEGG enrichment analysis were obtained from https://www.kegg.jp/kegg/kegg1.html.
Transcriptomic data acquisition and processing
RNA-seq and clinical data for TCGA-BRCA were retrieved from the GDC Portal (Transcriptome Profiling → Gene Expression Quantification). We did not reprocess raw FASTQ/BAM files. For differential expression analyses, we used HTSeq-Counts (raw gene-level counts) and fitted DESeq2, which applies internal size-factor normalization; no batch correction was applied to raw counts. For tasks requiring normalized continuous expression (e.g. visualization and immune-related scoring/deconvolution), we used GDC-provided FPKM-UQ. Batch correction, when applicable, followed the ComBat policy detailed in the subsection “Statistical and bioinformatics analyses.” Acquisition and preprocessing were performed as described in the subsection “Statistical and bioinformatics analysis”; resources and versions are listed in Table S1.
Data sources, analysis lock, and rationale
We analyzed TCGA-BRCA RNA-seq (HTSeq-Counts; GDC-harmonized, GRCh38 pipeline) data and the METABRIC cohort (“Breast Cancer (METABRIC, Nature 2012 & Nat Commun 2016)” release on cBioPortal). We utilized a frozen local copy of these datasets to ensure exact reproducibility. TCGA and METABRIC were selected because they provide mature follow-up and harmonized processing, which support consistent survival modeling and cross-cohort validation. General processing and statistical settings (DESeq2 design/thresholds, BH false discovery rate (FDR), gene set enrichment analysis (GSEA) criteria, CIBERSORT/ESTIMATE options, and docking readouts) are consolidated in the subsection “Statistical and bioinformatics analyses”; software/database versions and releases are listed in Table S1.
To evaluate the TME, the ESTIMATE algorithm was used to compute ImmuneScore, StromalScore, and ESTIMATEScore, whereas immune cell deconvolution was performed using CIBERSORT. Overlap visualization of differentially expressed gene (DEG) sets was performed using jvenn, as described in the subsection “Overlap identification and Venn visualization.” Subtype-stratified differential expression analyses were not performed; all DESeq2 models were fitted to the overall TCGA-BRCA primary tumor cohort. When referenced for context, TNBC refers to IHC-based ER−/PR−/HER2− status according to TCGA clinical annotations.
ImmuneScore and StromalScore analysis
The composition of the TME was evaluated using the ESTIMATE algorithm to calculate ImmuneScore, StromalScore, and ESTIMATEScore for each sample. The “estimate” package in R was utilized, with higher scores indicating greater proportions of immune or stromal components within the TME. The correlation between these scores and patient outcomes was subsequently analyzed.
Functional enrichment of TME-associated DEGs
Functional enrichment (GO and KEGG) was performed using clusterProfiler; enrichment visualizations were generated with enrichplot, and ggplot2 was used solely for figure rendering. The results were visualized using bubble plots and bar charts to illustrate enriched BP and pathways, with a significance threshold set at p < 0.05.
Clinical correlation and survival analysis
This study examined the association between TME scores, clinicopathological characteristics, and survival in BC patients. The Wilcoxon rank-sum and Kruskal–Wallis tests were applied to analyze the relationships between TME scores and clinical features, including disease stage. Samples were divided into high- and low-score groups based on the median TME score, and survival analysis was conducted using the ‘survival’ and ‘survminer’ R packages. Statistical significance was set at p < 0.05, providing valuable insights into the prognostic relevance of TME scores in BC.
PPI network and Cox regression analysis
The PPI network of the 349 TME-associated DEGs was constructed using STRING (organism =
For topology-based prioritization, degree, betweenness, and closeness were computed for each node; per-metric ranks were min–max normalized to [0,1] and averaged to yield a composite rank. The top 30 genes by composite rank were considered PPI hubs (with ties broken by higher degree).
For integrative prioritization, univariable Cox proportional hazards models for OS (two-sided; implemented using the ‘survival’ R package) were fitted to each DEG, and genes with p < 0.05 were advanced. Univariable Cox significance served as the sole survival-based filter; no rescreening was performed later. Genes meeting both network and Cox criteria were carried forward to expression/survival presentation (as described in the subsection “Core gene expression and clinical validation”) and to the corresponding Results subsections (including “Network–survival intersection identifies four core genes (
Core gene expression and clinical validation
Survival analyses and Cox modeling followed the procedures described in the subsection “Statistical and bioinformatics analyses.” Briefly, for each core gene (
GSEA
GSEA (version 4.1.0 or later, Broad Institute) was conducted to identify pathways enriched in samples with high or low core gene expression. Gene sets from MSigDB (Hallmark and Immunologic Signature, v6.2) were utilized. Significance was evaluated by FDR q-values, with q < 0.25 considered significant (nominal p-values were also reported for completeness).Enrichment plots were generated to highlight the pathways associated with immune regulation and metabolic reprogramming.
Immune cell infiltration analysis
Immune cell deconvolution was performed using CIBERSORT (absolute mode; 21 immune cell subtypes), and TME scoring was conducted using ESTIMATE. The ggplot2 package was used solely for visualization. ImmuneScore and StromalScore, derived from the ESTIMATE algorithm, were used to quantify infiltration levels, whereas Spearman correlation analysis was used to evaluate the associations between core gene expression and immune subpopulations, including memory B cells, CD8+ T cells, monocytes, activated NK cells, and M0 macrophages. Heatmaps and correlation matrices, generated using ggplot2, revealed significant relationships, highlighting the immunoregulatory roles of core genes and their therapeutic potential in BC. CIBERSORT outputs were analyzed as relative immune cell fractions per sample; no aggregated TME score was derived from CIBERSORT.
External validation in METABRIC
Expression and clinical data for the METABRIC cohort were obtained from cBioPortal (study: METABRIC; Illumina HT-12 v3 expression with OS). We included primary invasive BCs with nonmissing OS time/status and nonmissing expression for the four core genes (
PPI docking
Protein–protein docking was conducted using the ClusPro 2.0 web server (https://cluspro.bu.edu/; PIPER algorithm). Three-dimensional structures of the core proteins—CCL19 (PDB: 2MP1), CD40LG (PDB: 3LKJ), IGLL5 (PDB: 7T17), and KLRB1 (PDB: 5MGS)—were retrieved from the RCSB Protein Data Bank (RCSB PDB). The results were expressed as ClusPro balanced energy scores, where more negative values indicate more favorable interactions; the general criterion for interpreting docking stability is described in the subsection “Molecular docking validation of BPA binding to target proteins.” Representative complexes were visualized in PyMOL (The PyMOL Molecular Graphics System, v2.0; Schrödinger, LLC) to highlight binding interfaces and key residues. These analyses provide structural insights into potential cooperative functions among the core proteins and underscore their therapeutic relevance in BC. These protein–protein docking analyses complement the small-molecule docking of BPA described in the subsection “Molecular docking validation of BPA binding to target proteins.” Docking parameterization and ΔG interpretation were conducted as described in the subsection “Statistical and bioinformatics analyses”; software and PDB IDs are listed in Table S1.
Molecular docking validation of BPA binding to target proteins
Molecular docking analysis was performed using AutoDock Vina to simulate interactions between BPA and BC-related target proteins. The three-dimensional structures of target proteins were retrieved from the PDB (PDB IDs as listed above) and preprocessed using PyMOL. During docking, a grid resolution of 0.375 Å was applied, and 100 docking simulations were conducted to ensure reliability. Docking stability was evaluated based on predicted binding free energies (ΔG) from AutoDock Vina, where more negative ΔG values indicate stronger predicted affinity.
The three-dimensional structure of the hub protein was retrieved from the PDB database, and the BPA structure was obtained from the PubChem database. Structural files were preprocessed using AutoDock by adding hydrogen atoms, removing water molecules, and converting them to PDBQT format. The binding site of the hub protein was analyzed to identify the active docking pocket. The prepared structural files of the protein and ligand were then imported into AutoDock Vina for docking validation. Finally, the docking results were visualized using PyMOL to examine the potential binding interaction between BPA and the target protein and generate visual representations of the docking conformation. Protein–protein docking and reporting (ClusPro ranking; visualization) were described in the subsection “Statistical and bioinformatics analyses”; the resources are listed in Table S1.
Statistical and bioinformatics analysis
To minimize redundancy, all statistical and bioinformatics methods (algorithms, thresholds, and software) are consolidated in this section. URLs and software versions are listed in Table S1 (“Software and Web Resources”), with canonical citations provided in the References. All statistical tests were two-sided, with a nominal significance threshold of p < 0.05 unless otherwise specified. For multiple comparisons, BH correction was applied; for GO/KEGG over-representation analyses, BH-adjusted p (Padj) < 0.05 was considered significant. For GSEA, significance was assessed using FDR q-values, with q < 0.25 considered significant; nominal p-values were also reported for completeness. Analytical conventions defined in this section (e.g. BH-adjusted p and docking ΔG interpretation) are referenced throughout the Results section to prevent redundancy. Student’s
Functional enrichment analyses were conducted using clusterProfiler (GO/KEGG over-representation and GSEA), whereas enrichment visualizations were generated with enrichplot, and general plots were created using ggplot2. TME scoring was performed using ESTIMATE, and immune cell deconvolution was conducted using CIBERSORT. Tool versions and canonical citations are provided in the References, where applicable.
In this study, “TME scores” refer exclusively to ESTIMATE-derived ImmuneScore, StromalScore, and ESTIMATEScore. CIBERSORT outputs were treated as immune cell fractions (relative abundances) and were not included under the term “TME scores.” No composite index was derived from CIBERSORT data.
Associations between gene expression and CIBERSORT-derived immune cell fractions were assessed using two-sided Spearman’s correlation tests. For families of correlations across multiple cell types, BH adjustment was applied, and BH-adjusted p-values were reported.
OS was evaluated using KM curves and log-rank tests. Cox proportional hazards models (univariable and multivariable) were fitted using the survival R package, and proportional hazards assumptions were verified using Schoenfeld residuals. HRs and 95% CIs were reported. Multiple testing was controlled using BH adjustment where applicable. Cox models were initially used as a univariable selection filter during hub identification (as described in the subsection “PPI network and Cox regression analysis”), and the same HRs were then reported with KM plots for the final four genes (as described in the subsection “Core gene high expression is associated with better survival and lower tumor expression”) to visualize effect direction. These KM displays do not constitute a second screening or independent validation.
TME scores (ImmuneScore, StromalScore, and ESTIMATEScore) were analyzed as continuous variables. For two-group comparisons (e.g. ER-positive vs. ER-negative, PR-positive vs. PR-negative, or early vs. late stage), we applied the Wilcoxon rank-sum test. For multilevel factors with ≥3 groups (e.g. American Joint Committee on Cancer (AJCC) stages I–IV and T stages T1–T4), we used the Kruskal–Wallis test. When the omnibus p was significant, we performed Dunn’s post-hoc pairwise comparisons with BH adjustment. Distributional assumptions were assessed using Shapiro–Wilk tests and Q–Q plots. As the TME scores frequently deviated from normality, nonparametric tests were chosen. Unless otherwise specified, all tests were two-sided. For multiple comparisons across clinical variables, we controlled the FDR using BH and reported the BH-adjusted p-values.
The ComBat function (
For each ESTIMATE metric (ImmuneScore and StromalScore), samples were dichotomized at the cohort median (high: ≥median; low: <median). Differential expression was analyzed using DESeq2 on raw HTSeq-Counts (size-factor normalization only), designed as ∼ score_group. Genes with |log2FC| > 1 and FDR < 0.05 (BH-adjusted) were considered significant. We named the resulting sets as ImmuneScore-associated DEGs and StromalScore-associated DEGs, and their intersection defines the shared TME-DEGs used in downstream analyses.
The thresholds for parameter sensitivity analyses were as follows: 1.
Results
A 179-gene BPA–BC intersection forms a dense interaction network
This study performed an in-depth analysis of the potential toxicity mechanisms linking BPA (4,4′-isopropylidenediphenol; PubChem CID: 6623) to BC using various bioinformatics tools. We initially screened databases and identified 382 potential toxicity targets associated with BPA and 2509 disease-related genes associated with BC. Using a designated intersection gene analysis platform (https://www.bioincloud.tech/task-meta), a Venn diagram analysis identified 179 intersecting genes between BPA targets and BC-related genes (Figure 2(a)). These genes constituted the preliminary target library for downstream analyses. To further process these targets, we used the UniProt database to convert protein names into standardized gene symbols, thereby facilitating consistent and accurate annotation. The intersection set comprised 179 genes, representing 46.9% of BPA’s predicted toxic targets and 7.1% of BC-related genes. This notable overlap suggests a strong potential mechanistic link between BPA exposure and BC pathogenesis. To investigate protein-level interactions, we constructed a PPI network of the 179 intersecting genes using the STRING database (Figure 2(b)) and imported the resulting data into Cytoscape for further analysis. Based on this, we constructed a comprehensive compound–target–disease network (Figure 2(c)), in which core targets were visually highlighted in dark purple, indicating their potential central roles in mediating the biological relationship between BPA and BC. After defining 179 shared targets and constructing the candidate network, we next examined functional enrichment to contextualize their roles (as described in the subsection “Shared targets are enriched for endocrine and immune signaling pathways”).

BPA–BC interaction network analysis via network toxicology. (a) Shared target genes between BPA and BC were identified, forming the intersection set. (b) STRING database–generated PPI network of intersection targets and (c) core target network visualized in Cytoscape, where color intensity indicates node connectivity. STRING combined-score cutoff: ≥0.700. BC: breast cancer; BPA: bisphenol A; PPI: protein–protein interaction.
During network toxicology analysis, the network analysis plugin in Cytoscape was used to perform a detailed evaluation of the STRING-derived TSV data. Core genes were identified by comparing each gene’s degree (Degree), BC, and CC, and selecting those with values greater than or equal to the average. Based on this screening, we constructed a refined network diagram (Figure 2(c)), highlighting the core genes. The color intensity of each node was adjusted according to its degree value, thereby visualizing the strength of interconnection among different targets. This network further reveals the key relationships among core targets, BPA-related targets, and disease-associated genes.
Regarding key results, we prioritized four immunoregulatory hub genes—
Shared targets are enriched for endocrine and immune signaling pathways
Enrichment procedures and thresholds are described in the subsection “Statistical and bioinformatics analyses”). Here, we first summarized GO over-representation results, followed by KEGG pathway enrichment for the 179 shared BPA–BC targets, highlighting the top terms/pathways (complete lists in Supplementary Tables). After identifying the intersecting genes, this study performed GO and KEGG enrichment analyses on the 179 shared targets to investigate their potential biological functions in the context of BPA exposure and BC. The prominence of immune and signaling terms motivated hub prioritization within the PPI, as described in the subsection “Topological prioritization nominates 14 candidate core genes”). Collectively, these enrichment results implicated membrane-proximal immune signaling and endocrine–immune crosstalk as potential mechanisms through which BPA perturbs BC biology.
As shown in Figure 3(a), GO enrichment analysis revealed that the intersecting genes were primarily involved in BP, such as aliphatic compound metabolism, cytokine metabolism, pain perception, unsaturated fatty acid metabolism, arachidonic acid metabolism, hormone metabolism, steroid metabolism, neurotransmitter metabolism, fatty acid derivative metabolism, and cAMP-mediated signal transduction. These processes highlight metabolic routes implicated in the initiation and progression of BC.

GO and KEGG pathway enrichment analysis of BPA–BC intersection targets. Functional annotation (GO) and pathway enrichment (KEGG) analyses of the 179 BPA–BC intersection targets are displayed. Each bubble denotes an enriched term, with its diameter proportional to the number of mapped genes and its color intensity reflecting statistical significance. The plot highlights BPA-linked processes central to BC biology, including metabolic reprogramming, hormone–receptor signaling, immune-inflammatory modulation, and prosurvival pathways. BC: breast cancer; BPA: bisphenol A; GO: gene ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes.
For CC, the enriched terms included membrane rafts, intrinsic components of synaptic and presynaptic membranes, organelle outer membranes, and neuronal cell body membranes. These localizations are consistent with TME remodeling processes linked to invasion and immune evasion. CC enrichments (e.g. plasma membrane, membrane rafts, and extracellular region) are interpreted in conjunction with MF and BP terms to indicate membrane-proximal chemokine and costimulatory processes relevant to immune cell communication within the TME (as shown in the subsection “Statistical and bioinformatics analyses”).
Regarding MF, key enriched terms included nuclear receptor activity, ligand-activated transcription factor activity, steroid dehydrogenase activity, steroid binding, and oxidoreductase activity. These functions represent mechanistic nodes relevant to immunoregulation and tumor promotion in BC.
As shown in Figure 3(b), KEGG enrichment analysis of the intersecting genes revealed several critical signaling pathways, including the PI3K-Akt signaling pathway, TNF signaling pathway, chemokine receptor interaction, prolactin-mediated oocyte maturation, insulin resistance pathways, and multiple pathogen-related pathways such as hepatitis B, hepatitis C,
Topological prioritization nominates 14 candidate core genes
To further elucidate the key regulatory factors linking BPA and BC, we analyzed the intersecting targets using the CytoHubba plugin in Cytoscape. Based on the degree centrality, betweenness centrality, and closeness centrality, we identified 14 candidate core genes, listed by degree ranking as follows: CCL19, CD40LG, CTNNB1, HSP90AB1, JAK2, TNF, IL6, PIK3R1, IGLL5, IGF1R, GRB2, KLRB1, KDR, and AR (Figure S1(A)). These prioritizations remained robust across reasonable STRING cutoffs (0.400/0.700/0.950), yielding largely overlapping hub sets and concordant centrality ranks. Given the immune relevance of the prioritized genes, we next evaluated the associations with OS in the TCGA-BRCA cohort (as shown in the subsection “Higher ImmuneScore/StromalScore predicts longer OS”).
As shown in Figure S1(B), GO enrichment analysis of the core genes revealed significant enrichment in BP associated with immune regulation, signal transduction, and cell proliferation, including positive regulation of interleukin-1β production, PI3K–Akt signaling, insulin-like growth factor receptor signaling, positive regulation of smooth-muscle cell proliferation, and activation of the MAPK cascade. At the CC level, the gene products were primarily localized to the outer plasma membrane surface, membrane rafts and microdomains, extracellular membrane constituents, and signal–receptor complexes, underscoring their involvement in structural and functional remodeling of tumor-cell membranes. At the MF level, the genes were significantly enriched in protein tyrosine kinase activity, insulin–receptor binding, hormone binding, cytokine activity, and receptor binding.
As shown in Figure S1(C), KEGG pathway enrichment analysis indicated involvement in several cancer-related pathways, including BC, PI3KAkt signaling, prolactin signaling, chemical carcinogenesis-receptor activation, proteoglycans in cancer, prostate cancer, fluid shear stress and atherosclerosis, and endometrial cancer. These pathways are associated with key tumorigenic processes, including cell proliferation, survival, apoptosis, and endocrine signal regulation. Notably, the coenrichment of the estrogen-signaling pathway and the BC pathway further supports a potential mechanism through which BPA may contribute to BC development by disrupting hormone regulation and intracellular signaling cascades.
Higher ImmuneScore/StromalScore predicts longer OS
KM survival analysis demonstrated that higher ImmuneScore and StromalScore values were significantly associated with prolonged OS in BC patients (Figure 4(a) to (c), p < 0.05). Differential expression for ImmuneScore (high vs. low, median split) identified 1122 ImmuneScore-associated DEGs (|log2FC| > 1, FDR < 0.05). In this section, “TME scores” specifically denote ESTIMATE ImmuneScore, StromalScore, and ESTIMATEScore. These findings suggest that increased immune cell infiltration and stromal cell abundance within the TME correlate positively with improved clinical outcomes. Notably, ImmuneScore and StromalScore emerged as independent prognostic biomarkers for BC, underscoring their clinical relevance in predicting patient survival and guiding therapeutic decision-making. To contextualize these survival associations clinically, we next examined the relationships between TME scores and key clinicopathologic variables (as shown in the subsection “TME score tracking with stage and hormone–receptor status”).

Kaplan–Meier survival analysis of ImmuneScore, StromalScore, and ESTIMATEScore in breast cancer patients. (a) Kaplan–Meier survival curve comparing high and low ImmuneScore groups based on the median value. Patients with a high ImmuneScore demonstrated significantly longer OS (p = 0.014, log-rank test). (b) Kaplan–Meier survival curve for StromalScore, showing that higher scores were associated with improved OS (p = 0.039, log-rank test) and (c) Kaplan–Meier survival curve for ESTIMATEScore, highlighting a positive correlation between higher scores and longer OS (p = 0.026, log-rank test). OS: overall survival.
TME score tracking with stage and hormone–receptor status
We next evaluated the associations between TME scores and key clinicopathologic variables (e.g. AJCC stage, T stage, and ER/PR/HER2 status). Unless otherwise stated, two-group comparisons of TME scores were tested using the Wilcoxon rank-sum test, whereas ≥3-group comparisons (e.g. AJCC I–IV) were tested using the Kruskal–Wallis test with Dunn’s post-hoc comparisons; BH-adjusted p-values are reported to account for multiple comparisons across clinical variables. Descriptively, TME scores demonstrated systematic variation across hormone–receptor strata and AJCC stages, consistent with the direction of effects described in the subsection “Higher ImmuneScore/StromalScore predicts longer OS,” and detailed test choices and multiplicity control are specified in the subsection “Statistical and bioinformatics analyses.” BH-adjusted p-values are reported. We then delineated transcriptomic correlates of TME variation by comparing high- versus low-score samples and characterizing shared DEGs (as shown in the subsection “TME-linked DEGs are enriched for leukocyte activation and chemokine signaling”).
TME-linked DEGs are enriched for leukocyte activation and chemokine signaling
Building on the survival associations described in the subsection “Higher ImmuneScore/StromalScore predicts longer OS,” we compared high versus low ESTIMATE scores (median split) to identify transcriptional programs associated with the TME. ImmuneScore high vs. low yielded 1122 DEGs, and StromalScore high vs. low yielded 1299 DEGs (|log2FC| > 1, FDR < 0.05)—as visualized in Figure 5(a), (c), and (d) (ImmuneScore) and 5B–D (StromalScore). The intersection (n = 349), hereafter referred to as shared TME-DEGs, is shown in the Venn diagram (Figure 5(d)) and was carried forward for functional enrichment and network analyses.

Heatmaps, Venn diagrams, and GO/KEGG enrichment analysis of TME-related DEGs. (a) Heatmap illustrating DEGs identified between high and low ImmuneScore groups.(b) Heatmap of DEGs identified between high and low StromalScore groups, analogous to (a). (c, d) Venn diagrams displaying shared upregulated (c) and downregulated (d) DEGs between ImmuneScore and StromalScore groups. (e–g) GO enrichment analysis for 349 shared DEGs, highlighting their involvement in biological processes such as leukocyte proliferation, T cell activation, and inflammatory response and (h) KEGG pathway enrichment analysis showing significant enrichment in chemokine signaling pathways and cytokine–cytokine receptor interactions for the shared DEGs. DEGs: differentially expressed genes; GO: gene ontology; KEGG: Kyoto Encyclopedia of Genes and Genomes.
This analysis included 1081 TCGA-BRCA primary tumor samples after applying predefined filters. To avoid redundancy in results, the inclusion criteria are detailed in the subsection “TCGA database analysis.”
To prevent redundancy, enrichment parameters and thresholds are detailed in the subsection “Statistical and bioinformatics analyses.” We summarized GO over-representation and KEGG pathway enrichment for the shared TME-DEGs, highlighting the top-ranked terms/pathways (complete lists are provided in Supplementary Tables). GO terms were dominated by leukocyte proliferation and T cell activation/immune regulation (Figure 5(e) and (g)), whereas KEGG pathways highlighted chemokine signaling and cytokine–cytokine receptor interaction (Figure 5(h)).
Collectively, these TME-linked enrichments converge on leukocyte activation and cytokine–chemokine axes, consistent with the membrane-proximal immune signals described in the subsection “Shared targets are enriched for endocrine and immune signaling pathways.”
Network–survival intersection identifies four core genes (CCL19, CD40LG, IGLL5, and KLRB1)
To further elucidate the potential mechanisms of TME-associated DEGs in BC, a PPI network comprising 349 DEGs was constructed using Cytoscape software, based on data from the STRING database (Figure 6(a)). Key hub genes were systematically identified through CytoHubba analysis, employing a multiparameter approach that prioritized genes with high degree centrality, betweenness centrality, and closeness centrality. This comprehensive analysis identified the top 30 hub genes (Figure 6(b)). Subsequently, univariate Cox proportional hazards regression analysis was performed to identify DEGs significantly associated with BC patient survival (p < 0.05; Figure 6(c)). Intersection of the top hub genes from the PPI network with survival-associated genes revealed four core genes—

Protein–protein interaction (PPI) network and univariate Cox regression analysis of TME-related DEGs. (a) PPI network of 349 TME-related DEGs constructed using STRING database, visualized in Cytoscape. (b) Top 30 hub genes ranked by node centrality metrics (degree, betweenness, and closeness centralities) within the PPI network. (c) Univariate Cox regression analysis of 349 DEGs for BC patient survival, with the top 16 prognostic genes visualized in a forest plot and (d) Venn diagram illustrating the intersection of the top 30 PPI hub genes and the top 16 prognostic DEGs identified via Cox regression analysis, leading to the identification of four core genes: CCL19, CD40LG, IGLL5, and KLRB1. BC: breast cancer; DEGs: differentially expressed genes.
Core gene high expression is associated with better survival and lower tumor expression
For the four core genes prioritized in the subsection “Network–survival intersection identifies four core genes (CCL19, CD40LG, IGLL5, and KLRB1),” we present KM curves to visualize survival differences between high and low expression groups. The reported Cox HRs correspond to the selection models described in the subsection “PPI network and Cox regression analysis” and are presented here for completeness rather than as a separate analysis.
Wilcoxon rank-sum tests further revealed that the expression levels of these genes were significantly lower in tumor tissues than in adjacent normal tissues (Figure S2(A) to (D)). This observation was corroborated by paired analyses of normal and tumor tissues from the same patients (Figure S2(E) and (H)).
To explore the pathways associated with each core gene, we conducted GSEA (as shown in the subsection “High expression of core genes aligns with immune activation signatures and low expression with metabolic programs”) (Figure S2(I) to (L) for survival curves). In the independent METABRIC cohort, high expression of the four core genes showed directionally consistent associations with improved OS relative to TCGA-BRCA. KM curves mirrored TCGA patterns, and univariable Cox models yielded HRs below one for the high expression groups. As METABRIC served solely for consistency verification, effect-size details are not retabulated here; these results are presented qualitatively to avoid implying a second discovery analysis.
High expression of core genes aligns with immune activation signatures and low expression with metabolic programs
Given the demonstrated negative correlation between core gene expression and BC patient survival outcomes, GSEA was performed to elucidate the underlying molecular mechanisms. GSEA parameters and significance thresholds are detailed in the subsection “Statistical and bioinformatics analyses”); here, we highlight the principal enriched Hallmark and C7 signatures. Hallmark pathway analysis (Figure S3(A) to (D)) revealed significant enrichment of immune-related pathways in the high expression group, including allograft rejection, complement activation, and interferon responses—critical pathways for immune defense and antiviral immunity. In contrast, the low expression group exhibited enrichment of metabolic pathways, particularly glycolysis and oxidative phosphorylation (Figure S3(E) and (H)). C7 immunologic signature analysis (Figure S3(I) to (L)) demonstrated that the high expression group was significantly enriched for T cell, B cell, and NK-cell activation signatures, whereas the low expression group displayed minimal immune-related pathway enrichment (Figure S3(M) to (P)). These findings collectively highlight the essential roles of CCL19, CD40LG, IGLL5, and KLRB1 in shaping the tumor immune microenvironment in BC. Given the immune-skewed signatures, we next examined the correlations between immune cell fractions estimated by CIBERSORT (as shown in the subsection “Core gene expression is correlated with the levels of CD8+ T cells and memory B cells”). Immune activation signatures remained significant under a stricter FDR threshold (q < 0.10), confirming the robustness of these enrichment themes.
Core gene expression is correlated with the levels of CD8+ T cells and memory B cells
Here, we analyzed CIBERSORT immune cell fractions (relative abundances) rather than global ESTIMATE “TME scores.” Immune cell proportions were estimated as described in the subsection “Statistical and bioinformatics analyses,” and their associations with core gene expression are summarized below. Expression of CCL19, CD40LG, IGLL5, and KLRB1 displayed significant positive correlations with memory B cells, CD8+ T cells, and monocytes, and negative correlations with activated NK cells and M0 macrophages (Figure 7(a) and (b); Figure S4). These patterns indicate that the four-gene axis reflects an immune-inflamed microenvironment in BC. Finally, to generate structural hypotheses consistent with these functional associations, we performed protein–protein and ligand docking analyses (as shown in the subsections “Docking predicts favorable interfaces among core proteins” and “BPA shows predicted binding to core targets”).

Tumor immune cell profile and correlation analysis. (a) Bar plot illustrating the proportions of 21 tumor-infiltrating immune cell (TIIC) subtypes across BC tumor samples and (b) heatmap depicting the correlations among 21 TIIC subtypes in BC tumor samples. BC: breast cancer.
Docking predicts favorable interfaces among core proteins
To investigate the potential interactions among the core proteins in BC, protein–protein docking was conducted using the ClusPro server (PIPER) for CCL19 (PDB ID: 2MP1), CD40LG (PDB ID: 3LKJ), IGLL5 (PDB ID: 7T17), and KLRB1 (PDB ID: 5MGS). For each pair, we report the top-ranked ClusPro balanced scores (unitless; more negative denotes a more favorable score): −738.7 (CCL19–CD40LG), −1073.7 (CCL19–IGLL5), and −910.1 (CCL19–KLRB1) (Figure S5; Table S2). Docking stability interpretation is detailed in the subsection “Molecular docking validation of BPA binding to target proteins”; accordingly, these in silico PPI predictions support potential physical associations among the core proteins and are hypothesis-generating, pending experimental validation. Building on potential PPI interfaces, we next evaluated whether BPA could engage the same proteins at plausible binding sites (as shown in the subsection “BPA shows predicted binding to core targets”).
BPA shows predicted binding to core targets
Molecular docking of BPA to the core immune-related targets was performed using AutoDock Vina 1.1.2 for CCL19 (PDB ID: 2MP1), CD40LG (PDB ID: 3LKJ), IGLL5 (PDB ID: 7T17), and KLRB1 (PDB ID: 5MGS), following the parameterization described in the subsection “Molecular docking validation of BPA binding to target proteins.” Across all four proteins, the top-scoring binding poses exhibited negative predicted binding free energies (ΔG) (Table S3; Figure 8), consistent with potential ligand–target interactions. The general criterion for interpreting ΔG is outlined in the subsection “Molecular docking validation of BPA binding to target proteins.” These computational findings are predictive and warrant further empirical validation (e.g. biophysical binding assays or mutational probing of the predicted interfaces).

Three-dimensional molecular docking interactions of BPA with core target proteins. The molecular docking analysis visualized the interactions between BPA and the 14 core target proteins: (a) CCL19, (b) CD40LG, (c) CTNNB1, (d) HSP90AB1, (e) JAK2, (f) TNF, (g) IL6, (h) PIK3R1, (i) IGLL5, (j) IGF1R, (k) GRB2, (l) KLRB1, (m) KDR, and (n) AR. The docking results highlight multiple hydrogen bonds and hydrophobic interactions at the binding sites, which contribute to the high binding affinity and stability of BPA with the target proteins. BPA: bisphenol A.
Discussion
Our integrative analysis identified an exposure-anchored immune axis—CCL19, CD40LG, IGLL5, KLRB1—associated with immune activation signatures and improved survival in BC across TCGA and METABRIC cohorts. Molecular docking predicted favorable BPA–protein binding (negative ΔG) for all four proteins, providing a structure-based interaction model compatible with interference in immune signaling. This work provides a novel exposure-anchored integrative framework connecting environmental toxicology to tumor immunology, identifying a concise four-gene immune axis with prognostic relevance and extending prior multiomics and immunotherapy findings.20–23
Mechanistic context
CCL19 and CD40LG facilitate leukocyte homing and costimulation.24,25 Reduced CD40LG weakens dendritic cell–T cell costimulation and attenuates cytotoxic T cell activation.26,27 IGLL5 is associated with humoral immunity and antigen recognition, 28 whereas KLRB1 (CD161) modulates T/NK-cell regulation via inhibitory signaling.29–31 Enrichment analyses linked the four-gene axis to interferon-γ response, allograft rejection, complement, and endocrine-related signaling (this study), positioning these genes within immune activation programs in the TME. CC enrichment at plasma membrane/extracellular compartments is consistent with chemokine trafficking and immune-synapse costimulation (this study), providing spatial plausibility. These findings are consistent with recent multiomics studies delineating stromal–immune paracrine networks and T cell dysfunction in BC and related carcinomas.21,22
Receptor-proximal frameworks
BPA has been reported to engage ERRγ and activate AhR, with downstream effects on antigen-presenting cell maturation and Th1 polarization.32–38 These receptors interface with NF-κB/IRF transcriptional modules in immune cells3.6,39,40 Thus, a plausible pathway from BPA → ERRγ/AhR → NF-κB/IRF → CCL19/CD40LG is consistent with our observations, although direct receptor-to-promoter control in BC remains to be demonstrated.
Metabolic–immune crosstalk
Accumulation of lactate, fatty acids, and tryptophan-derived metabolites (e.g. kynurenine) can induce immunosuppression through AhR/PXR signaling.41,42 Consistent with this, enrichment analyses implicated tryptophan and lipid metabolism as well as the PI3K–Akt axis in immune evasion and tumor progression. These processes may intersect with microbiota-dependent circuits. 43
Clinical relevance
Higher expression of the four-gene set was associated with improved OS, with stronger effects in ER+/HER2− disease,44,45 and may help distinguish inflamed from cold tumors; related work proposes this profile as an “immunogenic ignition” signature for TME remodeling.46–48 Authoritative reviews have synthesized BC immunotherapy and emphasized antigen processing/presentation as central to durable checkpoint responses, aligning with our observation that exposure-anchored immune modulation converges on cytotoxic T cell function.20,23
Limitations and future directions
All findings are observational and correlative. Docking provides interaction models but not functional proof. Future priorities should include the following: (a) targeted perturbations (ligand/receptor assays, clustered regularly interspaced short palindromic repeats (CRISPR) modulation, and cytokine profiling) with dose–response and temporal designs; (b) subtype-specific analyses (ER/HER2 phenotypes, TNBC); (c) evaluation of coexposures with other endocrine-disrupting chemicals (EDCs) (e.g. di(2-ethylhexyl) phthalate (DEHP) and polychlorinated biphenyls (PCBs)) within an exposome framework; and (d) validation in additional multiethnic cohorts and model systems (patient-derived xenograft (PDX) and spatial transcriptomics).
Conclusion
We identified an exposure-anchored immune axis (CCL19, CD40LG, IGLL5, and KLRB1) associated with immune activation programs and improved survival in BC. These findings are observational and hypothesis-generating. Next, we will pursue targeted functional assays (ligand/receptor perturbation, CRISPR modulation, and cytokine profiling) and subtype-specific analyses and extend validation to additional cohorts and model systems. This framework may guide future exposure-aware immuno-oncology studies and aid in patient stratification.
Footnotes
Acknowledgments
The authors would like to express their gratitude to Wang Zhang for providing assistance in editing and refining the manuscript.
Author contributions
Yu Fang: Conceptualization, data curation, formal analysis, investigation, methodology, writing–original draft.
Jia-Li Yan: Data curation, formal analysis, visualization, and validation.
Wang Zhang: Project administration, funding acquisition, resources, software, supervision, validation, writing–review & editing.
Availability of data and materials
The datasets generated and analyzed during this study are available in publicly accessible repositories, ensuring transparency and reproducibility of the findings. Detailed information, including data sources, accession numbers, repository links, and corresponding URLs and software versions, is provided in Table S1.
Competing interests
The authors declare that they have no competing interests.
Funding
This work was supported by the Health research project of Anhui Province (No. AHWJ2023BAa20017), the Health research project of Anhui Province (No. AHWJ2023BAc20042), and the Applied Medical Research Project of Hefei Municipal Health Commission (No. Hwk2023yb006).
