Abstract
Background
The abnormal expression patterns of genes associated with cell cycle control are essential contributors to the initiation and progression of bladder cancer (BC).
Methods
In this research, transcriptomic profiles sourced from The Cancer Genome Atlas (TCGA) were examined to pinpoint cell cycle-related genes with significant expression differences between early-stage (I–II) and advanced-stage (III–IV) BC. The identified BC samples were then randomly split into training and validation groups. A prognostic model containing nine key genes was subsequently constructed from the training dataset using the least absolute shrinkage and selection operator (LASSO) regression algorithm, and its predictive reliability was independently confirmed in the validation dataset. Based on individual risk scores derived from this gene signature, patients were separated into high- and low-risk groups. Functional enrichment analyses (GO and KEGG), immune cell infiltration (CIBERSORT), and gene set variation analysis (GSVA) were performed to explore the biological functions and immune landscape associated with the risk signature.
Results
Survival analysis revealed substantially worse overall survival (OS) in patients categorized as high-risk. The prognostic efficacy of the constructed gene signature was further corroborated by receiver operating characteristic (ROC) curve analyses. Functional enrichment analyses revealed that these genes were chiefly associated with critical biological processes such as cellular differentiation, proliferation, and migration, involving key signaling pathways like PI3K-Akt signaling, positive modulation of the MAPK cascade, and cell-matrix interactions. Immune infiltration analysis showed that tumors from patients in the high-risk group had increased infiltration by M2 macrophages. Furthermore, GSVA results indicated that stromal and angiogenic functions were enriched in the high-risk group.
Conclusions
These results highlight the potential clinical utility of the newly developed cell cycle-associated gene signature, providing effective biomarkers to facilitate personalized therapeutic decision-making for BC patients.
Introduction
BC, ranked tenth among malignancies globally, constitutes a notable proportion of cancers involving the urinary tract. In 2022, bladder cancer remained a significant global health burden, with an estimated 613,791 new cases and 220,349 deaths. 1 Epidemiological data demonstrate a marked gender discrepancy, with males exhibiting incidence rates roughly three to four times higher than females, a phenomenon closely associated with specific risk factors, such as tobacco use and exposure to carcinogenic agents (especially aromatic amines). Currently acknowledged risk factors encompass cigarette smoking (contributing to 55.1% of BC cases), obesity characterized by elevated body mass index (BMI), occupational exposure to carcinogens, and persistent inflammation of the bladder (chronic cystitis). 2 Treatment modalities are primarily selected based upon clinical staging; transurethral resection of bladder tumor (TURBT) in conjunction with intravesical chemotherapy is currently the accepted standard for non-muscle-invasive BC. In contrast, muscle-invasive BC requires more aggressive management, including radical cystectomy combined with urinary diversion or a multimodal therapeutic approach (maximal endoscopic resection combined with radiotherapy and chemotherapy). These treatments aim to control tumor metastasis and reduce disease-specific mortality. 3 The prognosis of BC patients is influenced by numerous molecular and clinical factors. Patients with non-muscle-invasive BC have a relatively favorable 5-year survival rate (80–95%), although the recurrence rate among high-risk patients remains as high as 50–70%. 4 Conversely, patients with muscle-invasive BC have significantly poorer outcomes, with corresponding five-year mortality reported at approximately 30–50%, and these individuals often experience lymph node involvement or distant metastasis. 5 Although the existing tumor staging and grading systems partly guide therapeutic decisions and prognosis, they fail to fully capture the tumor heterogeneity and biological complexity of individual patients. Therefore, identifying effective biomarkers capable of accurately predicting treatment response and prognosis in BC is urgently required.
Abnormal regulation of the cell cycle is a fundamental characteristic of malignant tumor initiation and progression. Under physiological conditions, the cell cycle is strictly controlled to ensure orderly cell division; however, cancer cells acquire unlimited proliferative capacity by disrupting these precise regulatory mechanisms. At present, several critical signaling pathways have been recognized to govern the cell cycle, notably: (1) The integrated MAPK/ERK signaling cascade critically regulates tumor cell fate by coordinating key processes like proliferation and survival; 6 and (2) the PI3K-AKT-mTOR axis, controlling cellular metabolism, growth, and resistance to apoptosis; 7 (3) TGF-β pathway, promoting epithelial-to-mesenchymal transition; (4) NF-κB pathway, mediating inflammation, immune responses, and cell survival; 8 (5) JAK-STAT pathway, mediating cytokine signaling and influencing proliferation and immunity; (6) The p53 pathway not only maintains genomic stability and regulates apoptosis but also broadly suppresses tumors by remodeling the tumor microenvironment and activating anti-tumor immunity; 9 (7) Apoptosis pathway; and (8) DNA damage repair pathway. In-depth investigation of these cell cycle-associated genes and their regulatory networks can help elucidate molecular mechanisms underlying tumor progression, thereby providing a theoretical foundation for developing novel diagnostic biomarkers and targeted therapeutic strategies.
In this study, transcriptomic and clinical data derived from 402 BC cases within TCGA were examined to uncover differentially expressed genes (DEGs) linked to cell cycle regulation, distinguishing between early-stage and advanced-stage cancers. Utilizing the LASSO regression model, a nine-gene prognostic signature was formulated from the training cohort (n = 269) and subsequently validated through analysis of the independent verification cohort (n = 133). This signature correlated with classical cancer signaling pathways and reflected differences in immune cell infiltration (ICI), demonstrating its potential as an effective prognostic predictor in BC.
Materials and methods
Data collection and processing
All data for this analysis, encompassing clinical annotations and gene expression profiles, were obtained from the TCGA-BLCA dataset. To guarantee analytical rigor, samples without complete RNA sequencing or follow-up survival data were omitted from this study. The entirety of statistical analyses was performed using R software.
Screening for cell cycle-related genes
This research focused on 770 key genes directly or indirectly involved in cell cycle regulation. 10 To investigate the molecular mechanisms underlying BC advancement, patients in the TCGA-BLCA dataset were divided into early (stages I–II) and advanced (stages III–IV) stage subgroups. Differences in gene expression profiles between these categories were identified using the DESeq2 package. 11 Overlapping genes identified through this analysis were displayed visually using the online Venny tool.
GO and KEGG analysis
Functional enrichment analyses were performed to systematically characterize the DEGs between early- and advanced-stage BC. Differential expression profiles were determined using criteria of |log2FC| ≥ 1.0 and P < 0.05. The enriched biological processes and pathways were analyzed using the clusterProfiler package. 12 GO enrichment analysis, a principal bioinformatics tool, identifies gene sets with significant alterations under pathological conditions and clarifies potential biological processes involving target genes. 12 KEGG analysis, a database resource linking genomic information with higher-order functional data, was also employed. 13 To strengthen result validity, adjusted P-values derived from the Benjamini-Hochberg correction method were applied, considering values below 0.05 as indicative of statistical significance.
Protein–protein interaction (PPI) network analysis
Additionally, key regulatory genes and functional modules were identified using PPI analyses. Protein interaction networks for DEGs were constructed by extracting data from the STRING database (version 11.0). 14 Cytoscape software (version 3.10.3) 15 was subsequently employed to visualize networks and perform functional analyses, adopting a consistent significance level of P < 0.05 throughout. 16
Identification of prognostic cell cycle-related gene signatures
Initially, the prognostic significance of differentially expressed cell cycle-related genes was assessed using univariate Cox regression analysis.
17
Prognostic genes were selected, and a predictive risk score model was developed using LASSO Cox regression analysis via the “GLMNET” R package.18,19 The optimal cutoff values for gene-expression stratification were determined using the surv_cutpoint function from the survminer R package. Kaplan-Meier survival curves were subsequently generated based on these classifications. The optimal penalty parameter (λ) was determined using a ten-fold cross-validation procedure. Each patient's risk score was derived using the normalized gene expression (Expi) values and their corresponding regression coefficients (Coei) according to the following calculation formula:
The prognostic capability of the gene signature was assessed through Kaplan-Meier analyses and ROC curve assessments. 20 Furthermore, to confirm the clinical applicability and robustness of this prognostic model, both univariate and multivariate Cox regression analyses were implemented to determine if the risk score independently predicted patient outcomes relative to established clinical and pathological features.
ICI estimation
To systematically characterize the immune microenvironment (IME) in BC, the abundance of ICI was quantified by employing the CIBERSORT algorithm. 21 RNA transcript expression data of 22 distinct immune cell populations, including naive and memory B cells, CD8+ T cells, naive CD4+ T cells, resting/activated memory CD4+ T cells, Tregs, TFH cells, gamma-delta (γδ) T cells, DCs, macrophages, neutrophils, eosinophils, and mast cells (activated/resting), were comprehensively analyzed. Subsequently, Spearman correlation analyses were conducted to investigate associations between ICI and patient risk scores, and results were visualized using the “ggstatsplot” package. These investigations aimed to clarify the underlying immunological mechanisms associated with the prognostic signature. 22 Gene set variation analysis (GSVA) was performed using the GSVA R package with a custom gene set collection derived from immune-cell marker genes identified by CIBERSORT. The analysis was executed using a Gaussian kernel with default parameters.
Statistical analysis
All statistical analyses were conducted in R. Student's t-test and chi-square tests were utilized for comparisons among variables. Kaplan-Meier survival analyses and log-rank tests were performed to evaluate OS differences. Additionally, univariate and multivariate Cox regression analyses were applied to recognize independent prognostic factors, with P < 0.05 indicating statistical significance.
Results
Identification of dysregulated cell cycle-related genes in BLCA
In accordance with AJCC staging criteria, the TCGA-BLCA samples were subdivided into early-stage and advanced-stage groups to facilitate differential gene expression analysis (Figure 1). Comparative analysis between these two groups revealed a total of 1111 significantly dysregulated genes (DEGs, P < 0.05). Their expression patterns are illustrated using a volcano plot (Figure 2A) and heatmap (Figure 2B). Further analysis revealed that among 770 candidate cell cycle-related genes, 61 (7.92%) were differentially expressed between advanced and early BC, as clearly presented in a Venn diagram (Figure 2C).

Flowchart of data collection, analysis, model construction, and validation.

Identification of DEGs in TCGA-BLCA dataset: (A) Volcano plot and (B) heatmap comparing gene expression profiles between advanced- and early-stage groups. (C) Venn diagram showing cell cycle-related DEGs.
PPI network analysis
Using the STRING database, we established a PPI network comprising 61 cell cycle-related genes identified as differentially expressed (Figure 3A). Subsequently, Cytoscape software was applied to further examine and visually represent the interaction network, ranking the genes according to their connectivity within the network (Figure 3B).

(A) PPI network of DEGs. (B) Cytoscape-ranked connection degrees of DEGs.
Construction of prognostic gene signature from classical cancer pathways in the TCGA training cohort
Using the training cohort from TCGA (n = 269), this research systematically evaluated the prognostic relevance of genes involved in classical cancer-associated pathways. Initially, prognostic correlations were explored using univariate Cox regression analysis, which revealed that 33 cell cycle-related genes were significantly associated with patient survival outcomes (Figure 4A). These candidate genes were subsequently refined using LASSO penalized Cox regression analysis, yielding an optimized prognostic model consisting of nine critical genes selected at the optimal penalty parameter (λ) (Figure 4B). Kaplan-Meier survival analyses were conducted for each of these nine genes (Figure 4C-K). The regression coefficients obtained from the LASSO algorithm were then employed to establish a risk-score calculation formula:

Subsequently, a nine-gene prognostic model was constructed from the TCGA-BLCA training dataset: (A) The forest plot illustrates the univariate Cox regression results of the 33 cells cycle-related genes linked to OS. (B) Regression coefficients for the nine selected genes related to cell cycle regulation were determined through LASSO regression analysis. (C-K) Kaplan-Meier curves demonstrated significantly distinct OS outcomes associated with each of the nine critical genes (P < 0.05). (L) Kaplan-Meier curves demonstrated significantly distinct OS between the high-risk and low-risk patient cohorts (P < 0.05). (M) Predictive performance of the risk model for OS within the training dataset was evaluated using time-dependent ROC curve analysis.
Patients from the TCGA training dataset were categorized into high-risk or low-risk subgroups based on the median cutoff value of the computed risk scores (−0.0082). Kaplan-Meier curves demonstrated that patients classified in the high-risk subgroup exhibited notably poorer OS relative to those in the low-risk subgroup (P < 0.0001, Figure 4L). Furthermore, the prognostic performance of the gene-based signature was validated through time-dependent ROC analyses, which yielded robust AUC values of 0.655, 0.668, and 0.687 for 1-, 3-, and 5-year survival predictions, respectively (Figure 4M).
The validation of nine-gene signature in the validation cohort
To confirm the predictive reliability and generalizability of the established gene signature, the risk-scoring formula and identical threshold (−0.0082), initially developed in the training set, were subsequently validated using the TCGA internal validation set (n = 133). Consistent with findings from the original analysis, patients assigned to the high-risk category exhibited poorer OS (Figure 5A). ROC curve analyses further validated the prognostic accuracy of the model, with corresponding AUC values presented in Figure 5B.

Validation of the prognostic signature in the internal TCGA cohort: (A) Kaplan-Meier curves contrasting OS between high- and low-risk groups. (B) Time-dependent ROC curves evaluating OS predictions in the validation set.
Independent prognostic value of the nine-gene signature
Clinical data derived from the TCGA-BLCA dataset were analyzed, revealing statistically significant associations between the established risk groups and clinical characteristics (Table 1). Univariate Cox regression indicated significant correlations between OS and variables (Table 2). Importantly, Cox regression analyses conducted after adjusting for multiple clinical and pathological confounding variables confirmed the risk score's independent prognostic relevance (HR = 2.05, 95% CI = 1.27–3.3, P = 0.003; Figure 6A). These results underscore the potential clinical utility of the developed prognostic signature in categorizing BC patients according to their risk of mortality.

(A) Forest plot illustrating multivariate Cox regression analysis of prognostic factors in the TCGA-BLCA cohort.
Comparison of baseline clinical characteristics between high-risk and low-risk groups in the TCGA-BLCA cohort.
Univariate cox regression analysis of clinical variables and risk score for overall survival in the TCGA-BLCA cohort.
GO and KEGG pathway analyses of DEGs
GO analysis demonstrated that these genes were predominantly implicated in biological processes such as connective tissue development, ossification, epithelial morphogenesis regulation, and positive modulation of the MAPK cascade. Additionally, GO analysis revealed significant enrichment within extracellular matrix components and collagen fiber structures (Figure 7A). KEGG pathway enrichment further indicated notable involvement of these DEGs in key cancer-related pathways, including the MAPK pathway and Rap1 pathway (Figure 7B).

GO and KEGG pathway enrichment analyses: (A) GO enrichment analysis showing significantly upregulated and downregulated pathways among DEGs. (B) KEGG enrichment analysis illustrating significantly altered signaling pathways among DEGs.
ICI analysis in high- and low-risk groups
Using the CIBERSORT computational method, ICI assessments were performed to explore correlations between risk scores and tumor immune microenvironment (TIME) composition. The analyses revealed distinct immune infiltration profiles between low- and high-risk patient subgroups (Figures 8A-B). Specifically, risk scores exhibited significant negative associations with the abundance of eosinophils, TFH cells, activated DCs, and Tregs (R ≤ −0.2, P < 0.05). Conversely, macrophage populations, including the M0, M1, and M2 subtypes, showed positive correlations with increased risk scores (R ≥ 0.2, P < 0.05, Figure 8C). Based on these results, GSVA was performed. The findings indicated that stromal and angiogenic functions were enriched in the high-risk group (Figure 8D).

Immune infiltration characteristics in the high- and low-risk groups: (A-B) Comparative analysis of infiltration levels of 22 immune cell types between high-risk and low-risk groups, calculated by the CIBERSORT algorithm. (C) Correlation between risk score and infiltration levels of the 22 immune cell subsets. (D)GSVA enrichment scores between high- and low-risk groups.
Conclusion
BC, ranked tenth in global malignancy prevalence, continues to pose significant clinical challenges due to its propensity for local recurrence and metastasis, leading to adverse patient outcomes even after conventional therapeutic interventions.1,5 To improve prognostic assessment, we established a novel prognostic signature utilizing the TCGA training dataset and subsequently confirmed its validity in an independent validation cohort. Risk scores derived from the LASSO regression model accurately differentiated patient prognosis, clearly demonstrating that patients assigned to the high-risk group had markedly reduced OS in both cohorts. ROC analyses further verified the reliability and predictive strength of this nine-gene signature across the analyzed datasets. These findings indicate that our cell cycle-related gene-based scoring system can serve as an essential molecular marker for evaluating prognosis in patients with BC.
Among 770 cell cycle-related genes analyzed, 61 (7.92%) showed significant dysregulation in advanced-stage BC, suggesting their potential involvement in tumor progression. Based on their functional properties, the 33 core genes used to construct the prognostic model were categorized into six groups: (1) extracellular matrix and structural genes (COL11A1, COL1A1, COL1A2, COL3A1, COL5A1, COL5A2, COMP, TNN, THBS4, SPP1); 23 (2) calcium channel and signal transduction-related genes (CACNA2D1, CACNG6, GNG4, PTPRR, DUSP2); 24 (3) growth factors and receptor-associated genes (FGF1, FGF7, FGF14, FGFR1, HGF, IGF1, PDGFRB, TGFB3, GAS1); 25 (4) development and morphology-regulating genes (LEFTY2, RELN, SFRP2, SFRP4, SOST); 26 (5) cytoskeleton and muscle-function-related genes (FLNC, TSPAN7); 27 and (6) other functional genes (PLA2G5, CCNA1).
Functional enrichment analyses (GO and KEGG) highlighted the fundamental biological significance of these selected genes in BC progression. Notably, enriched pathways such as MAPK and PI3K-Akt signaling were closely linked to essential cellular processes. Additionally, the activation of genes related to extracellular matrix remodeling, collagen fiber structures, and the Rap1 signaling pathway may facilitate tumor cell metastasis and invasion by regulating cellular migration, adhesion, and polarity. This investigation provides comprehensive insights into the molecular mechanisms driving BC progression, establishing a robust theoretical foundation for subsequent research on the signaling pathways underlying BC pathogenesis.
In this study, nine genes with significant prognostic value were ultimately identified through LASSO regression analysis. Previous studies have demonstrated that these genes significantly influence multiple aspects of cancer biology, including tumorigenesis, malignant progression, metastatic dissemination, and resistance to treatment. COL5A1, an integral component of the extracellular matrix (ECM), contributes to modulating the biomechanical properties of tissues and influences the cellular microenvironment via its participation in constructing heterotypic collagen fibers. 28 Its overexpression in BC potentially promotes ECM fibrosis, increases tissue stiffness, and subsequently activates the integrin-FAK signaling pathway, enhancing tumor cell invasion and metastasis. 29 COL5A2, along with COL5A1, constitutes type V collagen, and both genes exhibit significantly elevated expression in high-grade and invasive BC, correlating strongly with reduced disease-specific survival. 30 As crucial negative regulators, DUSP2 and PTPRR suppress signaling transduction by dephosphorylating members of the MAPK pathway (ERK/JNK/p38) and MAPK itself, respectively. 31 FGF1 initiates downstream signaling cascades such as PI3K-AKT and MAPK through interactions with FGFR1-4 receptors. Notably, FGFR3 is frequently altered genetically in BC, thereby influencing crucial cellular processes encompassing proliferation, survival, angiogenesis, and migration. 32 GNG4, an essential component of the G-protein signaling pathway, significantly modulates the tumor IME, promoting cancer progression and negatively affecting patient survival, 33 highlighting its potential as a novel biomarker for chemotherapy and immunotherapy in BC. PTPRR, belonging to the protein tyrosine phosphatase family, negatively regulates MAPK activity through dephosphorylation. Emerging evidence demonstrates that PTPRR functions as a critical tumor suppressor. Its upregulation enhances chemosensitivity and significantly inhibits cancer cell proliferation and clonogenicity, whereas its silencing promotes these malignant phenotypes. 34 SOST, a key regulator of bone metabolism, is critically involved in cancer progression. Upregulated SOST expression is associated with bone metastasis and poor survival in breast cancer patients. Functionally, SOST interacts with STAT3 to enhance the TGF-β/KRAS signaling axis, thereby promoting tumor growth and bone metastasis, an effect that can be therapeutically targeted. 35 THBS4, a pivotal glycoprotein in stromal tissues, is involved in multiple biological activities, including extracellular matrix remodeling, cell adhesion, angiogenesis, and cell proliferation. 36 In BC specifically, THBS4 induces expression of MMP-2 by activating AKT signaling, thus promoting invasive characteristics in tumor cells. 37 Although limited evidence is currently available regarding the function of TSPAN7 in tumor biology, existing studies underscore its importance. Research in non-small cell lung cancer (NSCLC) confirms its association with aggressive clinicopathological features and demonstrates its role in promoting metastasis through the induction of EMT. 38 Complementary work in osteosarcoma elucidates a key mechanism, showing that TSPAN7 drives this process via activation of the FAK-Src-Ras-ERK1/2 signaling pathway. 39 Collectively, these observations offer critical insights to deepen our understanding of the molecular pathways involved in BC progression. 38
In summary, the current investigation performed a detailed characterization of IME features in BC patients classified into low- and high-risk groups based on their gene signature-derived scores using CIBERSORT analysis. The findings confirmed robust associations between patient risk stratification and ICI patterns. Specifically, the TME of high-risk individuals exhibited markedly elevated infiltration by macrophages, particularly those of the M2 subtype, which positively correlated with the assigned risk score. This observation has critical clinical implications, as M2 macrophages facilitate tumor progression and metastatic spread via mechanisms such as promoting EMT and inducing immunosuppressive conditions within tumors, 40 factors that correlate closely with unfavorable outcomes in BC. 41 Conversely, infiltration by immune cells possessing anti-tumor activities, including activated DCs, TFH cells, eosinophils, and Tregs, was significantly greater among low-risk patients and exhibited negative correlations with risk scores. Moreover, GSVA revealed that the high-risk group was predominantly enriched with pathways associated with stromal activation and angiogenesis. These findings indicate that the established cell cycle-related gene model effectively estimates the immune status of bladder cancer patients. The increased infiltration of M2 macrophages in high-risk patients may promote tumor progression by enhancing stromal activation and angiogenesis, thereby explaining the worse prognosis observed in this subgroup.
Limitations
Despite the successful development and validation of this cell cycle-associated gene signature for BC prognosis, several limitations of the current study should be acknowledged. First, the model was constructed and validated solely on retrospective data from public databases (TCGA). While internal validation demonstrated robustness, the lack of validation in prospective, multi-center, and ethnically diverse cohorts limits the generalizability and immediate clinical translatability of our findings. Second, as a bioinformatics-based study, the precise biological functions and mechanistic roles of the identified key genes (e.g., COL5A1, FGF1) in BC pathogenesis and progression remain to be experimentally verified through in vitro and in vivo studies. Third, the predictive value of the risk score for patient responses to specific therapies, such as immunotherapy or chemotherapy, was not investigated, which is crucial for its clinical utility in guiding personalized treatment decisions. Future studies should address these limitations to facilitate the translation of this signature into clinical practice.
Conclusion
In conclusion, we developed and validated a novel prognostic signature comprising nine cell cycle-related genes for bladder cancer. This signature effectively stratified patients into distinct risk groups with significant differences in overall survival and demonstrated robust predictive performance. Furthermore, the high-risk score was associated with an immunosuppressive tumor microenvironment characterized by increased M2 macrophage infiltration. Our findings not only highlight the critical role of cell cycle-related genes in BC progression and prognosis but also provide a potential tool for risk stratification. Targeting these genes may represent a promising strategy for modulating the tumor immune landscape and developing combination therapies for BC.
Footnotes
Acknowledgments
The authors acknowledge The Cancer Genome Atlas (TCGA) program for providing the genomic data used in this study.
Ethical considerations and consent to participate
TCGA is a public database where the patients involved have obtained ethical approval. Users can freely download relevant data for research purposes and publish articles based on these data. Since our study relies on open-source data, there are no ethical concerns or conflicts of interest involved.
Author contributions
Author 1: Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Writing – review & editing.
Author 2: Methodology, Writing – original draft, Writing – review & editing.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
