Abstract
Introduction:
Many germline associations have been reported for urinary bladder cancer (UBC) outcomes and prognostic characteristics. It is unclear whether there are overlapping genetic patterns for various prognostic endpoints. We aimed to review contemporary literature on genetic associations with UBC prognostic outcomes and to identify potential overlap in reported genes.
Methods:
EMBASE, MEDLINE, and PubMed databases were queried for relevant articles in English language without date restrictions. The initial search identified 1346 articles. After exclusions, 112 studies have been summarized. Cumulatively, 316 single-nucleotide polymorphisms (SNPs) were reported across prognostic outcomes (recurrence, progression, death) and characteristics (tumor stage, grade, size, age, risk group). There were considerable differences between studied outcomes in the context of genetic associations. The most commonly reported SNPs were located in OGG1, TP53, and MDM2. For outcomes with the highest number of reported associations (ie, recurrence and death), functional enrichment annotation yields different terms, potentially indicating separate biological mechanisms.
Conclusions:
Our study suggests that all UBC prognostic outcomes may have different biological origins with limited overlap. Further validation of these observations is essential to target a phenotype that could best predict patient outcome and advance current management practices.
Introduction
Urothelial bladder cancer (UBC) results in considerable clinical input and necessitates ongoing research to reduce the burden of patients and health care providers. 1 Current era of genomics offers new insights into UBC pathogenesis. 2 However, due to the complex nature of genetics, many studies are difficult to summarize into clear recommendations for future research and clinical practice.
Urothelial bladder cancer is most frequently diagnosed as a non-muscle-invasive bladder cancer (NMIBC), accounting for 70% to 80% of all new cases. 3 Management of NMIBC is complex with appropriate treatment dependent on multiple clinical and pathological components. Importantly, a significant proportion of patients are prone to tumor recurrence and/or progression, both events being difficult to predict. Previously developed multifactorial prognostic NMIBC tools 4 have been useful to describe populations, but lack accuracy for individual outcomes and require further advances. 5 Muscle-invasive bladder cancer (MIBC) cases are equally complex to treat with various permutations of chemotherapy, radiotherapy, and cystectomy, 6 with an addition of recent initiatives in molecular-genomic subtyping. 2
Although multiple studies have addressed the potential role of genetic variation in UBC prognosis, the findings are yet to be implemented into clinical practice. For the most part, genetic associations are often reported in small samples and their validity is difficult to establish. 7 In addition, the interpretation of the biological relevance over many reports is challenging. Furthermore, it is not yet clear whether genetic associations overlap within and between the groups of direct (recurrence, progression, survival) and indirect (stage, grade, tumor size, age at the time of diagnosis) prognostic endpoints. Identifying existing genetic similarities between prognostic outcomes would help potentially decipher underlying pathological mechanisms and guide promising future directions in UBC research.
In this review, our objective is to summarize genetic associations for UBC prognostic phenotypes and to describe any overlap or existing patterns that would clarify their pertinence for future clinical practice.
Methods
The systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 8 (Supplementary Table 1).
We queried EMBASE, Medline, and PubMed with the following search term: ((“urinary bladder neoplasms” OR “bladder cancer” OR “urothelial carcinoma”) AND (prognosis OR survival OR recurrence OR progression OR grade OR stage OR “tumour size” OR age) AND (polymorphi* OR SNP OR germline)). The search was limited to articles published prior to November 13, 2018, written in English, and describing human research only. A detailed flowchart on the selection and search process is presented in Figure 1. Reference lists of included manuscripts were checked for potentially missing reports. Study eligibility was determined by the 2 authors (N.L., A.W.).

Flow diagram of study selection used in evidence synthesis. SNP indicates single-nucleotide polymorphism.
Inclusion criteria were as follows:
Studies assessing single germline single-nucleotide polymorphism (SNP) variants (not somatic mutations, insertion/deletions, microsatellites, haplotype analyses, dinucleotide polymorphism associations, multiple SNP prediction models);
Original reports (not meta-analyses, reviews, letters, case reports, others);
Studies focused on UBC or where UBC data are described distinctly from a broader urothelial carcinoma (UC) cohort;
Studies reporting an effect size;
Studies reporting significant associations (for characteristics or prognosis);
DNA sequence level variation described;
The described SNPs could be identified.
Studies describing diagnostic and methodological procedures and gene-gene and gene-environment interactions were excluded.
Each study was assessed for quality by evaluating reporting adequacy. Inconsistency was regarded as mismatching data within the study (eg, different SNP IDs reported in article sections). Data completeness was verified if all relevant data fields for a genetic association study 9 were available to extract from the report. The quality criteria were part of the study selection process (eg, studies stating variant relevance for an outcome without providing an effect size were regarded as having low quality and excluded from further evaluation).
Data extraction
Further information was extracted from each eligible study: year of publication, first author, patient subgroup (UBC, MIBC, NMIBC, or other), cancer subtype (UC or other), ethnicity, sample size, SNP ID, locus, gene, effect allele, reference allele, effect allele frequency, effect size, corresponding 95% confidence intervals, and P value.
Summarizing overlap in genetic associations and outcomes
To investigate whether previously reported genes may play a role across multiple UBC outcomes, results were put in a ranked table. Genes associated with many UBC endpoints were ranked high, whereas genes that were reported for one or few of the outcomes were ranked low. As such, we were able to suggest genes that are important for UBC prognosis overall (ie, ranked high) and which genes are more likely to be outcome specific (eg, only associated with cancer recurrence and ranked low).
The resulting ranking acted as a guideline for identifying genes that were commonly observed for most of the prognostic outcomes and characteristics. Outcomes with at least 20 genes were chosen and their functional roles were further described in additional detail.
Functional annotation
After summarizing the overlap, some outcomes have been associated with multiple genes. Every biological process is polygenic, and having bigger sets of identified genes helps elucidate biological pathways behind the studied phenotype. We chose outcomes with the largest number of reported genes and submitted those sets to the DAVID Functional Annotation Tool. 10 This tool groups genes by their functional similarity, using information from well-known databases, such as Gene Ontology (GO) for biological mechanisms and KEGG (Kyoto Encyclopedia of Genes and Genomes) for pathways, among others. Gene clustering was performed with setting the highest level of classification stringency. A high level of stringency generates fewer clusters, but genes within them are associated more tightly. Moreover, to reduce the likelihood of describing false-positive clusters, only gene groups containing pathways with false discovery rates (FDRs) < 5% were interpreted as valid results.
Statistical analysis
Overall, our search has resulted in multiple genes corresponding to various outcomes. As such, the resulting data are very difficult to describe in a comprehensive manner. To reduce the dimensionality of current data, we performed a principal component analysis (PCA). Principal component analysis can be seen as a form of an exploratory analysis to identify group-level correlations in the sample. It is a useful tool for improving the interpretation of data, as it allows visualizing similarities between groups regarding chosen characteristics. In our analysis, we aimed to investigate the similarity between UBC outcomes regarding their genetic background. For example, clinical outcomes that share genes would plot more closely, whereas an outcome that does not share any genes with other endpoints would plot far from other groups. In the currently reviewed literature, some outcomes have been investigated more often (eg, recurrence and death), and hence we have adjusted the size of data points in a PCA plot to be relative to the number of associated genes. The first 2 principal components were plotted for all studied endpoints.
Results
For this review, 373 full-text articles were evaluated in depth, resulting in a final set of 112 articles for further summary (Figure 1). In total, 316 associations were extracted across all investigated outcomes as follows: age (N = 1211-22), stage (N = 7911,13,18,20,23-66), tumor size (N = 267,68), grade (N = 4911,18,27,29,32,33,35-37,42,47,50,54,58,62,66,67,69-75), risk groups (N = 1512,14,17,24,25,30,31,33,40,76,77), recurrence (N = 8114,23,25,30,31,40,43,49,50,52,53,69,71,78-107), progression (N = 2426,33,46,87-89,107-112), and death (N(cancer-specific) = 1234,43,46,101,107,113-117, N(overall) = 4234,43,50,56,89,90,111,112,118-122).
There was considerable heterogeneity across all associations, including assumed patterns of inheritance, studied ethnic populations, and outcome definitions.
Age was investigated using multiple year cut-offs, namely, 50, 13 56, 17 60,11,16,22 65,12,15,18-21 and once as a continuous variable 14 (Supplementary Table 2).
Tumor size was investigated either as using a cut-off of ⩾3 cm 67 or defined as a large tumor, corresponding to stages T1 to T4 68 (Supplementary Table 3).
Tumor stage was analyzed using multiple combinations. Broadly, we have differentiated between stages corresponding to NMIBC and MIBC cases. For studies reporting on NMIBC, the following endpoints were used: tumors of Tis,61,65 T1, 37 Ta + T1,13,18,36,39,54,58,62-64,66 and Ta + T1 + Tis32,59,60 (Supplementary Table 4). As for MIBC, most studies have defined the primary outcome as T2+ staged tumors.11,20,23-32,34-57 However, some associations have been reported for a merged group of T2+ and T1 stages.33,34
Most reports on grade can be roughly categorized into containing either low- or high-grade UBC cases. Low-grade UBC definitions were as follows: G1,18,54,62,70 G2,35,58,62,70 G1 + G2, 36 low-grade,69,71 and G1 + G2 + papilloma. 32 High-grade UBC was usually defined as grade 3 UBC,11,27,29,32,36,37,47,50,67,70,72-75 a combination of G2 and G3 NMIBC,54,66 and some studies have reported estimates for grade 4 tumors, without a reference for the grading system used (G3 + G4 42 and G2 + G3 + G4) 33 (Supplementary Table 5).
It was common for studies to classify UBC as a disease of low or high risk that corresponded to various combinations of clinical stage and grade. For low-risk tumors, researchers used the following definitions: TaG2,33,34 TaG1, 33 and TaG1-2.12,14,17 In contrast, high-risk tumors were defined as TaG2-3 + T1G1-3,25,30,31,40,76 TaG3 + T1G2-3, 24 G2-3 with T1-4, 77 and TaG3 + T1 33 (Supplementary Table 6).
For genetic associations with tumor recurrence, studies mostly focused on NMIBC cases (except for few reports considering UBC group overall 49 or MIBC).50,71 The NMIBC recurrence was investigated as an overall outcome,49,69,78,89,90,94,97,104,106 or in specific groups: patients younger than 64 years, 98 patients not treated with Bacillus Calmette-Guérin (BCG) therapy40,43; BCG-treated patients23,25,30,31,40,43,52,53,79-83,85-87,93,95,96,99,100,102,103,107; patients treated only with transurethral urinary bladder tumor (TURBT) resection88,101; patients who have received both TURBT and BCG treatments91,92; patients having received treatments of TURBT and epirubicin 105 ; and recurrence only among low-risk NMIBC14,84 (Supplementary Table 7).
Progression was defined as an increase of stage within NMIBC33,109,110 or UBC 108 overall. Also, transition from NMIBC to MIBC or metastatic disease87-89 was considered a disease progression, sometimes expanding the latter definition to include cancer-specific death.107,112 In other cases, alternative definitions were considered, namely, occurrence of metastases26,46 and a confirmed relapse among MIBC 111 (Supplementary Table 8).
Regarding death outcomes, there were 2 broad groups of overall33,43,56,89,90,112,118-121 and cancer-specific34,43,46,101,107,113-117 survival endpoints (Supplementary Table 9).
Retrieved data and detailed study characteristics, including outcome definition for each study, are presented in Supplementary Tables 2 to 9.
Overlap between outcomes
A summary table of existing overlap between outcomes and associated genes is presented in Table 1.
Overlap between reported outcomes and mapped genes.
OGG1 was the most commonly reported gene, having been associated with patient age, 19 tumor stage, 54 grade, 54 recurrence,40,78 and risk group. 40 Associations on OGG1 and UBC did not cluster within a clearly defined subgroup and instead showed relationships with various characteristics: increased age at diagnosis (>65 years) 19 and elevated risks of the following: non-muscle-invasive and invasive UBC, 54 low- and high-grade tumors, 54 rate of recurrence,40,78 and high-risk tumors. 40
A set of 2 genes (TP53 and MDM2) have also been reported for multiple endpoints, specifically UBC grade,42,69 stage,25,41,42 recurrence,25,69,101 survival,101,117 and risk group.25,76 For most outcomes (death, risk category, grade, stage), the associations for MDM2- and TP53-related variants were in opposite directions.
Regarding the number of genes corresponding to a single endpoint, tumor recurrence was the outcome with the highest sum of genes (N = 28) showing associations, followed by death (N = 21) (Table 1).
To elucidate any unifying pathways between these genes, gene sets for recurrence and death were submitted to the functional annotation tool DAVID. 10
For recurrence, DAVID identified 2 gene clusters of similar functions that contained pathways with acceptable FDR values (Supplementary Table 10). The first group (enrichment score = 2.72) was formed entirely of RGS family genes (RGS10, RGS13, RGS16). The second cluster (enrichment score = 2.42) was formed by GLI2, GLI3, and SHH genes. Out of 10 functional terms within the cluster, 1 was of satisfactory FDR and reached Bonferroni-adjusted statistical significance < .05, termed “hindgut morphogenesis.”
For individual enriched pathways, 20 have yielded FDR < 5% and are listed in Supplementary Table 11. Three functional terms—“hindgut morphogenesis,” “pathways in cancer,” and “positive regulation of transcription from RNA polymerase II promoter”—have shown low FDR rates and were also below the conventional level of statistical significance (P < .05) after multiple-comparison adjustment.
For genes associated with UBC survival, the submitted set retrieved 6 functional clusters in total; however, no individual terms had acceptable FDR values.
Nonetheless, multiple individual functional pathways with FDR < 5% were identified instead (Supplementary Table 12). One term, “pancreatic cancer,” has reached a Bonferroni-adjusted statistical significance (P = .05).
Finally, a performed PCA analysis for previously reported genetic associations showed UBC recurrence to be the most distinct outcome (Figure 2), with tumor stage and grade also showing significant deviations from other endpoints.

Principal component analysis for genetic associations with urinary bladder outcomes.
Discussion
In this review, we have summarized existing evidence for single SNP genetic associations with UBC characteristics (tumor size, stage, grade, patient’s age) and prognostic outcomes (recurrence, progression, survival). There were multiple associations for considered endpoints with limited overlap. Based on these data, we have made several observations.
It is widely accepted that complex disease genetic architecture is highly polygenic. 123 However, the currently summarized list of associations for UBC outcomes and characteristics is far from exhaustive. It is essential to note that future studies with higher per-study power will contribute additional associations and will clarify the validity of those already reported.
Importantly, our review underscores the sensitivity of outcome definition in genetic studies. It has been demonstrated that genetic variants for UBC risk are unlikely to be relevant for prognosis, 124 and our report implies that prognostic outcomes demonstrate further within-group heterogeneity. Interestingly, the PCA revealed differences for direct prognostic outcomes: UBC death and progression showed similar characteristics, whereas UBC recurrence significantly deviated from the group. From a biological perspective, cancer recurrence is not an equivalent to progression or death, and it is likely that the mechanisms involved are triggered and organized via different pathways. Similarly, tumor characteristics (grade, stage, size) and patient characteristics (age) are likely different entities in genetic contribution.
When trying to elucidate unifying pathways for multiple genes involved in certain outcomes, UBC recurrence was found to be associated with terms that relate to the formation of a new tissue (eg, “hindgut morphogenesis”). In contrast, functional pathway terms were different for death as an outcome and may indicate a separate biological mechanism. Interestingly, the most promising associated term for death was “pancreatic cancer,” which exhibits very low survival rates in comparison with cancers of any other site. 125
In the light of our analyses, UBC prognosis may represent a complex phenotype, and this review indicates that different outcomes imply distinct genetic associations. The genetic relationships may overlap but, nonetheless, should be treated as independent endpoints.
Importantly, the review identifies a number of commonly reported genes, specifically OGG1, TP53, and MDM2. OGG1 encodes a protein involved in base excision repair (BER) pathways to protect cells from oxidative stress. 126 Although having a clear role in mutagenic processes, OGG1-null mice showed only moderate increases in malignancy rate, likely due to effective alternative damage repair pathways. 127 Evidence from multiple meta-analyses128-130 of OGG1 involvement in UBC cancerogenesis is contradictory, and, when having a genuine effect, is more likely to play a supporting role in a multi-stage process rather than being the main cause of it. 127 It is also probable that the establishment of the type and direction of genetic associations requires larger populations (underscoring sufficient sample sizes for different ethnicities), not yet available to researchers.
In addition, the link between TP53 and MDM2 genes has been extensively reported in the literature, offering an attractive pharmacologic target in cancer treatment. 131 P53 protein acts as a tumor suppressor, which is negatively regulated by MDM2 oncoprotein. The pattern is somewhat mirrored in observed associations, where a variation in SNPs of the 2 genes seemed to correspond to effects in opposite direction (eg, SNPs in TP53 increased the risk of T2+ stage, whereas alterations in MDM2 showed reduced risk of invasive tumors).
Collectively, OGG1, TP53, and MDM2 are relevant for multiple essential DNA-preserving cellular mechanisms and hence would be expected to have importance for a variety of UBC characteristics and outcomes, as observed in our review.
The limitations of our study are important to acknowledge. Many reports have analyzed different ethnicities, which alone does not undermine the reported associations, but makes inter-population relevance improbable due to differing allele frequencies. 132 Moreover, assumed genetic patterns of inheritance (eg, recessive, dominant, additive) differed highly between the studies, without a clear preference for the chosen model. Usually, the reported model was chosen ad hoc as a consequence of being statistically significant, making it difficult to be confident that the reported model reflected true genetic architecture of the association. Because associations were highly heterogeneous, we were unable to perform a meta-analysis (which would have provided a preferred summary of these data). Furthermore, most of the included studies were of candidate-gene design; we would expect different results if all studies followed an agnostic genome-wide association approach. Finally, sample sizes were limited, and it is difficult to establish whether all reported associations are robust. The lack of external replication studies for genetic associations is detrimental to translating science into practice, as many genetic findings are likely to be false positives. 7 Optimally, only validated variants would be included in review studies. We underscore the importance of validation efforts for future studies to summarize only unambiguous variants.
Finally, we interpret this report as exploratory in its nature, and although no clear guidelines can be drawn due to the heterogeneity of previous studies, it is nonetheless an important exercise in drawing research directions. First of all, as more independent research groups have access to genotype data of bladder cancer patients, this article can prove a useful resource to replicate already reported associations in a time-efficient manner. Second, reporting those results will help identify which associations and corresponding genetic regions are most promising to pursue in other studies. As such, the landscape of genetic bladder cancer investigations may be accelerated by collaborative contributions made by the wider research community and preserve resources for studies of higher likelihood to produce meaningful results.
To conclude, we have summarized existing genetic associations for tumor and patient characteristics and disease prognosis for UBC. Multiple loci have been identified that demonstrate little consensus and highlight the possibility of UBC prognostic outcomes being unique entities in the context of genetic contribution. We recommend that further replication of previously identified SNPs should be undertaken. Consecutive formal reviews of existing associations will help facilitate their potential use in clinical practice.
Supplemental Material
Supplementary_Tables_1_12_Sep24_xyz268777d1d25d8 – Supplemental material for Systematic Review: Genetic Associations for Prognostic Factors of Urinary Bladder Cancer
Supplemental material, Supplementary_Tables_1_12_Sep24_xyz268777d1d25d8 for Systematic Review: Genetic Associations for Prognostic Factors of Urinary Bladder Cancer by Nadezda Lipunova, Anke Wesselius, Kar K Cheng, Frederik J van Schooten, Jean-Baptiste Cazier, Richard T Bryan and Maurice P Zeegers in Biomarkers in Cancer
Footnotes
Funding:
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
NL designed the study, organized the data, performed statistical analyses, and wrote the first draft of the manuscript. All authors contributed to the manuscript and study design revision, read, and approved the submitted version.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
