Sage Journals: Discover world-class research

Abstract

Background:

Shared decision-making (SDM) in urology faces challenges including limited health literacy, language barriers, and time constraints that can compromise informed consent and treatment adherence. Generative artificial intelligence (GAI), particularly large language models, offers opportunities to personalise patient education and enhance SDM.

Objective:

To evaluate the role of GAI applications in SDM for patients with urological conditions.

Eligibility criteria:

Peer-reviewed observational studies, validation studies, or mixed-methods studies evaluating GAI (e.g., large language models, AI chatbots) in patient communication, education, counselling, or SDM for urological conditions were included. Editorials, opinion pieces, conference abstracts, and non-English language publications were excluded.

Source of evidence:

PubMed, Embase, Cochrane Library, and Web of Science databases were comprehensively searched through June 2025. Study assessments: Newcastle-Ottawa Scale, the STROBE or the AGREE II as per study type.

Charting methods:

Charting methods was performed by using a standardised form. Outcomes of interest included accuracy of GAI-generated information, patient understanding, satisfaction, and decisional conflict.

Results:

Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidelines, 18 observational studies (2023–2025) were included, comprising 310 patients in real-world settings plus hundreds of simulated queries across diverse urological conditions. GAI demonstrated moderate to high accuracy (52%–95%) for guideline-based information, with optimal performance in disease-specific patient education. A prospective comparative study showed 27% reduction in consultation time and improved patient understanding with ChatGPT-4 assistance. Limitations emerged including poor performance in emergencies and complex oncological counselling, and readability issues with content written at a college level (mean Flesch-Kincaid Grade Level 13.5). Most studies evaluated ChatGPT versions, limiting generalizability.

Conclusions:

GAI could enhance and potentially transform SDM in urology with appropriate clinical oversight and human-in-the-loop governance. Currently, GAI is useful for consultation preparation and patient education, while maintaining physician expertise for complex scenarios. Future implementation should prioritise patient safety, equitable access, and environmental sustainability while developing speciality-specific models and clinician education programmes.

Plain language summary

AI for patient consent and shared decision making

We reviewed 18 studies examining how generative artificial intelligence (GAI) can help doctors explaining urological conditions and treatment options to patients. We found that these GAI tools provide accurate information for common pathologies and can help in saving consultation time. However, GAI should not be used alone for complex scenarios, where urologists’ expertise remains essential for patient safety.

Keywords

generative artificial intelligence LLMs patient counselling personalised medicine shared decision-making urology

Introduction

The landscape of patient-physician communication in urology has evolved dramatically over the past decade, driven by advances in digital health technologies and changing patient expectations for personalised healthcare. Shared decision-making (SDM), characterised by a collaborative diagnostic and therapeutic planning between patients and physicians, has emerged as a milestone of patient-centred urological care.^1,2 SDM is fundamental, particularly where treatment pathways significantly affect quality of life, such as in prostate cancer, kidney transplantation, or sexual health. However, traditional counselling approaches often fail to adequately address individual patients’ needs, and other factors like limited patients’ health literacy, language differences, and time constraints can hinder patient understanding, potentially compromising the quality of informed consent and treatment adherence.^3–11

In this scenario, generative artificial intelligence (GAI), especially large language models (LLMs) such as ChatGPT (OpenAI, San Francisco, CA, USA), Claude (Anthropic, San Francisco, CA, USA) or Gemini (Google DeepMind, London, UK), has demonstrated remarkable capabilities in language processing and information synthesis. These technologies offer unprecedented opportunities to personalise patient education materials, providing 24/7 access to medical information, and enhancing the efficiency of patient–physician interactions.^12,13

Previous studies have demonstrated the effectiveness of the integration of AI into urological practice through multimedia presentations, 3D anatomical models, and virtual reality platforms in improving patients’ understanding and satisfaction.^14–16 However, these approaches often lack the personalisation and adaptability that GAI can provide. Additionally, significant questions remain regarding accuracy, reliability, and clinical integration and the real-world evidence for LLMs’ role in supporting SDM in urology is limited and often speculative, with studies yielding to mixed results.^17–19

This scoping review aims to evaluate the feasibility and safety of using GAI applications as a supportive tool in facilitating shared decision-making for patients with urological conditions, to evaluate their effectiveness compared to traditional approaches, and provide guidance for future research and clinical implementation with emphasis on human–AI collaboration and appropriate governance framework.

Evidence acquisition

Search strategy

This scoping review was conducted and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines (Supplemental Table 1 shows the complete checklist).²⁰ A comprehensive search was conducted on PubMed, Embase, Cochrane Library, and Web of Science databases up to June 2025. The search strategy combined terms related to: artificial intelligence and machine learning; patient counselling and education; shared decision-making; and urology or urological procedures. Boolean operators (AND, OR) were used to refine the search. The complete search string is provided in Supplemental Material 1. No language or date restrictions were applied during the initial search. Additional studies were identified through manual review of reference lists and citation tracking from published reviews.

Study selection

Inclusion criteria encompassed the following criteria; (1) Types of participants: patients receiving urological care, healthcare providers in urology, or simulated patient scenarios involving urological conditions; (2) Concept: use of GAI (e.g., LLMs, AI chatbots) in patient communication, education, counselling, or SDM; (3) Comparator: standard care, other digital tools, or none; (4) Context: accuracy of information, patient understanding, satisfaction, decisional conflict, usability, accessibility; (5) Type of evidence sources: observational studies (cohort, cross-sectional, case-control), validation studies, or mixed-methods studies. Exclusion criteria comprised (1) Non-generative AI interventions (e.g., predictive algorithms only); (2) Studies in non-urological settings; (3) Editorials, opinion pieces, letters, conference abstracts without full data; (4) Non-peer-reviewed studies; (5) Non-English language publications. The primary objective was to assess whether a generative AI tool improves patients’ understanding of treatment options and increases engagement in SDM. Secondary objectives included: to compare AI-assisted consultations with standard of care in terms of patient satisfaction, decisional conflict, and communication quality; to evaluate the readability, cultural sensitivity, and accuracy of LLM-generated outputs across literacy levels and languages; to identify patients’ perceptions, trust, and ability of experiences with GAI in the SDM process.

Charting methods, quality and risk of bias assessment

Title and abstract screening were performed independently by two reviewers (CC and FR). Full-text articles were subsequently assessed for eligibility. Charting methods were performed independently by using a standardised form that included: study characteristics (design, sample size, setting, country), population demographics, GAI intervention details (platform, version, prompting strategy), outcome measures, and study quality indicators. Discrepancies were resolved through consultation with a third researcher (BKS).

Study quality and risk of bias were assessed using the Newcastle-Ottawa Scale (NOS) for observational and cross-sectional studies,²¹ the STROBE for observational reporting quality,²² or the AGREE II for guidelines adherence evaluations.²³ Risk of bias was assessed independently by two reviewers (CC and FR), with disagreements resolved by consensus or consultation with a third reviewer (BKS). Studies were categorised as high quality (green), medium quality (yellow), or low quality (red; Supplemental Table 2). As per risk of bias, one study showed low risk of bias, demonstrating only minor concerns and strong methodology; medium risk studies had limitations in two or more domains that might affect the reliability of their conclusions, while studies with high bias for comparability and/or another domain were deemed at high risk of bias (Supplemental Table 3). Consistent with the scoping review methodology, meta-analysis was not performed given the heterogeneity in study design, interventions, comparators, and outcome measures across the included studies, which is expected in an emerging field.

Evidence synthesis

Study characteristics

Figure 1 presents the PRISMA-ScR flow diagram detailing the study selection process. The initial search identified 1426 articles. Following duplicates removal (n = 23), title and abstract screening excluded 1370 studies. Full-text assessment of the 33 studies led to exclusion of 15 papers focused on traditional technology studies and clinical algorithm studies, yielding 18 papers for inclusion in the final review.^{17–19,24–38}

Figure 1.

The PRISMA-ScR flow chart for the inclusion of studies evaluating generative AI in urological patient counselling and shared decision-making.

All the included studies were published from 2023 onwards, reflecting the recent emergence and increasing interest of GAI in clinical practice. Table 1 shows a summary of the included studies. Studies originated from multiple countries, including the United States (n = 5), Europe (n = 3), China (n = 1), Canada (n = 1), Turkey (n = 3), Australia (n = 1), and multinational collaborations (n = 4). The total patient population directly evaluated across all the studies included 310 patients from real-world clinical settings, with additional evaluation of hundreds of simulated patients queries and clinical scenarios.

Table 1.

Characteristics of the GAI studies included in the review.

Study	Study number	Country	Design	Population/Sample	GAI platform	Main outcome	Key findings
Carl et al. (2024)	1	Germany	Prospective cross-sectional	292 Patient queries	ChatGPT-4 chatbot	Patient confidence in AI vs urologists	Patients prefer AI as complement to human expertise; age-independent acceptance
Davis et al. (2023)	2	USA	Cross-sectional evaluation	Eighteen Urology questions	ChatGPT-3.5	Appropriateness for counselling	A total of 77.8% appropriate overall; lowest appropriateness for emergency
Chung et al. (2024)	3	Canada	Prospective single-centre	Eighteen Vasectomy patients	ChatGPT-4	Consultation efficiency/understanding	27% time reduction; improved provider perception of the patient undersdtanding of the procedure
Eppler et al. (2023)	4	USA	Comparative evaluation	Urology journals	ChatGPT	Layperson summary generation	Accurate summaries with proper prompting
Ayers et al. (2023)	5	USA	Comparative cross-sectional	195 Patient queries	ChatGPT	Quality/empathy vs physicians	AI responses rated higher quality and more empathetic than physicians
Cakir et al. (2024)	6	Turkey	Cross-sectional evaluation	Urolithiasis FAQs	ChatGPT	Accuracy/reproducibility	A total of 94.6% FAQ accuracy; 100% reproducibility
Guo et al. (2024)	7	Australia	Cross-sectional evaluation	Bladder cancer scenarios	ChatGPT standard	Accuracy/completeness of responses	Mean accuracy 5.0/6, completeness 1.8/3; good for patient education
Manolitsis et al. (2023)	8	Greece/UK	Comparative experimental	EAU guidelines training	ChatGPT-3.5 Turbo	Custom vs general model	Custom-trained model more precise and prompt
Whiles et al. (2023)	9	USA	Cross-sectional evaluation	13 Guideline questions	ChatGPT	Brief DISCERN scores	A total of 60% appropriate; 25% inconsistent between prompts
Coskun et al. (2023)	10	Turkey	Comparative evaluation	Prostate cancer queries	ChatGPT	F1/precision/recall scores	Low accuracy: F1 0.426, precision 0.349
Gabriel et al. (2023)	11	UK	Cross-sectional analysis	Robotic prostatectomy FAQs	ChatGPT-3.5	DISCERN/concordance rates	A total of 92.9% global accuracy; 78.6%
Talyshinskii et al. (2024)	12	Kazakhstan/Netherlands/Norway/India/UK/Greece	Cross-sectional analysis	Eleven Urolithiasis scenarios	GPT-4	EAU guideline compliance	Incorrect data in 8/11 answers; adequate for patients
Szczesniewski et al. (2023)	13	Spain	Cross-sectional analysis	5 Urological conditions	ChatGPT-4.0	DISCERN quality assessment	DISCERN 4/5 for most conditions; requires caution
Caglar et al. (2023)	14	Turkey	Cross-sectional evaluation	Paediatric urology queries	ChatGPT	Response quality	Satisfactory responses; needs professional oversight
Musheyev et al. (2024)	15	USA	Cross-sectional evaluation	Urological malignancy queries	Multiple platforms	DISCERN/PEMAT-P scores	Moderate quality; poor actionability (40%)
Zhu et al. (2023)	16	China	Comparative cross-sectional	22 Prostate cancer questions	Multiple LLMs	Accuracy/comprehensiveness	>90% accuracy for most LLMs except NeevaAI/ChatSonic. Accuracy decreases with analysis complexity
Rodler et al. (2024)	17	Germany/USA	Prospective clinical trial	466 Patients	Multiple LLMs	Patient trust in AI (primary), choice of AI in treatment settings and traits attributed to AI and urologists (secondary)	Affinity for technology translated in trust for AI; patients trust in capability more clinicians than AI; AI-assisted clinicians were preferred.
Cocci et al. (2024)	18	Italy/Germany/ USA	Cross-sectional evaluation	100 Clinical scenarios	ChatGPT	Appropriateness for counselling (diagnosis/suggested examinations/treatment options)	A total of 52% appropriateness overall; lowest appropriateness for emergency (11.1%)

GAI, Generative artificial intelligence; LLM, large language model.

The GAI platforms evaluated were predominantly ChatGPT versions 3.4 and 4.0, with additional assessment of Perplexity, Microsoft Bing AI, ChatSonic, and NeevaAI. Studies focused on diverse urological conditions, including prostate cancer (n = 5), bladder cancer (n = 2), urolithiasis (n = 2), general urological malignancies (n = 2), paediatric urology (n = 1), vasectomy counselling (n = 1), and mixed urological conditions (n = 5). Most studies employing validated assessment tools, including DISCERN scores, Brief DISCERN questionnaires, PEMAT-P tools, and readability formulae.^18,19,31,35

Generative AI application in urological patient counselling

GAI has been utilised across different SDM settings, from initial patient education to supporting complex treatment decisions. With respect to information exchange (the first element of SDM), studies identified three primary applications. First, disease-specific patient education showed moderate to high performance with accuracy rates ranging from 52% to 95% depending on condition complexity. Gabriel et al. found 92.9% global accuracy for robotic prostatectomy patient education,³¹ while another study reported 94.6% accuracy for urolithiasis FAQs.³² Second, treatment counselling applications demonstrated variable success, with better performance for structured procedural information than complex medical decision-making.^34,35 Third, one study showed a good generation of accessible medical literature summaries for lay audiences.³⁰

The most comprehensive real-world GAI utilisation was described in a prospective study of 292 patients interacting directly with a GPT-4-powered chatbot in clinical settings, where patients valued AI as a powerful complement to human expertise, particularly appreciating the additional consultation time it provided.²⁴ Notably, age was not associated with confidence in AI tools, suggesting thus universal accessibility across patient demographics, which is important for equitable SDM implementation.

Emergency urology scenarios revealed critical limitations for SDM applications. Davis et al. found very low appropriate responses for emergency conditions,¹⁸ highlighting current GAI unsuitability for time-sensitive clinical decisions requiring immediate intervention, where rapid accurate SDM is essential.

Patient outcomes and satisfaction

Direct patient interaction studies revealed nuanced perspective on GAI integration in SDM process. The prospective clinical trial by Rodler et al. of 466 patients found a good patients’ comfort with the technology, with AI trust correlating weakly but significantly with overall technology acceptance (r = 0.094, p = 0.04). Traditional factors like age, education, and illness perception showed no significant relationship with AI trust. However, patients demonstrated significantly higher trust in physicians compared to AI when it came to individualised diagnosis communication (a key SDM component) and understandable explanation. Critically for SDM governance, patients’ trust in AI-generated diagnoses more than doubled when physician-controlled (4.31 ± 0.88) versus unsupervised AI (1.75 ± 0.93), underscoring the importance of human-in-the loop approaches.²⁷ The single randomised prospective comparative trial by Chung et al.³⁸ examining pre-vasectomy counselling demonstrated measurable clinical benefits for SDM efficiency. The intervention group receiving ChatGPT-4 assistance showed significantly higher provider perception of patient understanding (8.8 ± 1.0 vs 6.7 ± 2.8, p = 0.04) and 27% reduction in consultation time (7.7 ± 2.3 vs 10.6 ± 3.4 min, p = 0.05). Patient satisfaction scores for AI-assisted counselling were high (quality of information: 8.3 ± 1.9; ease of use: 9.1 ± 1.5 on a 10-point scale), though privacy concerns scored 4.9 ± 2.9, indicating moderate apprehension requiring attention in SDM implementation.

Readability emerged as a possible barrier to effective SDM. Even though Eppler et al. demonstrated ChatGPT’s ability to create accurate summaries of urological research for general population without medical background when properly prompted,³⁰ several studies found AI-generated content exceeded the recommended sixth grade reading levels, with responses usually written at college level (mean Flesch-Kincaid Grade Level 13.5 ± 1.72).¹⁸ Musheyev et al. evaluated quality, understandability, actionability, misinformation and readability across multiple platforms like ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI. While AI chatbots responses had moderate to high information quality with minimal misinformation, understandability was moderate (median patient Education Material Assessment Tool for Printable Materials (PEMAT-P) with understandability 66.7% – range 44.4%–90.9%) and actionability was moderate to poor with responses being written at a difficult reading level to be actioned.³³ These findings suggest that literacy level, rather than age, may potentially limit GAI utility and access of diverse populations in SDM contexts.

Clinician perspective and AI governance

Healthcare professionals evaluating GAI responses demonstrated cautious optimism and some concerns. A blinded assessment by Ayers et al. found that healthcare professionals preferred chatbot-generated responses over physician responses for quality and empathy,²⁶ suggesting potential for AI to enhance patient communication when properly implemented. Minolitsis et al. demonstrated that training with specialty-specific guidelines provided more precise and prompt answers, highlighting the value of domain-specific training and effective prompt engineering.²⁸ Szczesniewski et al. queried ChatGPT 4.0 on five common urological conditions (prostate cancer, bladder cancer, renal cancer, BPH and urolithiasis). Two independent urologists deemed the overall information well-balanced including anatomical location, affected population, and symptom descriptions, but assigned moderate quality scores (DISCERN 3/5) for treatment-related answers, which are central to SDM.³⁵ Caglar et al. questioned ChatGPT about the 137 most asked inquiries related to paediatric urology; 92% of the answers were judged completely correct, with a concordance of 93.6% with the EAU Guidelines strong recommendations.³⁶

Clinicians consistently emphasised the need for professional oversight and human-in-the-loop governance, with multiple studies recommending AI use only under direct supervision. Whiles et al. found concerning variability, with 25% sets showing discordant appropriateness between repeated prompts,¹⁹ raising reliability concerns for SDM application where consistency is paramount. The consensus emerging from these studies positions GAI as a drafting and preparatory tool for physicians rather than an autonomous patient interaction system, with clinicians serving as essential intermediaries who contextualise and validate AI-generated information.

Accuracy and safety considerations

Accuracy varied substantially by clinical context and question complexity. Overall appropriateness ranged from 52% to 95%, with structured, guideline-based questions achieving higher accuracy than open-ended clinical scenarios.^18,19,31,32 Prostate cancer information showed particularly diverse results concerning for SDM reliability. Coskun et al. reported F1 scores of only 0.426 (range 0–1) and precision of 0.349 (range 0–1),²⁹ indicating substantial inaccuracy risk for complex oncological counselling. Zhu et al. found above 90% accuracy for basic information questions with definite answers for most of the LLMs’ analysed responses; however, accuracy decreased for questions requiring synthesis and analysis (e.g. Why the PSA is still high after surgery?) and responses did not exhibit humanistic care elements essential for SDM.³⁴ Guo et al. identified concerns regarding bladder cancer information completeness, particularly for tumour cure rate according to stage and the treatment-related side effects related, reinforcing the importance of urologist involvement in SDM.²⁵ Safety assessments revealed multiple concerns relevant to patient counselling, Lack of clear information sources emerged as a universal issue, with several platforms failing to disclose references, preventing verification of medical claims critical for informed decision-making.^19,35 Thalyshinskii et al. found incorrect data in 8 of 11 clinical answers despite all containing some EAU guideline-corresponding information.³⁷ Emergency scenario performance was particularly concerning, with one study reporting of missed diagnoses of acute retention in classic presentation, representing potential patient harm if left without clinical oversight,¹⁸ and with another cross-sectional study scoring only 11.1% appropriateness for emergency clinical scenarios.¹⁷

Discussion

This scoping review provides evidence that GAI holds potential as supportive tool for urological counselling and SDM, though with important limitations regarding implementation (Figure 2). The evidence base reveals a technology positioned between promising potential and significant limitations that must be addressed through appropriate governance frameworks.

Figure 2.

Summary of benefits and limitations of generative AI in urological shared decision-making.

A critical finding across studies is the necessity for human-in-the-loop governance GAI deployment for SDM.³⁹ This framework positions clinicians as essential intermediaries who contextualise, validate and communicate AI-generated information. The substantial increase in patient trust when physicians control AI input¹⁵ provides empirical support for this governance model. Implementation requires establishing clear protocols identifying when AI assistance is appropriate versus when direct physician expertise is essential, particularly for complex oncological decisions and emergency scenarios where GAI performance is inadequate.^7,8,13

Equally important is the need for clinician training in context prompt engineering. The finding that speciality-specific training improved GAI response quality¹⁶ highlights how carefully constructed prompts can enhance AI utility. In SDM contexts, effective prompts must incorporate patient-specific factors including medical history, treatment preferences, cultural background, and literacy level. This represents a new competency for urologists that training programmes like the EAU Talent Incubator Programme are beginning to address, where urologists are trained as ‘knowledge brokers’ capable of effectively using AI tools while maintaining clinical judgement.⁴⁰ The EAU has recognised GAI’s potential through initiatives including the EAU Patient chatBOT integrated with ChatGPT capabilities for patients and family education.^41,42 Importantly, all the information that is released through the EAU Patient Offices website is reviewed by experts in the field, either guideline panellists or members of the Young Academic Urology.

This review identifies several evidence-supported integration pathways for GAI in urological SDM (Figure 3). Pre-consultation preparation emerged as an optimal, with patients using AI to formulate questions and understand basic concepts before appointments.^12,26 This approach supports rather than replaces the physician’s role in SDM while improving consultation efficiency and quality. The 27% reduction in consultation time demonstrated in the vasectomy study suggests significant workflow optimisation potential³⁸ when AI handles routine information provision, freeing clinicians for complex decision-making and relationship building. Post-consultation reinforcement represents another valuable application, with AI helping patients recall and understand discussed information, addressing the documented problem of patients retaining only 40%–80% of verbally provided medical information.⁴³

Figure 3.

Implementation, challenges, integration and future directions.

Conversely, GAI is currently unsuitable for emergency scenarios, complex oncological counselling requiring nuanced risk-benefit discussions, and situations requiring understanding of individual patient values and preferences. In these contexts, direct physician expertise remains essential for safe and effective SDM.

The development of specialty-specific AI models trained on urological guidelines and literature shows promise for improving accuracy and clinical relevance.²⁸ GAI might help in workflow redesign opportunities, including AI-assisted documentation where chatbots draft consultation summaries for clinician review reducing administrative burden, structured patient history taking using AI gathering past medical histories before appointments improving consultation efficiency, multi-language and multi-cultural supports through translation and language adaptation improving care accessibility for diverse populations and level of educations. Integration with electronic health records could enable personalised AI responses based on individual patient histories and treatment plans, though privacy considerations require careful attention. GAI development should address multi-language and multi-cultural support, as GAI’s translation and adaptation capabilities could improve SDM accessibility for diverse populations.²² Integration with electronic health records could enable personalised AI responses based on individual patient histories, though privacy considerations require careful attention.

Based on our evidence synthesis, therefore we would recommend GAI implementation for pre-consultation patient education with appropriate content at accessible reading levels; mandatory clinician oversight for all patient-facing GAI applications; explicit warnings against GAI use for emergency scenarios; institutional development of prompt engineering guidelines; regular accuracy auditing against current guidelines; and patient education about limitations and continued primary physician expertise in SDM.

Future research should prioritise head-to-head comparative studies evaluating traditional counselling and informed consent against GAI-assisted approaches across diverse urological conditions, incorporating validated patient-reported outcome measures such as decisional conflict scales and knowledge assessments to establish clinical effectiveness and cost-benefit profiles. Long-term studies examining patient outcomes, satisfaction, and potential harms from AI-assisted counselling are needed to complement the short-term efficacy data currently available. Research into optimal human-AI collaboration models should identify which tasks benefit from AI assistance versus human expertise. Health equity research must determine whether GAI narrows or widens existing disparities in access to medical information, particularly across populations with varying digital literacy levels.³⁴ Finally, the environmental impact of GAI deployment represents an increasingly urgent consideration, with the computational demands of LLMs contributing substantial CO₂ emissions^44,45; healthcare systems must balance AI’s clinical benefits against environmental costs, potentially restricting use to high-value applications. Given the rapid pace of technological evolution, living review methodologies and continuous outcome monitoring will be essential to keep evidence aligned with current model capabilities.

Limitations

This scoping review comes with and highlights some limitations that should be considered when interpreting findings. First, restriction to English language publications may have excluded relevant studies from non-English speaking regions where GAI implantation may differ. Second, exclusion of grey literature may have introduced publication bias toward positive results. Third, most included studies evaluate ChatGPT 3.5 and 4.0, limiting generalisability to other platforms or future interactions. Fourth, the rapid evolution of AI technology means published studies may not reflect current capabilities, with newer models potentially addressing identified limitation.^13,46 Fifth, most studies employed simulated queries rather than patient interactions, potentially missing important contextual factors affecting real-world performance.²⁶ Sixth, while heterogeneity in study design, interventions, and outcomes is expected and appropriate within a scoping review framework, it precluded quantitative synthesis. Finally, the limited number of prospective studies evaluating actual SDM process rather than isolated GAI capabilities constrains conclusions about real-world effectiveness.

Conclusion

This scoping review provides comprehensive evidence that GAI holds transformative potential for urological counselling and SDM, with greatest utility as complementary tool enhancing rather than replacing human clinical expertise. GAI demonstrates strength in structured patient education and information exchange but shows limitations for complex deliberation and emergency scenarios. Effective implementation requires human-in-the-loop governance, clinician training in prompt engineering, and careful attention to readability and accessibility. Future directions should prioritise patients’ safety and health equity while addressing environmental sustainability concerns. Our evidence highlights the need for continued research to optimise human–AI collaboration in patient-centred care.

Supplemental Material

sj-docx-1-tau-10.1177_17562872261441968 – Supplemental material for Generative AI in urology: rethinking patient counselling and shared decision-making – a scoping review from the European Association of Urology Patient Office

Supplemental material, sj-docx-1-tau-10.1177_17562872261441968 for Generative AI in urology: rethinking patient counselling and shared decision-making – a scoping review from the European Association of Urology Patient Office by Clara Cerrato, Francesco Ripa, Michael R van Balken, Eamonn T. Rogers and Bhaskar K. Somani in Therapeutic Advances in Urology

Footnotes

Acknowledgements

None.

Declarations

ORCID iDs

Clara Cerrato

Bhaskar K. Somani

Supplemental material

Supplemental material for this article is available online.

References

Elwyn

Frosch

Thomson

, et al. Shared decision making: a model for clinical practice. J Gen Intern Med 2012; 27: 1361–1367.

Hager

Jungmann

Holland

, et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med 2024; 30: 2613–2622.

Berkman

Sheridan

Donahue

, et al. Low health literacy and health outcomes: an updated systematic review. Ann Intern Med 2011; 155: 97–107.

Winter

Kam

Nalavenkata

, et al. The use of portable video media vs standard verbal communication in the urological consent process: a multicentre, randomised controlled, crossover trial. BJU Int 2016; 118: 823–828.

Gelmis

Pazir

Caglar

, et al. Video-assisted education for informed consent in percutaneous nephrolithotomy: a prospective study on patient comprehension. World J Urol 2025; 43: 305.

Haack

Fischer

Frey

, et al. Digital informed consent for urological surgery - randomized controlled study comparing multimedia-supported vs. traditional paper-based informed consent concerning satisfaction, anxiety, information gain and time efficiency. Prostate Cancer Prostatic Dis 2024; 27: 715–719.

Pekala

Shill

Austria

, et al. Shared decision-making before prostate cancer screening decisions. Nat Rev Urol 2024; 21: 329–338.

Pietrzykowski

Smilowska

The reality of informed consent: empirical studies on patient comprehension—systematic review. Trials 2021; 22: 57.

Glaser

Nouri

Fernandez

, et al. Interventions to improve patient comprehension in informed consent for medical and surgical procedures: an updated systematic review. Med Decis Making 2020; 40: 119–143.

10.

Graham

Reynard

Turney

BW.

Consent information leaflets – readable or unreadable?

J Clin Urol 2015; 8: 177–182.

11.

Sönmez

Kozanhan

Özkent

, et al. Evaluation of the readability of informed consent forms used in urology: is there a difference between open, endoscopic, and laparoscopic surgery? Turk J Surg 2018; 34: 295–299.

12.

Dave

Athaluri

Singh

. ChatGPT in medicine: An overview of its applications, advantages, limitations, future prospects,and ethical considerations. Front Artif Intell 2023;6.

13.

Shah

NH`

Entwistle

Pfeffer

. Creation and adoption of large language models in medicine. JAMA 2023; 330: 866–869.

14.

Porpiglia

Bertolo

Checcucci

, et al. Development and validation of 3D printed virtual models for robot-assisted radical prostatectomy and partial nephrectomy: urologists’ and patients’ perception. World J Urol 2018; 36: 201–207.

15.

Wake

Bjurlin

Rostami

, et al. Three-dimensional printing and augmented reality: enhanced precision for robotic assisted partial nephrectomy. Urology 2018; 116: 227–228.

16.

Tripodi

Kelly

Husaric

, et al. The impact of three-dimensional printed anatomical models on first-year student engagement in a block mode delivery. Anat Sci Educ 2020; 13: 769–777.

17.

Cocci

Pezzoli

Lo Re

, et al. Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer Prostatic Dis 2024; 27: 103–108.

18.

Davis

Eppler

Ayo-Ajibola

, et al. Evaluating the effectiveness of artificial intelligence-powered large language models application in disseminating appropriate and readable health information in urology. J Urol 2023; 210: 688–694.

19.

Whiles

Bird

Canales

, et al. Caution! AI Bot has entered the patient chat: ChatGPT has limitations in providing accurate urologic healthcare advice. Urology 2023; 180: 278–284.

20.

Tricco

Lillie

Zarin

, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med 2018; 169: 467–473.

21.

Wells

Shea

O’Connell

, et al. The Newcastle-Ottawa Scale (NOS) for assessing the quality if nonrandomized studies in meta-analyses, https://ohri.ca/en/who-we-are/core-facilities-and-platforms/ottawa-methods-centre/newcastle-ottawa-scale. (2012, accessed September 2025).

22.

Cuschieri

The STROBE guidelines. Saudi J Anaesth 2019;13(Suppl. 1): S31–S34.

23.

Brouwers

Kerkvliet

Spithof

; AGREE Next Steps Consortium. The AGREE reporting checklist: a tool to improve reporting of clinical practice guidelines. BMJ 2016; 352: i1152.

24.

Carl

Nguyen

Haggenmüller

, et al. Comparing patient’s confidence in clinical capabilities in urology: large language models versus urologists. Eur Urol Open Sci 2024; 70: 91–98.

25.

Guo

Razi

Kim

, et al. The role of artificial intelligence in patient education: a bladder cancer consultation with ChatGPT. Soc Int Urol J 2024; 5: 214–224.

26.

Ayers

Poliak

Dredze

, et al. Comparing physician and artificial intelligence Chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023; 183: 589–596.

27.

Rodler

Kopliku

Ulrich

, et al. Patients’ trust in artificial intelligence–based decision-making for localized prostate cancer: results from a prospective trial. Eur Urol Focus 2024; 10: 654–661.

28.

Manolitsis

Feretzakis

Tzelves

, et al. Training ChatGPT models in assisting urologists in daily practice. Stud Health Technol Inform 2023; 305: 576–579.

29.

Coskun

Ocakoglu

Yetemen

, et al. Can ChatGPT, an artificial intelligence language model, provide accurate and high-quality patient information on prostate cancer? Urology 2023; 180: 35–58.

30.

Eppler

Ganjavi

Knudsen

, et al. Bridging the gap between urological research and patient understanding: the role of large language models in automated generation of Layperson’s summaries. Urol Pract 2023; 10: 436–443.

31.

Gabriel

Shafik

Alanbuki

, et al. The utility of the ChatGPT artificial intelligence tool for patient education and enquiry in robotic radical prostatectomy. Int Urol Nephrol 2023; 55: 2717–2732.

32.

Cakir

Caglar

Yildiz

, et al. Evaluating the performance of ChatGPT in answering questions related to urolithiasis. Int Urol Nephrol 2024; 56: 17–21.

33.

Musheyev

Pan

Loeb

, et al. How well do artificial intelligence Chatbots respond to the top search queries about urological malignancies? Eur Urol 2024; 85:13–16.

34.

Zhu

Mou

Chen

Can the ChatGPT and other large language models with internet-connected database solve the questions and concerns of patient with prostate cancer and help democratize medical knowledge?

J Transl Med 2023; 21: 269.

35.

Szczesniewski

Tellez Fouz

Ramos Alba

, et al. ChatGPT and most frequent urological diseases: analysing the quality of information and potential risks for patients. World J Urol 2023; 41: 3149–3153.

36.

Caglar

Yildiz

Meric

, et al. Evaluating the performance of ChatGPT in answering questions related to pediatric urology. J Pediatr Urol 2024; 20: 26.e1–26.e5.

37.

Talyshinskii

Juliebø-Jones

Zeeshan Hameed

, et al. ChatGPT as a clinical decision maker for urolithiasis: compliance with the Current European Association of Urology Guidelines. Eur Urol Open Sci 2024; 69: 51–62.

38.

Chung

Sidhom

Dhillon

, et al. Real-world utility of ChatGPT in pre-vasectomy counselling, a safe and efficient practice: a prospective single-centre clinical study. World J Urol 2024; 43: 32.

39.

Griffen

Owens

From “Human in the Loop” to a participatory system of governance for AI in healthcare. Am J Bioeth 2024; 24: 81–83.

40.

Cerrato

Vásquez

JL.

Rethinking research in urology: pitfalls from the Talent Incubator Programme. BJU Int 2025; 136: 772–774.

41.

EAU Patient Office, https://uroweb.org/offices/eau-patient-office (accessed September 2025).

42.

EAU Patient Information–European Association of Urology, https://patients.uroweb.org (accessed September 2025).

43.

Kessels

RPC

. Patients’ memory for medical information. J R Soc Med 2003; 96: 219–222.

44.

Alnafrah

The two tales of AI: a Global assessment of the environmental impacts of artificial intelligence from a multidimensional policy perspective. J Environ Manage 2025; 392: 126813.

45.

Ajala

Adeoye

Salami

, et al. An examination of daily CO2 emissions prediction through a comparative analysis of machine learning, deep learning, and statistical models. Environ Sci Pollut Res Int 2025; 32: 2510–2535.

46.

Qin

Chislett

Ischia

, et al. ChatGPT and generative AI in urology and surgery—a narrative review. BJUI Compass 2024; 5: 813–821.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.04 MB

0.00 MB