Abstract
Generative artificial intelligence (AI) is a powerful class of machine learning that moves beyond simply analysing data to actually creating new and original content, such as medical images or clinical text. The use of generative AI is varied in orthopaedic surgery. Generative AI moves us from one-size-fits-all surgical planning to highly personalised surgical blueprints for each patient’s unique anatomy and condition. While generative AI in surgery is new, it can provide real-time intelligent help to a surgeon’s skill and decision-making. Most practitioners see the use of AI as a tool to improve diagnosis and treatment, with some expressing their concern that it will conversely worsen diagnosis and treatment. With its use and potential, the use of generative AI currently should be supervised and validated, as it has been shown that sometimes the generated content does not reference to any actual source. Policies and economic values are also detrimental to the integration of AI technologies in clinical orthopaedics. Ethical issues, practitioners view and perspective, and the high overall cost of AI technology use, are among the barriers that may emerge. This comprehensive review addresses the opportunities, challenges, and future direction of integrating generative AI in orthopaedic surgery.
Introduction
Generative artificial intelligence (AI) is a powerful class of machine learning that moves beyond simply analysing data to actually creating new and original content, such as medical images, clinical text or 3D anatomical models. Orthopaedics is a specifically promising field for its application, as clinical practice can heavily relies on imaging, surgical planning, decision making and patient-specific solutions. These are all areas where generative AI may add its value. For instance, creating 3D models from standard x-rays, improving resolution of CT or MRI images, or generating virtual reality images of a patients’ data are applications relevant to diagnosis, surgical preparation, and rehabilitation. 1
This generative power is harnessed through several sophisticated technologies. 1 A worldwide study by Familiari et al. found that 25.1% of orthopaedic practitioner used AI in their practice. Most view it as beneficial, especially in referencing literature and aiding communication. 2 AI and natural language processing have been utilised in evaluating treatment success. Floyd et al. developed an AI model evaluating clinical notes to evaluate outcomes on treatment after proximal humerus fracture, which provide them with the analysed patient’s outcome based on each of their flexible and responsive individual treatment goals. With AI models helping practitioners achieving tasks as they intended, the modifiable nature of this technology allows flexibility, however guidelines and standards, as well as accuracy and validity test are still need to be implemented to ensure the quality of outcome. 3 However, adoption of the use of AI remains limited by uncertainty about accuracy, ethical considerations, and cost. To fully realise its premise, generative AI necessitate validation and standardisation before routine application or integration into clinical practice.
Technically, generative AI employs methods such as adversarial training, probabilistic encoding, iterative noise reduction, and large language processing. These methods are made into models, known as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Diffusion Models and Large Language Models (LLMs), which require trainings to generate desired contents. While the details might be different, the common feature is the ability of synthesising new and realistic data rather than solely predicting the outcomes. These capabilities distinguish generative models from predictive AI, which is primarily concerned with forecasting clinical risks or prognosis of a procedure or disease.4,5
This creative ability highlights a critical difference from traditional predictive AI. Predictive models look to the past to forecast the future, answering the clinical question, “What is the most likely outcome?” In contrast, generative models invent new possibilities, asking, “What is an optimal solution we could create?” This is more than a technical distinction; it signals a potential evolution in medicine from a reactive stance on risk management to a proactive approach centred on personalised solution design. 6
Key terms in artificial intelligence with examples in orthopaedic surgery.
Opportunities in orthopaedic clinical practice
Generative artificial intelligence use in orthopaedics.
AI: artificial intelligence; RCI: rotator cuff injury; cGAN: conditional generative adversarial nertworks; PEM: patient educational materials; CDC: Centers for Disease Control and Prevention; NIH: National Institutes of Health (NIH); GAC: Gwet’s AC1; ML: machine learning; DL: deep learning; PHF: proximal humerus fracture; HA: hip arthroplasty.
Preoperative phase: Enhancing diagnostic precision and surgical planning
Generative imaging is changing how orthopaedic surgeons see and understand patient anatomy. One of the most powerful applications is the ability to create a “digital twin” of a patient’s anatomy which is a rich interactive tool for diagnosis and planning. This starts with improving and transforming standard medical images. 5 Using models like GANs and diffusion models you can turn low resolution or low dose scans into high quality detailed images. This improves diagnostic accuracy by showing subtle problems that would otherwise be missed and reduces patient radiation exposure without losing image quality. These models can also algorithmically reconstruct parts of an image that are missing or unclear and give a more complete picture for clinical evaluation. 18
A major innovation with big clinical benefits is the ability to automatically create 3D anatomical models from standard 2D images. For example, studies have used GAN-based models to generate accurate 3D models of the human spine from a pair of regular X-rays. This gives surgeons a detailed volumetric view of complex anatomy that was previously only possible with more expensive and higher radiation methods like CT scans. This could make advanced anatomical visualisation more widely available especially in places with only basic X-ray equipment. 19 Beyond improving existing images, generative models can create entirely new synthetic X-rays. Research has shown that GANs can generate high resolution synthetic knee X-rays that show the stages of arthritis and are so realistic that even surgeons and expert radiologists can’t tell them apart from real images, with accuracy percentage between 34% - 50% according to Ahn et al. 10 These synthetic images can be used to expand limited datasets for research, train new clinicians and develop and test other AI diagnostic tools without risking patient privacy.10,20
Personalised surgical blueprints
Generative AI moves us from one-size-fits-all surgical planning to highly personalised surgical blueprints for each patient’s unique anatomy and condition. By processing a patient’s CT or MRI, AI algorithms can quickly and automatically create detailed, interactive 3D models of bones and surrounding tissues. These models let surgeons see complex fractures, tumours or deformities from any angle, measure important parameters and develop a precise individual surgical plan. 21 The 3D models created by AI are the foundation for surgical simulations. Using virtual reality (VR) or augmented reality (AR), surgeons can “rehearse” a procedure on a patient’s digital twin. They can practice surgical techniques, try different approaches, anticipate problems and optimise implant placement all in a risk-free virtual environment. This is especially valuable for complex procedures like joint replacement, bone tumor removal and corrective spinal surgery where it has been shown to increase surgical success and reduce operating time22,23 Generative AI can create custom orthopaedic implants. By analysing a patient’s specific anatomical data – bone density, geometry and movement patterns – AI algorithms can design prosthetics and implants that fit like a glove. 24 This level of personalisation will increase implant lifespan, patient comfort and better functional outcomes.24–26
Conditional GAN (cGAN) utilises deep learning, where the creation of a target image from an input image is trained through an algorithm to construct a realistic visualisation, with a better quality from the traditional raw data-based approaches.13,15 A study by Kim et al. showed the use of trained model of cGAN results a better quality in terms of resolution and performance, compared to the untrained model of GAN. However, the database used for training affects the outcome quality, and analysing unseen data may become its weakness in generating the intended images. Producing high quality scans with this technology provides better understanding on patient’s clinical anatomy, which in turn will results in better assessment and surgical outcomes. 13 In a study by Zhao et al., virtual reduction of femoral fracture by trained cGAN provide better satisfactory rate compared to manual reduction by orthopaedic surgeon. Previously prepared virtual reduction of fracture may act as a target goal in the actual open reduction, and may be embedded in virtual reality glasses, assisting operation in real time. 15
Intraoperative phase: Augmenting surgical execution
While generative AI in surgery is new, it can provide real time intelligent help to a surgeon’s skill and decision making. Here we focus on dynamic adjustments and better navigation. AI-powered systems can improve surgical navigation by combining preoperative 3D models with real time intraoperative imaging (like fluoroscopy or ultrasound). This creates a live augmented view of the surgical field, so the surgeon can track their instruments more precisely to the patient’s anatomy. In robotic assisted surgery which is becoming more common in orthopaedics, AI algorithms can analyse data from sensors and cameras to adjust the surgical plan in real time. This allows the system to account for small changes in patient position or tissue movement, so the robot can be safer and more accurate in actions like cutting bone or placing screws. These systems can also provide real time decision support, for example by analysing intraoperative data to predict the stability of an implant and suggest to the surgeon.21,27
In the recent time, AI usage is not preferentially seen to significantly help in improving surgical outcome, although the use of AI robotics for surgical procedure are emerging, and more commonly, practitioners have been using AI to support diagnosis and intraoperative assessment based on image. 2 AI technology had been successfully utilised by Hernigou et al. in analysing 3-D models to determine patient’s precise and individual ankle real axis of motion, which aids the process of axial alignment of talar implant in total ankle arthroplasty. 9 Kim et al. suggested the use of cGAN to increase the image quality of the recently emerging C-arm CT, which resolution and performance are often inadequate in terms of clinical use. 13
Postoperative phase: Personalising recovery and rehabilitation
After surgery, generative AI can create more personalised and closely monitored recovery plans to optimise outcomes and reduce complications. Instead of a one-size-fits-all protocol for everyone, AI can create a dynamic schedule of exercises tailored to an individual’s needs, range of motion and muscle strength. By looking at patient-specific data – type of surgery, pre-op functional levels and real-time data from wearable sensors, generative AI can create custom rehab programs. AI-guided gait analysis using cameras or wearable sensors can track a patient’s progress.28,29 However, these rehabilitation sequences still require supervision by physicians. 30 By looking at movement data these systems can predict functional problems and recovery speed, so therapists can adjust the rehab plan in real time.28,29
Generative language model such as ChatGPT can help answer patient’s postoperative concern. Dubin et al. found that ChatGPT and arthroplasty trained nurse provide similarly appropriate answer regarding common total hip arthroplasty postoperative questions. Most patients do not trust ChatGPT in answering their postoperative question, however when the answers are given blinded, they were more comfortable in the answers given by ChatGPT. 31 Another study by Gök et al. compared ChatGPT-4.0 and DeepSeek-V3 performance in answering commonly asked questions regarding total knee arthroplasty, and it was found that ChatGPT-4.0 provide a more descriptive and detailed content in its answers, even though they are similar in scientific accuracy and understandability. 32
Transforming the landscape of orthopaedic research and education
LLMs are now trained to generate sophisticated human-like answers to prompts.17,33 ChatGPT, when compared to google search engine, generates more academic responses (93% vs 46%). 17 Even when given prompt in accordance to the American Academy of Orthopaedic Surgeons clinical practice guidelines (AAOS CPGs), Nian et al. found that the newest LLMs at that time can deliver responses with no significant difference to the existing guidelines. 16 Holland et al. evaluates abstracts constructed by ChatGPT-4.0 and ChatGPT-3.5 compared to the ones written by a surgical resident and a senior author where surgeon reviewers found that the AI-generated abstracts are subjectively ranked better than abstracts written by human. Current trustworthy uses of these chatbots are limited, they showed their ability in translation and organising hospital electronic medical records but often spreads misinformation from falsified unknown-sourced data and making biased analysis. These LLMs are lacking creativity as they are dependent on the given prompt, often requiring multiple modification until satisfying answers are generated. Validation should be a compulsory step if LLMs are utilised to generate scientific text. 33
Navigating the challenges and limitations
Generative AI may perform artificial hallucination while generating contents, which concerns the possibility of the technologies providing its user with fabricated information. Generative LLMs for example, often delivers their answers with definitive statements, misleading users while the answer does not actually based on any actual reference material. 12 Quality of the learning source and different learning models determines the quality of analysis and eventually generated outcomes, often falters when encountering unseen data. Different intent of generative AI requires different algorithm model, which affects the type of source material usually comprises of datasets. Technique regularisation, improvement of datasets, and transfer learning could be the potential solver for this problem. 13
Concerns regarding the ethics of AI application presents on the social and moral aspects. The instance of generative AI language model in giving incorrect answers that sounds plausibly correct is one of the factors that hinder the use of AI, making it unreliable to use in practice. 34 Different models of AI, even in analysing the same dataset, can exert different quality of outcome as shown by Floyd et al., hence algorithm bias, data quality, and interpretation should also be considered in integrating AI in clinical practice.3,35
Policies and economic values are also detrimental in the integration of AI technologies in clinical orthopaedics. 34 Availability of AI technologies are affected by cost and regulation. Insurance policy commonly does not cover the use of AI technologies. 2 The value of AI technologies in the market keeps expanding year-to-year, mirroring how the developments are constantly being made. 12
The reliability of generative AI technology shows inconsistency, which should not be seen if they are to be implemented in daily practice. 11 With a popular model of generative AI, ChatGPT, Agharia et al. found that the model’s ability in generating decision for orthopaedic patients are non-consistent compared to practitioner.11,36 A similar study by Isleem et al. found that ChatGPT were only able to correctly answer 60.8% of the 301 orthopaedic board exam-style question, which is similar to the score of interns and junior resident, with limitation in answering complex question which requires intuition. 12 Practitioner’s will in utilising AI is also a factor in AI adoption. Training is needed in integrating novel technology to clinical practice, which often expensive and time-consuming, possibly making it unappealing to practitioners with older age. Knowledge regarding AI technologies is not well understood by some, with practitioners showing confusion regarding the difference of AI, machine learning (ML) and deep learning (DL). 2
Future directions: Toward an autonomous and ethical orthopaedic ecosystem
Most practitioner sees the use of big data as a tool to improve diagnosis and treatment, a minority expressed their concern that it will conversely worsen diagnosis and treatment. The view on the future of AI in orthopaedics is still ambiguous, where its potential in image-based diagnosis, especially in osteoarthritis, and patient outcome prediction are eagerly expected, while regulation, cost–value concerns, and scarcity of evidence-based applications remain barriers. 2 Questions of legal liability remain unresolved, as accountability for poor outcomes involving AI-assisted planning is unclear, and bias in datasets risks perpetuating inequities if underrepresented populations are not adequately included. AI learning and algorithm can be embedded in various types of technologies. The implementation of generative AI in low-quality imaging machine was suggested to be able to enhance image quality, potentially lowering cost from the need of repetitive retake scans required for diagnosis confirmation. 13 Together, these challenges emphasize why enthusiasm for generative AI is tempered by caution within the orthopaedic ecosystem.
Conclusion
Generative AI is frequently used in the orthopaedic settings. With continuous development to the technologies, the use of generative AI will potentially aid practice, even during surgical procedures. Robotic-assisted surgery, predictive measurements, and communication aid is the currently clinically applicable generative AI technology. The potential of generative AI in synthesising new data in imaging is promising, however supervision and validation are still needed. There are also concerns regarding the ethical, legality, reliability, and cost of generative AI. Future studies in AI should focus on enhancing the accuracy, quality, and clinical applicability of generative AI.
Footnotes
ORCID iDs
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
