Abstract
Artificial intelligence (AI) is rapidly transforming surgical care, with growing integration across all phases from preoperative planning to postoperative recovery. The role of AI in postoperative care represents a particularly promising frontier. Applications such as AI-generated discharge instructions, conversational chatbots, and computer vision–based wound monitoring have the potential to improve comprehension, enhance patient satisfaction, and reduce unnecessary health care utilization. While the economic impact remains underexplored, these innovations could substantially lower health care costs through safe, responsible implementation. This review explores current and emerging uses of AI in surgical aftercare, emphasizing its capacity to simplify complex information, provide accessible guidance beyond clinician availability, and enable early detection of complications. Thoughtful adoption of these tools may help bridge health literacy gaps, advance equitable care delivery, and optimize outcomes, particularly for underserved and marginalized populations.
Keywords
Introduction
The transition from hospital to home is one of the most vulnerable phases in the surgical care continuum. 1 Patients frequently leave the hospital with standardized postoperative instructions but limited access to real-time support, which can lead to confusion, anxiety, and avoidable health care utilization. Uncertainty about pain, wound appearance, activity restrictions, or medications often prompts patients to seek reassurance in the emergency department (ED)—a setting that is costly, resource-intensive, and rarely necessary for minor postoperative concerns.2,3 This pattern not only highlights gaps in current postoperative education and communication, but also underscores the need for more effective patient-centered solutions.
Emergency department visits after surgery are common, with 30-day ED visit rates ranging from 6.2 to 11% and 90-day ED visit rates up to 21.1% in certain procedures such as total joint arthroplasty.4-7 A significant proportion of these encounters are classified as “avoidable,” suggesting that a more appropriate, lower-acuity setting could have addressed the patient’s needs.4,5 Flood et al found that 11.4% of patients who had elective thoracolumbar surgery were readmitted to the emergency department (ED) within 6 months for potentially avoidable reasons. 7 A study conducted by Smith et al evaluating postoperative patients who had undergone bariatric surgery demonstrated that triaging potentially avoidable ED visits to urgent care centers (UCCs) would result in cost savings of $4238 per patient (an aggregate savings of $1.6 million for their entire study cohort). 2 Importantly, patient concerns are often addressed in discharge instructions, yet patients remain unsure or unable to apply the guidance, reflecting persistent communication gaps.8,9 Such missed opportunities to reinforce or personalize instructions contribute to unnecessary ED utilization, increased health care costs, and patient dissatisfaction.
Artificial intelligence (AI) (particularly large language models (LLMs) and computer vision (CV) applications) offers an opportunity to transform this phase of care. AI-driven chatbots can provide patients with continuous access to understandable, guideline-based answers, while CV tools can assist in monitoring wound healing and surgical sites remotely. By bridging communication gaps and supporting timely triage, AI has the potential to reduce avoidable ED visits, improve patient confidence in recovery, and lessen the financial and clinical strain on an already overburdened health care system.
Postoperative ED Visits and Health Care Burden
Emergency department overutilization is both a clinical and financial challenge. Treating a non-emergent condition in the ED costs on average, more than $2,000, which is roughly 10-12 times higher than a physician office or urgent care visit. 10 Studies suggest that 13-27% of ED visits could be safely redirected to lower-acuity settings, generating billions in annual cost savings. 11 As noted earlier, in surgical patients specifically, redirection of even a single avoidable visit can save thousands of dollars per patient and millions at the population level. 2 Yet the ED remains a default destination for many postoperative concerns, serving as an unintended gateway to readmission.
Beyond cost, postoperative ED utilization exposes disparities rooted in social determinants of health. 12 Patients with limited support systems, lower health literacy, or public insurance are disproportionately more likely to return to the ED. 12 For uninsured patients, the financial impact is particularly severe: the median charge for an uninsured treat-and-release ED visit grew by 141% between 2006 and 2017, leaving nearly 1 in 5 at risk of catastrophic health expenditure from a single encounter. 13 These inequities underscore the urgency of developing scalable, accessible solutions that can support vulnerable populations in the immediate recovery period. 4
Finally, health literacy further complicates the issue. Nearly one-third of adults in the United States (US) have limited health literacy, yet most postoperative instructions are written at or above a 10th-grade reading level, well above the American Medical Association (AMA) & National Institutes of Health (NIH) recommendation of sixth grade.14-17 This mismatch increases the risk of misunderstanding instructions, preventable complications, and avoidable health care use. Effective interventions must therefore have a dual focus: improving accessibility and comprehension while simultaneously tailoring support to individual patient needs.
Potential for AI to Impact Postoperative Care and Instruction
This review explores 3 promising applications of AI in postoperative care, to address some of the challenges highlighted above. Firstly, AI can enhance patient education by generating personalized instructions that are clear, actionable, and appropriately tailored to health literacy levels. Secondly, AI-driven chatbots can provide round-the-clock, patient-specific support, improving communication, triage, and confidence in recovery. Finally, AI-powered computer vision can remotely monitor surgical sites for infection or complications, enabling earlier intervention while reducing unnecessary clinic or ED visits. Collectively, these applications have the potential to improve patient outcomes, patient satisfaction, lower health care costs, and reduce disparities in postoperative care.
AI-Generated Postoperative Instructions
Health literacy plays a pivotal role in surgical recovery. 18 Patients with higher health literacy experience fewer complications and shorter hospital stays, while those with lower literacy face increased risks of readmission, complications, and poor adherence to care plans. 18 Even when patients report understanding their discharge instructions, recall of key details is often limited, likely due to the stress and emotional burden surrounding surgery. 9 While literature that directly compares health literacy to recall of postoperative instruction is scarce, studies have consistently shown that better discharge instructions reduce complications and improve outcomes, highlighting the need for more accessible and actionable communication.9,19
Large language models offer a promising avenue to bridge this gap. These systems can transform complex, jargon-filled postoperative instructions into content that is clearer, more understandable, and aligned with patients’ comprehension levels.20,21 Early studies show that LLMs can answer postoperative questions at a knowledge level comparable to surgical trainees, and patients often report perceiving these responses as both accurate and understandable. 20 Seth et al in 2023 published a study evaluating AI’s ability to answer postoperative patient questions regarding breast augmentation surgery. 21 ChatGPT-4 was asked to answer 6 commonly asked questions pertaining to breast augmentation, and the quality of the generated responses was evaluated by specialist plastic and reconstructive surgeons. Additional cross-check of relevancy of the responses was performed through a literature search in the PubMed and Cochrane databases. 21 The authors noted that AI-generated answers were well-written and understandable; however, they lacked individualization and occasionally used outdated references. 21 Studies that utilize readability, understandability, and actionability metrics reinforced this pattern. Shaari et al used the Flesch-Kincaid grade-level readability formula and the Patient Education Materials Assessment Tool (PEMAT) understandability score (where a score above 70% indicates an easily understandable reading material) to analyze postoperative instructions after 4 separate rhinology procedures. 22 The authors reported that ChatGPT produced material at an 8th grade reading level and Google Gemini produced material at a 9th grade level—both of which are above the 6th grade level recommended by the AMA/NIH but nonetheless an improvement on the current reading level of postoperative material previously described.15,16,22 Both LLMs exceeded the PEMAT understandability benchmark score of 70%, which is used as a benchmark for medical handouts, with a score range of 91-100%.22,23 Similarly, Dihan et al (2024) found a PEMAT understandability score of greater than 70% when analyzing patient education materials for pediatric cataract using multiple LLMs (ChatGPT 3.5, ChatGPT-4, and Google Bard) while Mohan et al (2025) found that an even higher score over 80% for understandability when evaluating postoperative instructions for 4 common facial trauma procedures.23,24
Future integration of LLMs into electronic medical records (EMRs) could allow tailoring of discharge instructions to each patient’s surgical procedure, comorbidities, medications, and social context. This would be especially valuable for vulnerable populations with limited health literacy, who face disproportionate barriers to recovery. By making instructions individualized and comprehensible, AI-powered interventions have the potential to improve adherence, reduce complications, and decrease avoidable health care utilization.
Hybrid Interactive Models for Postoperative Care
Beyond static instructions, conversational AI chatbots represent an important step toward interactive postoperative care. These systems allow patients to access real-time, 24/7 guidance in the absence of immediate clinician availability, while incorporating medically vetted knowledge and physician oversight to minimize errors.25,26 When paired with AI-generated, patient-centric discharge instructions, chatbots could create a continuous safety net, reduce unnecessary ED visits, and improve patient confidence during recovery.
The capability of general-purpose LLMs to answer postoperative questions has recently been explored within plastic surgery. Gomez-Cabello (2024) found that ChatGPT 3.5, ChatGPT-4, and Gemini produced reasonably accurate responses with scores of 4.19, 4.16, and 4.09, respectively, on a 5-point Likert scale when graded by plastic surgeons. 20 Similarly, Abi-Rafeh et al (2024) reported that ChatGPT had a 91% accuracy reporting the most likely diagnosis for postoperative medical support following aesthetic surgery. 27 Despite these promising results, limitations remain for general-purpose LLMs. Seth et al (2023) emphasized that the responses given by LLMs tended to be accurate, though they were often generalized, nonspecific answers (not individualized)—a theme that Cox et al (2023) additionally endorsed while studying ChatGPT responses to common postoperative blepharoplasty questions.22,28 The Cox et al study also noted that the LLM it tested was limited to pre-2021 data, resulting in information that sometimes emerged from outdated resources. 28 Readability was also a notable concern; for example, when not specifically instructed to answer questions at an AMA/NIH acceptable reading level, ChatGPT-3.5, ChatGPT-4, and Gemini struggled to give responses under a 10th grade reading level, with a majority of ChatGPT-4 and Gemini responses requiring a college reading level to comprehend. 20 Collectively, these findings suggest that while general-purpose LLMs can provide broadly correct information, more accurate and comprehendible responses need both technological refinement and correct prompts for LLMs to serve as dependable postoperative chatbots.
Unlike general-purpose LLMs, clinically oriented chatbots are trained in medical guidelines and often employ rule-based safeguards. For example, a 2025 study on AI follow-up systems for patients with coronary heart disease combined AI with physician oversight, yielding high patient satisfaction and significant time and workload reduction—saving more than 13 clinician hours per 100 patients. 25 This system incorporated physician input, field-specific literature, and rule-based safeguards to enhance accuracy and prohibit the generation of fabricated information, a well-documented issue with LLMs. 25 Smaller studies echo these findings: one pilot trial in hip arthroplasty patients reported 80% satisfaction and 79% accuracy across more than 100 questions (Dwyer et al, 2023), with nearly half of the patients reassured enough by chatbot responses to forgo further medical contact. 26 Such data underscores the potential of AI chatbots to reduce unnecessary health care utilization and enhance patient experience.
Despite the immense potential highlighted above, reliability remains a barrier to adoption. Accuracy rates below 100% in the absence of a supervising clinician carry clear risks, and while no adverse outcomes were reported in early studies, scaling to larger populations demands caution.25,26 Continuous refinement of model accuracy, incorporation of real-time physician oversight, and robust safety checks will be essential before widespread deployment. With these safeguards in place, hybrid interactive models could meaningfully reduce costs, increase efficiency, and support more patient-centered postoperative care.
Surgical Site Infection (SSI) Monitoring and Computer Vision Applications
Computer vision (CV) represents another frontier in AI-enhanced postoperative care, particularly for surgical site monitoring. Surgical site infections and graft complications remain leading causes of preventable morbidity, yet early detection is challenging without frequent in-person visits. 29 Artificial intelligence-enabled CV offers the potential for patients to submit images of their surgical site for rapid, remote triage, allowing clinicians to prioritize follow-up and intervene before complications escalate.
Recent studies demonstrate encouraging performance. In one study of 6060 patients from multiple sites of the Mayo Clinic, an AI model known as Vision Transformer achieved 73% accuracy in detecting SSIs from images taken up to 30 days postoperation. 30 The images were optimally scaled to maintain aspect ratio and minimize distortion, in an effort to boost AI performance. A small pilot study using the RedScar app also found success with 100% sensitivity and 83% specificity in detecting SSIs in patients undergoing abdominal surgery. 29 Other models, such as DenseNet121, have shown even higher accuracy in specialized applications. Kim et al used DenseNet121 to analyze 5506 images of free flap sites and identify abnormal flap perfusion. 31 This model demonstrated strong performance in detecting signs of flap insufficiency, with a 97.5% accuracy for venous insufficiency and a 92.8% accuracy for arterial insufficiency. 31 These findings suggest that AI can effectively interpret patient-captured images, despite challenges such as inconsistent image quality. If validated in larger, diverse cohorts, such systems could streamline postoperative surveillance, reduce unnecessary clinic visits, and ensure earlier intervention when true complications arise.
It is once again important to emphasize that significant barriers remain before widespread implementation. Current models have been tested primarily in controlled research settings, not in diverse real-world populations. Notably, performance across different skin tones has not been adequately addressed, raising concerns about equity in diagnostic accuracy. 31 Larger randomized trials, broader data set development, and integration with clinical workflows will be critical next steps. Despite these challenges, CV-based surgical site monitoring has strong potential to enhance safety, efficiency, and patient engagement in postoperative care.
Challenges and Future Directions
Artificial intelligence-based postoperative tools ranging from tailored discharge instructions to conversational chatbots and CV wound monitoring share a unifying goal: to improve patient comprehension, enhance satisfaction, and reduce unnecessary health care utilization.4-7,25,32 The early evidence is encouraging: AI-generated instructions can simplify complex medical information, chatbots provide continuous and empathetic support, and CV technology shows promise in detecting complications such as SSIs or flap insufficiency before patients present to the ED. Such advances signal a shift toward more accessible, patient-centered, and resource-efficient care.
Key challenges, however, remain. Readability and actionability scores for AI-generated instructions often fall short of AMA and NIH standards unless explicitly prompted.22-24 Patient Education Materials Assessment Tool defines actionability as “when consumers of diverse backgrounds and varying levels of health literacy can identify what they can do based on the information presented” with a ≥70% score considered the benchmark.23,33 Multiple studies mentioned in the preceding sections found just about adequate to potentially subpar results in this domain, highlighting opportunities for improvement.22-24 Zhang et al (2024) achieved high readability at a 6th grade level when specifically prompting AI to rewrite postoperative reading material but noted decreased actionability score with simpler text. 32 In contrast, Swisher et al (2024) found significant readability gains (Flesch-Kincaid Reading Ease of 70.8 vs 43.9 for ChatGPT revised vs original) without actionability improvement. 34 These discrepancies underscore the need for physician oversight to ensure accuracy and actionable guidance through AI. Few studies use rigorous designs, and there is a lack of randomized controlled trials directly comparing patient comprehension of AI-generated vs clinician-authored postoperative instructions, limiting the strength of current evidence. 22
The field of research also remains in its infancy. Most studies are small feasibility trials, often measuring satisfaction or efficiency rather than clinical outcomes such as complication rates, ED visits, or readmissions. Larger, surgery-specific randomized controlled trials are needed to determine whether these tools meaningfully impact outcomes and cost. AI performance varies depending on the extent of the context used for its training and the rules implemented to limit hallucinations.25,35 Equally important, the responsibility for AI-generated content remains ambiguous, and clear delineation of ownership and accountability will be essential before these tools can be fully endorsed in clinical care. Future research must further address equity, particularly health literacy disparities and the performance of CV algorithms across diverse skin tones.30,31 Integration into electronic medical records, with physician oversight and safeguards against misinformation, will be essential for safe, scalable implementation.
Conclusion
Artificial intelligence-driven tools described in this review represent a promising frontier in postoperative care, with the potential to improve comprehension, provide postoperative support, and reduce unnecessary health care utilization. Widespread adoption requires stronger evidence from randomized trials, validation in diverse surgical populations, and careful integration into clinical workflows with appropriate oversight. Continued innovation, paired with rigorous evaluation, will be essential to translating these technologies into safe, effective, and cost-efficient improvements in surgical care while helping to bridge persistent gaps in health care disparities.
Footnotes
Author Contributions
NL and AJM—Conceptualization, draft of preliminary manuscript, revisions incorporating critical intellectual content, and approval of final version.
DL and CEC—Conceptualization, critical review of manuscript with revisions for incorporating intellectual content, and approval of final version.
AR—Senior author, conceptualization, critical review of manuscript with revision for incorporating intellectual content, and approval of final version.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
