Sage Journals: Discover world-class research

Abstract

Background:

There are few studies that have evaluated whether large language models, such as ChatGPT, can provide accurate guidance to clinicians in the field of foot and ankle surgery. This study aimed to assess the accuracy of ChatGPT's responses regarding ankle cartilage repair by comparing them with the consensus statements from foot and ankle experts as a standard reference.

Methods:

The open artificial intelligence (AI) model ChatGPT-4 was asked to answer a total of 14 questions on debridement, curettage, and bone marrow stimulation for ankle cartilage lesions that were selected at the 2017 International Consensus Meeting on Cartilage Repair of the Ankle. The ChatGPT responses were compared with the consensus statements developed in this international meeting. A Likert scale (scores, 1-5) was used to evaluate the similarity of the answers by ChatGPT to the consensus statements. The 4 scoring categories (Accuracy, Overconclusiveness, Supplementary, and Incompleteness) were also used to evaluate the quality of ChatGPT answers, according to previous studies.

Results:

The mean Likert scale score regarding the similarity of ChatGPT’s answers to the consensus statements was 3.1 ± 0.8. Regarding the results of 4 scoring categories of the ChatGPT answers, the percentages of answers that were considered “yes” in the Accuracy, Overconclusiveness, Supplementary, and Incompleteness were 71.4% (10/14), 35.7% (5/14), 78.6% (11/14), and 14.3% (2/14), respectively.

Conclusion:

This study showed that ChatGPT-4 often provides responses that diverge from expert consensus regarding surgical treatment of ankle cartilage lesions.

Level of Evidence:

Level V, expert opinion.

Keywords

Artificial intelligence ChatGPT consensus statement ankle cartilage injury bone marrow stimulation

Introduction

Chondral or osteochondral lesions of the ankle are common injuries in young active subjects and are often associated with trauma such as ankle sprains or fractures.^21,30 Several surgical procedures for ankle cartilage lesions have been reported with favorable outcomes. Among surgical techniques for ankle cartilage lesions, bone marrow stimulation (BMS) is the most widely performed, and its favorable long-term outcomes have been reported.^3,22,25 However, literature on ankle cartilage repair is still based on lower-quality scientific evidence,²³ and no superior surgical treatment for either primary or secondary osteochondral lesions of the talus exists.^2,5,13 Thus, evidence-based consensus statements following the International Consensus Meeting on Cartilage Repair of the Ankle have emerged as a way to provide guidance for the surgical treatment of ankle cartilage injuries.^6,10,24

The application of artificial intelligence (AI) to Medicine is highly alluring and has gained a high degree of interest and popularity recently.¹ Rapid advancement in large language models (LLMs), such as ChatGenerative Pre-Trained Transformer (ChatGPT), Google Bard, and Microsoft’s Bing Chat, have the potential to open a new era of clinical practice and research. These LLMs are freely and readily available to patients and clinicians, and their reliability and accuracy have been recently investigated in several studies on orthopaedic surgery.^4,9,16,29 It has been shown that ChatGPT can perform at the same level as orthopaedic surgery residents during the Orthopedic In-Training Examinations.^9,17 Some previous studies have also reported that ChatGPT can correctly provide evidence-based responses to questions frequently raised by patients.^11,20 However, some studies have shown that these LLMs do not provide clinically supported answers for hip and knee arthroplasty.^18,31

In the field of foot and ankle surgery, there is a lack of evidence regarding the reliability and usefulness of LLMs. Therefore, whether or not ChatGPT can offer adequate expertise in foot and ankle surgery remains unclear. This study aimed to assess the accuracy of ChatGPT's responses regarding ankle cartilage repair by comparing them with the consensus statements from foot and ankle experts as a standard reference.

Methods

This study was exempt from institutional review board approval because broadly available free chatbots were used and human or animal subjects were not included.

Questions and Consensus Statements

Of the consensus statements that were developed at the 2017 International Consensus Meeting on Cartilage Repair of the Ankle, the present study collected questions and consensus statements on “Debridement, Curettage and Bone Marrow Stimulation.”¹⁰ This study determined that questions and consensus statements on “Debridement, Curettage and Bone Marrow Stimulation (BMS)” application for treatment of osteochondral lesions (OCLs) of the ankle were selected as standard references, because BMS is one of the most commonly performed surgical procedures for ankle OCLs.²⁶ Furthermore, BMS is among the most widely available procedures in low-resource health care systems.²⁶ In addition, long-term outcomes of BMS are available in the literature.^22,25

In this study, a total of 15 questions regarding debridement, curettage, and BMS were formalized by 7 international foot and ankle experts as a working group.¹⁰ By using the modified Delphi format reported by Linstone and Turoff,¹⁵ an agreement among the experts was pursued. The grades of consensus were defined as follows: consensus, 51% to 74% agreement; strong consensus, 75% to 99% agreement; and unanimous, 100% agreement. Of the 15 questions and consensus statements, 1 achieved unanimous consensus, 12 achieved strong consensus, 1 achieved consensus, and 1 question was removed. In the present study, a total of 14 questions and consensus statements derived from the reference work of foot and ankle experts,¹⁰ including those that were considered unanimous (1), strong consensus (12), and consensus (1), were used to compare the answers provided by ChatGPT.

A new user account for the LLM chatbot ChatGPT-4 (Open AI; https://chat.openai,com/) was created. This was done to prevent the influence of past prompts on the responses by ChatGPT-4, as allowed by the memory function. The aforementioned 14 questions were fed into ChatGPT on February 6 in 2025. Each answer by ChatGPT was then recorded with no clarification or follow-up. To simulate the behavior of a nonspecialized user, that is, outside of a research environment or by trained users, such as patients or clinicians in their clinical practice, the questions were fed directly to the LLM without any special prompting techniques.

The Evaluation of Answers by ChatGPT

The answers to each question by ChatGPT were compared to each corresponding consensus statement.¹⁰ Two senior orthopaedic surgeons independently performed the comparison and the Likert scale method, and 4 scoring categories were used to evaluate the quality of the answers by ChatGPT. When discrepancies between the 2 examiners were detected, a third senior orthopaedic surgeon provided the final output. The Likert scale was used to evaluate the similarity of the answers by ChatGPT to the consensus statements as follows: 5, completely identical; 4, almost identical; 3, neither the same nor different; 2, slightly different; and 1, completely different.^27,29

Four scoring categories were used to evaluate the quality of the ChatGPT answers based on the evaluation framework used in previous studies^7,19 (Table 1).

Table 1.

Four Scoring Categories for the Assessment of Responses by ChatGPT.

1. Accuracy: Does the chatbot response align with the consensus statement?

If YES: The chatbot response is consistent with the consensus statement, with no contradictions.

If NO: The chatbot response contradicts the consensus statement.

2. Overconclusiveness: In cases where the consensus statement indicates insufficient evidence to make a recommendation, did the chatbot provide one?

If YES: The chatbot offered a recommendation despite the consensus statement’s lack of one.

If NO: Either the chatbot refrained from making a recommendation in alignment with the consensus statement, or both the chatbot and the consensus statement provided recommendations.

3. Supplementary: Does the chatbot response include additional relevant information not covered by the consensus statement?

If YES: The chatbot response contains significant additional information, such as references to peer-reviewed literature or further elaborations beyond what is found in the consensus statement.

If NO: The chatbot response does not introduce additional relevant information beyond what is specified in the consensus statement.

4. Incompleteness: If the chatbot response is deemed accurate, does it omit any relevant details included in the consensus statement?

If YES: The chatbot response lacks relevant information that is present in the consensus statement.

If NO: The chatbot response captures all the relevant details included in the consensus statement, with no omissions.

Statistical Analyses

Statistical analyses were performed using the SPSS software (version 26.0, SPSS, Chicago, IL). The interrater reliability of the 2 assessments by the 2 examiners was evaluated using the κ statistic. The κ coefficient for interrater agreement was graded using Landis classification as follows: slight, 0.0 to 0.20; moderate, 0.21 to 0.60; substantial, 0.61 to 1.00.^12,14

Results

Interrater Reliability of the Assessment of Answers by ChatGPT

Interrater reliabilities of the assessment of answers by the ChatGPT by 2 examiners in the Likert scale and 4 scoring categories were graded substantial as follows: Likert scale, κ = 0.77; Accuracy, κ = 0.81; Overconclusiveness, κ = 0.84; Supplementary, κ = 0.66; Incompleteness, κ = 0.76.

The Evaluation of Responses by ChatGPT

The similarity of answers by ChatGPT to those of the experts is shown in Table 2. The mean Likert scale was 3.1 ± 0.8. Six answers (42.9%) were graded as Likert 3, and 3 (21.4%) were graded as Likert 2. The results for the 4 scoring categories are shown in Table 3. The percentages of answers that were considered “yes” in the Accuracy, Overconclusiveness, Supplementary, and Incompleteness were 71.4% (10/14), 35.7% (5/14), 78.6% (11/14), and 14.3% (2/14), respectively. The questions and answers by both ChatGPT and experts are shown in Tables 4 and 5.

Table 2.

Consistency of Responses by ChatGPT to Consensus Statements.^a

Question	Examiner 1	Examiner 2	Final
1	4	4	4
2	3	3	3
3	2	2	2
4	4	3	4^b
5	4	4	4
6	4	4	4
7	2	3	2^b
8	3	3	3
9	3	3	3
10	2	2	2
11	3	3	3
12	3	3	3
13	4	4	4
14	3	3	3

Likert scale: 1, completely different; 2, almost different; 3, neutral; 4, almost same; 5, completely same. The concordance rate of the Likert scales by the examiners was 85.7% (12/14).

Disagreement between the Likert scales by each examiner. A third senior orthopaedic surgeon resolved it.

Table 3.

Results of the 4 Scoring Categories of Answers by ChatGPT.^a

	Accuracy			Overconclusiveness			Supplementary			Incompleteness
Question	Examiner 1	Examiner 2	Final	Examiner 1	Examiner 2	Final	Examiner 1	Examiner 2	Final	Examiner 1	Examiner 2	Final
1	yes	yes	yes	no	no	no	yes	no	yes^b	no	no	no
2	yes	yes	yes	no	no	no	yes	yes	yes	no	no	no
3	no	no	no	yes	yes	yes	yes	yes	yes	no	no	no
4	yes	yes	yes	yes	no	yes^b	yes	yes	yes	no	no	no
5	yes	yes	yes	no	no	no	yes	no	yes^b	no	no	no
6	yes	yes	yes	no	no	no	no	no	no	no	no	no
7	no	yes	no^b	yes	yes	yes	yes	yes	yes	no	no	no
8	yes	yes	yes	yes	yes	yes	yes	yes	yes	no	no	no
9	yes	yes	yes	no	no	no	no	no	no	no	no	no
10	no	no	no	yes	yes	yes	no	no	no	no	no	no
11	yes	yes	yes	no	no	no	yes	yes	yes	yes	yes	yes
12	yes	yes	yes	no	no	no	yes	yes	yes	no	yes	no^b
13	yes	yes	yes	no	no	no	yes	yes	yes	no	no	no
14	no	no	no	no	no	no	yes	yes	yes	yes	yes	yes

The concordance rates of the answers by the examiners: Accuracy, 92.9% (13/14); Overconclusiveness, 92.9% (13/14); Supplementary, 85.7% (12/14); Incompleteness, 92.9% (13/14).

Disagreement between the Likert scales by each examiner. A third senior orthopaedic surgeon resolved it.

Table 4.

List of the Total 14 Questions Used in This Study.

Q1. In what cases/lesion types can debridement and curettage without bone marrow stimulation be considered for the surgical treatment of cartilage pathology of the ankle?
Q2. In what cases/lesion types can bone marrow stimulation be considered for the surgical treatment of cartilage pathology of the ankle?
Q3. What are the ideal lesion size guidelines (diameter, depth, etc) for the use of bone marrow stimulation in the primary surgical treatment of an osteochondral lesion of the talus?
Q4. What local prognostic factors should be considered when utilizing bone marrow stimulation for the surgical treatment of cartilage pathology of the ankle?
Q5. How can the lesion site (base and edge) be prepared prior to bone marrow stimulation?
Q6. How can damaged cartilage be differentiated from healthy cartilage intraoperatively and how much should be removed?
Q7. Can an awl or drill be utilized for bone marrow stimulation of the ankle, and what size should be used?
Q8. To what depth can the awl/drill holes be made in the ankle?
Q9. What distance should be maintained between the awl/drill holes in the ankle?
Q10. To what depth can a lesion be debrided before bone grafting is required in the ankle?
Q11. Can biological augmentation of a bone marrow stimulation procedure be considered?
Q12. Is it necessary to perform bone marrow stimulation for saucer-type lesions of the ankle?
Q13. Can a repeat debridement, curettage, and/or bone marrow stimulation procedure be considered? If so, in what cases?
Q14. Does a previous debridement, curettage, and/or bone marrow stimulation procedure affect the outcome of a subsequent procedure?

Table 5.

Questions and Answers by ChatGPT and Foot and Ankle Experts.

Q1. In what cases/lesion types can debridement and curettage without bone marrow stimulation be considered for the surgical treatment of cartilage pathology of the ankle?
Answer by experts:
Debridement and curettage without bone marrow stimulation can be considered for the following cases/lesion types: (1) partial-thickness chondral lesions; (2) acute lesions (eg, found during arthroscopy after ankle fracture; professional athlete in season and looking to return to competition, etc); (3) incidental lesion found during arthroscopy; and (4) lesion caused by other disease (eg, gouty arthritis, PVNS, etc)

Vote: Agree: 80%; Disagree: 20% (Strong Consensus)
Grade of Evidence: E
Answer by ChatGPT:
Debridement and curettage without bone marrow stimulation (BMS) can be considered in select cases of cartilage pathology of the ankle, particularly when the lesion is not full-thickness or does not require significant biological augmentation for healing. The following are specific cases where this approach may be appropriate:
Stable Osteochondral Lesions (OCLs) with Intact Subchondral Bone
When the lesion is small (<10 mm in diameter) and has stable cartilage edges.
If the subchondral bone remains intact without cyst formation, minimizing the need for BMS.
Fibrillated or Degenerative Cartilage Lesions (Early-Stage Chondral Damage)
In cases of superficial chondral fibrillation or softening without full-thickness cartilage loss.
When mechanical symptoms (e.g., impingement or catching) are present due to cartilage flaps but without significant subchondral bone involvement.
Low-Demand Patients or Asymptomatic Lesions Discovered Incidentally
For patients with minimal symptoms where a less aggressive approach is warranted.
When lesions are found intraoperatively but are not causing significant functional impairment.
Palliative or Symptom-Relieving Surgery in Older Patients
In older individuals with degenerative changes where cartilage repair techniques may not be as effective.
When the goal is symptom relief rather than long-term biological restoration.
Post-Traumatic Loose Cartilage Flaps Without Deep Subchondral Defects
In acute cases where removing unstable cartilage flaps restores a stable joint surface without exposing deep subchondral bone.
In these situations, debridement and curettage alone may help alleviate symptoms by removing unstable cartilage fragments and smoothing rough edges, without necessarily requiring additional procedures such as microfracture or bone marrow stimulation. However, for larger, unstable, or cystic lesions, biological augmentation techniques (e.g., microfracture, autologous matrix-induced chondrogenesis, or osteochondral grafting) are typically preferred.
Q2. In what cases/lesion types can bone marrow stimulation be considered for the surgical treatment of cartilage pathology of the ankle?
Answer by experts:
Bone marrow stimulation can be considered for the surgical treatment of full-thickness chondral or osteochondral lesions that have failed conservative treatment.
Vote: Agree: 93%; Disagree: 7% (Strong Consensus)
Grade of Evidence: B1
Answer by ChatGPT:
Bone marrow stimulation (BMS) techniques, such as microfracture, drilling, and abrasion arthroplasty, can be considered for the surgical treatment of cartilage pathology of the ankle in the following cases:
1. Small to Medium Osteochondral Lesions of the Talus (OLT)
Lesions typically <1.5 cm² (some studies extend up to 2 cm²).
Ideal for contained lesions with intact surrounding cartilage.
Used when lesion depth is <5–7 mm, as deeper lesions may require structural grafting.
2. Acute or Symptomatic Chronic Osteochondral Defects
Post-traumatic osteochondral lesions that have not healed conservatively.
Chronic lesions with persistent pain and functional limitations.
3. Primary Surgical Option for Certain Patients
Young, active patients with good healing potential.
Patients preferring a minimally invasive approach (arthroscopic technique).
Cases where osteochondral autograft or allograft is not available or feasible.
4. Revision Surgery for Previous Failed Non-Surgical Management
If previous conservative treatments (e.g., physical therapy, NSAIDs) fail.
Can be considered before progressing to more invasive procedures (e.g., autograft/allograft transplantation).
5. As a Part of Augmented Cartilage Repair Strategies
When combined with biologic adjuncts like PRP (platelet-rich plasma), bone marrow aspirate concentrate (BMAC), or hyaluronic acid scaffolds to enhance healing.

Utilized as an initial step before secondary procedures like autologous matrix-induced chondrogenesis (AMIC).
Contraindications and Limitations
Large lesions (>2 cm²) or deep lesions (>5–7 mm) – these require structural grafting (e.g., OATS, ACI).
Subchondral cystic changes or poor bone quality – may compromise the success of BMS.
Multiple or diffuse cartilage defects (e.g., generalized osteoarthritis) – not ideal as BMS is best for focal lesions.
Bone marrow stimulation is a cost-effective and minimally invasive technique but has limitations such as the formation of fibrocartilage rather than hyaline cartilage, which may have inferior mechanical properties and durability. Thus, patient selection and proper surgical technique are crucial to optimize outcomes.
Q3. What are the ideal lesion size guidelines (diameter, depth, etc) for the use of bone marrow stimulation in the primary surgical treatment of an osteochondral lesion of the talus?
Answer by experts:
The ideal size guidelines for use of bone marrow stimulation are lesions <10 mm in diameter, <100 mm² in area, and <5 mm in depth. Bone marrow stimulation is less likely to succeed when used as a sole treatment in a lesion 15 mm in diameter or greater.

Vote: Agree: 94%; Disagree: 6% (Strong Consensus)
Grade of Evidence: Diameter, A1; Area, A1; Depth, B2
Answer by ChatGPT:
Bone marrow stimulation (BMS), including techniques such as microfracture, drilling, or abrasion arthroplasty, is commonly used for the primary surgical treatment of osteochondral lesions of the talus (OLT). However, its success is highly dependent on lesion characteristics such as size, depth, and location. The ideal lesion size guidelines for using BMS are as follows:
1. Lesion Diameter (Surface Area)
Ideal: ≤15 mm in diameter or ≤150 mm² in surface area
Suboptimal: 15–20 mm (borderline; outcomes may decline)
Not Recommended: >20 mm, as larger lesions have higher failure rates due to insufficient fibrocartilage coverage and durability.
2. Lesion Depth
Ideal: <5 mm deep
Suboptimal: 5–8 mm deep (may require augmentation such as bone grafting)
Not Recommended: >8 mm deep, as subchondral bone involvement necessitates other reconstructive procedures (e.g., osteochondral autograft, allograft, or autologous chondrocyte implantation).
3. Other Considerations
Lesion Containment: Well-contained lesions (surrounded by intact cartilage) respond better to BMS.
Age & Activity Level: Younger, less active patients tend to have better outcomes.
Chronicity: Acute lesions respond better than chronic ones, where subchondral bone damage may be more significant.
Location: Medial lesions tend to do better than lateral ones due to weight-bearing biomechanics.
For larger or deeper lesions, alternative treatments such as osteochondral autograft transplantation (OATS), autologous matrix-induced chondrogenesis (AMIC), or autologous chondrocyte implantation (ACI) should be considered.
Q4. What local prognostic factors should be considered when utilizing bone marrow stimulation for the surgical treatment of cartilage pathology of the ankle?
Answer by experts:
The following prognostic factors should be considered when utilizing bone marrow stimulation for surgical treatment of cartilage pathology of the ankle: (1) ankle stability; (2) bone marrow edema; (3) joint alignment; (4) lesion size; (5) lesion location; (6) presence of a cyst(s); (7) previous cartilage repair procedure; and (8) uncontained lesion.

Vote: Agree: 92%; Disagree: 8% (Strong Consensus)
Grade of Evidence: Ankle stability, B2; Bone marrow edema, E; Joint alignment, E; Lesion size (Diameter and Area, A1; Depth, B2); Lesion location, B2; Presence of a cyst(s), B2; Previous cartilage repair procedure, B2; Uncontained lesion,1 B2
Answer by ChatGPT:
When utilizing bone marrow stimulation (BMS) for the surgical treatment of cartilage pathology of the ankle, such as osteochondral lesions of the talus (OLT), several local prognostic factors should be considered to optimize outcomes:
1. Lesion-Specific Factors
Size of the lesion:
Lesions <150 mm² generally have better outcomes with BMS.
Larger lesions (>150–200 mm²) have poorer healing and may require alternative treatments (e.g., osteochondral autograft transfer system (OATS) or autologous chondrocyte implantation (ACI)).
Depth of the lesion:
Deep lesions (>5–7 mm) extending into subchondral bone may have worse outcomes with BMS alone.
Additional techniques, such as bone grafting, may be required in these cases.

2. Subchondral Bone Integrity
Bone marrow edema (BME):
Presence of significant BME on MRI may indicate underlying instability and poor response to BMS.
Cystic changes:
Subchondral cyst formation (>5 mm) may impair healing and may require bone grafting.
Sclerosis:
Subchondral bone sclerosis can reduce marrow access and limit healing response.
3. Location of the Lesion
Medial vs. Lateral talar lesions:
Medial lesions are often deeper but more stable.
Lateral lesions are more commonly associated with trauma and may have more instability.
Central vs. Peripheral lesions:
Peripheral lesions may be harder to access surgically and may have less contact with joint fluid, affecting healing.
4. Joint Environment
Presence of osteoarthritis (OA):
Significant joint degeneration reduces the success of BMS.
Synovial inflammation:
Chronic synovitis may impair healing and contribute to cartilage degradation.
5. Mechanical and Alignment Considerations
Ankle joint stability:
Instability (e.g., chronic lateral ankle instability) may lead to failure due to abnormal loading.
Malalignment:
Varus or valgus deformity may cause uneven joint loading, reducing the effectiveness of BMS.
6. Previous Surgical Interventions
Failed prior BMS procedures:
Reoperation with BMS alone has a lower success rate.
Multiple prior surgeries:
Can lead to fibrosis, altered joint mechanics, and poor healing response.
7. Patient Activity Level & Postoperative Rehabilitation
Weight-bearing status:
Premature loading may compromise healing.
High-impact activities:
Athletes or high-demand individuals may experience faster cartilage degradation despite initial improvement.
By carefully evaluating these local prognostic factors, surgeons can determine whether bone marrow stimulation is an appropriate treatment or if other surgical options (e.g., OATS, ACI, allograft) should be considered.
Q5. How can the lesion site (base and edge) be prepared prior to bone marrow stimulation?
Answer by experts:
The lesion site can be prepared prior to bone marrow stimulation by debriding all loose or fibrillated cartilage until there is a stable rim of articular cartilage. Stable edges should be vertical and 90 degrees to the subchondral bone. The calcified cartilage layer of bone should be removed, but care should be taken not to destroy the framework of the underlying subchondral bone.

Vote: Agree: 98%; Disagree: 2% (Strong Consensus)
Grade of Evidence: B1
Answer by ChatGPT:
Before performing bone marrow stimulation (BMS) techniques such as microfracture, drilling, or abrasion arthroplasty, it is crucial to properly prepare the lesion site, specifically the base and edge, to optimize healing and cartilage regeneration. The preparation typically involves the following steps:
1. Removal of Damaged or Unstable Cartilage
Use a curette or shaver to debride and remove all loose, fibrillated, or non-viable cartilage from the lesion.
Ensure that only healthy, well-attached cartilage remains at the periphery.
2. Creation of Stable Vertical or Gradual Sloped Edges
The edges of the defect should be stable to prevent delamination of the new tissue.
Depending on the lesion size and location:
For small defects: Vertical walls may help contain the repair tissue.
For larger defects: A gradual slope may help integrate the new tissue.
3. Preparation of the Subchondral Bone Base
The base of the lesion should be carefully prepared to preserve the subchondral plate while ensuring it is viable and vascularized.
Remove any sclerotic or non-viable bone with a curette or burr without over-thinning the subchondral plate.
Ensure a smooth and uniform base to promote even bone marrow stimulation.

4. Bone Marrow Stimulation (Microfracture, Drilling, or Abrasion)
Once the base and edges are properly prepared, BMS is performed:
Microfracture: Use an awl to create small perforations (3-4 mm apart) to access bone marrow cells.
Drilling: Small-diameter drill holes (often used in larger defects).
Abrasion Arthroplasty: High-speed burr used to remove the sclerotic layer and expose cancellous bone.
5. Hemostasis & Hydration
Hemostasis: Avoid excessive bleeding or clot disruption.
Hydration: Keep the site moist but not overly fluid-filled to support clot formation.
By following these steps, you create an optimal environment for mesenchymal stem cells (MSCs) from the bone marrow to migrate, form a superior fibrin clot, and differentiate into fibrocartilaginous repair tissue.
Q6. How can damaged cartilage be differentiated from healthy cartilage intraoperatively and how much should be removed?
Answer by experts:
All unstable cartilage can be removed from the defect site. Articular cartilage that is stable, but appears abnormal/damaged, can be left in situ, as stability is more important than visual appearance and slightly damaged but stable cartilage is superior to fibrocartilage. The following methods may be used to differentiate damaged cartilage from healthy cartilage intraoperatively: (1) probing articular cartilage to assess how soft or ballotable it is and (2) probing articular cartilage to see if it can be delaminated off of the subchondral bone.
Vote: Agree: 98%; Disagree: 2% (Strong Consensus)
Grade of Evidence: B1
Answer by ChatGPT:
Intraoperative differentiation of damaged cartilage from healthy cartilage is crucial in procedures such as chondroplasty, microfracture, and osteochondral grafting. Surgeons use a combination of visual inspection, palpation, and specialized instruments to assess cartilage integrity.
1. Differentiation of Damaged vs. Healthy Cartilage
Visual Inspection
Healthy cartilage: Smooth, glistening, white, and firm.
Damaged cartilage: Fibrillated, soft, discolored, or has visible cracks and delamination.
Full-thickness defects: Exposed subchondral bone or loose cartilage flaps.
Palpation with a Probe (Hook Test)
Healthy cartilage offers resistance to probing.
Diseased cartilage feels soft or spongy and may fragment under pressure.
Mechanical Debridement Response
When debriding, healthy cartilage remains attached, while damaged cartilage easily peels or frays.
Dye Staining (Optional)
Dyes like methylene blue or India ink can highlight subtle cartilage damage by penetrating fissures and soft areas.
2. How Much Cartilage Should Be Removed?
The goal is to remove only irreversibly damaged cartilage while preserving as much healthy tissue as possible.
Partial-Thickness Lesions
Preserve intact cartilage.
Remove only loose or fibrillated layers using a shaver, curette, or radiofrequency device.
Full-Thickness Defects
Remove all unstable cartilage down to stable, well-adherent cartilage edges.
Consider microfracture or grafting if subchondral bone is exposed.
Chondroplasty Considerations
Use smooth transitions between treated and healthy cartilage to prevent stress risers.
Avoid excessive thermal or mechanical damage to surrounding cartilage.
Q7. Can an awl or drill be utilized for bone marrow stimulation of the ankle, and what size should be used?
Answer by the experts:
The use of an awl or a low-speed drill of 2 mm or less in size is recommended.
Vote: Agree: 93%; Disagree: 7% (Strong Consensus)
Grade of Evidence: C
Answer by ChatGPT:
Yes, both an awl and a drill can be used for bone marrow stimulation (BMS) of the ankle, specifically in procedures like microfracture surgery to treat osteochondral lesions of the talus (OLT).
Preferred Instrumentation:
Awl: Typically used in microfracture procedures to create small perforations in the subchondral bone, allowing marrow elements to access the defect.
Drill: Can be used as an alternative to an awl, often in a retrograde fashion to avoid damaging the intact articular cartilage.
Recommended Size:
Awl or Drill Bit Diameter: 1.0 mm to 2.5 mm
Smaller than 2.5 mm to avoid excessive weakening of subchondral bone.
1.2 mm to 1.8 mm is often preferred to optimize healing without compromising stability.

Key Considerations:
Angle of penetration: Typically 30–45° to avoid lateral damage.
Spacing: Holes should be spaced 3–4 mm apart to maintain bone integrity.
Depth: Perforations should reach the subchondral plate (approximately 3–4 mm deep) to access the marrow.
Using a drill may allow more controlled depth and consistency, whereas an awl is simpler and minimizes the risk of thermal necrosis from heat generated by drilling.
Q8. To what depth can the awl/drill holes be made in the ankle?
Answer by experts:
The awl/drill holes can be made to a depth that results in subchondral bone bleeding or presence of fat droplets.
Vote: Agree: 96%; Disagree: 4% (Strong Consensus)
Grade of Evidence: B1
Answer by ChatGPT:
For bone marrow stimulation of the ankle, such as microfracture surgery, drilling, or awl perforation for treating osteochondral defects (OCDs), the recommended depth of penetration is:
2 to 4 mm deep
Spaced 3 to 4 mm apart
This depth ensures access to the subchondral bone and underlying marrow, allowing mesenchymal stem cells and growth factors to migrate into the defect, promoting cartilage repair.
However, excessive depth (>4 mm) risks:
Subchondral plate damage
Cyst formation
Weakened bone structure
For ankle OCDs, controlled microfracture with an awl is often preferred over drilling, as it reduces thermal necrosis and preserves bone integrity.
Q9. What distance should be maintained between the awl/drill holes in the ankle?
Answer by experts:
The distance between the awl apertures should be 3 to 5 mm.
Vote: Agree: 95%; Disagree: 5% (Strong Consensus)
Grade of Evidence: B2
Answer by ChatGPT:
During bone marrow stimulation (BMS) techniques such as microfracture, drilling, or awl perforation in the ankle (talus) for osteochondral lesions, the recommended distance between awl or drill holes is typically:
Optimal Distance Between Holes:
3 to 4 mm apart
This spacing ensures proper stimulation of marrow-derived progenitor cells while maintaining subchondral bone integrity.
Key Considerations:
Avoid Overlapping or Connecting Holes:
Holes should be spaced adequately to maintain mechanical stability of the subchondral plate.
Depth of Penetration:
2–4 mm depth to reach the subchondral bone and allow marrow elements to seep into the lesion site.
Technique Preference:
Awl perforation (manual technique) is generally preferred over drilling to minimize heat necrosis of the bone.
Q10. To what depth can a lesion be debrided before bone grafting is required in the ankle?
Answer by experts:
A lesion can be debrided to a depth of 5 mm before bone grafting is required.
Vote: Agree: 77%; Disagree: 23% (Strong Consensus)
Grade of Evidence: E
Answer by ChatGPT:
When performing bone marrow stimulation techniques such as microfracture, drilling, or abrasion arthroplasty for osteochondral lesions of the talus, the depth of lesion debridement is a critical factor in determining whether bone grafting is required.
General Guidelines:
Superficial Lesions (<6–8 mm deep)
These lesions can typically be debrided down to healthy subchondral bone without requiring bone grafting.
Bone marrow stimulation can be performed directly after debridement.
Deep Lesions (>8 mm or >50% of the talar dome thickness)
If debridement results in a significant bony defect, particularly deeper than 8 mm, structural support is necessary.
Cancellous bone grafting (e.g., autograft, allograft, or synthetic substitutes) is recommended to restore subchondral bone integrity before performing bone marrow stimulation.
Without grafting, excessive depth may lead to collapse, cyst formation, or inadequate cartilage support.

Cystic Lesions
When cystic changes extend deeper than 8 mm or 50% of the talar dome thickness, curettage and bone grafting are advised.
If the cyst is left unfilled, marrow stimulation may be ineffective, and the risk of lesion recurrence increases.
Q11. Can biological augmentation of a bone marrow stimulation procedure be considered?
Answer by experts:
Some form of biological augmentation may be beneficial in lesions treated with bone marrow stimulation in the ankle.
Vote: Agree: 100%; Disagree: 0% (Unanimous)
Grade of Evidence: Concentrated bone marrow aspirate, B2; Hyaluronic acid, A2; Mesenchymal stem cells, B1; Platelet-rich plasma, A2
Answer by ChatGPT:
Yes, biological augmentation of a bone marrow stimulation (BMS) procedure can be considered and has been explored in various orthopedic and regenerative medicine applications. The goal is to enhance the body’s natural healing response by introducing biologic agents that improve tissue regeneration and repair.
Biological Augmentation Strategies for Bone Marrow Stimulation
Platelet-Rich Plasma (PRP)
PRP is derived from the patient’s own blood and contains high concentrations of growth factors (e.g., PDGF, TGF-β, VEGF) that promote tissue healing.
PRP can be injected into the site following BMS to enhance cartilage or bone regeneration.
Bone Marrow Aspirate Concentrate (BMAC)
BMAC is rich in mesenchymal stem cells (MSCs), growth factors, and cytokines.
It is often used to improve the healing capacity of BMS, especially in cartilage repair procedures such as microfracture.
Scaffolds and Biomaterials
Porous scaffolds (e.g., collagen, hydrogel, or synthetic matrices) provide structural support and enhance cell adhesion and proliferation.
These can be used in combination with BMS to improve the quality of new tissue formation.
Stem Cell Therapy
Exogenous stem cells, particularly mesenchymal stem cells (MSCs), can be introduced to augment the repair process.
They help in cartilage and bone regeneration by differentiating into chondrocytes or osteoblasts.
Growth Factors and Cytokines
Recombinant growth factors like BMP-2, BMP-7, and FGF can be applied to stimulate tissue repair.
These factors enhance the proliferation and differentiation of progenitor cells recruited via BMS.
Gene Therapy Approaches
Genetic modifications using viral or non-viral vectors can enhance the regenerative potential of BMS.
For instance, upregulating genes responsible for chondrogenesis or osteogenesis may improve outcomes.
Clinical Applications
Cartilage Repair (e.g., microfracture + PRP or BMAC in knee osteochondral defects)
Avascular Necrosis (AVN) of Bone (e.g., BMAC augmentation of core decompression)
Fracture Healing (e.g., PRP or stem cell therapy in delayed union or nonunion)
Conclusion
Biological augmentation of BMS can significantly enhance the body’s ability to repair bone and cartilage defects. Combining BMS with PRP, BMAC, stem cells, scaffolds, or growth factors may improve clinical outcomes, especially in orthopedic and sports medicine applications. However, further research is still ongoing to optimize protocols and establish standardized treatment guidelines.
Q12. Is it necessary to perform bone marrow stimulation for saucer-type lesions of the ankle?
Answer by experts:
Bone marrow stimulation can be performed for saucer-type lesions of the ankle. When a clear bone fragment is visible, the fragment can be preserved in cases where the lesion is acute and large enough to accommodate fixation.
Vote: Agree: 87%; Disagree: 13% (Strong Consensus)
Grade of Evidence: Please refer to the consensus statements and evidence set forth by the working group on Fixation Techniques.
Answer by ChatGPT:
Saucer-type lesions of the ankle, often seen in conditions like osteochondral lesions of the talus (OLT), may or may not require bone marrow stimulation (BMS) depending on various factors, such as lesion size, depth, symptoms, and patient-specific considerations.
When Bone Marrow Stimulation is Recommended
Bone marrow stimulation techniques (e.g., microfracture, drilling) are generally recommended for:
Small to medium-sized lesions (typically <15 mm in diameter and <5 mm deep)
Symptomatic lesions that cause pain, instability, or dysfunction
Non-contained lesions where primary repair is not feasible
Lesions without significant subchondral bone cysts (where BMS alone may be less effective)
When Bone Marrow Stimulation May Not Be Necessary
Very small, asymptomatic lesions: Conservative management may be sufficient.
Large/deep lesions (>15 mm or >5 mm deep): BMS alone may not provide durable cartilage repair, and alternative treatments like osteochondral autografts, allografts, or cell-based therapies (e.g., autologous chondrocyte implantation) may be more effective.
Subchondral bone damage or cyst formation: If the subchondral bone is significantly compromised, adjunctive procedures like bone grafting or scaffold implantation may be needed.

Alternatives to Bone Marrow Stimulation
Osteochondral autograft transfer (OAT)
Osteochondral allograft transplantation
Autologous chondrocyte implantation (ACI)
Particulate juvenile cartilage allograft
Conclusion
Bone marrow stimulation is a commonly used, minimally invasive option for treating small to medium saucer-type lesions of the ankle. However, its necessity depends on the lesion’s characteristics and patient factors. If the lesion is large, deep, or associated with subchondral cysts, alternative or adjunctive procedures may be required. A personalized approach based on MRI and arthroscopic findings is essential.
Q13. Can a repeat debridement, curettage, and/or bone marrow stimulation procedure be considered? If so, in what cases?
Answer by experts:
A repeat debridement, curettage, and/or bone marrow stimulation procedure can be considered in cases where there is reason to suggest incomplete debridement and/or improper technique in the primary case, as well as in cases where the patient is unwilling to undergo a more invasive procedure.

Vote: Agree: 86%; Disagree: 14% (Strong Consensus)
Grade of Evidence: B1
Answer by ChatGPT:
Yes, a repeat debridement, curettage, and/or bone marrow stimulation of the ankle can be considered in specific cases, particularly when there is incomplete healing or recurrence of symptoms following the initial procedure. Common indications include:
Persistent or Recurrent Osteochondral Lesions (OCLs)
If an osteochondral lesion of the talus (OLT) or another ankle joint structure fails to heal after the initial procedure, a repeat surgery may be necessary.
MRI or CT imaging showing incomplete fibrocartilage formation or continued defect presence may indicate the need for re-intervention.
Residual or Worsening Symptoms
If a patient continues to experience pain, instability, or mechanical symptoms (e.g., locking, catching, or grinding) after the first procedure, a repeat attempt may be warranted.
A failure to restore joint congruency and cartilage integrity could lead to ongoing discomfort.
Insufficient Biological Response
In cases where bone marrow stimulation (e.g., microfracture) does not result in adequate fibrocartilage formation, a repeat procedure may be attempted, possibly with augmentation techniques like platelet-rich plasma (PRP), bone marrow aspirate concentrate (BMAC), or scaffold implantation.
Progression of Cartilage Damage
If imaging or arthroscopy reveals worsening cartilage degeneration or an expanding lesion, re-debridement and stimulation may be necessary.
This may be particularly relevant in high-impact athletes or individuals with high joint-loading activities.
Infection or Poor Healing
If the initial debridement was compromised by an underlying infection or inadequate post-operative healing, another debridement may be necessary before proceeding with further interventions.
Considerations Before Repeat Surgery
Patient factors: Age, activity level, and comorbidities (e.g., diabetes, vascular disease) influence healing potential.
Defect Size & Depth: Larger or deeper lesions may require alternative treatments such as osteochondral autograft transfer (OATS) or autologous chondrocyte implantation (ACI).
Alternative Approaches: If the initial surgery failed, consider adjunct therapies (e.g., biologics, cartilage scaffolds) before opting for another simple debridement.
Q14. Does a previous debridement, curettage, and/or bone marrow stimulation procedure affect the outcome of a subsequent procedure?
Answer by experts:
A previous debridement, curettage, and/or bone marrow stimulation procedure will affect the outcome of a subsequent bone marrow stimulation and autologous osteochondral transplantation procedure, but not in subsequent cases of allograft or scaffold-based therapies.
Vote: Agree: 72%; Disagree: 28% (Consensus)
Grade of Evidence: B2
Answer by ChatGPT:
Yes, a previous debridement, curettage, and/or bone marrow stimulation (BMS) of the ankle can impact the outcome of a subsequent procedure. The effects depend on factors such as the extent of the initial procedure, the quality of the remaining tissue, and the type of subsequent surgery. Here’s how previous interventions may influence future outcomes:
Cartilage and Subchondral Bone Health
Bone marrow stimulation techniques (e.g., microfracture, drilling, abrasion arthroplasty) rely on the formation of fibrocartilage, which is biomechanically inferior to native hyaline cartilage. This can make future repairs or reconstruction less successful.
If the previous procedure caused excessive subchondral bone damage, subsequent interventions like autografts or allografts may be less effective due to poor integration.

Scar Tissue Formation
Debridement and curettage can lead to scar formation, potentially limiting joint motion or affecting the success of later procedures, such as osteochondral grafting or joint fusion.
Changes in Biomechanics
Loss of subchondral bone integrity from repeated curettage or drilling can weaken the structural support, increasing the risk of collapse or cyst formation.
If excessive bone loss occurred, additional structural procedures (e.g., bone grafting) may be required before definitive treatment.
Impact on Subsequent Cartilage Repair Procedures
Procedures such as osteochondral autograft transplantation (OATS) or autologous chondrocyte implantation (ACI) may have lower success rates if the previous treatment led to extensive bone loss or poor tissue quality.
Revision procedures often have higher failure rates compared to primary procedures.
Risk of Post-Traumatic Arthritis
If previous procedures altered joint mechanics or led to suboptimal healing, they could accelerate cartilage degeneration and increase the likelihood of post-traumatic osteoarthritis, limiting future surgical options.
Surgical Considerations for Revision
Surgeons may need to adjust their approach, potentially requiring additional bone grafting, joint realignment, or more invasive reconstructive techniques.
Healing potential may be reduced, especially in older patients or those with comorbidities like diabetes or vascular insufficiency.
Conclusion
A history of debridement, curettage, or bone marrow stimulation does not necessarily preclude a successful outcome in a subsequent procedure, but it can make revision surgeries more complex and potentially less predictable. A careful preoperative evaluation, including imaging (MRI/CT), is essential to assess bone quality and cartilage status before planning the next intervention.

Discussion

The most important finding of the present study was that ChatGPT provided inaccurate answers regarding the surgical treatment of ankle cartilage lesions, and its answers were not sufficiently consistent with those of foot and ankle experts. In addition, ChatGPT made an overconclusion on approximately one-third of the included questions. However, ChatGPT was able to offer an answer consistent with the 2018 International Consensus Meeting on Cartilage Repair of the Ankle for approximately 70% of questions on surgical treatment of ankle cartilage lesions. This study demonstrated strong interobserver reliability in evaluating the similarity between ChatGPT-generated responses and answers by the foot and ankle experts. The clinical relevance of this study is that improvement is still required before ChatGPT can be used reliably by clinicians. This divergence should be interpreted in the context of both the evolving nature of LLMs and the limitations of fixed expert references from earlier timepoints as to how best to surgically treat patients with ankle cartilage lesions.

LLMs are broadly available to the general population and health care providers. Recently, the utility of LLMs has been investigated in various medical fields.^16,28 For clinicians to use LLMs as support tools when making clinical decisions, they must provide reliable, evidence-based answers to clinical questions. However, the usefulness and safety of LLMs in medicine remains controversial. Several authors have evaluated the accuracy of ChatGPT responses to frequently asked questions from patients.^11,20,29 Villarreal-Espinosa et al²⁹ reported that 80% (8/10) of the responses by ChatGPT to frequently asked questions regarding anterior cruciate ligament surgery were regarded as completely accurate by one of the 2 attending physicians. Mika et al²⁰ also reported highly accurate answers by the ChatGPT for common patient questions regarding total hip arthroplasty. These previous studies support the usefulness of ChatGPT in patients undergoing orthopaedic surgery. Other authors have evaluated the accuracy of responses by ChatGPT using clinical practice guidelines (CPGs) as a standard reference.^7,31 Yang et al³¹ reported that ChatGPT encouraged the use of nonrecommended treatments for hip and knee osteoarthritis in 30% of the queries. Duey et al⁷ found that 92% of the answers by ChatGPT-4 for thromboembolic prophylaxis in spine surgery were correct.

In the present study, no answer by ChatGPT-4 was completely the same (Likert 5) as the corresponding one by foot and ankle experts, whereas 42.9% were graded as neutral (Likert 3), and 21.4% were almost different (Likert 2). Therefore, the findings of the present study suggest that ChatGPT-4, at present, does not provide answers to clinical queries regarding the surgical treatment of ankle cartilage lesions sufficiently aligned with expert consensus. Although the utility of ChatGPT in clinical practice is promising, further studies are needed to clarify the role of LLMs in clinical practice. In particular, the origin of the data provided by ChatGPT has not been clarified, as it could contain false or outdated information in its training text corpus. In the future, if ChatGPT could guide clinicians correctly to the literature with a high level of evidence to find the best answer to clinical questions, it would greatly be beneficial for clinicians.

There were several limitations to the present study. First, the consensus statements and answers by experts in this study were developed at the 2017 International Consensus Meeting on Cartilage Repair of the Ankle¹⁰; thus, these statements may potentially be outdated. This limitation has also been found in previous studies with similar study designs.^7,31 Second, the ChatGPT model and prompts used for querying could have affected the study findings.⁸ It could be argued that new models, such as “reasoning” models, could have provided more accurate answers. Additionally, the model was fed questions without any advanced prompting techniques. It is possible that by “forcing” the model to adopt a persona, such as “an orthopaedic surgeon advising a colleague,” different answers would have been obtained. Third, this study used only 14 questions on the BMS for the ankle cartilage lesions, limiting the ability to broadly investigate the accuracy of answers by ChatGPT. Fourth, the present study evaluated the answers by ChatGPT-4 for questions concerning “Debridement, Curettage and Bone Marrow Stimulation.”⁹ Therefore, this study could not assess the accuracy of answers by ChatGPT for other surgical procedures of the ankle cartilage pathologies, such as autologous osteochondral or allograft transplantation, fixation of the ankle cartilage or osteochondral lesion, and autologous cartilage implantation. Fourth, the evaluation of ChatGPT’s responses was performed by only 2 examiners. Finally, the current study did not evaluate the accuracy of answers to previous versions of ChatGPT or LLMs other than ChatGPT. Despite these limitations, this study has clarified the present reliability of ChatGPT regarding medical knowledge and would contribute to enhancing interactions between contemporary AI systems and health care providers on the study topic. Although failing to match the consensus statements from experts, the findings of this study suggest potential utility of ChatGPT in patient education or even for nonexpert clinicians. However, cautious implementation and further evaluation are recommended considering the limitations of the present study and potential enhancement of LLM possibilities that were not explored in this study. Additionally, ChatGPT responses may change over time because of ongoing model updates, and thus the findings of this study may not generalize to future versions.

Conclusion

This study showed that ChatGPT, model GPT-4, often generated responses that differed from expert consensus statements developed by the International Consensus Meeting on Cartilage Repair of the Ankle.

Supplemental Material

sj-pdf-1-fao-10.1177_24730114251352494 – Supplemental material for ChatGPT-4 Responses on Ankle Cartilage Surgery Often Diverge from Expert Consensus: A Comparative Analysis

Supplemental material, sj-pdf-1-fao-10.1177_24730114251352494 for ChatGPT-4 Responses on Ankle Cartilage Surgery Often Diverge from Expert Consensus: A Comparative Analysis by Takuji Yokoe, Giulia Roversi, Nuno Sevivas, Naosuke Kamei, Pedro Diniz and Hélder Pereira in Foot & Ankle Orthopaedics

Footnotes

Author’s Note

Pedro Diniz currently has the following affiliations: Department of Orthopaedic Surgery, Centre Hospitalier Universitaire Brugmann, Brussels, Belgium; Department of Orthopaedic Surgery, Hôpital Universitaire de Bruxelles - Hôpital Erasme, Brussels, Belgium; and Department of Bioengineering and iBB - Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal.

Ethical Approval

No ethical approval was needed for this study. No patients were included in this study.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Disclosure forms for all authors are available online.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Takuji Yokoe, MD, PhD

Nuno Sevivas, MD, PhD

Naosuke Kamei, MD, PhD

Pedro Diniz, MD, PhD

Hélder Pereira, MD, PhD

References

Abbasgholizadeh Rahimi

Légaré

Sharma

, et al. Application of artificial intelligence in community-based primary health care: systematic scoping review and critical appraisal. J Med Internet Res. 2021;23(9):e29839. doi:10.2196/29839

Anastasio

Wixted

McGroarty

NK.

Osteochondral lesions of the talus: etiology, clinical presentation, treatment options, and outcomes. Foot Ankle Clin. 2024;29(2):193-211. doi:10.1016/j.fcl.2023.11.002

Biehl

Fritzsche

Walter

, et al. Midterm outcomes and surgical approaches for osteochondral lesions of the talus. Foot Ankle Orthop. 2025;10(2):24730114251329378. doi:10.1177/24730114251329378

Dagher

Dwyer

Baker

Kalidoss

Strelzow

JA.

“Dr AI will see you now”: how do ChatGPT-4 treatment recommendations align with orthopedic clinical practice guidelines? Clin Orthop Relat Res. 2024;482(12):2098-2106.

Dahmen

Lambers

KTA

Reilingh

van Bergen

CJA

Stufkens

SAS

Kerkhoffs

GMMJ

. No superior treatment for primary osteochondral defects of the talus. Knee Surg Sports Traumatol Arthrosc. 2018;26(7):2142-2157.

D’Hooghe

Murawski

Boakye

LAT

, et al. Rehabilitation and return to sports: proceedings of the International Consensus Meeting on Cartilage Repair of the Ankle. Foot Ankle Int. 2018;39(1_suppl):61S-67S. doi:10.1177/1071100718781862. Erratum in: Foot Ankle Int. 2021;42(2):248. doi:10.1177/1071100720967713

Duey

Nietsch

Zaidat

, et al. Thromboembolic prophylaxis in spine surgery: an analysis of ChatGPT recommendations. Spine J. 2023;23(11):1684-1691.

Griewing

Lechner

Gremke

, et al. Proof-of-concept study of a small language model chatbot for breast cancer decision support—a transparent, source-controlled, explainable and data-secure approach. J Cancer Res Clin Oncol. 2024;150(10):451. doi:10.1007/s00432-024-05964-3

Guerra

Hofmann

, et al. ChatGPT, Bard, and Bing Chat are large language processing models that answered orthopedic in-training examination questions with similar accuracy to first-year orthopedic surgery residents. Arthroscopy. 2025;41(3):557-562.

10.

Hannon

Bayer

Murawski

, et al. Debridement, curettage, and bone marrow stimulation: proceedings of the International Consensus Meeting on Cartilage Repair of the Ankle. Foot Ankle Int. 2018;39(1_suppl):16S-22S.

11.

Hurley

Crook

Lorentz

, et al. Evaluation high-quality of information from ChatGPT (artificial intelligence-large language model) artificial intelligence on shoulder stabilization surgery. Arthroscopy. 2024;40(3):726-731.

12.

Kuşcu

Pamuk

Sütay Süslü

Hosal

Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?

Front Oncol. 2023;13:1256459. doi:10.3389/fonc.2023.1256459

13.

Lambers

KTA

Dahmen

Reilingh

van Bergen

CJA

Stufkens

SAS

Kerkhoffs

GMMJ

. No superior surgical treatment for secondary osteochondral defects of the talus. Knee Surg Sports Traumatol Arthrosc. 2018;26(7):2158-2170.

14.

Landis

Koch

GG.

The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.

15.

Linstone

Turoff

The Delphi Method: Technique and Application. Addison-Wesley; 1975.

16.

Lubowitz

Cote

Ramkumar

Kunze

KN.

Applications of artificial intelligence for health care providers. Arthroscopy. 2025;41(3):537-538.

17.

Lum

ZC.

Can artificial intelligence pass the American Board of Orthopedic Surgery examination? Orthopedic residents versus ChatGPT. Clin Orthop Relat Res. 2023;481(8):1623-1630.

18.

Magruder

Rodriguez

Wong

JCJ

, et al. Assessing ability for ChatGPT to answer total knee arthroplasty-related questions. J Arthroplasty. 2024;39(8):2022-2027.

19.

Mejia

Arroyave

Saturno

, et al. Use of ChatGPT for determining clinical and surgical treatment of lumbar disc herniation with radiculopathy: a North American Spine Society guideline comparison. Neurospine. 2024;21(1):149-158.

20.

Mika

Martin

Engstrom

Polkowski

Wilson

JM.

Assessing ChatGPT responses to common patient questions regarding total hip arthroplasty. J Bone Joint Surg Am. 2023;105(19):1519-1526.

21.

Murawski

Kennedy

JG.

Operative treatment of osteochondral lesions of the talus. J Bone Joint Surg Am. 2013;95(11):1045-1054.

22.

Park

Cho

Han

Lee

JW.

Bone marrow stimulation for osteochondral lesions of the talus: are clinical outcomes maintained 10 years later?

Am J Sports Med. 2021;49(5):1220-1226.

23.

Pinski

Boakye

Murawski

Hannon

Ross

Kennedy

JG.

Low level of evidence and methodologic quality of clinical outcome studies on cartilage repair of the ankle. Arthroscopy. 2016;32(1):214-222.

24.

Reilingh

Murawski

DiGiovanni

, et al. Fixation techniques: proceedings of the International Consensus Meeting on Cartilage Repair of the Ankle. Foot Ankle Int. 2018;39(1_suppl):23S-27S. doi:10.1177/1071100718781096. Erratum in: Foot Ankle Int. 2021;42(2):248. doi:10.1177/1071100720967713

25.

Rikken

QGH

Aalders

Dahmen

Sierevelt

Stufkens

SAS

Kerkhoffs

GMMJ

. Ten-year survival rate of 82% in 262 cases of arthroscopic bone marrow stimulation for osteochondral lesions of the talus. J Bone Joint Surg Am. 2024;106(14):1268-1276.

26.

Rikken

QGH

Kerkhoffs

GMMJ.

Osteochondral lesions of the talus: an individualized treatment paradigm from the Amsterdam perspective. Foot Ankle Clin. 2021;26(1):121-136.

27.

Scheschenja

Viniol

Bastian

Wessendorf

König

Mahnken

AH.

Feasibility of GPT-3 and GPT-4 for in-depth patient education prior to interventional radiological procedures: a comparative analysis. Cardiovasc Intervent Radiol. 2024;47(2):245-250.

28.

Sharma

Savage

Nair

Larsson

Svedberg

Nygren

JM.

Artificial intelligence applications in health care practice: scoping review. J Med Internet Res. 2022;24(10):e40238. doi:10.2196/40238

29.

Villarreal-Espinosa

Berreta

Allende

, et al. Accuracy assessment of ChatGPT responses to frequently asked questions regarding anterior cruciate ligament surgery. Knee. 2024;51:84-92.

30.

Wijnhoud

Rikken

QGH

Dahmen

Sierevelt

Stufkens

SAS

Kerkhoffs

GMMJ

. One in three patients with chronic lateral ankle instability has a cartilage lesion. Am J Sports Med. 2023;51(7):1943-1951.

31.

Yang

Ardavanis

Slack

Fernando

Della Valle

Hernandez

NM.

Chat Generative Pretrained Transformer (ChatGPT) and Bard: artificial intelligence does not yet provide clinically supported answers for hip and knee osteoarthritis. J Arthroplasty. 2024;39(5):1184-1190.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.16 MB

Question	Examiner 1	Examiner 2	Final
1	4	4	4
2	3	3	3
3	2	2	2
4	4	3	4^b
5	4	4	4
6	4	4	4
7	2	3	2^b
8	3	3	3
9	3	3	3
10	2	2	2
11	3	3	3
12	3	3	3
13	4	4	4
14	3	3	3

Question	Examiner 1	Examiner 2	Final
1	4	4	4
2	3	3	3
3	2	2	2
4	4	3	4^b
5	4	4	4
6	4	4	4
7	2	3	2^b
8	3	3	3
9	3	3	3
10	2	2	2
11	3	3	3
12	3	3	3
13	4	4	4
14	3	3	3

Question	Examiner 1	Examiner 2	Final
1	4	4	4
2	3	3	3
3	2	2	2
4	4	3	4^b
5	4	4	4
6	4	4	4
7	2	3	2^b
8	3	3	3
9	3	3	3
10	2	2	2
11	3	3	3
12	3	3	3
13	4	4	4
14	3	3	3