Sage Journals: Discover world-class research

Abstract

Objective

To assess the accuracy, readability, and comparative quality of five large language models (LLMs) in answering frequently asked questions related to nasoalveolar molding (NAM) in cleft care.

Design

Repeated measures study.

Setting

This study evaluated the responses of five LLMs, Google Gemini, Microsoft (Copilot), ChatGPT, Meta, and Claude artificial intelligence (AI), through a standardized set of 28 questionnaires related to NAM in cleft care.

Participants

None.

Intervention

The accuracy of LLMs was assessed using a five-point modified Likert scale. Readability was evaluated using two validated metrics: the Flesch-Kincaid Reading Ease and Flesch-Kincaid Grade Level.

Main Outcome Measure

The primary outcome variable was the response generated by the five LLMs. Two investigators independently assessed the quality of responses from the five LLMs using a five-point modified Likert scale, with the highest score (5) indicating the highest quality.

Results

Claude AI achieved the highest mean Likert score (3.71 ± 0.53), whereas Gemini had the lowest score (3.29 ± 0.60). The highest mean readability score was observed in Meta AI (79.61 ± 37.09), while Claude AI showed significantly lower scores (47.04 ± 46.29).

Conclusion

Among the five LLMs, Claude AI achieved the highest accuracy, followed by Microsoft Copilot, ChatGPT, Meta AI, and Google Gemini in responding to NAM-related queries. The responses from Claude AI were complex and harder to read, followed by ChatGPT, Copilot, Gemini, and Meta AI, with Meta AI being the most straightforward to comprehend.

Keywords

large language models nasoalveolar molding artificial intelligence cleft lip and palate

Get full access to this article

View all access options for this article.

References

Carl

Schramm

Haggenmüller

, Kather

, Hetz

, Wies

, Michel

, Wessels

, Brinker

. Large language model use in clinical oncology. NPJ Precis Oncol. 2024;8(1):1–17. doi:10.1038/s41698-024-00733-4

Clusmann

Kolbinger

Muti

, Carrero ZI, Eckardt J-N, Laleh NG, Lavinia Löffler CM, Schwarzkopf S-C, Unger M, Veldhuizen GP, et al. The future landscape of large language models in medicine. Commun Med (Lond). 2023;3(1):1–8. doi:10.1038/s43856-023-00370-1

Minssen

Vayena

Cohen

. The challenges for regulating medical use of ChatGPT and other large language models. JAMA. 2023;330(4):315–316. doi:10.1001/jama.2023.9651

Gong

Shen

. CAD presurgical nasoalveolar molding effects on the maxillary morphology in infants with UCLP. Oral Surg Oral Med Oral Pathol Oral Radiol. 2013;116(4):418–426. doi:10.1016/j.oooo.2013.06.032

Grayson

Garfinkle

. Early cleft management: The case for nasoalveolar molding. Am J Orthod Dentofacial Orthop. 2014;145(2):134–142. doi:10.1016/j.ajodo.2013.11.011

Zuhaib

Bonanthaya

Parmar

Shetty

Sharma

. Presurgical nasoalveolar moulding in unilateral cleft lip and palate. Indian J Plast Surg. 2016;49(1):42–52. doi:10.4103/0970-0358.182235

Kassam

Perry

Ayala

, Stieber E, Davies G, Hudson N, Hamdan US. World cleft coalition international treatment program standards. Cleft Palate Craniofac J. 2020;57(10):1171–1181. doi:10.1177/1055665620928779

Kantar RS, Cammarata MJ, Rifkin WJ, Diaz-Siso JR, Hamdan US, Flores RL. Foundation-based cleft care in developing countries. Plast Reconstr Surg. 2019;143(4):1165–1178. doi:10.1097/PRS.0000000000005416

Wester JR, Weissman JP, Reddy NK, Chwa ES, Gosain AK. The current state of cleft care in sub-Saharan Africa: A narrative review. Cleft Palate Craniofac J. 2022;59(9):1131–1138. doi:10.1177/10556656211038183

10.

Smerica

Rumprecht

Peters

Mehendale

. Cleft care companion: An innovative app to educate and connect patients with a cleft and their families to treatment centres. J Glob Health. 2023;13:03048. doi:10.7189/jogh.13.03048

11.

#Cleft: The use of social media amongst parents of infants with clefts. Cleft Palate Craniofac J. Published online October 22, 2024. doi:10.1597/16-156

12.

Adekunle

James

Adeyemo

. Health information seeking through social media and search engines by parents of children with orofacial cleft in Nigeria. Cleft Palate Craniofac J. 2020;57(4):444–447. doi:10.1177/1055665619884447

13.

Evaluating recommendations about atrial fibrillation from Chat-based AI algorithms. Accessed February 9, 2025. https://www.google.com/search

14.

Will ChatGPT transform healthcare?

Nat Med. 2023;29(3):505–506. doi:10.1038/s41591-023-02289-5

15.

Meng

Yan

Zhang

, Liu D, Cui X, Yang Y, Zhang M, Cao C, Wang J, Wang X, et al. The application of large language models in medicine: A scoping review. iScience. 2024;27(5):109713. doi:10.1016/j.isci.2024.109713

16.

Is ChatGPT an accurate and readable patient aid for third molar extractions? Accessed February 24, 2025. https://pubmed.ncbi.nlm.nih.gov/39019079/

17.

Kasabwala

Agarwal

Hansberry

Baredes

Eloy

. Readability assessment of patient education materials from the American academy of otolaryngology–head and neck surgery foundation. Otolaryngol Head Neck Surg. 2012;147(3):466–471. doi:10.1177/0194599812442783

18.

Grayson

Shetye

. Presurgical nasoalveolar molding treatment in cleft lip and palate patients. Indian J Plast Surg. 2009;42(Suppl):S56–S63. doi:10.4103/0970-0358.57188

19.

Nelson PA, Kirk SA, Caress AL, Glenny AM. Parents’ emotional and social experiences of caring for a child through cleft treatment. Qual Health Res. 2012;22(3):346–359. doi:10.1177/1049732311421178

20.

Stein

Berg

Padwa

. Coping with cleft: A conceptual framework of caregiver responses to nasoalveolar molding. Cleft Palate Craniofac J. 2015;52(6):640–646. doi:10.1597/14-113

21.

Bradbury

Hewison

. Early parental adjustment to visible congenital disfigurement. Child Care Health Dev. 1994;20(4):251–266. doi:10.1111/j.1365-2214.1994.tb00388.x

22.

Rey-Bellet

Hohlfeld

. Prenatal diagnosis of facial clefts: Evaluation of a specialised counselling. Swiss Med Wkly. 2004;134(43-44):640–644. doi:10.4414/smw.2004.10547

23.

Namdar

Pourasghar

Alizadeh

Shiva

. Anxiety, depression, and quality of life in caregivers of children with cleft lip and palate: A systematic review. Iran J Psychiatry Behav Sci. 2022;16(2):e113591. doi:10.5812/ijpbs-113591

24.

Grollemund

Dissaux

Gavelle

, Martínez C, Mullaert J, Alfaiate T, Guedeney A. The impact of having a baby with cleft lip and palate on parents and on parent–baby relationship: The first French prospective multicentre study. BMC Pediatr. 2020;20(1):230. doi:10.1186/s12887-020-02118-5

25.

Eglenen

Arslan

Cakan

. Quality and content assessment of internet information on nasoalveolar molding. BMC Public Health. 2025;25(1):389. doi:10.1186/s12889-025-21616-8

26.

Quality and content assessment of internet information on nasoalveolar molding. Accessed February 24, 2025. https://pubmed.ncbi.nlm.nih.gov/39885471/

27.

Fatima

Singh

Amipara

Chaudhary

. Accuracy of artificial intelligence-based virtual assistants in responding to frequently asked questions related to orthognathic surgery. J Oral Maxillofac Surg. 2024;82(8):916–921. doi:10.1016/j.joms.2024.04.013

28.

Andrew

Tizzard

. Large language models for improving cancer diagnosis and management in primary health care settings. J Med Surg Public Health. 2024;4:100157. doi:10.1016/j.glmedi.2024.100157

29.

Rossettini

Rodeghiero

Corradi

, Cook C, Pillastrini P, Turolla A, Castellini G, Chiappinotto S, Gianola S, Palese A et al. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: A cross-sectional study. BMC Med Educ. 2024;24(1):694. doi:10.1186/s12909-024-05630-9

30.

Chen

Williamson

DFK

, Chen RJ, Zhao M, Chow AK, Ikemura K, Kim A, Pouli D, Patel A, et al. A multimodal generative AI copilot for human pathology. Nature. 2024;634(8033):466–473. doi:10.1038/s41586-024-07618-3

31.

How an AI chatbot works: unveiling the mechanisms behind conversational AI. FastBots. Accessed February 17, 2025. http://fastbots.ai/blog/how-an-ai-chatbot-works-unveiling-the-mechanisms-behind-conversational-ai

32.

Duran GS, Yurdakurban E, Topsakal KG. The quality of CLP-related information for patients provided by ChatGPT. Cleft Palate Craniofac J. 2025;62(4):588–595. doi:10.1177/10556656231222387

33.

Chaker

Hung

Saad

Golinko

Galdyn

. Easing the burden on caregivers—applications of artificial intelligence for physicians and caregivers of children with cleft lip and palate. Cleft Palate Craniofac J. 2025;62(4):574–587. doi:10.1177/10556656231223596

34.

Mahedia M, Rohrich RN, Sadiq KO, Bailey L, Harrison LM, Hallac RR. Exploring the utility of ChatGPT in cleft lip repair education. J Clin Med. 2025;14(3):993. doi:10.3390/jcm14030993

35.

Fazilat AZ, Berry CE, Churukian A, Lavin C, Kameni L, Brenac C, Podda S, Bruckman K, Lorenz HP, Khosla RK, et al. AI-based cleft lip and palate surgical information is preferred by both plastic surgeons and patients in a blind comparison. Cleft Palate Craniofac J. 2025;62(9):1542–1548. doi:10.1177/10556656241266368

36.

Yuan J, Tang R, Jiang X, Hu X. Large language models for healthcare data augmentation: An example on patient–trial matching. AMIA Annu Symp Proc. 2024;2023:1324–1333. https://pubmed.ncbi.nlm.nih.gov/38222339/ .

37.

Chatzopoulos

Koidou

Tsalikis

Kaklamanos

. Large language models in periodontology: Assessing their performance in clinically relevant questions. J Prosthet Dent. 2025;134(6):2328–2336. doi:10.1016/j.prosdent.2024.10.020

38.

Raj

Batra

Thakur

Pandey

. Accuracy of large language models for infective endocarditis prophylaxis in dental procedures. Int Dent J. 2025;75(1):56–63. doi:10.1016/j.identj.2024.09.033

39.

Vasileios

Daskalakis

Iakovou

Georgiou

. Large language models and rheumatology: A comparative evaluation. Lancet Rheumatol. 2023;5(10):e586–e589. doi:10.1016/S2665-9913(23)00216-3

40.

Shiraishi M, Kanayama K, Lee H, Furuse K, Okazaki M. Appropriateness of artificial intelligence chatbots in diabetic foot ulcer management: Reply. Int J Low Extrem Wounds. 2026; 25(1):259-260. doi: 10.1177/15347346241262942

41.

Connor

O’Neill

. Large language models in sport science and medicine: opportunities, risks and considerations. arXiv. May 5, 2023. Accessed February 24, 2025. https://arxiv.org/abs/2305.03851v1.

42.

Sallam

. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare (Basel). 2023;11(6):887. doi:10.3390/healthcare11060887

43.

Mijwil

. Should ChatGPT be biased? Challenges and risks of bias in large language models. arXiv. Accessed February 24, 2025. https://arxiv.org/abs/2304.03738.

44.

Howard

Hope

Gerada

. ChatGPT and antimicrobial advice: The end of the consulting infection doctor? Lancet Infect Dis. 2023;23(4):405–406. doi:10.1016/S1473-3099(23)00113-5

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB

Accuracy of Large Language Models in Frequently Answering Questions Related to Presurgical Nasoalveolar Molding