Sage Journals: Discover world-class research

Abstract

This study evaluated the accuracy, consistency, and clinical appropriateness of responses generated by large language models (LLMs) to frequently asked questions (FAQs) in veterinary dentistry client communication. Six common FAQs were identified based on guidance from the American Veterinary Dental College (AVDC) and submitted under standardized conditions to multiple LLMs, including ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4.5, Perplexity, Qwen3-Max, and DeepSeek. Artificial intelligence (AI) generated responses were compared with expert-reviewed reference answers prepared by 2 veterinarians with academic and clinical experience in small animal dentistry. Responses were independently evaluated by 2 expert and 2 novice assessors across 4 domains: main idea coverage, information quality, consistency with expert content, and presence of inconsistencies using a 3-point Likert scale (Yes, Neutral, No). Inter-rater agreement between expert evaluators was assessed using Cohen's kappa, and between-model comparisons were performed using McNemar's exact test after dichotomization of ratings. Inter-rater agreement was substantial (κ = 0.68). ChatGPT-5 showed the highest alignment with expert-reviewed reference content, followed by Claude Sonnet 4.5. Differences between expert and novice evaluations were most evident for questions related to anesthesia safety and anesthesia-free dental procedures. Clinically relevant inaccuracies were identified across several models, particularly regarding the requirement for general anesthesia with a protected airway. No statistically significant differences were detected between primary model comparisons (P = 1.00). These findings indicate that LLMs may support client education in veterinary dentistry but require expert oversight to ensure clinical accuracy and patient safety.

Keywords

veterinary dentistry artificial intelligence large language models client communication decision support

Get full access to this article

View all access options for this article.

References

Niemiec

. Veterinary periodontology. Wiley-Blackwell; 2013.

Loftus

Tighe

Filiberto

, et al. Artificial intelligence and surgical decision-making. JAMA Surg. 2019;154(10):962-963. doi:10.1001/jamasurg.2019.4917

Burti

Banzato

Coghlan

Wodzinski

Bendazzoli

Zotti

. Artificial intelligence in veterinary diagnostic imaging: perspectives and limitations. Res Vet Sci. 2024;175(1):105317. doi:10.1016/j.rvsc.2024.105317

Cascella

Montomoli

Bellini

Bignami

. Evaluating the feasibility of ChatGPT-5 in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. doi:10.1007/s10916-023-01925-4

Patel

Lam

. ChatGPT-5: the future of discharge summaries? Lancet Digit Health. 2023;5(3):e107-e108. doi:10.1016/s2589-7500(23)00021-3

Hung

Montalvao

Tanaka

Kawai

Bornstein

. The use and performance of artificial intelligence applications in dental and maxillofacial radiology: a systematic review. Dentomaxillofac Radiol. 2020;49(1):20190107. doi:10.1259/dmfr.20190107

Lee

Kim

Jeong

Choi

. Detection and diagnosis of dental caries using a deep learning-based convolutional neural network algorithm. J Dent. 2018;77(1):106-111. doi:10.1016/j.jdent.2018.07.015

Miki

Muramatsu

Hayashi

, et al. Classification of teeth in cone-beam CT using deep convolutional neural network. Comput Biol Med. 2017;80(1):24-29. doi:10.1016/j.compbiomed.2016.11.003

Canejo-Teixeira

Carvalho

Teixeira

, et al. A pilot study on using an artificial intelligence algorithm to identify urolith composition through abdominal radiographs in the dog. Vet Radiol Ultrasound. 2025;66(2):e70012. doi:10.1111/vru.70012

10.

Hennessey

Ferreira

Labens

Hunt

. Artificial intelligence in veterinary diagnostic imaging: a literature review. Vet Radiol Ultrasound. 2022;63(6):755-765. doi:10.1111/vru.13163

11.

Asfuroğlu

Yağar

Gümüşoğlu

. High accuracy but limited readability of large language model-generated responses to frequently asked questions about Kienböck’s disease. BMC Musculoskelet Disord. 2024;25(1):879. doi:10.1186/s12891-024-07983-0

12.

Huang

Zheng

Wang

, et al. ChatGPT-5 for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. 2023;15(1):29. doi:10.1038/s41368-023-00239-y

13.

Eggmann

Weiger

Zitzmann

Blatz

. Implications of large language models such as ChatGPT-5 for dental medicine. J Esthet Restor Dent. 2023;35(7):1098-1102. doi:10.1111/jerd.13046

14.

Park

Pillai

Deng

, et al. Assessing the research landscape and clinical utility of large language models: a scoping review. BMC Med Inform Decis Mak. 2024;24(1):72. doi:10.1186/s12911-024-02459-6

15.

Nazi

Peng

. Large language models in healthcare and medical domain: a review. Informatics. 2024;11(3):57. doi:10.3390/informatics11030057

16.

Meng

Yan

Zhang

, et al. The application of large language models in medicine: a scoping review. iScience. 2024;27(5):109713. doi:10.1016/j.isci.2024.109713

17.

Zhang

Nie

Zehnder

Page

Zou

. Vettag: improving automated veterinary diagnosis coding via large-scale language modeling. NPJ Digit Med. 2019;2(1):35. doi:10.1038/s41746-019-0113-1

18.

Jiang

Irvin

Zou

. VetLLM: large language model for predicting diagnosis from veterinary notes. Pac Symp Biocomput. 2024;2023(1):120-133.

19.

Nguyen

Dang

Nguyen

Hoang

Nguyen

. Accuracy of latest large language models in answering multiple choice questions in dentistry: a comparative study. PLoS One. 2025;20(1):e0317423. doi:10.1371/journal.pone.0317423

20.

Lee

Bubeck

Petro

. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med. 2023;388(13):1233-1239. doi:10.1056/nejmsr2214184

21.

Thirunavukarasu

Ting

DSJ

Elangovan

Gutierrez

Tan

Ting

DSW

. Large language models in medicine. Nat Med. 2023;29(8):1930-1940. doi:10.1038/s41591-023-02448-8

22.

Liang

Bommasani

Hudson

Adeli

. On the opportunities and risks of foundation models. arXiv. 2022. doi:10.48550/arXiv.2108.07258.

23.

Worthing

Roberts

Šlapeta

. Surveyed veterinary students in Australia find ChatGPT-5 practical and relevant while expressing no concern about artificial intelligence replacing veterinarians. Vet Rec Open. 2024;11(1):e280. doi:10.1002/vro2.80

24.

Benary

Wang

Schmidt

, et al. Leveraging large language models for decision support in personalized oncology. JAMA Netw Open. 2023;6(11):e2343689. doi:10.1001/jamanetworkopen.2023.43689

Assessing the Role of Large Language Models in Veterinary Dentistry Client Communication

Abstract

Keywords

Get full access to this article

References