Sage Journals: Discover world-class research

Abstract

Aim:

This study assesses the reliability of artificial intelligence (AI) large language models (LLMs) in identifying relevant literature comparing inguinal hernia repair techniques.

Material and Methods:

We used LLM chatbots (Bing Chat AI, ChatGPT versions 3.5 and 4.0, and Gemini) to find comparative studies and randomized controlled trials on inguinal hernia repair techniques. The results were then compared with existing systematic reviews (SRs) and meta-analyses and checked for the authenticity of listed articles.

Results:

LLMs screened 22 studies from 2006 to 2023 across eight journals, while the SRs encompassed a total of 42 studies. Through thorough external validation, 63.6% of the studies (14 out of 22), including 10 identified through Chat GPT 4.0 and 6 via Bing AI (with an overlap of 2 studies between them), were confirmed to be authentic. Conversely, 36.3% (8 out of 22) were revealed as fabrications by Google Gemini (Bard), with two (25.0%) of these fabrications mistakenly linked to valid DOIs. Four (25.6%) of the 14 real studies were acknowledged in the SRs, which represents 18.1% of all LLM-generated studies. LLMs missed a total of 38 (90.5%) of the studies included in the previous SRs, while 10 real studies were found by the LLMs but were not included in the previous SRs. Between those 10 studies, 6 were reviews, and 1 was published after the SRs, leaving a total of three comparative studies missed by the reviews.

Conclusions:

This study reveals the mixed reliability of AI language models in scientific searches. Emphasizing a cautious application of AI in academia and the importance of continuous evaluation of AI tools in scientific investigations.

Get full access to this article

View all access options for this article.

References

Budhwar

, Chowdhury

, Wood

, et al. Human resource management in the age of generative artificial intelligence: Perspectives and research directions on ChatGPT. Human Res Mgmt Journal, 2023; 33(3):606–659; doi: 10.1111/1748-8583.12524

Kacena

, Plotkin

, Fehrenbacher

. The use of artificial intelligence in writing scientific review articles. Curr Osteoporos Rep, 2024; 22(1):115–121; doi: 10.1007/s11914-023-00852-0

Giglio

, Costa

MUPD

. The use of artificial intelligence to improve the scientific writing of non-native english speakers. Rev Assoc Med Bras (1992), 2023; 69(9):e20230560; doi: 10.1590/1806-9282.20230560

Beam

, Drazen

, Kohane

, et al. Artificial intelligence in medicine. N Engl J Med, 2023; 388(13):1220–1221; doi: 10.1056/nejme2206291

Thirunavukarasu

, Ting

DSJ

, Elangovan

, et al. Large language models in medicine. Nat Med, 2023; 29(8):1930–1940; doi: 10.1038/s41591-023-02448-8

Zimmerman

. A ghostwriter for the masses: ChatGPT and the future of writing. Ann Surg Oncol, 2023; 30(6):3170–3173; doi: 10.1245/s10434-023-13436-0

Hutson

. Could AI help you to write your next paper? Nature, 2022; 611(7934):192–193; doi: 10.1038/d41586-022-03479-w

Katsnelson

. Poor English skills? New AIs help researchers to write better. Nature, 2022; 609(7925):208–209; doi: 10.1038/d41586-022-02767-9

Else

. Abstracts written by ChatGPT fool scientists. Nature, 2023; 613(7944):423; doi: 10.1038/d41586-023-00056-7

10.

Benda

, Novak

, Reale

, et al. Trust in AI: Why we should be designing for APPROPRIATE reliance. J Am Med Inform Assoc, 2021; 29(1):207–212; doi: 10.1093/jamia/ocab238

11.

Dwivedi

, Kshetri

, Hughes

, et al. “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int J Information Management, 2023; 71:102642; doi: 10.1016/j.ijinfomgt.2023.102642

12.

Fleming

, Phillips

, Drake

, et al. Sugarbaker Versus Keyhole repair for parastomal hernia: Results of an artificial intelligence large language model post hoc analysis. J Gastrointest Surg, 2023; 27(11):2567–2570; doi: 10.1007/s11605-023-05749-y

13.

Department of Plant Systems Biology. VIB. Venn diagram plotter. VIB. 2024. Available from: https://bioinformatics.psb.ugent.be/webtools/Venn/

14.

National Institute of Environmental Health Sciences. Superfund Research Program publications in high-impact journals. NIEHS. 2024. Available from: https://tools.niehs.nih.gov/srp/publications/highimpactjournals.cfm

15.

Aiolfi

, Cavalli

, Micheletto

, et al. Primary inguinal hernia: Systematic review and Bayesian network meta-analysis comparing open, laparoscopic transabdominal preperitoneal, totally extraperitoneal, and robotic preperitoneal repair. Hernia, 2019; 23(3):473–484; doi: 10.1007/s10029-019-01964-2

16.

Aiolfi

, Cavalli

, Ferraro

, et al. Treatment of inguinal hernia: Systematic review and updated network meta-analysis of randomized controlled trials. Ann Surg, 2021; 274(6):954–961; doi: 10.1097/SLA.0000000000004735

17.

Qabbani

, Aboumarzouk

, ElBakry

, et al. Robotic inguinal hernia repair: Systematic review and meta-analysis. ANZ J Surg, 2021; 91(11):2277–2287; doi: 10.1111/ans.16505

18.

Hashimoto

, Rosman

, Rus

, et al. Artificial intelligence in surgery: Promises and perils. Ann Surg, 2018; 268(1):70–76; doi: 10.1097/SLA.0000000000002693

19.

Eysenbach

. The role of ChatGPT, generative language models, and artificial intelligence in medical education: A conversation with ChatGPT and a call for papers. JMIR Med Educ, 2023; 9(1):e46885; doi: 10.2196/46885

20.

De Angelis

, Baglivo

, Arzilli

, et al. ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Front Public Health, 2023; 11:1166120; doi: 10.3389/fpubh.2023.1166120

21.

Goldenberg

, Kirby

, Albrecht

, et al. AI chatbots in surgery: What does the future hold? J Plast Reconstr Aesthet Surg, 2024; 88:310–313; doi: 10.1016/j.bjps.2023.11.032

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.29 MB