Artificial Intelligence Applications Versus Manual Methods For Literature Retrieval: A Comparative Analysis

Abstract

Background:

Artificial intelligence (AI), particularly generative and large language models, is being used in nursing education, practice, and scholarly writing. Generative AI applications have been specifically examined for their use in conducting literature reviews with evidence supporting reduced production time of scholarly work. However, there has been limited investigation of their levels of accuracy with identifying references for a literature review.

Objective:

The purpose of this study was to compare human-generated citations of literature reviews with AI literature-review generated citations.

Methods:

Using a comparative exploratory design, references from 4 human-written literature reviews, 2 published and 2 unpublished, on 4 different topics, were compared to references derived from 2 AI literature applications, Consensus and Elicit. Three prompting strategies were utilized, including prompts generated using ChatGPT-4. Agreement between the AI and human references was evaluated.

Results:

The percent of agreement between AI and human generated reference lists ranged from 0% to 63.6%. The Consensus application had a greater overall mean rate of match (21.3%) as compared to Elicit (3.7%). The use of a ChatGPT-4 prompt did not significantly impact results, and there were no differences based on published or unpublished literature reviews.

Conclusion:

The 2 literature-based applications examined in this study offered a glimpse of their potential use and limitations. The use of an AI literature review application may support but not replace human work.

Keywords

artificial intelligence large language models literature search nursing nursing research

Get full access to this article

View all access options for this article.

References

Trends in artificial intelligence. Epoch AI. Updated February 5, 2026. Accessed March 5, 2026. https://epoch.ai/trends

Reuters. OpenAI’s weekly active users surpass 400 million. February 20, 2025. Accessed June 8, 2025. https://www.reuters.com/technology/artificial-intelligence/openais-weekly-active-users-surpass-400-million-2025-02-20/

Legatt

. 90% of college students use AI: higher Ed needs AI fluency support now. Forbes. September 18, 2025. Accessed March 5, 2026. https://www.forbes.com/sites/avivalegatt/2025/09/18/90-of-college-students-use-ai-higher-ed-needs-ai-fluency-support-now/

Robert

. The impact of AI on work in higher education. EDUCAUSE. January 12, 2026. Accessed March 6, 2026. https://www.educause.edu/research/2026/the-impact-of-ai-on-work-in-higher-education

Open AI. GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. ND. Accessed June 8, 2025. https://openai.com/index/gpt-4/

Doston

Fontenot

Morris

Hebert

The use of artificial intelligence in nursing education: a scoping review. J Nurs Educ. 2025;64(8):479-488. doi:10.3928/01484834-20250313-03

Rony

MKK

Das

Khalil

, et al. The role of artificial intelligence in nursing care: an umbrella review. Nurs Inq. 2025;32(2):e70023. doi:10.1111/nin.70023

Oermann

Owens

Carter-Templeton

Peterson

Bailey

HE.

Using artificial intelligence for scholarly writing. Am J Nurs. 2025;125(11):52-55. doi:10.1097/AJN.0000000000000179

Khalifa

Albadawy

Using artificial intelligence in academic writing and research: an essential productivity tool. Comput Methods Programs Biomed Update. 2024;5:100145. doi:10.1016/j.cmpbup.2024.100145

10.

Chetwynd

Critical analysis of reliability and validity in literature reviews. J Hum Lact. 2022;38(3):392-396. doi:10.1177/08903344221100201

11.

Tricco

Lillie

Zarin

, et al. A scoping review on the conduct and reporting of scoping reviews. BMC Med Res Methodol. 2016;16:15. doi:10.1186/s12874-016-0116-4

12.

Borah

Brown

Capers

Kaiser

KA.

Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7(2):e012545. doi:10.1136/bmjopen-2016-012545

13.

Oermann

Knafl

KA.

Strategies for completing a successful integrative review. Nurse Author Ed. 2021;31(3-4):65-68. doi:10.1111/nae2.30

14.

Mostafapour

Fortier

Pacheco

Murray

Garber

Evaluating literature reviews conducted by humans versus ChatGPT: comparative study. JMIR AI. 2024;3:e56537. doi:10.2196/56537

15.

Egan

Leak-Smith

Hanna-Amodio

, et al. AI-assisted vs human-only evidence review: results from a comparative study. April 23, 2025. Accessed June 25, 2025. https://www.gov.uk/government/publications/ai-assisted-vs-human-only-evidence-review/ai-assisted-vs-human-only-evidence-review-results-from-a-comparative-study#key-results-how-the-2-reviews-compared

16.

Kacena

Plotkin

Fehrenbacher

JC.

The use of artificial intelligence in writing scientific review articles. Curr Osteoporos Rep. 2024;22(1):115-121. doi:10.1007/s11914-023-00852-0

17.

Bolaños

Salatino

Osborne

Motta

Artificial intelligence for literature reviews: opportunities and challenges. Artif Intell Rev. 2024;57:259. doi: 10.1007/s10462-024-10902-3

18.

Apata

Kwok

Lee

YH.

The use of generative artificial intelligence (AI) in academic research: a review of the consensus app. Cureus. 2025;17(7):e87297. doi:10.7759/cureus.87297

19.

Bernard

Sagawa

Jr Bier

Lihoreau

Pazart

Tannou

Using artificial intelligence for systematic review: the example of elicit. BMC Med Res Methodol. 2025;25(1):75. doi:10.1186/s12874-025-02528-y

20.

Fenske

Otts

JAA

. Incorporating generative AI to promote inquiry-based learning: comparing elicit AI research assistant to PubMed and CINAHL complete. Med Ref Serv Q. 2024;43(4):292-305. doi:10.1080/02763869.2024.2403272

21.

2025 THOUGHT LEADERS ASSEMBLY of AI to Transform Nursing Education. Accessed March 10, 2026. https://www.aacnnursing.org/Portals/0/PDFs/Reports/Thought-Leadership/AACN-2025-Thought-Leaders-Assembly-Summary.pdf

22.

Schroers

Huggins

Sasangohar

O’Rourke

Associations between interruptions and medication administration errors among nurses in hospital settings: a scoping review of quantitative studies. J Adv Nurs. 2026;82(4):2551-2569. doi:10.1111/jan.70032

23.

Byrne

Digital compassion fatigue as an emerging phenomenon for registered nurses experiencing technostress. Appl Clin Inform. 2025;16(3):708-717. doi:10.1055/a-2564-8809

24.

Vanderzwan

Kilroy

Daniels

O’Rourke

Nurse-to-nurse handoff with distractors and interruptions: an integrative review. Nurse Educ Pract. 2023;67:103550. doi:10.1016/j.nepr.2023.103550

25.

Park

Choo

Generative AI prompt engineering for educators: practical strategies. Journal of Special Education Technology. 2025;40(3):411-417. doi:10.1177/01626434241298954

26.

Semantic Scholar Publishers. Semanticscholar.org. Published 2024. Accessed March 10, 2026. https://www.semanticscholar.org/about/publishers

27.

Fiorini

Canese

Starchenko

, et al. Best match: new relevance search for PubMed. PLoS Biol. 2018;16(8):e2005343. doi:10.1371/journal.pbio.2005343

28.

Sloan

. Study: generative AI results depend on user prompts as much as models | MIT Sloan. MIT Sloan. Published August 4, 2025. Accessed March 11, 2026. https://mitsloan.mit.edu/ideas-made-to-matter/study-generative-ai-results-depend-user-prompts-much-models

29.

Position Statement: Artificial Intelligence in Health Care–American Academy of Nursing. Aannet.org. Published 2026. Accessed March 17, 2026. https://aannet.org/page/AI-position-statement-2026

30.

Hoelscher

Pugh

N.U.R.S.E.S. embracing artificial intelligence: a guide to artificial intelligence literacy for the nursing profession. Nurs Outlook. 2025;73(4):102466. doi:10.1016/j.outlook.2025.102466

31.

Ehmke

Bridges

Patel

SE.

Self-perceived knowledge, skills, and attitude of nursing faculty on generative artificial intelligence in nursing education: a descriptive, cross-sectional study. Teach Learn Nurs. 2025;20(3):222-227. doi:10.1016/j.teln.2025.01.029