Sage Journals: Discover world-class research

Abstract

Background

Lack of information is a critical challenge in occupational health. With over 180 million users, ChatGPT has become a prominent trend, swiftly addressing a wide array of queries, yet it critically needs validation in occupational health.

Objective

This study evaluated GPT-3.5 (free version) and GPT-4 (paid version) on their ability to respond to Occupational Risk Prevention formal multiple-choice questions.

Methods

A total of 303 questions were assessed, categorized across four levels of complexity—task-specific, national, European, and global—within various Spanish regions.

Results

GPT-3.5 achieved an overall accuracy of 56.8%, while GPT-4 reached 73.9% (p < 0.001). GPT-3.5 showed particularly limited performance on domain-specific content. Both models shared similar error patterns, with incorrect response rates ranging from 18–24% across regions.

Conclusion

Despite GPT-4's improved performance, both models display notable limitations in occupational health applications. To enhance reliability, four strategies are proposed: formal validation, continuous training, error analysis, and regional adaptation.

Keywords

artificial intelligence occupational health educational technology validation studies as topic natural language processing computer-assisted instruction

Get full access to this article

View all access options for this article.

References

van Dijk

Moti

. A repository for publications on basic occupational health services and similar health care innovations. Saf Health Work 2023; 14: 50–58.

Teubner

Flath

Weinhardt

, et al.

Welcome to the era of ChatGPT et al.

Bus Inf Syst Eng 2023; 65: 95–101.

Eysenbach

. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ 2023; 9: e46885.

How Many Users on ChatGPT?, https://www.demandsage.com/chatgpt-statistics/ (accessed October 2024).

Alkaissi

McFarlane

. Artificial hallucinations in ChatGPT: implications in scientific writing. Cureus 2023; 15: e35179.

Fui-Hoon Nah

Zheng

Cai

, et al. Generative AI and ChatGPT: applications, challenges, and AI-human collaboration. J Inform Technol Case Appl Res 2023; 25: 277–304.

Kung

Cheatham

Medenilla

, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLoS Digit Health 2023; 2: 1–12.

Ali

Tang

Connolly

, et al. Performance of ChatGPT and GPT-4 on neurosurgery written board examinations. Neurosurgery 2023; 93: 1353–1365.

ILO 349GB - Global Strategy on Occupational Safety and Health 2024-30 and plan of action for its implementation - EU Statement, https://www.ilo.org/resource/policy/global-strategy-occupational-safety-and-health (2023).

10.

Andalucia-Técnico/a Superior en Prevención de Riesgos Laborales-Seguridad en el Trabajo Convocatoria - Convocatorias OEP 2021-22- Tipo acceso: Acceso libre, https://www.sspa.juntadeandalucia.es/servicioandaluzdesalud/profesionales/ofertas-de-empleo/oferta-de-empleo-publico-puestos-base/convocatorias-oep-2021-22/cuadro-de-evolucion-de-acceso-libre/tecnicoa-superior-en-prevencion-de-riesgos-laborales-seguridad-en-el-trabajo-1 (accessed July 2024).

11.

Region de Murcia - Oferta de Empleo 2017, 2018 y Estabilización de Empleo-Prevención de Riesgos laborales, https://www.murciasalud.es/web/recursos-humanos-y-empleo/visualizacion-oposiciones/-/categories/4565559/?p_r_p_categoryId=4565559&p_r_p_categoryId1=4565748 (accessed July 2024).

12.

Madrid - Cobertura de puestos de trabajo de Técnicos Superiores, Especialistas, Especialidad Prevención de Riesgos Laborales, https://sede.comunidad.madrid/oferta-empleo/tecnico-sup-prev-riesgos-laborales (2023, accessed July 2024).

13.

Lim

Pushpanathan

Yew

SME

, et al. Benchmarking large language models’ performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. EBioMedicine 2023; 95: 104770.

14.

Hussain

Sabir

Lee

, et al. Conversational AI-based VR system to improve construction safety training of migrant workers. Autom Constr 2024; 160. 10.1016/j.autcon.2024.105315

15.

Oviedo-Trespalacios

Peden

Cole-Hunter

, et al. The risks of using ChatGPT to obtain common safety-related information and advice. Saf Sci 2023; 167: 106244.

16.

Uddin

SMJ

Albert

Ovid

, et al. Leveraging ChatGPT to aid construction hazard recognition and support safety education and training. Sustainability 2023; 15: 7121.

17.

Uddin

SMJ

Albert

Tamanna

. Harnessing the power of ChatGPT to promote Construction Hazard Prevention through Design (CHPtD). Eng Constr Archit Manag 2024: ahead-of-print. doi:10.1108/ECAM-03-2024-0314

18.

Padovan

Cosci

Petillo

, et al. ChatGPT in occupational medicine: a comparative study with human experts. Bioengineering 2024; 11: 57.

19.

Padovan

Palla

Marino

, et al.

ChatGPT-4 vs. Google Bard: which chatbot better understands the Italian legislative framework for worker health and safety?

Appl Sci 2025; 15: 1508.

20.

Park

Ham

. Applications and concerns of generative AI: ChatGPT in the field of occupational health. J Korean Soc Occup Environ Hyg 2023; 33: 412–418.

21.

Sridi

Brigui

. The use of ChatGPT in occupational medicine: opportunities and threats. Ann Occup Environ Med 2023; 35: 42.

22.

Gao

, et al. The relationship between social media and professional learning from the perspective of pre-service teachers: a survey. Educ Inf Technol (Dordr) 2024; 29: 2067–2092. Epub ahead of print.

23.

Zhu

. Intelligent robot path planning and navigation based on reinforcement learning and adaptive control. J Logist Inform Serv Sci 2023; 10: 235–248.

24.

Kochanek

Skarzynski

Jedrzejczak

. Accuracy and repeatability of ChatGPT based on a set of multiple-choice questions on objective tests of hearing. Cureus 2024; 16: e59857. PMID: 38854312; PMCID: PMC11157293.

25.

Brin

Sorin

Vaid

, et al. Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments. Sci Rep 2023; 13: 16492. PMID: 37779171; PMCID: PMC10543445.

26.

Temel

Erden

Bağcıer

. Information quality and readability: ChatGPT's responses to the most common questions about spinal cord injury. World Neurosurg 2024; 181: e1138–e1144.

27.

Morreale

Balon

Beresin

, et al. Artificial intelligence and medical education, academic writing, and journal policies: a focus on large language models. Acad Psychiatry 2025; 49: 1–9.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB

ChatGPT as a rising force: Can AI bridge information gaps in Occupational Risk Prevention?