Application of Signal Detection Theory in Evaluating Trust of Information Produced by Large Language Models

Abstract

Hallucinations are a major threat in applying generative artificial intelligence technologies like large language models (LLMs) to high-consequence domains like nuclear electricity generation. Inappropriately trusting an LLM can have deleterious consequences, such as misusing unreliable LLMs and disusing reliable LLMs. Identifying methods to evaluate the trustworthiness of LLMs and resultant user trust is an open area that demands future research from both a technical perspective and a human factors perspective. In this paper, we highlight the challenges in evaluating trust in LLMs and then introduces SDT as a potential framework that may overcome these challenges. We then propose using signal detection theory (SDT) as an evaluative framework for comparing trust and trustworthiness when interacting with LLMs in high-consequence domains like nuclear electricity generation.

Keywords

trust generative Artificial Intelligence signal detection theory

Get full access to this article

View all access options for this article.

References

Adams

B. D.

Bruyn

L. E.

Houde

Angelopoulos

Iwasa-Madge

McCann

(2003). Trust in automated systems. Ministry of National Defence, Advance online publication. https://doi.org/10.3233/IDT-240366

Al Rashdan

Giraud

Mapes

Wilcken

Mohon

Murray

. (2024). Scalable methods to automate manual work management activities using artificial intelligence (Report No. INL/RPT-24-80159). Idaho National Laboratory.

Chancey

E. T.

Bliss

J. P.

Yamani

Handley

H. A. H.

(2017). Trust and the compliance-reliance paradigm: The effects of risk, error bias, and reliability on trust and dependence. Human Factors, 59(3), 333–345.

Hautus

M. J.

Macmillan

N. A.

Creelman

C. D.

(2021). Detection theory: A user’s guide. Routledge.

Hoffman

R. R.

Mueller

S. T.

Klein

Litman

(2023). Measures for explainable AI: Explanation goodness, user satisfaction, mental models, curiosity, trust, and human-AI performance. Frontiers of Computer Science, 5, 1096257.

Kriman

N. E.

(2024). Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation. arXiv. https://doi.org/10.48550/arXiv.2408.15171.

Lee

J. D.

See

K. A.

(2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46(1), 50–80. https://doi.org/10.1518/hfes.46.1.50_30392

Martell

M. J.

Baweja

J. A.

Dreslin

B. D.

(2025). Mitigative strategies for recovering from large language model trust violations. Journal of Cognitive Engineering and Decision Making, 19(1), 76–95. https://doi.org/10.1177/15553434241303577

Min

Krishna

Lyu

Lewis

Yih

W. T.

Koh

P. W.

Hajishirzi

(2023). Factscore: Fine-grained atomic evaluation of factual precision in long form text generation. arXiv preprint arXiv:2305.14251.

10.

Parasuraman

Riley

(1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39(2), 230–253.

11.

Sato

Yamani

Liechty

Chancey

E. T.

(2020). Automation trust increases under high-workload multitasking scenarios involving risk. Cognition Technology & Work, 22, 399–407.

12.

Sheridan

T. B.

(2019). Extending three existing models to analysis of trust in automation: Signal detection, statistical parameter estimation, and model-based control. Human Factors, 61(7), 1162–1170. https://doi.org/10.1177/0018720819829951

13.

Stanislaw

Todorov

(1999). Calculation of signal detection theory measures. Behavior Research Methods Instruments & Computers, 31(1), 137–149.

14.

Vorm

E. S.

Combs

D. J. Y.

(2022). Integrating transparency, trust, and acceptance: The intelligent systems technology acceptance model (ISTAM). International Journal of Human–Computer Interaction, 38(18–20), 1828–1845.

15.

Wickens

C. D.

Sargent

Walters

(2023). An influence model of the human-automation team: Effects of workload and automation reliability, transparency, and degree. Ergonomics International Journal, 7(5), 1–6.

16.

Yamani

Jackson

Kovesdi

Joe

J. C.

Mohon

(in press). Trustworthiness and trust: Identifying factors that drive successful human-AI interaction in nuclear power plant operation [Conference session]. Nuclear Plant Instrumentation and Control & Human-Machine Interface Technology (NPIC&HMIT).