Sage Journals: Discover world-class research

Abstract

The NASA Task Load Index (NASA-TLX) is a widely used subjective measure of mental workload (MWL), but its visual design and postcollection data manipulation (rounding) practices lack standardization across studies. These inconsistencies may bias results. This secondary analysis of a prior within-subjects experiment examined whether different response formats (Knob, Slider, Ask) affect TLX ratings. Friedman tests revealed significant differences between formats across all dimensions except Effort, regardless of rounding. Wilcoxon Signed Ranks tests further showed that rounding significantly alters the data distribution for each sub-scale. Parametric comparisons of the data between different task-difficulty conditions suggest that the experimental findings can be influenced by the response format used to collect ratings. These findings demonstrate how rounding and interface design can meaningfully influence TLX results. Therefore, greater consistency in documentation, implementation, and reporting is needed to improve reliability and reduce bias in future studies. We call for further research into these issues

Keywords

concepts/methods multitasking and workload engineering measures metrics

Get full access to this article

View all access options for this article.

References

Ariza

Kalra

Potts

H. W.

(2015). How do clinical information systems affect the cognitive demands of general practitioners?: Usability study with a focus on cognitive workload. Journal of Innovation in Health Informatics, 22(4), 379–390.

Bolton

M. L.

Biltekoff

Humphrey

(2023). The mathematical meaninglessness of the NASA task load index: A level of measurement analysis. IEEE Transactions on Human-Machine Systems, 53(3), 590–599.

Carswell

C. M.

Lio

C. H.

Grant

Klein

M. I.

Clarke

Seales

W. B.

Strup

(2010). Hands-free administration of subjective workload scales: acceptability in a surgical training environment. Applied Ergonomics, 42(1), 138–145.

Chyung

S. Y.

Swanson

Roberts

Hankinson

(2018). Evidence-based survey design: The use of continuous rating scales in surveys. Performance Improvement, 57(5), 38–48.

Funke

(2016). A web experiment showing negative effects of slider scales compared to visual analogue scales and radio button scales. Social Science Computer Review, 34(2), 244–254.

Galy

Paxion

Berthelon

(2018). Measuring mental workload with the NASA-TLX needs to examine each dimension rather than relying on the global score: An example with driving. Ergonomics, 61(4), 517–527.

Matthews

De Winter

Hancock

P. A.

(2020). What do subjective workload scales really measure? Operational and representational solutions to divergence of workload measures. Theoretical Issues in Ergonomics Science, 21(4), 369–396.

Hancock

P. A.

(1996). Effects of control order, augmented feedback, input device and practice on tracking performance and perceived workload. Ergonomics, 39(9), 1146–1162.

Hart

(1986). NASA Task Load Index (NASA-TLX) v. 1.0 Paper and Pencil Package. Moffett Field: NASA Ames Research Center.

10.

Hart

S. G.

Staveland

L. E.

(1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in Psychology, 52, 139–183.

11.

Hofmans

Theuns

(2008). On the linearity of predefined and self-anchoring Visual Analogue Scales. British Journal of Mathematical and Statistical Psychology, 61(2), 401–413.

12.

Huskisson

E. C.

(1974). Measurement of pain. The Lancet, 304(7889), 1127–1131. https://doi.org/10.1016/S0140-6736(74)90884-8

13.

Melman

Abbink

D. A.

van Paassen

M. M.

Boer

E. R.

de Winter

J. C. F.

(2018). What determines drivers’ speed? A replication of three behavioural adaptation experiments in a single driving simulator study. Ergonomics, 61(7), 966–987.

14.

NASA. (2014). Task Load Index paper and pencil version. Retrieved from http://humansystems.arc.nasa.gov/groups/TLX/downloads/TLXScale.pdf

15.

Pincus

Bergman

Sokka

Roth

Swearingen

Yazici

(2008). Visual analog scales in formats other than a 10 centimeter horizontal line to assess pain and other clinical data. The Journal of Rheumatology, 35(8), 1550–1558.

16.

Gore

. (2023). NASA TLX: iOS APP. https://humansystems.arc.nasa.gov/groups/tlx/tlxapp.php

17.

Sudman

Bradburn

N. M.

Schwarz

. (1996). Thinking about answers: The application of cognitive processes to survey methodology. Jossey-Bass.

18.

Tatar

Saltukoğlu

Özmen

. (2018). Development of a self report stress scale using item response theory-I: Item selection, formation of factor structure and examination of Its psychometric properties. Archives of Neuropsychiatry, 55(2), 161.

19.

Voutilainen

Pitkäaho

Kvist

Vehviläinen-Julkunen

. (2016). How to ask about patient satisfaction? The visual analogue scale is less vulnerable to confounding factors and ceiling effect than a symmetric Likert scale. Journal of Advanced Nursing, 72(4), 946–957.

20.

Wall

E. J.

Milewski

M. D.

Carey

J. L.

Shea

K. G.

Ganley

T. J.

Polousky

J. D.

Grimm

N. L.

Eismann

E. A.

Jacobs

J. C.

Jr Murnaghan

Nissen

C. W.

Myer

G. D.

; Research in Osteochondritis of the Knee (ROCK) Group; Weiss

Edmonds

E. W.

Anderson

A. F.

Lyon

R. M.

Heyworth

B. E.

Fabricant

P. D.

Zbojniewicz

. (2017). The reliability of assessing radiographic healing of osteochondritis dissecans of the knee. The American Journal of Sports Medicine, 45(6), 1370–1375.

Design Matters: Scale Design and Rounding Ratings Impacts NASA-TLX Results

Abstract

Keywords

Get full access to this article

References