Sage Journals: Discover world-class research

Abstract

Objective

This study investigated whether individual differences in multi-tasking ability (MTa) modulate the benefit and cost of supervising imperfect automation on performance, workload, situation awareness, stress, and trust in simulated air traffic control (ATC).

Background

Automation is rarely perfectly reliable, and automation failures can have significant detrimental effects. Prior work established that MTa can modulate the benefit of perfectly reliable automation. However, it is unknown whether MTa influences the cost of supervising imperfect automation.

Methods

MTa was indexed using a latent factor from three cognitive tasks completed by 113 undergraduate students. Participants completed two ATC blocks: one without automation (manual) and one with automated assistance for accepting incoming aircraft, handing-off outgoing aircraft, and conflict detection (violations of minimum separation). Conflict detection automation was perfectly reliable. Automation highlighting aircraft needing acceptance and hand-off was imperfect, missing 30% of events (the “unreliable” trials).

Results

Lower-MTa participants obtained greater performance benefits to aircraft hand-off from reliable-automation but suffered greater costs from unreliable automation compared to manual hand-off, relative to higher-MTa participants. Situation awareness was improved by automation provision, and workload reduced, although MTa did not vary these effects. Stress reduction with automation, compared to manual, was greater for lower-MTa compared to higher-MTa participants. Higher-MTa participants calibrated trust across the different reliability of ATC tasks more effectively.

Conclusion

MTa can lead to differentiated effects of imperfect automation on aircraft hand-off, perceived stress and trust.

Application

MTa may warrant consideration in personnel hiring and role selection for work contexts where automation reliability is volatile.

Keywords

multi-tasking individual differences automation air traffic control situation awareness

Introduction

Many modern workplaces require workers to strategically direct and divide their attention across multiple tasks when interacting with technology, which is cognitively demanding (Gutzwiller et al., 2019). When task demands overwhelm workers’ cognitive capacity this can result in reduced task performance (Matthews et al., 1996) and increased accident risk in safety critical settings (Clarke, 2012). Automation can support worker decision making by performing some, or all their tasks. Reliable automation (i.e., performing accurately and as expected) can release cognitive capacity and reduce workload relative to unaided manual performance (Kaber & Endsley, 2004; Onnasch, Wickens, et al., 2014). Automation is therefore essential for managing increasing demands in workplaces such as air traffic control (ATC), where growing traffic volumes may overwhelm human ability to maintain safety without automated assistance (Trapsilawati et al., 2017).

Reliable automation typically reduces task demands on operators, resulting in improved performance, reduced stress, and reduced mental workload (Onnasch, Wickens, et al., 2014). Perfectly reliable automation is not guaranteed due to inherent limitations of complex technology and the unpredictable nature of modern work environments. Automation can be “brittle” when operating in circumstances for which it was not designed, resulting in incorrect automated actions/advice (Smith, 2017). Automation failures can be elementary, involving missing or lost functionality, or more systemic (Jamieson & Skraaning, 2024). Alternatively, automation may work as intended by the designer but not as the operator expects (Sebok & Wickens, 2017). Regardless of the cause, imperfect automation can result in poorer performance than the operator performing the same task unaided (Strickland et al., 2023; Van Acker et al., 2018). It is therefore essential for operators to be selected, trained, and supported by the system to respond appropriately to imperfect automation. The current study examines how individual differences in cognitive ability, specifically multi-tasking, affect performance when working with imperfect automation.

Multi-Tasking Ability and Automation Supervision

In general, when task demands increase, individuals with more cognitive capacity can better maintain manual task performance (Schumacher et al., 2001). Studies have also sought to identify which facets of cognitive capacity impact operator performance when working with automation. For example, greater working memory capacity has been associated with calibrated trust and reliance on automation (Rovira et al., 2017). Further, it has been suggested that individuals who can more effectively switch attention perform better in multi-tasking environments (Chen & Barnes, 2012; Chen & Terrence, 2009; Wright and Chen, 2018) and may work more effectively with imperfect automation.

Individual multi-tasking ability (MTa) varies widely (Redick et al., 2016) and is commonly defined as “the strategic direction of attention” between multiple tasks (Gutzwiller et al., 2019, p. 197). Specifically, MTa can be defined in terms of three elements: (1) concurrently performing multiple tasks, (2) consciously shifting between tasks, and (3) performing component tasks in a short duration (Oswald et al., 2007). Concurrent task performance is challenging when tasks compete for the same limited pool of cognitive/attentional resources (e.g., Multiple Resource Theory, Wickens & Boles, 1983; Cognitive Bottleneck Theory, Pashler, 1984). Performing short duration consecutive tasks and shifting between tasks can incur costs arising from the need to reconfigure perceptual and cognitive resources (Monsell, 2003; Visser et al., 1999).

Monitoring automation typically requires operators to multi-task as they switch between attending to tasks needing manual completion and monitoring automation performance. As such, supervising automation may cognitively strain lower-MTa individuals (Balfe et al., 2015). Greenwell-Barnden et al. (2025) investigated how MTa interacted with simulated ATC performance. Perfectly reliable automation used highlighting and flashing to assist participants in identifying which incoming aircraft required acceptance and which outgoing aircraft required hand-off. While higher-MTa resulted in better performance (acceptance/hand-off speed and accuracy), participants with lower-MTa received greater benefits from the provision of automation. However, as this automation was perfectly reliable, it is unknown how these findings extend to the supervision of imperfect automation.

There are several reasons why MTa may modulate the cost of supervising imperfect automation. First, MTa may impact when (or if) operators detect automation errors (Balfe et al., 2015; Strand et al., 2014). Higher-MTa individuals may rely less on automation, and/or be less complacent in following its advice, leading to better detection of automation failures (Cak et al., 2020; Cullen et al., 2014). Performing multiple tasks concurrently may bring lower-MTa individuals closer to the limit of their capacity to monitor automation, potentially leading to slower/less accurate failure detection (McGarry et al., 2003; Rovira et al., 2002). In addition, having detected the failure, MTa may impact the speed of intervening to correct the failure (Bowden et al., 2024). Therefore, lower-MTa individuals may take longer to (a) detect errors and/or (b) take back manual control (Körber et al., 2015; Wickens et al., 2005).

In addition to impacting performance, automation affects operator experience. Psychological constructs, including situation awareness (SA), workload, stress, and trust, are frequently examined in relation to automation (Durso & Alexander; 2010; Loft et al., 2023; Onnasch, Wickens, et al., 2014) and may interact with MTa, a possibility which to the best of our knowledge has not been examined.

SA can be defined as the perception and comprehension of the task environment and projection of its near-future states (Endsley, 1988; Vu & Chiappe, 2015). Both reliable and imperfect automation can impair operator SA, relative to manual performance (Manzey et al., 2012; Strybel et al., 2016). While Greenwell-Barnden et al. (2025) found no effect of perfectly reliable automation on SA, and no modulating effect of MTa, imperfect automation may lead to a different result. For example, lower-MTa individuals may have poorer SA with imperfect automation if they are overwhelmed by task demands and rely more on automation, compared to higher-MTa individuals.

Workload describes the relationship between task demands and available operator mental capacity, with higher workload occurring as task demands approach human cognitive capacity (Young et al., 2015). Higher workload can lead to poorer SA (Durso & Alexander, 2010; Loft et al., 2023), although the relationship is complex and workload and SA can often not associate (Endsley et al., 2024). Lower-MTa individuals could experience higher workload (or less workload reduction) with imperfect automation compared to manual due to greater residual demands on their capacity (Kahneman, 1973; Saqer & Parasuraman, 2014). For example, automation misses could increase workload if operators are required to increase their vigilance to compensate for the system’s failure (Wickens et al., 2015; Young et al., 2015). Thus, while reliable automation may have workload benefits for lower-MTa individuals (Greenwell-Barnden et al., 2025), the workload cost of imperfect automation may be greater. This conjecture has not been previously examined.

While lower workload may reduce stress with reliable automation, imperfect automation may increase stress (if workload is increased; Matthews & Desmond, 2002). Stress may also increase with imperfect automation requiring the performance of multiple tasks at once (Sauer et al., 2012). Increased stress may therefore contribute to poorer human performance outcomes when using imperfect automation (Parasuraman et al., 2008). This could be particularly detrimental for lower-MTa individuals who need to expend more cognitive effort when performing multiple tasks due to limited cognitive resource availability (Salvucci & Taatgen, 2011; Tombu & Jolicœur, 2003).

Trust in automation varies as a function of (a) automation reliability, such that higher trust should be placed in perfectly reliable automation relative to imperfect automation (Bliss & Dunn, 2000; Chiou & Lee, 2023), (b) individuals’ perceived ability to supervise automation, and (c) individuals’ level of manual task ability (Lee & See, 2004; Moray et al., 2000). Lower-MTa individuals may trust automation more due to lower perceived manual task ability, regardless of the reliability of automation. While higher-MTa individuals may trust automation less, even when it is reliable, because they have higher confidence in their own abilities, and therefore less perceived need for assistance (Pop et al., 2015).

Examining SA, workload, stress, and trust provides insight into the mechanisms underlying MTa-related performance differences with imperfect automation. These operator state measures may serve as diagnostic indicators of whether performance effects stem from capacity limitations (workload), psychological burden (stress), or inappropriate automation reliance/monitoring patterns (trust, SA), potentially informing both theoretical understanding and practical interventions to enhance task performance.

Current Study

We investigated the extent to which imperfect automation interacted with MTa to predict task performance and other outcomes in simulated ATC. Participants completed two ATC blocks: one manual, and one with automated assistance. In the manual block, participants accepted incoming aircraft, handed-off departing aircraft, and intervened to prevent loss of separation between aircraft (conflicts). In the automation block, participants were provided with aircraft color highlighting to draw their attention to aircraft requiring acceptance or handing-off (Greenwell-Barnden et al., 2025). This can be considered a ‘lower degree’ of automation (Wickens et al., 2010) as it augments information acquisition and processing, with the participant still needing to action automated advice. Acceptance and hand-off automation were imperfect (30% miss-rate). This reliability was selected based on a quantitative literature review which suggested 70% accuracy is a “tipping point” beyond which unreliable automation becomes detrimental and performance outcomes are typically poorer than unaided (i.e., manual) performance (Wickens & Dixon, 2007). For the conflict detection task, perfectly reliable automation correctly highlighted all conflicting aircraft.

MTa was indexed via three cognitive tasks; Psychological Refractory Period (PRP; Van Selst et al., 1999), Dual Response Selection task (Dual task; Dux et al., 2009), and Attentional Blink task (AB; Chun & Potter, 1995). The Dual task and the AB task capture the central processing limits associated with multi-tasking (Carrier & Pashler, 1995; Chun & Potter, 2001) and are linked to brain regions common to multi-tasking processes, including task-switching (Dux et al., 2006). These tasks have been used to represent MTa components previously (Klapp et al., 2019; Tombu & Jolicœur, 2003) and in one study found to load onto a response selection latent factor, indicating commonalities in underlying processes (Bender et al., 2018).

In the automation block, we predicted that the performance benefits to aircraft acceptance and hand-off (i.e., faster/more accurate) on reliable-automation trials relative to manual trials would be greater for individuals with lower-MTa (replicating Greenwell-Barnden et al., 2025). We also predicted that the performance cost to aircraft acceptance and hand-off (i.e., slower/less accurate) on unreliable-automation trials relative to manual trials (Metzger & Parasuraman, 2005; Rovira & Parasuraman, 2010), would be greater for lower-MTa participants. For the perfectly-reliable conflict detection automation, we predicted a main effect of better automation-assisted conflict detection performance than manual performance (Bowden et al., 2025, and replicating Greenwell-Barnden et al., 2025). However, given MTa did not affect these performance gains with reliable conflict detection automation in Greenwell-Barnden et al. (2025), we predicted no interaction with MTa.

As per Greenwell-Barnden et al. (2025), higher-MTa participants are expected to have better SA (MTa main effect). Whether MTa would interact with unreliable-automation to predict SA was less clear. As outlined previously, participants may have poorer SA with imperfect automation (relative to their SA during manual task completion) due to the higher resource allocation required to manage task demands (Jipp & Ackerman, 2016; Kaber et al., 2000). Therefore, we tentatively predicted better SA in the manual block, compared to imperfect automation.

Although 30% imperfect for aircraft acceptance and hand-off, automation should overall reduce perceived workload for acceptance, hand-off, and conflict detection (automation use main effect) (Bowden et al., 2025; Onnasch, Ruff, & Manzey, 2014; and replicating Greenwell-Barnden et al., 2025). Further, we predicted that lower-MTa participants would show a greater reduction in perceived workload compared to manual, even with imperfect automation, than higher-MTa participants. Higher-MTa participants may experience less stress than lower-MTa participants (MTa main effect; Sauer et al., 2012). Higher-MTa participants may trust automation more appropriately relative to its reliability (i.e., interaction such that higher trust for reliable conflict detection automation than for imperfect acceptance and hand-off automation). In contrast, lower-MTa participants may not calibrate their trust based on the differential reliability of automation across the three tasks.

Methods

Participants

University undergraduate students (N = 121) participated and received AUD$10 and partial course credit. Eight participants’ data were excluded: four had missing data on at least one task, and four performed below chance accuracy on the Dual task. The final sample was N = 113 (36 males, 75 females, two identified as “other,” M_age = 21.23, SD = 6.38, range = 18–47 years). The sample size and number of stimuli resulted in ∼2000–8000 observations, which exceeds recommended observations for repeated-measures designs (e.g., 40 participants with 40 stimuli each; Brysbaert & Stevens, 2018).

Eight participants had completed a previous ATC study. To estimate whether this resulted in performance differences relative to naïve participants, we conducted preliminary analyses including prior experience as a factor. Prior experience did not significantly predict performance and so we elected to include these eight participants in the analyses below. This research complied with the tenets of the Declaration of Helsinki and was approved by the Human Research Ethics Committee at The University of Western Australia. Informed consent was obtained from each participant.

Procedure

The study was completed across two sessions, with an intervening 10–15 min break to reduce fatigue.

Session 1 (60 min): Participants were informed of the procedure and provided demographics. Participants then completed three MTa tasks in counterbalanced order.

Session 2 (120 min): Participants completed a 20-min audio-visual training presentation in PowerPoint which explained the ATC task, the difference between manual and automation conditions, and the nature of the SA queries. Participants were told the automation “should be reliable, but may not be perfect” and to continue to accept/hand-off aircraft even if they noticed automation did not highlight an aircraft needing acceptance/hand-off. Participants were instructed to prioritize conflict detection among the tasks. Training concluded with 10 multi-choice questions to ensure comprehension. Participants then completed a 20-min manual practice session and were provided with task performance feedback. Participants then completed two 30-min blocks (manual, automation) in counterbalanced order. Reaction time (RT) and accuracy for acceptance, handoff, conflict detection tasks, and SA questions were collected during the blocks. Workload, stress, and trust measures were collected post-block (stress was also assessed before the ATC task commenced—see Figure 1).

Figure 1.

Study procedure and timeline.

Measures

For the multitasking battery measures, we employed the methodologies described in Greenwell-Barnden et al. (2025), with no modifications. In addition, readers can consult the relevant citations below for full stimulus parameters for each task.

Multi-tasking—The Dual Response Selection task (Dual task; Dux et al., 2009). On each trial participants saw a single shape (hexagon/triangle), heard a complex tone, or were presented with both simultaneously. Symbols and tones were mapped to keys on the right or left hand (one hand per task, counterbalanced between participants), and participants identified them without pressing both keys simultaneously. Participants learned keyboard mapping corresponding to the tones and symbols during 24 counterbalanced practice trials. Experimental trials comprised 4 blocks with 36 trials each (144 total). Faster RT indicated lower dual-task processing cost.

Multi-tasking—Psychological Refractory Period (PRP; Van Selst et al., 1999). On each trial, participants heard a complex tone from a set of four for 200 ms and saw a symbol (#, &, @, or %) for 200–600 ms, with a short (200 ms) or long (1000 ms) blank interval between stimuli presentation. Each of these two types of tasks was mapped to keys on the right or left hand (counterbalanced between participants). Participants learned keyboard mapping corresponding to the four complex tones and symbols during 128 practice trials. Experimental trials comprised 5 blocks with 32 trials each (160 total). Faster RTs reflected more efficient task-switching.

Multi-tasking—Attentional Blink task (AB; Chun & Potter, 1995). On each trial, participants identified two target letters (excluding I, L, O, Q, U, V, and X) among 18 distractors items (digits 2–9 and symbols) presented for 100 ms. Targets (T1 and T2) were separated by zero, two, or seven distractors. The task contained 120 trials, split equally across distractor conditions. The AB was calculated as the difference in accuracy between T1 and T2. A smaller AB indicates better multi-tasking ability via better working memory and distractor filtering.

Performance—Air Traffic Control Task (Fothergill et al., 2009). Participants supervised a sector of airspace through which aircraft traveled along intersecting flight paths and at varying cruising altitudes and speeds. Participants accepted aircraft by pressing the “A” key and clicking on the aircraft within 20s of it crossing the boundary into the controlled sector. Participants handed-off aircraft by pressing the “H” key and clicking on the aircraft within 20s of it crossing the boundary to exit the sector. Participants prevented aircraft breaching minimum safe separation (1,000 ft vertical, five nautical miles lateral) by clicking on aircraft, then selecting the other aircraft in the conflict pair. A correct intervention resulted in one of the aircraft increasing altitude by 1,000 ft to avoid the conflict.

Participants completed two blocks: one manual and one with automation (order counter-balanced). In the manual block, acceptances, hand-offs, and conflict detection were performed by the participant without automation. In the automation block (Figure 2), acceptance and hand-off automation, when reliable, used highlighting and flashing to draw participant attention towards aircraft in need of acceptance (blue) or hand-off (orange). Automation was 100% reliable for the first 5 min. Afterward, approximately every third aircraft was missed by automation. Overall, acceptance/hand-off automation was 70% reliable. Conflict detection automation highlighted all pairs of aircraft that would potentially conflict or pass close (near-misses). Conflict detection automation was 100% reliable. Adjusted RTs (mean RT divided by accuracy) were calculated separately for reliable and unreliable acceptance, hand-off, and conflict detection trials to account for potential speed-accuracy trade-offs (Liesefeld & Janczyk, 2019; Visser et al., 2015).

Figure 2.

ATC sector display. When automation was reliable, aircraft flashed blue when they approached the participants’ sector (light grey polygon) indicating they required acceptance (NZ17) or orange indicating they required hand-off (NZ29). Pairs of aircraft that might potentially conflict were red and bolded (AA36 and QF94). Acceptance and Handoff automation was 70% reliable, and the Conflict Detection automation was 100% reliable. In the manual block, all aircraft remained green throughout. Actions performed by the participant were logged in the “Events” box on the right of screen.

Situation awareness: A modified Situation Present Awareness Method (SPAM; Durso & Dattel, 2004) measured SA. Participants were instructed to click on a “Ready for Question?” prompt within 10s. The task was then paused, but the display was not blanked. One SA query was then presented along with four responses. Participants were instructed to click a response option as quickly and accurately as possible. Queries appeared every 2–3 min (18 total). The full list of SA queries is presented in Supplemental Materials. Adjusted RT was calculated as for performance measures (as noted above).

Workload: The NASA-TLX (Hart & Staveland, 1988) measured subjective workload. Participants completed the NASA-TLX twice—once after each block. A global workload score was calculated as the mean of the six weighted subscale scores, ranging from 0 (low) to 100 (high).

Stress: The Short Stress-State Questionnaire (SSSQ; Helton & Näswall, 2015) was completed before the first block (pre-task baseline) and then after each of the two blocks. This measure comprises 24 items with responses on a 5-point Likert scale (“not at all” to “extremely”). Higher total scores indicate greater perceived stress.

Trust: Following each block, participants completed six questions with responses on a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree) about their “trust” in the acceptance/hand-off automation and in the conflict detection automation (Merritt et al., 2013). Higher total scores indicate greater trust in automation.

Analysis was conducted in R. Linear mixed models using the lme4 plugin (Bates et al., 2015) for R (R Core Team, 2015).

Results

Data Cleaning

The following data cleaning was conducted before calculating a multi-tasking factor. For the PRP and Dual tasks, mean target RTs were calculated for correct trials only. Trials with RTs <150 ms and >3SD above the mean RT for each participant were excluded. PRP also removed responses less than 50 ms apart, as these do not reflect sequential decision making (Ulrich & Miller, 2008). This resulted in the removal of 0.90% of Dual task and 13.2% of PRP data. Additionally, all participant data were omitted for a task if they had an overall mean accuracy of less than 50%. This removed four participants’ Dual task (none removed from AB or PRP task).

For the ATC task, acceptance and hand-off data in the automation block were separated into “trial type” (reliable, unreliable). RT outliers were removed for the manual block and each trial type independently, using the outlier criteria described above. Adjusted RTs (mean RT divided by accuracy) were then calculated. See Supplemental Materials for task descriptive statistics.

MTa Factor Score

Three variables were used to formulate the latent MTa factor (following Greenwell-Barnden et al., 2025; Redick, 2016). These included: mean RT at shortest inter-target interval in the PRP task, mean RT in the dual stimulus presentation condition in the Dual task, and AB Combined T1s for all lags. Task descriptive statistics are presented in Supplemental Materials. These variables represent the most difficult condition within each task, thus better reflecting MTa. Variables were centered by transforming on a z-distribution before factor analysis. A participant’s MTa factor score was created by saving the Bartlett scores from the factor analysis which isolate shared variance on a factor across the tasks included (DiStefano et al., 2009). Scores were transformed (multiplied by −1) to ensure higher scores represented better MTa (Bartholomew et al., 2009). There was an adequate range of MTa factor scores (SD = 0.78, min = −1.73, max = 2.22) comparable to our previous study (Greenwell-Barnden et al., 2025).

Linear Mixed Models on ATC Performance and Situation Awareness

To test for the effects of imperfect automation and MTa on acceptance and hand-off performance, a series of linear mixed effects models (LMM) were conducted. Table 1 presents standardized coefficients and model fit indexes. Data were entered in nested form to account for the within-subject design. The fixed factor entered was trial type. For acceptances and hand-offs, there were three levels of trial type (manual, reliable-automation, and unreliable-automation). For conflict detection, there were two levels (manual and reliable-automation). MTa (continuous) and the interaction term (trial type*MTa) were entered. Random effects entered included: participant number (intercept) and block*participant (slope) to control for within-subject variability across blocks.

Table 1.

Linear Mixed Effects Standardized Coefficient Estimates, Standard Deviations in Parantheses and Model Fit Summaries for Acceptance, Hand-Off, and Conflict Detection Task Performance.

Parameter	Acceptance Task (Manual vs Reliable-Automation Trials Only)	Acceptance Task (Manual vs Unreliable-Automation Trials Only)	Hand-Off Task (Manual vs Reliable-Automation Trials)	Hand-Off Task (Manual vs Unreliable-Automation Trials)	Conflict task (Manual vs Reliable-Automation Trials)
Multi-tasking ability	−0.08 (0.23)	−0.10 (0.25)	−0.10* (0.25)	−0.12 (0.27)	−0.21*** (3.65)
Trial type	−0.21*** (0.09)	0.09*** (0.23)	−0.22*** (0.09)	0.19*** (0.28)	−0.13*** (4.58)
Trial type × multi-tasking ability	0.02 (0.12)	−0.04 (0.30)	0.04*** (0.11)	−0.05* (0.36)	0.05 (4.69)
Observations	8203	6193	8323	6060	2180
Log likelihood	−20301.85	−16583.50	−20069.98	−16561.95	−12236.00
AIC	40617.70	33181.00	40153.96	33137.90	24486.00
BIC	40666.78	33228.10	40203.14	33184.86	24525.80
R2 (conditional)	0.35	0.28	0.42	0.32	0.26
R2 (marginal)	0.05	0.02	0.05	0.05	0.06

Note. ***p < .001, **p < .01, *p < .05. The estimates represent standardized coefficient estimates for fixed effects in the linear mixed model on the trial level specifying a random intercept and slope per participant. P-values were computed using the Wald approximation. R2 conditional is the model’s total explanatory power. R2 marginal is the expanatory power of the fixed effects alone. Observations are the number of data points (trials) analyzed. Log likelihood is a measure of fit describing how well a parameter explains the observed data. AIC is Akaike Information Criterion, an extimator of prediction error. BIC is Bayesian Information Criterion, a measure for model comparison or selection.

ATC Performance: Separate LMM analyses were conducted for the acceptance and hand-off tasks, first comparing manual and reliable-automation trials, and then comparing manual and unreliable-automation trials.

Aircraft acceptance was better on reliable-automation trials than manual trials (i.e., negative beta indicating a decrease in adjusted RT), and poorer on unreliable-automation trials than manual trials (i.e., positive beta indicating an increase in adjusted RT). MTa was not a significant predictor, and the interaction between MTa and trial type was not significant.

Hand-off performance was better on reliable-automation trials than manual trials and poorer on unreliable-automation trials than manual trials. Higher-MTa participants performed better than lower-MTa participants. The interaction between MTa and trial type was significant for both reliable-automation and unreliable-automation trials (compared to manual). Figure 3 shows the performance of higher-MTa participants was less affected by automation compared to lower-MTa participants, regardless of automation reliability. Lower-MTa participants received a performance benefit (compared to manual) on reliable-automation trials, but also a greater performance cost on unreliable-automation trials.

Figure 3.

Adjusted RT (mean RT divided by accuracy in seconds) for acceptance, hand-off, and conflict detection tasks by trial type (manual, reliable-automation, and unreliable-automation) against multi-tasking (MTa) score (normed). LMM slope and intercept are provided. Lower adjusted RT reflects better performance. The blue line (dashes) represents manual trials, the red line (solid) represents unreliable-automation trials, and the green line (dots) represents reliable-automation trials.

For the conflict detection task, automation was perfectly reliable. Therefore, we compared reliable-automation and manual trials (Figure 3). Trial type was a significant predictor, such that conflict detection was better on automation trials relative to manual trials. MTa was also a significant predictor, such that higher-MTa participants had better conflict detection than lower-MTa participants. The interaction between MTa and trial type was not significant.

Situation Awareness: Table 2 presents standardized coefficients and model fit indexes for SA. Block was a significant predictor, such that SA was better in the automated block relative to manual (Figure 4). MTa was also a significant predictor, such that lower-MTa participants had poorer SA than higher-MTa participants. The interaction between MTa and trial type was not significant.

Table 2.

Linear Mixed Effects Standardized Coefficient Estimates, Standard Deviations in Parantheses and Model Fit Summaries.

Parameter	Block	Multi-Tasking Ability	Block × Multi-Tasking Ability	Observations	Log Likelihood	AIC	BIC	R2 C	R2 M
Situation awareness adjusted RT	−0.07** (−0.34)	−0.16*** (−0.45)	0.02 (-0.43)	3550	−12314.7	24642.4	24586.6	0.19	0.03

Note. ***p < .001, **p < .01, *p < .05. The estimates represent standardized coefficient estimates for fixed effects in the linear mixed model on the trial level specifying a random intercept and slope per participant. P-values were computed using the Wald approximation. R2 C (conditional) is the model’s total explanatory power. R2 M (marginal) is the explanatory power of the fixed effects alone. Observations are the number of data points (trials) analyzed. Log likelihood is a measure of fit describing how well a parameter explains the observed data. AIC is Akaike Information Criterion, an extimator of prediction error. BIC is Bayesian Information Criterion, a measure for model comparison or selection.

Figure 4.

Adjusted RT (mean RT divided by accuracy in seconds) for situation awareness by block (manual and automation) against multi-tasking score (MTa, normed). LMM slope and intercept are provided. Lower adjusted RT reflects better situation awareness. The blue line (dashes) represents the manual block, and the black line (solid) represents the automation block.

ANOVAs on Workload, Stress, and Trust

LMM was not suitable for workload, stress, and trust measures as there were only two data points per participant (i.e., one workload score for the manual block and one for automation block). Therefore mixed-ANOVAs were conducted. Block (manual and automated) was a within-subjects factor with two levels. Participants were separated into three approximately equal MTa groups based on relative MTa score (lower = bottom 33%, medium = middle 33%, higher = top 33%). MTa group was a between-subjects factor with three levels. A one-way ANOVA conducted to validate these groups showed a significant difference between MTa group means (M_lower = -0.85, M_medium = −0.04, M_higher = 0.91), F (2,68) = 231.04, p < .001. Simple comparisons showed all groups significantly differed, all p < .001.

Workload: There was an effect of block, F (1,110) = 46.31, p < .001, η² = 0.10, where workload was lower in the automated compared to manual block (Figure 5). There was no effect of MTa group, and no interaction (both F < 1).

Figure 5.

Subjective workload measure (NASA-TLX) across three groups of multi-tasking ability (MTa) and two blocks of the ATC. Error bars represent 95% confidence intervals around the estimate of marginal means (between subjects).

Stress: Pre-task stress was included as a covariate. There were no main effects of MTa or block (both F < 1). There was an interaction between MTa group and block, F (2,108) = 3.18, p < .05, η² = 0.003. As illustrated in Figure 6, the difference in reduction of stress between manual and automated blocks was greatest for the lower-MTa group. A follow-up comparison confirmed that stress was lower in the automated block (M = 2.37, SD = 0.43) compared to manual block (M = 2.49, SD = 0.48) for the lower-MTa group, t (36) = 3.55, p < .001, η² = .58. In contrast, there was no difference in stress between automated and manual blocks for medium and higher-MTa groups, t < 1.

Figure 6.

Post-block stress ratings across three groups of multi-tasking ability (MTa) and two blocks of the ATC. Error bars represent 95% confidence intervals around the estimate of marginal means (between subjects).

Trust. There was an effect of task, F (1,108) = 15.34, p < .001, η² = 0.048, where the reliable conflict detection automation was trusted more (M = 3.19, SD = 1.09) than imperfect acceptance and hand-off automation (M = 2.70, SD = 0.98). There was no effect of MTa group, F < 1. There was an interaction between MTa group and task, F (1,108) = 5.83, p < .05, η² = 0.037 (Figure 7). Trust was lower for acceptance/hand-offs (M = 2.47, SD = 0.99) compared to the conflict detection task (M = 3.50, SD = 1.08) for the higher-MTa group, t (108) = 5.05, p < .001, η² = .80. In contrast, there were no differences between trust for acceptance/hand-offs and conflict detection automation for the lower and medium groups, t < 1.

Figure 7.

Post-automated block average trust ratings across multi-tasking ability (MTa) groups and two levels of reliability for the tasks in the ATC. Acceptance and Handoff tasks were 70% reliable, and the Conflict Detection task was 100% reliable. Error bars represent 95% confidence intervals around the estimate of marginal means (between subjects).

Discussion

This study examined the effect of imperfectly reliable automation, and how it differentially impacted an individual’s performance as a function of their MTa. In Greenwell-Barnden et al. (2025), we established that MTa can modulate the benefit of perfectly reliable-automation on performance, in that those with lower-MTa benefited more from reliable automation. We aimed to extend these findings to imperfect automation by examining whether imperfect automation interacts with MTa to predict costs to task performance, workload, SA, stress, and trust in simulated ATC. We predicted that lower-MTa individuals would experience poorer outcomes with imperfect automation (i.e., poorer performance, higher workload and stress, and mis-calibrated trust in automation) compared to manual trials, when compared to higher-MTa individuals.

We found partial evidence supporting this prediction, as the cost of imperfect automation to aircraft hand-off, but not aircraft acceptance, was greater for participants with lower-MTa. We also found greater benefits to aircraft hand-off, but not aircraft acceptance, compared to manual trials for lower-compared to higher-MTa participants. This inconsistency across outcomes between acceptance and hand-off tasks was unexpected, and the lack of differential benefit from reliable automation between those with lower- and higher-MTa does not replicate Greenwell-Barnden et al. (2025). The predicted main effects of MTa were observed, with lower-MTa participants performing poorer than higher-MTa participants on the hand-off and conflict detection tasks (no significant effect for acceptances). Predicted automation use effects were also observed compared to manual, with better performance in reliable-automation trials and poorer performance in unreliable-automation trials, consistent with previous findings in the human-automation teaming literature (Bowden et al., 2024; Körber et al., 2015; Strand et al., 2014; Strickland et al., 2023).

These findings suggest lower-MTa participants took longer to detect aircraft hand-off automation errors and take action to compensate for automation failure. Conversely, higher-MTa participants possibly realized sooner that hand-off automation was imperfect, and compensated, potentially because they had greater task-switching and attentional capacity as indicated by their better trust calibration (they showed decreased trust in imperfect acceptance/hand-off automation than the reliable conflict detection automation). Alternatively, higher-MTa participants may have relied on aircraft hand-off automation less (Cak et al., 2020). This is supported by their higher workload in the automated block than the other MTa groups, as they possibly realized hand-off automation was imperfect sooner, and compensated for it, which increased their workload slightly.

There was an automation benefit to SA with imperfectly reliable automation compared to manual. This is contrary to previous studies that found poorer SA under automation than manual conditions (e.g., Onnasch, Wickens, et al., 2014; Strybel et al., 2016), and our previous work where SA was not affected by automation (Greenwell-Barnden et al., 2025). In our previous study automation was reliable, and visual-search requirements were therefore reduced. Here, the imperfect acceptance/hand-off automation may have required participants to pay more attention to their task environment, thus increasing their relative speed of SA query response, but as reported below automation provision reduced workload.

Automation did not interact with MTa as predicted for workload. However, there was an automation benefit consistent with Greenwell-Barnden et al. (2025; also see meta-analysis by Onnasch, Wickens, et al., 2014) for perfectly reliable-automation compared to manual (despite imperfect automation reliability). The presence of reliable conflict detection automation may explain the overall reduced workload, as conflict detection is a more challenging task than the acceptance/hand-off task. This is similar to Di Nocera et al. (2006) who found that subjective workload was not affected by automation errors in ATC.

As predicted, MTa interacted with automation such that lower-MTa participants reported a reduction in stress when provided imperfect automation compared to manual. In contrast, there were no differences with imperfect automation compared to manual for other MTa groups. This suggests that for lower-MTa participants, perfectly reliable conflict detection automation reduced stress, which outweighed the potential stress induced by imperfect acceptance/hand-off automation. This contrasts with previous research which has suggested automation can increase stress in tasks as long in duration as the current task (i.e., >30 min) by reducing motivation, concentration, and energetic arousal resulting in fatigue compared to manual task performance (McGarry et al., 2003). More broadly, it should also be noted that the current task duration (2 × 30 min blocks) certainly does not reflect full work shifts in operational environments and MTa could fluctuate over time, potentially as a function of task-fatigue, attentional lapses, or change in motivation (Rann & Almor, 2025). As such, future research should investigate the relationship between MTa and automation supervision in longer duration tasks.

Also as predicted, MTa interacted with automation such that higher-MTa participants had better trust calibration in that they trusted the perfectly reliable conflict detection automation more than the imperfect acceptance/hand-off automation. For lower and medium MTa groups, trust did not differ across perfect and imperfectly reliable automation. This finding potentially could account for higher-MTa participants’ better conflict detection task performance, as better trust calibration may free up cognitive capacity to monitor the reliability of the hand-off automation.

Limitations, Future Directions, and Practical Implications

These findings are based on one type of automation error (misses: failing to flag a relevant event) and thus may not generalize to other failure situations. Other types of failure such as false alarms (inappropriate or unnecessary alerts) and how they interact with individual cognitive capabilities should be examined. Previous literature has suggested automation false alarms may differentially impact performance compared to misses, in part due to misses being more salient (Chen & Terrence, 2009; Wickens et al., 2010). False alarm failures may therefore have a commensurately larger impact on operators with lower MTa.

The current study showed some evidence that the benefit and cost of imperfect automation varies with MTa in novice undergraduate students. Expert operators however undoubtedly differ in their experience, motivation, and skill compared to novices (Balfe et al., 2015; Jamieson & Skraaning, 2018). As such, experts may have better recovery performance when automation fails (Roth et al., 2019), have different perceptions of workload (Matthews & Desmond, 2002), and may be more trusting of automation (Niu et al., 2018). Thus, the present pattern of findings could differ for experts who may experience different benefits and costs of automation failure. MTa effects may not apply to experts who have significant practice, which may overcome differences in cognitive ability. Alternatively, the MTa interaction may exist in experts, but to a smaller extent. Expert operators may also have different modality-based multi-tasking requirements in their task environments (auditory as well as visual). Multi-tasking modality could be investigated in future studies for its impact on performance with imperfect automation.

The key current findings are that automation may not assist everyone equally, and that automation failures also have differential effects on performance, stress, and trust determined, at least in part, by an individual’s MTa. To avoid greater costs of automation error being borne by those who are least capable of compensating for such system failures, it may be necessary to selectively hire operators for their MTa and put those with the greatest ability in roles where automation is anticipated to be particularly vulnerable to errors. This may include placing higher-MTa individuals in roles where the consequence of failure of automation is greater (e.g., critical safety monitoring), creating mixed-ability teams where higher-MTa individuals can monitor automation and provide support to lower-MTa operators, and/or providing more structured training and support for lower-MTa operators such as decision trees for dealing with automation failure. Hence, an individually tailored approach to automation may provide the best outcomes when automation reliability cannot be guaranteed.

Conclusions

Multi-tasking ability varied the impact of imperfect automation in a simulated air traffic control task. Multi-tasking ability may lead to differentiated effects of imperfect automation on performance, situation awareness, stress, and trust with imperfect automation. Taken together with the prior findings of Greenwell-Barnden et al. (2025), it may be beneficial to consider individual cognitive ability when determining what automation should be deployed in the workplace, as those who gain a significant benefit from reliable automation may also suffer greater costs when automation was unreliable.

Key points

Participants with lower-MTa had greater performance benefits from reliable aircraft hand-off automation (compared to manual), but greater costs with imperfect hand-off automation (compared to manual), relative to those with higher-MTa.

Situation awareness was unexpectedly higher with imperfect automation (compared to manual), potentially due to increased attention required to monitor imperfect automation.

Imperfect automation reduced stress for participants with lower-MTa (compared to manual) but did not for participants with higher-MTa.

Participants with higher-MTa calibrated trust across the different reliability of ATC tasks more effectively.

Supplemental Material

Supplemental Material - How Multi-Tasking Ability Impacts Performance, Workload, Situation Awareness, Stress and Trust with Simulated Imperfect Automation

Supplemental Material for How Multi-Tasking Ability Impacts Performance, Workload, Situation Awareness, Stress and Trust with Simulated Imperfect Automation by Jayden N. Greenwell-Barnden, Troy A. W. Visser, Shayne Loft, Susannah J. Whitney, Vanessa K. Bowden in Human Factors.

Footnotes

ORCID iDs

Jayden N. Greenwell-Barnden

Troy A. W. Visser

Shayne Loft

Vanessa K. Bowden

Ethical Considerations

This research complied with the tenets of the Declaration of Helsinki and was approved by the Human Research Ethics Committee at The University of Western Australia.

Consent to Participate

Informed consent was obtained from each participant.

Consent for Publication

Participants were informed in the Participant Information Form that deidentified data may be included in published research before consenting to participate.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported through an Australian Government Research Training Program Scholarship and the Australian Army and Defense Science Partnerships agreement of the Defense Science and Technology Group (ID 7120), as part of the Human Performance Research network (HPRnet).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data for this study is not made publicly available.*

Supplemental Material

Supplemental material for this article is available online.

Author Biographies

Jayden N. Greenwell-Barnden is lecturer at Edith Cowan University. He received his PhD in psychology in 2023 from The University of Western Australia.

Troy A. W. Visser is an associate professor at the University of Western Australia. He received his PhD in cognitive systems in 2001 from the University of British Columbia.

Shayne Loft is a professor in the School of Psychological Science at The University of Western Australia. He received his PhD in psychology in 2004 from the University of Queensland.

Susannah J. Whitney is a Senior Human Scientist in the Defence Science and Technology Group, in the Australian Department of Defence. She received her PhD in psychology in 2005 from the University of Queensland.

Vanessa K. Bowden is a lecturer in the School of Psychological Science at The University of Western Australia. She received her PhD in psychology in 2012 from the University of Western Australia.

References

Balfe

Sharples

Wilson

J. R.

(2015). Impact of automation: Measurement of performance, workload and behaviour in a complex control environment. Applied Ergonomics, 47(1), 52–64. https://doi.org/10.1016/j.apergo.2014.08.002

Bartholomew

D. J.

Deary

I. J.

Lawn

(2009). The origin of factor scores: Spearman, Thomson and Bartlett. British Journal of Mathematical and Statistical Psychology, 62(3), 569–582. https://doi.org/10.1348/000711008x365676

Bates

Mächler

Bolker

Walker

(2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

Bender

Loft

Lipp Visser

T. A. W.

(2018). Advancing our understanding of warfighter cognition: Development of a “cognitive profiling” tool to enhance situation awareness. Defence Science and Technology (DST) Group Human Performance Research Network (HPRnet).

Bliss

J. P.

Dunn

M. C.

(2000). Behavioural implications of alarm mistrust as a function of task workload. Ergonomics, 43(9), 1283–1300. https://doi.org/10.1080/001401300421743

Bowden

Long

Loft

(2024). Reducing the costs of automation failure by providing voluntary automation checking tools. Human Factors, 66(7), 1817–1829. https://doi.org/10.1177/00187208231190980

Bowden

V. K.

Gegoff

Kilpatrick

P. J.

Loft

(2025). The impact of lower-degree automation reliability on higher-degree automation failure detection in simulated air traffic control. Human Factors, 67(11), 1121–1135. https://doi.org/10.1177/00187208251335536

Brysbaert

Stevens

(2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 9. https://doi.org/10.5334/joc.10

Cak

Say

Misirlisoy

(2020). Effects of working memory, attention, and expertise on pilots’ situation awareness. Cognition, Technology & Work, 22(1), 85–94. https://doi.org/10.1007/s10111-019-00551-w

10.

Carrier

L. M.

Pashler

(1995). Attentional limits in memory retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21(5), 1339–1348. https://doi.org/10.1037//0278-7393.21.5.1339

11.

Chen

J. Y. C.

Barnes

M. J.

(2012). Supervisory control of multiple robots: Effects of imperfect automation and individual differences. Human Factors, 54(2), 157–174. https://doi.org/10.21236/ada552060

12.

Chen

J. Y. C.

Terrence

P. I.

(2009). Effects of imperfect automation and individual differences on concurrent performance of military and robotics tasks in a simulated multi-tasking environment. Ergonomics, 52(8), 907–920. https://doi.org/10.1080/00140130802680773

13.

Chiou

E. K.

Lee

J. D.

(2023). Trusting automation: Designing for responsivity and resilience. Human Factors, 65(1), 137–165. https://doi.org/10.1177/00187208211009995

14.

Chun

M. M.

Potter

M. C.

(1995). A two-stage model for multiple target detection in rapid serial visual presentation. Journal of Experimental Psychology: Human Perception and Performance, 21(1), 109–127. https://doi.org/10.1037/0096-1523.21.1.109

15.

Chun

M. M.

Potter

M. C.

(2001). The attentional blink and task-switching within and across modalities. In Shapiro

(Ed.), In the limits of attention: Temporal constraints in human information processing (pp. 20–35). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198505150.003.0002

16.

Clarke

(2012). The effect of challenge and hindrance stressors on safety behavior and safety outcomes: A meta-analysis. Journal of Occupational Health Psychology, 17(4), 387–397. https://doi.org/10.1037/a0029817

17.

Cullen

R. H.

Dan

C. S.

Rogers

W. A.

Fisk

A. D.

(2014). The effects of experience and strategy on visual attention allocation in an automated multiple-task environment. International Journal of Human-Computer Interaction, 30(7), 533–546. https://doi.org/10.1080/10447318.2014.906158

18.

Di Nocera

Terenzi

Camilli

(2006). Another look at scanpath: Distance to nearest neighbour as a measure of mental workload. In Developments in Human Factors in Transportation, Design, and Evaluation (pp. 295–303). Shaker Publishing.

19.

DiStefano

Zhu

Mindrila

(2009). Understanding and using factor scores: Considerations for the applied researcher. Practical Assessment, Research and Evaluation, 14(1), 20. https://doi.org/10.7275/da8t-4g52

20.

Durso

F. T.

Alexander

A. L.

(2010). Managing workload, performance, and situation awareness in aviation systems. In In human factors in aviation (pp. 217–247). Academic Press. https://doi.org/10.1016/b978-0-12-374518-7.00008-0

21.

Durso

F. T.

Dattel

A. R.

(2004). SPAM: The real-time assessment of SA. In Banbury

Tremblay

(Eds.), A cognitive approach to situation awareness: Theory and application (pp. 137–154). Ashgate.

22.

Dux

P. E.

Ivanoff

Asplund

C. L.

Marois

(2006). Isolation of a central bottleneck of information processing with time-resolved fMRI. Neuron, 52(6), 1109–1120. https://doi.org/10.1016/j.neuron.2006.11.009

23.

Dux

P. E.

Tombu

M. N.

Harrison

Rogers

B. P.

Tong

Marois

(2009). Training improves multitasking performance by increasing the speed of information processing in human prefrontal cortex. Neuron, 63(1), 127–138. https://doi.org/10.1016/j.neuron.2009.06.005

24.

Endsley

M. R.

(1988). Design and evaluation for situation awareness enhancement. In Proceedings of the human factors society annual meeting (Vol. 32, pp. 97–101). Sage Publications. https://doi.org/10.1177/154193128803200221

25.

Endsley

M. R.

Dixon

Endsley

Jamrog

Smith-Velazquez

Pfeffer

(2024). Divergence in situation awareness and workload. Ergonomics, 68(10), 1618–1634. https://doi.org/10.1080/00140139.2024.2427859

26.

Fothergill

Loft

Neal

(2009). ATC-labAdvanced: An air traffic control simulator with realism and control. Behavior Research Methods, 41(1), 118–127. https://doi.org/10.3758/brm.41.1.118

27.

Greenwell-Barnden

J. N.

Loft

Bowden

V. K.

Bender

A. D.

Whitney

S. J.

Lipp

O. V.

Visser

T. A.

(2025). Individual differences in multi-tasking ability moderate the benefits of using low-degree automation. Theoretical Issues in Ergonomics Science, 26(4), 478–507. https://doi.org/10.1080/1463922x.2024.2446848

28.

Gutzwiller

R. S.

Wickens

C. D.

Clegg

B. A.

(2019). The role of reward and effort over time in task-switching. Theoretical Issues in Ergonomics Science, 20(2), 196–214. https://doi.org/10.1080/1463922x.2018.1522556

29.

Hart

S. G.

Staveland

L. E.

(1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology (Vol. 52, pp. 139–183). https://doi.org/10.1016/s0166-4115(08)62386-9

30.

Helton

W. S.

Näswall

(2015). Short Stress State Questionnaire: Factor structure and state change assessment. European Journal of Psychological Assessment, 31(1), 20–30. https://doi.org/10.1027/1015-5759/a000200

31.

Jamieson

G. A.

Skraaning

Jr, G.

(2018). Levels of automation in human factors models for automation design: Why we might consider throwing the baby out with the bathwater. Journal of Cognitive Engineering and Decision Making, 12(1), 42–49. https://doi.org/10.1177/1555343417732856

32.

Jamieson

G. A.

Skraaning

Jr, G.

(2024). Stumbling towards a shared apprehension of automation failure. Journal of Cognitive Engineering and Decision Making, 18(4), 402–423. https://doi.org/10.1177/15553434241292400

33.

Jipp

Ackerman

P. L.

(2016). The impact of higher levels of automation on performance and situation awareness. Journal of Cognitive Engineering and Decision Making, 10(2), 138–166. https://doi.org/10.1177/1555343416637517

34.

Kaber

D. B.

Endsley

M. R.

(2004). The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical Issues in Ergonomics Science, 5(2), 113–153. https://doi.org/10.1080/1463922021000054335

35.

Kaber

D. B.

Onal

Endsley

M. R.

(2000). Design of automation for telerobots and the effect on performance, operator situation awareness, and subjective workload. Human Factors and Ergonomics in Manufacturing & Service Industries, 10(4), 409–430. https://doi.org/10.1002/1520-6564(200023)10:4%3C409::aid-hfm4%3E3.0.co;2-v

36.

Kahneman

(1973). Attention and effort. Prentice Hall, Inc.

37.

Klapp

S. T.

Maslovat

Jagacinski

R. J.

(2019). The bottleneck of the psychological refractory period effect involves timing of response initiation rather than response selection. Psychonomic Bulletin & Review, 26(1), 29–47. https://doi.org/10.3758/s13423-018-1498-6

38.

Körber

Weißgerber

Kalb

Blaschke

Farid

(2015). Prediction of take-over time in highly automated driving by two psychometric tests. Dyna, 82(193), 195–201. https://doi.org/10.15446/dyna.v82n193.53496

39.

Lee

J. D.

See

(2004). Trust in automation and technology: Designing for appropriate reliance. Human Factors, 46(1), 50–80. https://doi.org/10.1518/hfes.46.1.50.30392

40.

Liesefeld

H. R.

Janczyk

(2019). Combining speed and accuracy to control for speed-accuracy trade-offs (?). Behavior Research Methods, 51(1), 40–60. https://doi.org/10.3758/s13428-018-1076-x

41.

Loft

Tatasciore

Visser

T. A. W.

(2023). Managing workload, performance, and situation awareness in aviation systems. In Keebler

Lazzara

Wilson

Blickensderfer

(Eds.), Human factors in aviation and aerospace (3rd ed., pp. 171–197). Elsevier: Academic Press. https://doi.org/10.1016/b978-0-12-420139-2.00018-6

42.

Manzey

Reichenbach

Onnasch

(2012). Human performance consequences of automated decision aids: The impact of degree of automation and system experience. Journal of Cognitive Engineering and Decision Making, 6(1), 57–87. https://doi.org/10.1177/1555343411433844

43.

Matthews

Desmond

P. A.

(2002). Task-induced fatigue states and simulated driving performance. The Quarterly Journal of Experimental Psychology A: Human Experimental Psychology, 55A(2), 659–686. https://doi.org/10.1080/02724980143000505

44.

Matthews

Sparkes

T. J.

Bygrave

H. M.

(1996). Attentional overload, stress, and simulate driving performance. Human Performance, 9(1), 77–101. https://doi.org/10.1207/s15327043hup0901_5

45.

McGarry

Rovira

Parasuraman

(2003). Effects of task duration and type of automation support on human performance and stress in a simulated battlefield engagement task. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 47(3), 548–552. https://doi.org/10.1037/e577042012-062

46.

Merritt

S. M.

Heimbaugh

LaChapell

Lee

(2013). I trust it, but I don’t know why: Effects of implicit attitudes toward automation on trust in an automated system. Human Factors, 55(3), 520–534. https://doi.org/10.1177/0018720812465081

47.

Metzger

Parasuraman

(2005). Automation in future air traffic management: Effects of decision aid reliability on controller performance and mental workload. Human Factors, 47(1), 35–49. https://doi.org/10.4324/9781315095080-22

48.

Monsell

(2003). Task switching. Trends in Cognitive Sciences, 7(3), 134–140. https://doi.org/10.1016/s1364-6613(03)00028-7

49.

Moray

Inagaki

Itoh

(2000). Adaptive automation, trust, and self-confidence in fault management of time-critical tasks. Journal of Experimental Psychology: Applied, 6(1), 44–58. https://doi.org/10.1037//1076-898x.6.1.44

50.

Niu

Geng

Zhang

(2018). Relationship between automation trust and operator performance for the novice and expert in spacecraft rendezvous and docking (RVD). Applied Ergonomics, 71(August 2017), 1–8. https://doi.org/10.1016/j.apergo.2018.03.014

51.

Onnasch

Ruff

Manzey

(2014). Operators׳ adaptation to imperfect automation – Impact of miss-prone alarm systems on attention allocation and performance. International Journal of Human-Computer Studies, 72(10–11), 772–782. https://doi.org/10.1016/j.ijhcs.2014.05.001

52.

Onnasch

Wickens

C. D.

Manzey

(2014). Human performance consequences of stages and levels of automation: An integrated meta-analysis. Human Factors, 56(3), 476–488. https://doi.org/10.1177/0018720813501549

53.

Oswald

F. L.

Hambrick

D. Z.

Jones

L. A.

(2007). Keeping all the plates spinning: Understanding and predicting multitasking performance. In Jonassen

D. H.

(Ed.), In learning to solve complex scientific problems (pp. 77–96). Routledge. https://doi.org/10.4324/9781315091938-4

54.

Parasuraman

Sheridan

T. B.

Wickens

C. D.

(2008). Situation awareness, mental workload, and trust in automation: Viable, empirically supported cognitive engineering constructs. Journal of Cognitive Engineering and Decision Making, 2(2), 140–160. https://doi.org/10.1518/155534308x284417

55.

Pashler

(1984). Processing stages in overlapping tasks: Evidence for a central bottleneck. Journal of Experimental Psychology: Human Perception and Performance, 10(3), 358–377. https://doi.org/10.1037//0096-1523.10.3.358

56.

Pop

V. L.

Shrewsbury

Durso

F. T.

(2015). Individual differences in the calibration of trust in automation. Human Factors, 57(4), 545–556. https://doi.org/10.1177/0018720814564422

57.

Rann

J. C.

Almor

(2025). An examination of sustained attention during complex multitasking scenarios. Cognitive Research: Principles and Implications, 10(1), 67. https://doi.org/10.1186/s41235-025-00674-x

58.

R Development Core Team . (2015). R: A Language and environment for statistical computing. R Foundation for Statistical Computing. https://doi.org/10.32614/r.manuals

59.

Redick

B. T. S.

Shipstead

Meier

M. E.

Montroy

J. J.

Hicks

K. L.

Unsworth

Kane

M. J.

Engle

R. W.

Hambrick

D. Z.

(2016). Cognitive predictors of a common multi-tasking ability: Contributions from working memory, attention control, and fluid intelligence. Journal of Experimental Psychology: General, 145(11), 1473–1492. https://doi.org/10.1037/xge0000219.supp

60.

Redick

T. S.

(2016). On the relation of working memory and multitasking: Memory span and synthetic work performance. Journal of Applied Research in Memory and Cognition, 5(4), 401–409. https://doi.org/10.1016/j.jarmac.2016.05.003

61.

Roth

E. M.

Sushereba

Militello

L. G.

Diiulio

Ernst

(2019). Function allocation considerations in the era of human autonomy teaming. Journal of Cognitive Engineering and Decision Making, 13(4), 199–220. https://doi.org/10.1177/1555343419878038

62.

Rovira

McGany

Parasuraman

(2002). Effects of unreliable-automation on decision making in command and control. In In proceedings of the annual meeting of the human factors and ergonomics society (pp. 428–432). Human Factors Society. https://doi.org/10.1177/154193120204600345

63.

Rovira

Parasuraman

(2010). Transitioning to future air traffic management: Effects of imperfect automation on controller attention and performance. Human Factors, 52(3), 411–425. https://doi.org/10.1177/0018720810375692

64.

Rovira

Pak

McLaughlin

(2017). Effects of individual differences in working memory on performance and trust with various degrees of automation. Theoretical Issues in Ergonomics Science, 18(6), 573–591. https://doi.org/10.1080/1463922x.2016.1252806

65.

Salvucci

D. D.

Taatgen

N. A.

(2011). The multitasking mind. Oxford University Press.

66.

Saqer

Parasuraman

(2014). Individual performance markers and working memory predict supervisory control proficiency and effective use of adaptive automation. International Journal of Human Factors and Ergonomics, 55(1), 15–31. https://doi.org/10.1504/ijhfe.2014.062548

67.

Sauer

Kao

C.-S.

Wastell

(2012). A comparison of adaptive and adaptable automation under different levels of environmental stress. Ergonomics, 55(8), 840–853. https://doi.org/10.1080/00140139.2012.676673

68.

Schumacher

E. H.

Seymour

T. L.

Glass

J. M.

Fencsik

D. E.

Lauber

E. J.

Kieras

D. E.

Meyer

D. E.

(2001). Virtually perfect time sharing in dual-task performance: Uncorking the central cognitive bottleneck. Psychological Science, 12(2), 101–108. https://doi.org/10.1111/1467-9280.00318

69.

Sebok

Wickens

C. D.

(2017). Implementing lumberjacks and black swans into model-based tools to support human–automation interaction. Human Factors, 59(2), 189–203. https://doi.org/10.1177/0018720816665201

70.

Smith

P. J.

(2017). Making brittle technologies useful. In In cognitive systems engineering (pp. 181–208). CRC Press. https://doi.org/10.1201/9781315572529-10

71.

Strand

Nilsson

Karlsson

I. C. M.

Nilsson

(2014). Semi-automated versus highly automated driving in critical situations caused by automation failures. Transportation Research Part F: Traffic Psychology and Behaviour, 27(2014), 218–228. https://doi.org/10.1016/j.trf.2014.04.005

72.

Strickland

Boag

R. J.

Heathcote

Bowden

Loft

(2023). Automated decision aids: When are they advisors and when do they take control of human decision making? Journal of Experimental Psychology: Applied, 29(4), 849–868. https://doi.org/10.1037/xap0000463.supp

73.

Strybel

T. Z.

K. P. L.

Chiappe

D. L.

Morgan

C. A.

Morales

Battiste

(2016). Effects of NextGen concepts of operation for separation assurance and interval management on air traffic controller situation awareness, workload, and performance. The International Journal of Aviation Psychology, 26(1–2), 1–14. https://doi.org/10.1080/10508414.2016.1235363

74.

Tombu

Jolicœur

(2003). A central capacity sharing model of dual-task performance. Journal of Experimental Psychology: Human Perception and Performance, 29(1), 3–18. https://doi.org/10.1037//0096-1523.29.1.3

75.

Trapsilawati

Wickens

Chen

C. H.

(2017). Transparency and conflict resolution automation reliability in air traffic control. In 19th international symposium on aviation psychology (pp. 419–424). https://corescholar.libraries.wright.edu/isap_2017/8

76.

Ulrich

Miller

(2008). Response grouping in the psychological refractory period (PRP) paradigm: Models and contamination effects. Cognitive Psychology, 57(2), 75–121. https://doi.org/10.1016/j.cogpsych.2007.06.004

77.

Van Acker

B. B.

Parmentier

D. D.

Vlerick

Saldien

(2018). Understanding mental workload: From a clarifying concept analysis toward an implementable framework. Cognition, Technology & Work, 20(3), 351–365. https://doi.org/10.1007/s10111-018-0481-3

78.

Van Selst

Ruthruff

Johnston

J. C.

(1999). Can practice eliminate the psychological refractory period effect? Journal of Experimental Psychology: Human Perception and Performance, 25(5), 1268–1283. https://doi.org/10.1037/0096-1523.25.5.1268

79.

Visser

T. A.

Bischof

W. F.

Di Lollo

(1999). Attentional switching in spatial and nonspatial domains: Evidence from the attentional blink. Psychological Bulletin, 125(4), 458–469. https://doi.org/10.1037//0033-2909.125.4.458

80.

Visser

T. A. W.

Ohan

J. L.

Enns

J. T.

(2015). Temporal cues derived from statistical patterns can overcome resource limitations in the attentional blink. Attention, Perception, & Psychophysics, 77(5), 1585–1595. https://doi.org/10.3758/s13414-015-0880-y

81.

K. P. L.

Chiappe

(2015). Situation awareness in human systems integration. In Boehm-Davis

D. A.

Durso

F. T.

Lee

J. D.

(Eds.), APA handbook of human systems integration (pp. 293–308). American Psychological Association. https://doi.org/10.1037/14528-019

82.

Wickens

C. D.

Boles

D. B.

(1983). The limits of multiple resource theory: The role of task correlation/integration in optimal display formatting (No. EPL835ONR835). https://doi.org/10.21236/adp003321

83.

Wickens

C. D.

Clegg

B. A.

Vieane

A. Z.

Sebok

A. L.

(2015). Complacency and automation bias in the use of imperfect automation. Human Factors: The Journal of Human Factors and Ergonomics Society, 57(5), 728–739. https://doi.org/10.1177/0018720815581940

84.

Wickens

C. D.

Dixon

S. R.

(2007). The benefits of imperfect diagnostic automation: A synthesis of the literature. Theoretical Issues in Ergonomics Science, 8(3), 201–212. https://doi.org/10.1080/14639220500370105

85.

Wickens

C. D.

Santamaria

Sebok

Sarter

N. B.

(2010). Stages and levels of automation: An integrated meta-analysis. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 54(4), 389–393. https://doi.org/10.1037/e578652012-025

86.

Wickens

C. D.

Mccarley

J. S.

Alexander

A. L.

Thomas

L. C.

Ambinder

Zheng

Field

(2005). Attention-Situation Awareness (ASA) Model of pilot error. Contract, 213(1), 213–239. https://doi.org/10.1201/9781420062984.ch9

87.

Wright

J. L.

Chen

J. Y. C.

Barnes

M. J.

(2018). Human–automation interaction for multiple robot control: The effect of varying automation assistance and individual differences on operator performance. Ergonomics, 0139(8), 1–13. https://doi.org/10.1080/00140139.2018.1441449

88.

Young

M. S.

Brookhuis

K. A.

Wickens

C. D.

Hancock

P. A.

(2015). State of science: Mental workload in ergonomics. Ergonomics, 58(1), 1–17. https://doi.org/10.1080/00140139.2014.956151

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.20 MB

How Multi-Tasking Ability Impacts Performance,Workload,Situation Awareness,Stress and Trust with Simulated Imperfect Automation

Abstract

Objective

Background

Methods

Results

Conclusion

Application

Keywords

Introduction

Multi-Tasking Ability and Automation Supervision

Current Study

Methods

Participants

Procedure

Measures

Results

Data Cleaning

MTa Factor Score

Linear Mixed Models on ATC Performance and Situation Awareness

ANOVAs on Workload, Stress, and Trust

Discussion

Limitations, Future Directions, and Practical Implications

Conclusions

Key points

Supplemental Material

Supplemental Material - How Multi-Tasking Ability Impacts Performance, Workload, Situation Awareness, Stress and Trust with Simulated Imperfect Automation

Footnotes

ORCID iDs

Ethical Considerations

Consent to Participate

Consent for Publication

Funding

Declaration of Conflicting Interests

Data Availability Statement

Supplemental Material

Author Biographies

References

Supplementary Material