Abstract
Objective
The goal of this meta-analysis is to investigate the effect of partial automation on mental workload, visual behavior, and engagement in nondriving-related tasks.
Background
The literature on the human factors of operating partially automated driving offers mixed findings. While some studies show partial driving automation to result in suboptimal mental workload, others found it to impose similar levels of workload to the ones observed during manual driving. Likewise, while some studies evidence a marked increase in off-road glances when the automated system was engaged, other work has failed to replicate this pattern.
Method
41 studies involving 1482 participants were analyzed using the PRISMA approach.
Results
No significant differences in mental workload were found between manual and partially automated driving, indicating no changes in mental workload between the two driving modes. A higher likelihood of glancing away from the forward roadway and engaging in nondriving-related tasks was found when the partially automated system was engaged.
Conclusion
Although the adoption of partial driving automation comes with some intended safety benefits, its use is also associated with an increased engagement in nondriving-related activities.
Application
These findings add to our understanding of the safety of partial automation and provide valuable information to Human Factors practitioners and regulators about the use and potential safety risks of using these systems in the real-world.
Keywords
Introduction
Approximately 1.19 million people die each year as a result of road crashes worldwide, with activities like speeding and using cellphones being among the most dangerous contributors to road fatalities (WHO, 2023). In response to this, a growing number of automobile manufacturers have introduced driving automation that can control the vehicle’s behavior to limit the detrimental effect that these activities have on safety. The Society of Automotive Engineers (SAE, 2021) identifies six level of driving automation, ranging from 0 (manual driving) to 5 (fully automated driving). Currently, SAE level 2 (L2) systems, also known as partially automated driving, are becoming more commonplace on our roads, with a projected market share of 60% by 2025 in the United States (Statista, 2023). These systems can assist the driver by controlling both the vehicle’s steering and acceleration in selected conditions, provided the driver remains vigilant and ready to take control whenever necessary (SAE, 2021). Yet, preliminary crash data show that using these systems may lead to drivers disengaging from the driving task more often compared to manual driving, thus posing a risk to safety (e.g., NHTSA, 2022; NTSB, 2020b).
Operating partially automated systems is expected to switch the role of the human driver from
Research on the effect of automated driving on mental workload has produced mixed findings. Using self-reported metrics, Stapel et al. (2019) found a reduction in workload when the L2 system was engaged. A similar pattern was observed by Radhakrishnan et al. (2022) who also found a reduction in physiological activation during partially automated driving, a pattern that the authors intepreted as a reduction in drivers’ mental workload. Likewise, the decline in detection task performance observed by Biondi et al. (2018) during L2 driving was also interpreted as lower mental workload. These patterns are in conflict with the work by Lohani and McDonnell (Lohani et al., 2021; McDonnell et al., 2021; Mcdonnell et al., 2023) who, instead, found partial automation not to produce any changes in mental workload when compared to manual driving. Additional data by Kim et al. (2023) revealed an opposite pattern, showing that operating an L2 system increased drivers’ workload. The authors attributed this increase to the additional demands resulting from supervising the functioning of the L2 system over an extended period of time.
Conflicting findings can also be observed in the literature investigating behavioral changes resulting from partially automated driving. Works by Solis-Marcos et al. (2018) and Biondi and Jajo (2024) have evidenced a reduction in forward glances accompanied by an increase in glances directed toward the vehicle’s touchscreen when the L2 system was engaged. However, conflicting results were found by Gaspar and Carney (2019) who, despite observing a slight increase in glance duration toward the vehicle’s touchscreen when the L2 system was engaged, failed to see an increase in the total time spent looking away from the road in this condition. Similarly, Goncalves et al. (2020) found that gaze concentration on the forward roadway did not differ between manual and L2 driving.
This seemingly fractured literature, combined with a greater presence of partially automated vehicles on our roads, justifies the need for a deeper investigation of the differences in mental workload, visual behavior and NDRT engagement between manual and partially automated driving. With this said, the current study has three main objectives. (1) (2) (3)
Similar meta-analyses have investigated the human factors of automated driving. For example, earlier work by de Winter et al. (2014) explored changes in mental workload resulting from operating driving automation. However, in doing so the authors largely examined drivers’ self-reported mental workload resulting from operating either driver assistance (or SAE level 1) or highly automated systems (or SAE level 4 and 5), with limited attention being paid to partially automated driving. Weaver and DeLucia (2018) investigated the human factors of adopting driving automation requiring shared vehicle control, but their study largely centered on the transition of control between the human driver and the automated system during conditionally automated driving or SAE level-3 automation; a topic that was also investigated by the meta-analytical work by Zhang et al. (2019). Similar work was conducted by Shahini and Zahabi (2022) who, in addition to focusing on transitions of control between manual and automated driving, also explored the mental workload resulting from operating partially and highly automated systems. In our study we investigate changes in mental workload between manual and partially automated driving while taking into consideration potential differences resulting from adopting diverse self-reported, physiological, and behavioral metrics. Additionally, in all our analysis, we also examine the moderating effects of age and sex. Altogether, although previous meta-analytical works have addressed related topics, we believe the distinct focus of the current research makes a unique contribution to the Human Factors literature on driving automation.
Method
In conducting this meta-analysis, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework was adopted (Page et al., 2021). The entire procedure is detailed in Figure 1. All the studies selected for review achieved a sufficient level of quality by reporting detailed descriptions of experimental designs and protocols. Therefore, each study was considered of equal importance and was not coded for quality (cf. Zhang et al., 2019). The ROB 2 tool (Sterne et al., 2019) was employed to assess the risk of bias in the selected papers. A detailed description of this assessment is provided in the Supplementary Material (see Supplemental Figures 1 and 2). The protocol was not preregistered. Data and scripts for the analyses are available upon request to the authors. PRISMA flowchart, showing in detail identification, screening, and inclusion steps (Page et al., 2021).
Search Strategy
Keywords Included in the Review.
Inclusion and Exclusion Criteria
Studies were selected according to our inclusion and exclusion criteria (see Figure 1). Only empirical studies published in peer-reviewed journal or conferences and involving human participants were included in our review: review papers, methodological papers, studies based on simulations of data without actual participants, studies that did not involve human participants or a driving task, and studies that did not undergo a peer-review process were excluded. Studies must include both a manual driving condition and a partially automated driving condition as per the SAE taxonomy, and they must compare these conditions by testing differences in either mental workload, visual behavior, or NDRT engagement. Studies investigating only partially automated driving, only manual driving, only take-over transitions or testing individual differences between participants were excluded. Moreover, we chose to exclude studies involving only SAE level 3 automation from the review and meta-analysis. This was done considering the meaningful differences between L2 and L3 systems in terms of operational design domains (e.g., L2 requires the driver to supervise the functioning of the automated at all times, whereas this is not required when the L3 system is engaged and operational), and also given the extremely limited literature on the comparison between manual and L3 driving in on-road experiments (only one study was identified). Studies testing novel software, vehicles, or interventions (e.g. training programs) were excluded as they can largely modify drivers’ behavior, therefore confounding potential differences between manual and partially automated driving (Casner & Hutchins, 2019; Ebnali et al., 2019). Finally, studies with asymmetries between driving mode conditions (i.e., nondriving-related task presented only during partially automated driving) or reporting nonquantitative data or incomplete data were excluded.
Study Selection
The screening was performed independently by the two authors, who reached agreement on the final selection. This was done to decrease the risk of bias and the risk of error in the selection process (Page et al., 2021). First, we reviewed the titles and abstracts to ensure that the papers met our inclusion criteria. Once this first screening was completed, selected papers were read in their entirety by both authors and then screened once again (see Figure 1). Next, a citation search was performed on the selected papers to identify articles that had not emerged from the keyword analysis (as suggested by Page et al., 2021). Furthermore, the review of de Winter et al. (2014) was screened to find studies that could match our inclusion criteria. Five additional articles were identified through the citation search, while two more were identified from the review of de Winter et al. (2014). When the same data were used by two different studies, the most complete study was retained, while the less complete study was included only if it presented relevant additional findings. After the study selection process was completed,
Dependent Variables Extracted
The variables relevant to our objectives were extracted and grouped according to the constructs they were intended to measure (i.e., mental workload, visual behavior or NDRT engagement). See below for a complete overview of the variables extracted and the rationale behind their inclusion.
Mental Workload
Mental workload assessments can vary depending on the metrics being used (Longo et al., 2022; De Waard, 1993; Butmee et al., 2019). Furthermore, there is not a single elective measure that can reliably discriminate between high and low mental workload (Charles & Nixon, 2019). Therefore, to obtain a comprehensive overview of the differences in mental workload between manual and partially automated driving, physiological, subjective, and behavioral metrics were included in the analyses. Descriptions of these metrics are presented below.
Physiological Measures
Spectral electroencephalogram (EEG) is sensitive to changes in mental workload. In particular, increased alpha and theta power are usually associated with greater workload (Mcdonnell et al., 2023) and fatigue (Zhang et al., 2021). Similarly, differences in event-related potentials (ERPs) amplitude during an oddball paradigm (i.e., a secondary task involving the presentation of a series of identical sounds randomly interspersed with rare deviant stimuli) are thought to reflect variations in mental workload, with larger ERPs’ amplitudes measured after each oddball sound corresponding to more mental resources available due to lower levels of mental workload (Figalová et al., 2024; Luck, 2014). Variations in blood flow as recorded by functional near-infrared spectroscopy (fNIRS) can indicate variations in the amount of mental workload experienced by the driver, with increased blood flow indicating higher workload (Saikia et al., 2021; Sibi et al., 2017). Pupil diameter and blink rate reflect variations in mental workload, with higher blink frequencies and increased pupil size indicating higher mental workload (Radhakrishnan et al., 2023; Tsai et al., 2007). Skin conductance level (SCL) and skin conductance response (SCR) have been shown to positively correlate with levels of mental workload during driving. Higher levels of drivers’ mental workload are usually associated with both an increase in heart rate and a decrease in heart rate variability, as measured by the root mean square of successive differences (RMSSD) (Lohani et al., 2021; Radhakrishnan et al., 2022). It is important to note, however, that while we attribute these physiological changes to differences in mental workload, many of these measures are also sensitive to variations due to other mental or physical states or to measurement artifacts (cf. Luck, 2014).
Behavioral Measures
The ISO Detection Response Task (DRT) is a standardized metric of mental workload. It requires responding to the presentation of a visual, vibrotactile, or auditory stimulus presented every 3–5 s. Differences in DRT RT and accuracy are expected to reflect different levels of mental workload (ISO, 2015), with slower RT and lower accuracy indicating increased mental workload (for a review, see Biondi, 2024).
Subjective Measures
Self-reported metrics of metal workload can offer useful information regarding the subjective mental workload experienced by drivers. The NASA Task Load Index (NASA-TLX) is the most common instrument requiring participants to self-rate their level of workload often on a 21-point Likert scale (Hart & Staveland, 1988). Most studies in this meta-analysis that used self-reported measures employed NASA-TLX scores or single scales of this questionnaire (see Figure 2). However, it is important to note that recent studies have questioned the validity of NASA-TLX, suggesting that subjective measures often diverge from other indicators, such as physiological and behavioral measures (e.g., de Winter, 2014; Rubio et al., 2004; Matthews et al., 2020). Forest plot of the Standardized Mean Changes (SMC) for mental workload. Negative SMCs indicate higher workload estimates during partially automated driving, while positive SMCs indicate higher workload during manual driving. Black squares represent the SMCs for each study, while the diamonds represent the aggregated SMCs estimated with random-effects models. 
Visual Behavior
Visual behavior can provide useful insights into drivers’ visual attention (cf. Hungund & Kumar Pradhan, 2023). Therefore, we investigated visual behavior considering gaze and glances toward nondriving relevant areas (i.e., off-road glances, vertical and horizontal gaze dispersion, and gazes toward instruments panels or billboards) and away from driving relevant areas (i.e., gazes toward front road, hazards, or vehicle’s mirrors).
NDRT Engagement
Similarly to visual behavior, engaging in NDRT (e.g., texting, holding the phone, talking to a passenger, using the radio, navigating with the GPS, and performing an easy ad-hoc experimental task) can provide information about attention allocation, with greater engagement in NDRT suggesting decreased attention toward driving relevant areas (cf. Dogan et al., 2019; Hungund & Kumar Pradhan, 2023).
Meta-Analytic Approach
Selected Studies Investigating Mental Workload, Visual Behavior and NDRT Engagement During On-Road Driving.
Selected Studies Investigating Mental Workload, Visual Behavior and NDRT Engagement During Simulated Driving.
Next, we addressed potential issues arising from using different types of designs in the same meta-analysis following Morris et al. (2002)’s guidelines. First, we transformed effect sizes from different designs into a single metric, that is, the SMC. Second, we included the study design (between- vs. within-subjects) as a moderator variable in all meta-analyses. Additionally, since sampling variance depends on both sample size and study design (Morris et al., 2002), we conducted two different sampling variance estimations: within-subjects studies’ sampling variance was estimated using Gibbons’s formula, which accounts for repeated measures designs (Gibbons, 1993); between-subjects studies' sampling variance was estimated using Hedges’ formula (Hedges, 1983, 1982), which accounts for independent groups designs (for reference, see equations A1 and A3 in Morris et al., 2002).
Finally, since multiple effect size estimates were included for each study, a multivariate Random Effect Model was preferred over a univariate model to account for the dependency among effect sizes originating from the same study (Berkey et al., 1996; Konstantopoulos, 2011; Olkin & Gleser, 2009). The estimates for the Random Effect Models were computed using the restricted maximum likelihood estimator (Viechtbauer, 2005; Raudenbush, 2009). Moderator analysis was conducted to explore the influence of the Type of drive (on-road driving vs. simulated driving) on the differences between partially automated and manual driving in each meta-analysis, as drivers might behave differently in real road conditions compared to simulations. Moderator analyses were also conducted to examine how participants’ sex (i.e., percentage of females) and the average age of drivers in each study influenced differences in mental workload, visual behavior, and NDRT engagement. This was done given that there is evidence that age and sex might affect some of these variables (cf. Cantin et al., 2009). Additionally, a moderator analysis was performed to investigate the influence of the Type of measure used (physiological vs. behavioral vs. subjective) when assessing mental workload.
Publication Bias
Publication bias refers to the phenomenon by which nonsignificant results are less likely to be published in peer review journals and conferences than significant results. To assess publication bias, we performed rank correlation tests for funnel plot asymmetry using the Kendall’s tau statistics included in the R package “metafor” (Begg & Mazumdar, 1994; Viechtbauer, 2010). This test examines the correlation between the absolute values of effect sizes and their corresponding sampling variances. A significant correlation indicates that larger effect sizes come from studies having high sampling variance (i.e., studies with small sample sizes and/or between-subject designs), which suggests the presence of publication bias. A nonsignificant correlation indicates that the effect sizes are not dependent on sample size or design, thus suggesting the absence of publication bias.
Results
Characteristics of on-road and simulated driving studies are described in Tables 2 and 3, respectively. Additional study characteristics can be found in Supplemental Tables 1 and 2 in the Supplementary Material. In this section, we present the aggregated characteristics of the studies and the results of the meta-analytic method.
Study Characteristics
Meta-Analysis Results
The results of each Random Effect Model are presented in detail below: the estimates of the Standardized Mean Change (SMC) are presented within text and in Figures 2, 3, and 4, while the between-study variance component (σ2) is presented in Figures 2, 3, and 4. The assessment of heterogeneity across studies (Q) was also conducted and reported within the text. Significant Q-values suggest that the true effect sizes are heterogeneous, while nonsignificant Q-values indicate that the variability in the observed effect sizes is smaller than would be expected based on sampling variability alone, and that the true effect sizes are relatively consistent across studies (Cochran, 1954). The size of the SMC estimates can be interpreted in the same way as Cohen’s Forest plot of the Standardized Mean Changes (SMC) for visual behavior. Negative SMC indicate reduced gazes or glances toward driving relevant areas and increased gaze or glances toward nondriving relevant areas during partially automated driving. Black squares represent the SMCs for each study, while the diamond represents the aggregated SMC estimated with a random-effects model. Measures marked with * are reverse-coded. Forest plot of the Standardized Mean Changes (SMC) for NDRT engagement. Negative SMC indicate increased NDRT engagement during partially automated driving. Black squares represent the SMCs for each study, while the diamonds represent the aggregated SMCs estimated with random-effects models. 

Mental Workload
For the purpose of the present meta-analysis, negative SMCs indicate higher workload during partially automated mode, while positive SMCs indicate higher workload during manual mode. In total, 47 different effects were collected from 26 independent samples analyzing mental workload, with 13 employing on-road driving experiments and 13 employing simulated driving experiments. A two-level multivariate Random Effect Model was conducted, including effects sizes and sampling variances in the first level of analysis and the studies in the second level as random intercepts. This analysis did not result in a significant difference (SMC = 0.039,
Finally, the heterogeneity assessment revealed that subjective workload was the only group with significantly heterogeneous effect sizes (Q (16) = 76.93,
Visual Behavior
In this meta-analysis, we attempted to quantify the visual behavior toward both driving and nondriving-related areas. Therefore, variables measuring visual behavior toward drive-relevant areas were reverse coded, so that positive SMCs indicate reduced gaze or glances toward driving relevant areas and increased gaze or glances toward nondriving relevant areas during manual driving, while negative SMCs indicate reduced gaze or glances toward driving relevant areas and increased gaze or glances toward nondriving relevant areas during partially automated driving (in Figure 3, measures marked with an asterisk “*” are reverse coded).
In total, 37 different effects were collected from 17 independent samples, with 6 involving on-road driving experiments and 11 involving simulated driving experiments. A two-level multivariate Random Effect Model was conducted, including effect sizes and sampling variances at the first level of analysis and the studies as random intercepts at the second level. This analysis resulted in a significant negative mean change (SMC = −0.513,
Notably, the heterogeneity assessment indicated that the effect sizes were significantly heterogenous (Q (36) = 114.12,
NDRT Engagement
In this meta-analysis, visual and manual NDRT engagement was assessed: positive SMCs indicate increased NDRT engagement during manual driving, while negative SMCs indicate increased NDRT engagement during partially automated driving.
In total, 13 different effects were collected from 10 independent samples, with 7 involving on-road driving experiments and 3 involving simulated driving experiments. A two-level multivariate Random Effect Model was conducted, incorporating effect sizes and sampling variances at the first level of analysis and the studies as random intercepts at the second level. This analysis revealed a small but significant negative mean change (SMC = −0.281,
Finally, heterogeneity assessment indicated that only the three simulated driving studies showed significantly heterogeneous effect sizes (Q (4) = 32.70,
Publication Bias Assessment
The rank correlation tests revealed a significant positive relationship between SMCs and their sampling variances (Kendall’s tau = 0.341,
Discussion
The present work investigated the differences in mental workload, visual behavior, and NDRT engagement between manual driving and partially automated driving. Here, findings are presented by research objective. Each section begins with a discussion of the results of the meta-analysis, followed by a subsection titled “Additional considerations” which includes relevant findings not included in the meta-analysis.
Explore Differences in Mental Workload Between Manual and Partially Automated Driving
Meta-Analysis Results for Mental Workload
Overall, our meta-analysis did not reveal a general difference in mental workload between partially automated and manual driving. Contrary to our hypothesis, we did not find increased workload during manual driving, suggesting that both partially automated and manual driving results in similar mental workload. The publication bias assessment indicated that some nonsignificant results were not published/available. Indeed, even in this review, we were unable to include the nonsignificant results of three studies due to insufficient information (i.e., Kraft et al., 2018; Stapel et al., 2019; Zhang et al., 2021). Given the results of the meta-analysis and the presence of publication bias, it is unlikely that differences in mental workload exist between partially automated and manual driving. This finding holds particular significance for Human Factors researchers and automobile manufacturers, as it suggests that both manual and L2 mode yield similar levels of mental workload. Notably, such finding is not entirely inconsistent with prior meta-analyses on the same topic. For instance, de Winter et al. (2014) conducted a comprehensive meta-analysis that indicated higher mental workload during manual driving compared to automated driving. However, their analysis included only self-reported measures and encompassed both partially and highly automated systems. In our analysis, when considering subjective measures alone, mental workload appears higher (though on a trending nonsignificant level) during manual driving, consistent with de Winter’s findings. Additionally, while de Winter’s review focused primarily on simulated studies and included highly automated systems, our study is limited to L2 systems and incorporates a substantial number of on-road studies published after de Winter’s review.
Based on our results, the shift in the driver’s role from vehicle operator to system supervisor does not consistently lead to a reduction in mental workload. Previous research has already noted that monitoring automated systems can impose a higher workload due to the novelty of the system and the increased number of tasks drivers may perform during partial automation (cf. the EAST framework, Banks & Stanton, 2016, 2019). In this meta-analysis, subjective mental workload was found to be higher during manual driving compared to partially automated driving when drivers were required to perform secondary tasks (SMC = 0.578), but no difference emerged when no secondary tasks were present. This suggests that, in the absence of secondary tasks, drivers perceive similar levels of workload in both driving modes. In other words, supervising the automated system may involve a different set of tasks than manually operating the vehicle, potentially reducing perceived mental workload when secondary tasks are present, but not when they are absent.
Moderator analyses suggested that mental workload also differed depending on the type of measure used. When behavioral measures were employed, a slightly higher mental workload was found during partially automated driving (SMC = −0.132,
Overall, the meta-analytic findings seem to indicate that partially automated and manual driving likely impose similar levels of mental workload on the driver. However, they also indicate that assessing mental workload with different measures can result in different outcomes. Researchers and automobile manufacturers that are planning to use behavioral measures to assess mental workload are invited not to use only these measures in the future, but compare them with (at least) self-reported or physiological measures (cf. Biondi, 2024).
Additional Considerations on Mental Workload
Stapel et al. (2019) examined the effect that the greater experience using L2 systems have on mental workload, finding that only experienced drivers reported lower perceived workload during partially automated compared to manual driving. Similarly, two studies including only participants inexperienced with partially automated systems did not find any significant differences in self-reported workload (Biondi et al., 2023; Biondi & Jajo, 2024), while one study testing drivers inexperienced with automation found slower DRT reaction times during partially automated driving (Mcdonnell et al., 2023). This pattern seems to align with the work by Dunn et al. (2021) positing that, as drivers become more experienced using vehicle automation, this may lead to greater system complacency and a higher risk of engaging in distracting activities. Within the context of our study, we argue that this pattern could be the result of the lower mental workload experienced by expert L2 users who may seek engagement in NDRT to counter the declining workload.
Some studies tested differences in mental workload over time, reaching mixed results. Three studies found that DRT reaction times increased at a greater rate in L2 mode arguably indicating greater workload over time (Biondi et al., 2023; Zhang et al., 2021; Zhao, Liu, et al., 2022). Similarly, Saxby et al. (2013) found higher NASA-TLX scores during manual driving after a 50-min drive but observed no differences between partially automated and manual driving during both 10- and 30-min drives, suggesting that differences in perceived mental workload may only emerge after a certain amount of time. In contrast, Mcdonnell et al. (2023) and Zhao et al. (2022) found a seemingly opposite pattern with faster DRT RT and a smaller pupil size in the latter section of the L2 automated drive, patterns that would indicate a temporal reduction in mental workload. Overall, while mental workload seems to vary over time, no particular conclusions can be made from these findings. Future studies should consider time when assessing the impact of automated systems on psychological factors.
Among the studies under consideration, only one (Cooper et al., 2023) adopted a naturalistic approach wherein it was up to the driver to decide whether and when engage the L2 system. The authors compared the driver’s workload experienced during this naturalistic portion of the study with that recorded during the experimental phase of the research, that is, when drivers were instructed to operate the vehicle in either manual or L2 mode. Results showed that, while a reduction in workload was found in the L2 mode during the experimental phase, no differences between the two modes were found in the naturalistic phase. This pattern is particularly relevant as it suggests that the workload associated with partial automation might stem more from drivers being forced to use automation, rather than being a direct consequence of automation itself (Cooper et al., 2023). It is also important to note that in all remaining studies, the experimenter was present inside the vehicle (se Supplementary Table 1 for a complete overview), which might skew the generalizability of the meta-analysis results to naturalistic driving conditions (Safi et al., 2014).
Explore Differences in Visual Behavior Between Manual and Partially Automated Driving
Meta-Analysis on Visual Behavior
The meta-analytic approach revealed a moderate aggregated effect size, indicating an increase in eye glances and gazes toward nondriving-related areas and a decrease toward driving-related areas during partially automated driving compared to manual driving (SMC = −0.513,
These findings show that drivers are more inclined to direct their gaze away from the road during partially automated driving. This is consistent with the hypothesis that, during L2 driving, relinquishing control of the vehicle to the automated system may lead some drivers to boredom. In an attempt to counter the impending state of underload, drivers may then start to direct their attention away from driving and toward the surrounding nondriving environment as a way to self-regulate (cf. Biondi, 2024, Engström et al., 2013). The self-regulation hypothesis posits that, as driving demands shift away from desired levels, drivers start to regulate their behavior to shift workload back to optimal levels (cf. Dunn et al., 2021). While this occurs in conditions of increasing driving demands—for example, a driver that silences the radio or hangs up a call when negotiating a challenging maneuver— it is also frequent in situations of lower workload—for example, a driver that starts fidgeting or picks up their smartphone when waiting at a red light.
These behaviors are particularly perilous as they may detract from the driver’s ability to promptly respond to road hazards. It is plausible that, as visual attention is directed away from the road, it becomes more challenging for drivers to properly maintain awareness of the surrounding traffic conditions and react to emerging threats (He et al., 2022; Gaspar & Carney, 2019; see also Merat et al., 2019 for a similar conclusion based on the “Out of the loop” algorithm). With this said, we recognize that characteristics such as the duration and frequency of off-road glances should also be accounted for when informing on their distraction potential. In the
Interestingly, our analyses revealed a higher tendency to glance toward nondriving relevant areas in more male-dominated cohorts, that is, in samples with lower percentages of female drivers. This is interesting as this is among the first studies evidencing sex-related differences during partially automated driving. While it is established that males, especially younger ones, are more inclined to risk tasking when driving (Fillmore et al., 2008; Hasanat-E-rabbi et al., 2021), little evidence was available on whether this would translate to operating partially automated driving.
Additional Considerations on Visual Behavior
Only one study presented findings that could not be included in the meta-analysis. Miller and Boyle (2019) found that off-road glances tend to increase over a 40-min period during partially automated driving, but not during manual driving. This is an interesting result that suggests that drivers might become distracted more quickly during partial automation. However, no claims can be done based on a single study.
Explore Differences in NDRT engagement Between Manual and Partially Automated Driving
Meta-Analysis on NDRT Engagement
The meta-analytic approach revealed a small effect size, indicating increased NDRT engagement during partially automated driving compared to manual driving (SMC = −0.281,
Additional Considerations on NDRT
Based on the studies reviewed, drivers with more experience with partially automated systems appear to engage NDRT more often during partial automation. Naujoks et al. (2016) and Solís-Marcos et al. (2018) explored the influence of the experience with partially automated systems on secondary tasks engagement, finding that only experienced drivers engaged more in NDRT during partially automated driving compared to manual driving. Although only two studies among those we selected tested this, it is noteworthy that their findings align with Dunn and colleagues’ framework (Dunn et al., 2021), suggesting that the increased engagement in nondriving tasks can be due to experienced drivers over-trusting automated systems. This could be concerning, as increased NDRT engagement may hinder the drivers’ ability to anticipate hazards (cf. Hungund & Kumar Pradhan, 2023). Finally, only one study (Cooper et al., 2023) compared a naturalistic condition (in which drivers could choose whether to activate or not in partial automation) with conditions where drivers were instructed to operate the vehicle in either manual or L2 mode, finding lower engagement in NDRT during naturalistic driving. While these findings come from a single study, they suggest that more realistic driving conditions should be considered when investigating partial automation.
Conclusions
This work reports the findings of meta-analyses evaluating differences in mental workload, visual behavior, and NDRT engagement between partially automated and manual driving. Below, we summarize the main findings and outline some final conclusions along with potential limitations.
Our data show partially automated driving to increase the likelihood of drivers looking away from the forward roadway and engaging in NDRT. In contrast, no significant differences were observed in mental workload between the two driving modes. Combined, these findings align with the literature on self-regulation of driving behavior, which suggests that drivers usually modulate their behavior in an attempt to avoid conditions of either high and low workload (Moore & Brown, 2019; Oviedo-trespalacios et al., 2017; Oviedo-Trespalacios et al., 2018; Strayer et al., 2017). We argue that the tendency to execute more off-road glances and engage in potentially distracting activities during L2 driving may be the result of drivers trying to counter the onset of boredom resulting from supervising the L2 system, a hypothesis that would find alignment in the literature on vigilance decrement (Molloy & Parasuraman, 1996).
We believe that interventions aimed at correcting drivers’ visual behavior and reducing NDRT engagement should be prioritized when developing partially automated systems. Following Hungund and Kumar Pradhan (2023), providing drivers with training could be beneficial and could reduce the risk of distractions, as nontrained drivers might not fully understand the limitations of automated systems. Developing guidelines on the importance of maintaining attention to driving relevant elements when using automated systems might also reduce disengagement from driving and, consequently, improve safety. Similarly, we believe consideration should also be given to the design of partially automated systems. In a comprehensive review, Wang et al. (2024) provide guidelines for interface design in driving automation systems, suggesting that auditory alerts are preferable to visual ones. This aligns with the concerns outlined in our meta-analysis, as visual alerts could divert gaze from the road, potentially encouraging risky visual behavior (cf. NHTSA, 2013). Another interesting finding that emerges from this work is the importance of previous experience with automation. On the one hand, drivers inexperienced with automation tend to report higher workload, likely due to a lack of trust in these systems. On the other hand, drivers experienced with automation are more likely to engage in secondary tasks while driving, possibly due to over-trust in these systems (cf. Dunn et al., 2021). We believe that developing training programs that balance trust in automation might reduce potential risks associated with workload and distractions. Again, future research should leverage these findings to create appropriate training programs for drivers interested in partially automated systems.
Our review also presents some limitations. First, the study selection process only included studies published in peer-reviewed journals or conferences, thus excluding works from non-peer-reviewed outlets. This decision aimed to improve the quality of the analyzed studies, as non-peer-reviewed papers may contain unreliable findings. Second, some findings were omitted due to unclear reporting of results and/or statistics (see Figure 1). This exclusion may have impacted the meta-analysis results, as some data were excluded solely based on reporting issues. To address this, a publication bias assessment was conducted to estimate the likelihood of omitted nonsignificant findings. Third, although a meta-analysis on NDRT engagement was conducted, only a few studies examined this in naturalistic on-road settings, making the conclusions of this analysis less robust compared to those on mental workload and visual behavior. Fourth, only half of the studies included in this work used realistic L2 systems (i.e., on-road studies), while the rest employed driving simulators. Although moderator analysis on mental workload and visual behavior revealed no differences between on-road and simulated studies, it is important to note that our conclusions are also based on studies using systems that may not fully resemble real-world L2 systems. As a result, our findings may not generalize to all types of L2 systems. Moreover, our work does not focus on differences in L2 system designs, as this review is focused specifically on the differences between manual and partially automated systems. However, when available, information on system design is provided in the Supplementary Material. Finally, only five databases were used during the study selection. However, it is important to note that, in line with the guidelines of Paré et al. (2015), this was done to improve replicability and rigor.
Our findings suggest that while partial automation seems to impose a similar mental workload as manual driving, it may lead to increased visual behaviors toward nondriving relevant elements and slightly greater engagement in secondary tasks. This behavior is concerning as it may impair drivers’ ability to react quickly to sudden hazards. Future research on partially automated systems should focus on interventions aimed at improving visual behavior and reducing engagement in secondary tasks.
Key Points
• No significant differences in mental workload between manual and partially automated driving.
• Increased visual behavior toward non-driving related areas during partial automation.
• Greater engagement in non-driving related tasks during partial automation.
• Previous experience with automation can affect the safety of partially automated systems.
Supplemental Material
Supplemental Material - Effect of Partially Automated Driving on Mental Workload, Visual Behavior and Engagement in NonDriving Related Tasks: A MetaAnalysis
Supplemental Material for Effect of Partially Automated Driving on Mental Workload, Visual Behavior and Engagement in Nondriving Related Tasks: A Meta-Analysis by Nicola Vasta and Francesco Biond in Human Factors: The Journal of the Human Factors and Ergonomics Society
Footnotes
Acknowledgments
The authors acknowledge the generous contribution from the University of Windsor Research Chair program. They also thank the Natural Science and Engineering Research Council and the Social Science and Humanities Research Council of Canada for their support.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
