Abstract
Objective
We investigated a driver monitoring system (DMS) designed to adaptively back up distracted drivers with automated driving.
Background
Humans are likely inadequate for supervising today’s on-road driving automation. Conversely, backup concepts can use eye-tracker DMS to retain the human as the primary driver and use computerized control only if needed. A distraction DMS where perceived false alarms are minimized and the status of the backup is unannounced might reduce problems of distrust and overreliance, respectively. Experimental research is needed to assess the viability of such designs.
Methods
In a driving simulator, 91 participants either supervised driving automation (auto-hand-on-wheel vs. auto-hands-off-wheel), drove with different forms of DMS-induced backup control (eyes-only-backup vs. eyes-plus-context-backup; visible-backup vs. invisible-backup), or drove without any automation. All participants performed a visual N-back task throughout.
Results
Supervised driving automation increased visual distraction and hazard non-responses compared to backup and conventional driving. Auto-hand-on-wheel improved response generation compared to auto-hands-off-wheel. Across entire driving trials, the backup improved lateral performance compared to conventional driving. Without negatively impacting safety, the eyes-plus-context-backup DMS reduced unnecessary automated control compared to the eyes-only-backup DMS conditions. Eyes-only-backup produced low satisfaction ratings, whereas eyes-plus-context-backup satisfaction was on par with automated driving. There were no appreciable negative consequences attributable to the invisible-backup driving automation.
Conclusions
We have demonstrated preliminary feasibility of DMS designs that incorporate driving context information for distraction assessment and suppress their status indication.
Application
An appropriately designed DMS can enable benefits for automated driving as a backup.
Introduction
Monitoring Problems
Requirements to monitor the simultaneous lateral and longitudinal control of SAE Level 2 (SAE, 2018) automated vehicles (AVs) will likely result in inadequate supervision. Problems include mismatched driver expectations as well as human vigilance performance limitations. Capabilities of today’s on-road AVs are commonly overestimated (Euro NCAP, 2018; Rader et al., 2019). Furthermore, instructions to monitor the technology of AVs are inconsistent with observed driver behaviors (Carsten et al., 2012; Jamson et al., 2013; Large et al., 2017) and preference (Bertoncello & Wee, 2015; Cyganski et al., 2014). Even if people wanted to supervise driving automation, many decades of human factors research, from Mackworth (1950) to Hancock (2017), suggest risks when humans are tasked to monitor automated processes over extended periods of mostly successful operation. Such risks have been substantiated by recent reviews specific to the driving domain (e.g., Cabrall, Happee, et al., 2016; Gonçalves et al., 2017) and demonstrated in a recent driving simulator experiment (Greenlee et al., 2018).
Monitoring Solutions
In conventional vehicles, driver monitoring systems (DMSs) typically trigger warnings after detecting steering and pedal inputs indicative of degraded driver states. Beyond warnings, DMS could also turn on/off automated driving support. Presently in Level 2 AVs, DMSs are used to warn and/or lock out drivers from automated driving modes if deemed aberrant based on steering wheel sensors and/or driver-facing cameras.
Hand placement. Many Level 2 AVs use hand-on-wheel as their de facto DMS (Audi, BMW, Mercedes, Tesla, and Volvo) to reduce human backup movement times. It is possible that physical-cognitive “motor–memory” couplings (Butefisch et al., 2000; Classen et al., 1998, 1999; Stefan et al., 2008) could underlie response generation and also be a benefit ahead of only faster reaction times. For example, Walton and Thomas (2005) have associated hand-on-wheel with risk perception rather than only fatigue or personal style preferences. Increased evidence for perceptual and memory benefits in driver hand placement could lead to increased standardization and compliance in supervising automated driving.
Adaptive backup. Because attentive driving is probably more common than distracted driving overall, a more rational driving safety technology step would be to use a paradigm of targeted human backup rather than gross human replacement. Petermeijer et al. (2015) found benefits of event-driven adaptive steering support (bandwidth feedback) to avoid negative complacency aftereffects with continuous steering support. Furthermore, a driving simulator experiment of Cabrall et al. (2018) showed benefits of adaptive backup but also discussed challenges of distrust and misuse of automated driving via overreliance (see also Martens & Jenssen, 2012). DMS-triggered driving automation where perceived false alarms are minimized and the status of the backup is left unannounced may mitigate such problems, but experimental research is required to assess the viability and efficacy of these design approaches.
Context-based assessments of distraction. Although false alarms can be minimized from a technological perspective (i.e., more accurate eye-tracking), perceived false alarms can still degrade trust (i.e., “cry-wolf” effect). Instead of establishing only fixed timing requirements for glance durations, it should also be valuable to pursue situation-dependent assessments of distraction such as involving the functional aspects of a driving context, like traffic density, road curvature, and so on. It has been previously suggested that distraction risks could reflect a joint function of neglected roadway and demand of roadway (Liang et al., 2014; Regan et al., 2009). This approach is not yet common in the automotive AV market but is not without precedent as Toyota has reportedly been working on a “Guardian Angel” concept (see Hof, 2016; Simonite, 2017). Thus, beyond interrogating if a driver is looking away from the road, a DMS might ask if the driver is looking away too much given the present circumstances (cf. minimum required situation awareness, Kircher & Ahlström, 2017). Lateral and longitudinal controls form the inner core of hierarchical driving models (e.g., Merat et al., 2019; Michon, 1978, 1985) and so seem a reasonable level to implement context-based assessments of distraction.
Invisible-status backup. If people believe the system will back them up, they may allow themselves to become distracted more often (i.e., misuse through overreliance). While appropriate feedback has been a mainstay constituent of good human factors design (e.g., Norman, 1990) and for advanced driver assistance systems (Seppelt & Lee, 2007), it does not necessarily imply that feedback is needed for all things at all times. An avenue for reducing operator overreliance on Level 2 driving automation might be to avoid indication of its existence/activation. Jaguar Land Rover’s head of safety, Phil Glyn-Davies, has proposed “the best active safety system is one where you’re not even aware of its presence” (Bird, 2018). Reasonably, it is harder to misuse something (e.g., Parasuraman & Riley, 1997) that you do not know is there. Furthermore, additional DMS information during a period of distraction may increase workload and unwanted visual behaviors, especially if the HMI is confusing or unwanted.
Aim
The present paper investigates the initial viability of various DMS designs to improve upon dangerous distraction problems that may occur when humans supervise an automated driving system. We assume dangerous distraction levels when visual control is no longer present enough to protect against lane excursions or obstacle collisions.
Methods
Research Question Operationalization: Concepts, Conditions, Comparisons
We aligned multiple dimensions of related concepts into simplified parameters (Table 1) and constructed seven experimental conditions (Table 2) to address five different research questions (RQ) through comparisons of different data sets (Table 3).
Overview of Investigated Concepts
Experimental Conditions and Participant Demographics
Note. Driving frequency response scale: 1 = every day, 2 = 4–6 days a week, 3 = 1–3 days a week, 4 = once a week to once a month, 5 = less than once a month, 6 = never.
aNumber of participants who did not provide a response in parentheses.
Research Questions and Planned Comparisons
Note. See Table 2 for condition descriptions.
Participants
This research complied with the American Psychological Association Code of Ethics and was approved by the Human Research Ethics Committee of the TU Delft. Informed consent was obtained from each participant. Ninety-one university students participated in our experiment. The majority (73%) reported driving frequencies between a weekly and monthly basis. Participants on average had a driving license for 4.48 years (SD = 2.70). Participants were randomly assigned to one of the seven conditions (Table 2) with two analysis exclusions: an E&C-vis-BU participant because of faulty instructions and a Conv participant due to excessive difficulties for driving in the simulator. Because no effect of gender was directly apparent on any measure when collapsing across experimental conditions, the randomly occurring gender imbalances within each condition are not presently considered for biased impact.
Apparatus
A low-fidelity desktop driving simulator with DMS (Figure 1) was configured using TASS International PreScan software and a Logitech G27 USB gaming wheel and pedal set. A NEC MultiSync EA 243WM monitor with a viewable image of 52 cm × 33 cm (1,920 × 1,200 pixels) was placed at an eye distance of approximately 65 cm. SmartEye DR120 eye-tracking cameras were concealed beneath the simulation display. A nondriving-related task (NDRT) (described in Section “NDRT—visual N-back”) was presented on a separate laptop.

Arrangement of driving simulation and nondriving-related task (NDRT).
Simulated Technology—Automated Driving, DMS, Backup Control
We simulated SAE Level 2 driving automation with lane-centering and a set speed of 70 km/hr while adjusting as needed for lead-vehicle spacing. Participants in the two automated driving conditions were instructed to monitor and correct the automated driving for any dangers/errors. Eyes were tracked across the full driving trial in all conditions, but only in adaptive backup control conditions was it used as a DMS for assessing real-time distraction and attention to turn the driving automation on/off.
The DMS logic was adopted from Cabrall, Janssen, et al. (2016) and Cabrall et al. (2018) with glance thresholds based from pilot studies and in approximate agreement with the literature (Glaser et al., 2016; Liang et al., 2014; NHTSA, 2013; Ryu et al., 2013; Samuel & Fisher, 2015). Cabrall et al. (2018) used thresholds of 1.5 s for classifying off-road “distracted” and 4.5 s for on-road “attentive” states. In pilot tests of the present study, a shorter on-road threshold of 2.0 s was deemed more practical for visually commanding auto-to-manual transitions of control. A technical glitch transpired, however, such that the DMS used off-road/on-road thresholds of 3.0 and 4.0 s, respectively. Explanation of the glitch and rationalization of the utility of the results in spite of it are further discussed in the Section “Limitations.”
Four types of adaptive driving automation backup were evaluated as a cross of two different dimensions (Table 1).
For eyes-only-backup, detections of driver visual distraction directly activated the driving automation (i.e., lateral control via steering the vehicle to the center of the right lane together with longitudinal control by gradually slowing down). For eyes-plus-context-backup, the DMS integration required both the detection of distraction and the simultaneous presence of a course/collision conflict to activate the backup control. The conventional driving control mode (human operation of steering wheel, throttle, and brake) was reactivated once distraction or driving conflicts were no longer detected. For visible-backup, an automated driving status appeared on the right side of a virtual dashboard in a white font that said either “Normal Driving” (green background) or “Auto Backup Control” (red background). For invisible-backup, the status was not shown, and participants were led to believe that they were driving in a conventional mode.
Driving course/collision conflicts. During eyes-plus-context-backup, course/collision conflicts were defined in terms of look-ahead predictions of impending course excursions (i.e., road departure) and/or collisions (i.e., with another road user). The simulated lateral and longitudinal radars each interrogated a fixed distance ahead of the vehicle (approximately 20 and 100 m, respectively) to determine a binary state of course/collision conflict. With a traveling speed of 70 km/hr, the look-ahead positioning of these radars represented time budgets of approximately 1 and 5 s for course and collision conflicts, respectively. The present conflict predictions were not yet capable of dynamically adjusting their ranges based on actual driven speed fluctuations. With only a fixed look-ahead distance, actual speeds slower/faster than 70 km/hr, respectively, increased/decreased the time budgets and diminished/inflated the frequency of alerting and thus also the potential for backup automated driving.
NDRT—Visual N-Back
To induce cognitive loads competing with the driving task, participants performed a visual N-back NDRT inspired by Mehler et al. (2011) on a laptop placed peripherally (Figure 1). A visual format (Figure 2) was designed and is available online (Cabrall, 2017). Participants responded using a computer mouse. Pilot tests suggested a level explained as “1 back” (i.e., repeat the last seen target) was sufficient for the desired competing loads on attentional management between our simulated driving and the NDRT (i.e., easy to do either separately, difficult to do both well simultaneously). A prerandomized schedule presented target values from 1 to 9 with interstimulus intervals between 2 and 5 s (at a half-second resolution) where the target was presented during the first half of the interval and the subsequent answer probe (i.e., “?????”) in the second half. During the allotted interval, correct responses produced a positive chime and incorrect responses produced a negative beep. By the end of the allotted interval, if a correct response had been made, participants scores increased by +1 point; if a correct response was not recorded, participants scores decreased by −1 point. Between safe driving responsibilities and the NDRT, participants were not told what to prioritize but to do both together as best able for the entire driving trial.

A modified N-back task was used as a NDRT presented via a graphical user interface (GUI). NDRT = nondriving-related task.
Driving Trials—Route, Timing, Hazards
All experimental drives lasted about 2 min 45 s. The route featured straight and curved road segments. In each drive, two surprise stationary hazards were presented. The automation was programmed to drive through these as simulated unannounced detection errors. The objects (Figure 3) appeared after about 1 and 2 min, with response time budgets of approximately 5 and 2 s, respectively. All of the aforementioned time descriptions are drawn from the automated driving conditions, in which the speed was computer controlled; otherwise, speed variations affected the timing of route progress. The simulation had to be manually terminated because the driving automation implementation would not function beyond its scripted nominal trajectory (i.e., positions were tied to timestamps). A data measurement cut-off point was established at 147.75 s (8,865th frame at 60 Hz) for all seven conditions as this was the earliest point the simulation was manually terminated by the experimenter (i.e., participant 61 in E&C-inv-BU).

Stationary obstacles in the driving simulation, appearing first as a fallen tree (a) after around 1 min of driving and second as a stalled motorcycle (b) after around 2 min of driving.
Driving Trials—Before, After
Participants were given around 3 min to separately practice driving in the simulator and the NDRT ahead of the experiment. Afterward, participants provided subjective ratings for success, effort, and acceptance across aspects of safety, efficiency, the N-back task, and the automation (Figure 4).

On-screen post-trial subjective questionnaire.
Measures
Measures taken at the discrete hazard events. In the automated driving conditions, plots of steering and brake inputs were manually inspected for conventional driving activity (e.g., nonconstant values) within the period between obstacle appearance and contact. During the backup conditions, the experimenter subjectively noted participant awareness of the obstacle. After the experiment, the objective status of automation (on/off) and participant eye position (on/off screen) were referenced from the computer-generated data logs at the first point of any contact.
Measures taken across the full trial. Visual distraction was measured as the percentage of time the DMS classified a binary state of visual distraction. NDRT performance was taken as a percentage of a final score and divided by the number of shown targets. Automated driving status was measured as the percentage of time the vehicle was under automated control. Lateral performance was assessed as off-road time, defined as the percentage of time where the front left or right corner of the car was positioned above the grass area alongside the roadway. Longitudinal progress was calculated in meters traveled along the driving route. Perceptions of success and effort were each probed for aspects of safety, travel efficiency (time/speed), and the N-back task at the end of each driving trial along with satisfaction with the automation if applicable (Figure 4).
Results
Our results and discussion are organized per research question (Table 3). For compactness and ease of reference, all measurement data and analyses are grouped at the end of this section in tables and figures. Data summaries of all hazard responses for each condition can be found in Tables 4,5 and for all other measures in Figure 5 (objective) and Figure 6 (subjective). Outcomes of the statistical analyses across each research question are given in:
Table 6 for one-way ANOVA comparisons between auto-hands-off-wheel, auto-hand-on-wheel, and Conv
Table 7 for Welch’s tests to compare all backup driving (E&C-vis-BU, EO-vis-BU, E&C-inv-BU, EO-inv-BU) against both automated driving conditions combined (Auto-hnd-off, Auto-hnd-on)
Table 8 for two-way ANOVA comparisons between the different kinds of backup design
assessment criteria (eyes-only-backup vs. eyes-plus-context-backup)
interface display (visible-backup vs. invisible-backup).
Overview of Responses Made to Hazards During Supervised Driving Automation Conditions
Note. Nonresponse events were presently ambiguous in all experimental conditions containing some level of conventional control inputs (i.e., backup and conventional control) due to inability to isolate steering and/or pedal inputs specifically intended for hazard avoidance. See Table 2 for condition descriptions.
Overview of Collisions and Circumstances with Hazards in Backup and Conventional Driving Conditions
Note. “Not trying to avoid” was determined via experimenter notes from subjective observation. See Table 2 for condition descriptions.
One-Way ANOVA Statistics and M (SD) for Comparing Conditions of Auto-Hnd-Off, Auto-Hnd-On, and Conv
Note. NDRT = nondriving-related task. *p < .05 for the ANOVAs or p < 0.05/3 (Bonferroni correction) for the post hoc Welch’s t-tests. NDRT scores were lost for one Autohnd-off and one Auto-hnd-on participant. See Table 2 for condition descriptions.
Welch’s Test Statistics and M (SD) for Comparing Backup Automation (E&C-Vis-BU, EO-Vis-BU, E&C-Inv-BU, EO-Inv-BU) and Automated Driving (Auto-Hnd-Off, Auto-Hnd-On)
Note. NDRT = nondriving-related task. See Table 2 for condition descriptions. *p < .05.
aFor off-road time, Backup was compared to Conventional driving instead of Automated driving.
bFor automation satisfaction, comparisons are also included for Auto-hnd-off vs. Auto-hnd-on, and E&C-vis-BU vs. EO-vis-BU.
Two-Way ANOVA Statistics and M (SD) for Comparing Conditions of Eyes-Plus-Context-Backup with Eyes-Only-Backup and for Comparing Invisible-Backup with Visible-Backup
Note. NDRT = nondriving-related task. *p < .05. Main effects are reported for each factor; no significant interaction effects were observed. An NDRT score was lost for one EO-vis-BU participant.

Objective results with means (“x”), medians (“—”), quartiles, and individual data points (“○”) per condition for the measures of (a) classified visual distraction, (b) N-back NDRT performance, (c) off-road time, (d) route progress, and (e) amount of automated driving. The numbers next to the boxplot represent the mean values. NDRT = nondriving-related task.

Subjective results with means (“x”), medians (“—”), quartiles, and individual data points (“○”) per condition for the measures of (a) safety success, (b) safety effort, (c) travel time/speed success, (d) travel time/speed effort, (e) NDRT success, (f) NDRT effort, and (g) automation satisfaction. The numbers next to the boxplot indicate the mean values. Positive or negative interpretations per higher or lower values differ per subfigure. NDRT = nondriving-related task.
RQ1—“Are Drivers Susceptible to Dangerous Levels of Distraction with SAE Level 2?”
In auto-hands-off-wheel, 10 out of 13 participants (77%) did not make any response to the first obstacle, and 2 out of 13 participants (15%) made no response to the second obstacle (Table 4). No collisions occurred in Conv for either the first or the second obstacle (Table 5). Objectively, participants exhibited significantly higher levels of visual distraction and improved NDRT scores in auto-hands-off-wheel than in Conv (Figure 5 and Table 6). Perceived success on the NDRT was not significantly higher in auto-hands-off-wheel than in Conv (Figure 6 and Table 6) and perceived effort spent on travel time/speed was not significantly lower in auto-hands-off-wheel than in Conv (Figure 6 and Table 6).
RQ2—“Does Having a Hand Placed on the Wheel Improve Driver Supervision of Automation?”
For the first hazard, there were 10 nonresponses in auto-hands-off-wheel compared to 2 nonresponses in auto-hand-on-wheel (Table 4). Nonresponses to the second hazard were equally frequent (two nonresponses each) between these conditions (Table 4). After Bonferroni correction, no significant differences were obtained between auto-hand-on-wheel and auto-hands-off-wheel for the objective measures of visual distraction and NDRT performance (Figure 5 and Table 6) nor for the subjective measures of effort spent on travel time/speed, success with the NDRT (Figure 6 and Table 6), and satisfaction with the automation (Figure 6 and Table 7).
RQ3—“Is Backup Control a Safe and Acceptable Alternative to Supervised Automated Driving?”
Visual distraction and NDRT performance scores were significantly lower in the combined set of adaptive backup driving control conditions (E&C-vis-BU, EO-vis-BU, E&C-inv-BU, EO-inv-BU) in comparison to the two continuous supervised automated driving conditions taken together (Auto-hnd-off, Auto-hnd-on) (Figure 5 and Table 7). Off-road time was also significantly lower in backup control compared to Conv (Figure 5 and Table 7). Participants in the backup conditions reported significantly lower effort compared to participants in the automated driving conditions (Figure 6 and Table 7). Backup automation reduced both the actual and perceived NDRT performance as compared to automated driving (Table 7). Significant differences were not found between backup and automated driving in regard to perceived safety effort and perceived safety success (Figure 6 and Table 7) or satisfaction with the automation (Figure 6 and Table 7). For perceived travel time/speed, participants in the backup conditions reported significantly higher effort and significantly lower success than participants in supervised automated driving (Figure 6 and Table 6).
Compared to the rate of nonresponse errors to hazards in the automated driving conditions (16 of 52 possible, 31%) (Table 4), a lower rate was observed of participants not noticing or not trying to respond to the hazards in backup conditions (3 of 102 possible, 3%) (Table 5). Five hazard collisions occurred in backup control with unobserved participant awareness; 37 other hazard collisions occurred in the backup conditions but with explainable artifacts rather than being attributable to issues of complacency: 19 when the participant was observed to be actively trying to avoid the hazard (i.e., unsuccessful in regaining control from the automation) and 18 due to unintended system integration malfunctions (i.e., conventional driving allowed while being classified as distracted, or automated control retention while being classified as nondistracted).
RQ4—“Can Context-Based Criteria Safely Reduce Driver State Monitoring from Over-Triggering?”
In the eyes and context conditions (E&C-vis-BU, E&C-inv-BU), backup control activated significantly less than in the eyes-only conditions (EO-vis-BU, EO-inv-BU) (Figure 5 and Table 8). Consequently, longitudinal progress was significantly higher in eyes-plus-context-backup than in eyes-only-backup (Figure 5 and Table 8). No significant increase was observed for off-road time between eyes-plus-context-backup and eyes-only-backup (Figure 5 and Table 8). Perceived success for travel time/speed was significantly higher with eyes-plus-context-backup versus eyes-only-backup without significant difference in subjective effort for this aspect (Figure 6 and Table 8). Participants in E&C-vis-BU reported significantly higher automation satisfaction than those in EO-vis-BU (Figure 6 and Table 8). Perceived effort and success of safety did not significantly differ between eyes-plus-context-backup and eyes-only-backup (Figure 6 and Table 8). Additionally, no significant differences were observed between eyes-plus-context-backup versus eyes-only-backup in terms of the amount of visual distraction and NDRT performance scores (Figure 5 and Table 8). Participants showed significantly higher perceived NDRT success in the eyes and context condition than in the eyes-only condition, but did not indicate significantly different levels of effort (Figure 6 and Table 8). For hazard collisions where the participant was observed as not trying to avoid the obstacle, all events transpired within the eyes-plus-context-backup rather than the eyes-only-backup conditions but were overall generally rare occurrences (i.e., 3 collisions out of a total of 102 exposures in the backup conditions) (Table 5).
RQ5—“Is the Status of Backup Driving Automation Necessary to Display to Drivers?”
In regard to objective indicators of expected overreliance, visual distraction was not found to be significantly higher in the visible-backup conditions than in the invisible-backup conditions (Figure 5 and Table 8). NDRT performance scores, proportion of automated control, and longitudinal progress also were not found to be significantly higher with visible-backup versus invisible-backup (Figure 5 and Table 8). Between visible and invisible backup, no significant difference was found for the safety measure of off-road time (Figure 5 and Table 8) and no discernible differences in evasion attempts were observed during hazard collisions (i.e., two in visible-backup condition: E&C-vis-BU, and one in an invisible-backup condition: E&C-inv-BU; Table 5). No significant differences were observed to evidence trade-offs between perceptions of success/effort for safety or travel time/speed efficiency (Figure 6 and Table 8), or the NDRT performance (Figure 5 and Table 8) between invisible-backup and visible-backup.
Discussion
RQ1—“Are Drivers Susceptible to Dangerous Levels of Distraction with SAE Level 2?”
The auto-hnd-off condition produced significantly higher visual distraction (76%) and NDRT performance (79%) when compared to Conv (53% and 44%). This increase in NDRT involvement most likely explains the obtained inadequate supervision, where 77% of the auto-hnd-off participants made no corrections to the first hazard. The observed drop in nonresponse rates to 15% for the second hazard is probably due to learning. This learning effect was found in short trials with multiple hazards; learning of this kind (i.e., heightened anticipation after a recent exposure) is not expected in real-world driving where hazards are more rare. It should be noted that a nonzero amount of nonresponses remained in spite of just experiencing a preceding collision. The subjective results for auto-hnd-off compared to Conv suggest that participants viewed the driving automation more as a convenience commodity (a decreasing trend in effort and a significant increase in success in perceptions of travel time/speed; an increasing trend in perceived NDRT success) rather than a safety aid (nonsignificant results regarding perceived safety success/effort).
RQ2—“Does Having a Hand Placed on the Wheel Improve Driver Supervision of Automation?”
Participants of auto-hnd-on made fewer nonresponse errors to first and second hazards (15%, 4 of 26) than participants of auto-hnd-off (46%, 12 of 26). Notably, auto-hnd-on did not produce significant differences from auto-hnd-off in terms of visual distraction, NDRT scores, or perceptions of success/effort, which suggests improved hazard awareness from hand-on requirements to be produced by mechanisms other than NDRT involvement or subjective value (as seen between auto-hnd-on and Conv). Physical hand-wheel contact might represent linked mind–body benefits that remind/prime a human operator toward conventional driving responsibility and steering activity. This interpretation is consistent with our observation of steering to be the majority response (i.e., compared to braking) when responses were made.
RQ3—“Is Backup Control a Safe and Acceptable Alternative to Supervised Automated Driving?”
The combined set of backup conditions (E&C-vis-BU, EO-vis-BU, E&C-inv-BU, EO-inv-BU) showed significantly lower visual distraction and NDRT performance compared to the supervised automation (auto-hnd-off, auto-hnd-on), and with significantly less off-road time compared to Conv. Compared to supervised automated driving, the subjective results suggest backup drew participants back into the driving task (significantly lower perceptions of success with higher levels of effort in terms of travel time/speed efficiency) and away from the NDRT (significantly lower perceptions of success with lower levels of effort in NDRT performance). Additionally, satisfaction ratings with the simulated short exposure sessions of driving automation were not found to be significantly lower (between-subjects) with the set of backup conditions compared to the set of supervised automated driving.
RQ4—“Can Context-Based Criteria Safely Reduce Driver State Monitoring from Over-Triggering?”
Like on-market systems that use alarms or feature lockout, our DMS was designed with an intended negative consequence for end-user inattention; ours included an impedance to forward driving progress (i.e., slowing down). The eyes-only-backup conditions (EO-vis-BU, EO-inv-BU) had significantly greater proportions of automated control and consequently more longitudinal impedance compared to the eyes-plus-context-backup conditions (E&C-vis-BU, E&C-inv-BU). Correspondingly, participants expressed negative subjective experiences with significantly lower ratings on perceived travel time/speed success (EO-vis-BU, EO-inv-BU) and automation satisfaction (EO-vis-BU). Importantly, the conservative shift toward fewer DMS triggers did not detract from safety: the perceived success of safety did not significantly decrease, and lateral performance errors (i.e., off-road time) did not significantly increase. In other words, the context-based criteria functioned as hypothesized to reduce “cry-wolf” while also not (dangerously) increasing misses with an overly strict criterion level.
RQ5—“Is the Status of Backup Driving Automation Necessary to Display to Drivers?”
The lack of end-user awareness of automaton existence/status in the invisible-backup conditions (E&C-inv-BU, EO-inv-BU) was not seen here to carry additional consequences (i.e., neither significant detraction from positive measures nor significant addition to negative measures). Even though our short-duration simulated trials did not obtain direct positive evidence (e.g., significantly decreased visual distraction in invisible-backup), it is reasonable to expect (as motivated in Section “Introduction”) that people with visible-backup might allow themselves to become distracted more often, for longer periods of time, expecting that the vehicle can always successfully back them up. Promisingly, our results do suggest that the notification of backup driving automation and detected distraction events might not be necessary from a DMS and so can practically remain in the background.
Limitations
In terms of external validity, it should be noted that our driving trials were targeted as short distraction stress periods only to evaluate different consequences of automation and DMS design concepts. While it is interesting and troubling that we found inattention issues of supervising driving automation even within our short-duration trials of only a few minutes, the generalizability to real-world on-road driving carries several caveats. First, our simulated driving automation performed its longitudinal and lateral spacing duties in a perfect manner up until its sudden failure at a much quicker rate than is conceivable for most people’s experiences with present-day AVs. In the real world, drivers may witness smaller or partial failures over an extended period of exposure that may help them better calibrate an appropriate level of trust. From such additional experience, people may have more opportunities to learn how to respond (i.e., on the wheel and/or the pedals, an emergency button), whereas our participants might have been more limited by confusion and hesitation regarding what responses were allowed/expected in the simulation. Additionally, the low-fidelity desktop driving simulator had multiple limitations (i.e., limited field of view, lack of realistic force feedback in steering, lack of vestibular motion feedback). The simulated vehicle handling was anecdotally characterized as “slippery.” Moreover, perceptions of risk (and hence risk-taking behaviors) are rarely commensurate between driving simulators and real-life roads. Further studies of longer duration and increased fidelity will be necessary to anticipate real-world inadequacies of humans supervising AVs.
The on/off-screen counting and classification of eye-tracker data frames were mistakenly not updated when the frame rate of the eye-tracker was halved to cope with system lag. Thus, our thresholds for attention/distraction were unknowingly doubled. However, the resultant 3 s off-road glance threshold still approximated the widely used 2 s criterion (Klauer et al., 2006). Additional research has suggested that studies should be open to investigating more elaborate measures such as frequencies of repeated glances off-road (Liang et al., 2014), as well as in relation to patterns of on-road glances (Kircher & Ahlström, 2009; Seppelt et al., 2017). In any case, for the present paper, our timing thresholds were only conceived to serve as a conventional presupposition (i.e., given constant) from which to build off extension topics of interest: eyes-only-backup versus eyes-plus-context-backup and visible-backup versus invisible-backup. If our thresholds had been half of what were mistakenly implemented, then distraction triggers and auto-to-manual transitions of control would have been earlier/easier and more frequent. Consequently, our hypothesized differences (for greater benefits of implicit and context-based DMS) would have been more likely to obtain, that is, up until a yet unknown limit of failing to prevent giving control over to drivers with too-short durations of on-road glances. For all of the above reasons, our presently reported results should only be interpreted in relative terms (ordinal comparisons between conditions) rather than absolute numeric values.
Conclusions
The present investigation demonstrated attentional susceptibilities in drivers tasked to supervise full-time driving automation in the presence of a compelling NDRT. A requirement to maintain one hand on the wheel provided some benefit but still yielded problematic rates of visual distraction and nonresponses to hazards. Although the NDRT was tasked rather than voluntary, the depth of involvement was left free to each participant’s behavioral discretion. Consequently, we demonstrated dangerous levels of distraction rather than uncompromised multi-tasking. Our results showed that such automation overreliance problems occurred in a period of only a couple of minutes.
Overall, the adaptive backup conditions yielded improvements in terms of distraction (with the same NDRT) and driving safety compared to the automated driving conditions. Context-based DMS criteria reduced unnecessary interventions, and invisible-backup removed unnecessary risks for human misuse of automation (e.g., overreliance).
Under controlled between-subject comparisons, we have shown preliminary feasibility of eyes-plus-context backup automation compared to status quo counterparts of continuous automated driving and eyes-only backup automation. Our participants were randomly assigned between conditions. Further studies of a within-subject design, however, would strengthen a claim of achieved levels of acceptance of our concepts, and more targeted survey studies might best assess acceptance at a broader level (e.g., intent to purchase/use).
Application
A backup paradigm for automated driving control would address momentary human errors rather than attempt to replace human driving authority. The present results may stimulate further design considerations within that paradigm.
Supplementary Material
Supplementary material is accessible at https://doi.org/10.4121/uuid:295df9d1-73fb-4808-a8aa-c3f66de95b8d
Key Points
Complacency effects can occur with automated driving systems in only a few minutes. This effect occurred in spite of instructions to monitor and correct the automated driving for any dangers/errors and a recently experienced automated driving error.
The provision to keep one hand on the wheel had a positive impact on generating a response to the first obstacle. However, nonresponses to the second follow-on obstacle were equally present in both the auto-hnd-off and the auto-hnd-on conditions.
All presently investigated adaptive automated driving conditions (whether with trigger criteria of eyes-only-backup or eyes-plus-context-backup; and whether with invisible or visible transitions of control) were successful in reducing the amount of time spent off the road in comparison to a conventional driving condition.
An invisible-backup automated driving system is expectedly harder to misuse than one with a visible interface, and context-based alerts have the potential to reduce the negative impacts of false alarms and enhance satisfaction.
Footnotes
Author Biographies
Christopher D. D. Cabrall is a guest researcher in the Intelligent Vehicles and Cognitive Robotics Department of the Delft University of Technology. He received his PhD from the Delft University of Technology in 2019 within the EU project “HFAuto—Human Factors of Automated Driving.”
Jork Stapel is a PhD student in the Intelligent Vehicles and Cognitive Robotics Department of the Delft University of Technology. He received his MSc in Aerospace Control and Simulation in 2015 from the Delft University of Technology.
Riender Happee is an associate professor with the Faculty of Mechanical, Maritime and Materials Engineering and the Faculty of Civil Engineering and Geosciences, Delft University of Technology. He obtained his PhD in 1992 from the Delft University of Technology.
Joost C. F. de Winter is an associate professor with the Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology. He received his PhD degree in 2009 from the Delft University of Technology.
