Abstract
Objective
To explore the scope of available research and to identify research gaps on in-vehicle interventions for drowsiness that utilize driver monitoring systems (DMS).
Background
DMS are gaining popularity as a countermeasure against drowsiness. However, how these systems can be best utilized to guide driver attention is unclear.
Methods
A scoping review was conducted in adherence to PRISMA guidelines. Five electronic databases (ACM Digital Library, Scopus, IEEE Xplore, TRID, and SAE Mobilus) were systematically searched in April 2022. Original studies examining in-vehicle drowsiness interventions that use DMS in a driving context (e.g., driving simulator and driver interviews) passed the screening. Data on study details, state detection methods, and interventions were extracted.
Results
Twenty studies qualified for inclusion. Majority of interventions involved warnings (n = 16) with an auditory component (n = 14). Feedback displays (n = 4) and automation takeover (n = 4) were also investigated. Multistage interventions (n = 12) first cautioned the driver, then urged them to take an action, or initiated an automation takeover. Overall, interventions had a positive impact on sleepiness levels, driving performance, and user evaluations. Whether interventions effective for one type of sleepiness (e.g., passive vs. active fatigue) will perform well for another type is unclear.
Conclusion
Literature mainly focused on developing sensors and improving the accuracy of DMS, but not on the driver interactions with these technologies. More intervention studies are needed in general and for investigating their long-term effects.
Application
We list gaps and limitations in the DMS literature to guide researchers and practitioners in designing and evaluating effective safety systems for drowsy driving.
Introduction
Driver drowsiness is a significant traffic safety issue. Drowsiness, or sleepiness, refers to reduced alertness associated with reduced executive functioning, mental effort, and involuntary muscle inhibitions (APA Dictionary of Psychology, n.d.-a). Fatigue, often used interchangeably with drowsiness, is a broader term relating to tiredness and declined functioning due to physical exertion, stress, boredom, lack of sleep, or potential disorders (APA Dictionary of Psychology, n.d.-b). Drowsiness can lead to increased speed variability, increased standard deviation of lane position, fewer micro corrections of the steering wheel (see Liu et al., 2009; Sahayadhas et al., 2012 for a review on vehicle measures), increased percentage of eyelid closures (Sikander & Anwar, 2019), lower driver awareness and alertness towards hazards, and lower cognitive processing and reaction times (Smith et al., 2009). Annually, 328,000 crashes in the US, including 109,000 crashes with injuries and 6400 fatal crashes, are estimated to involve a drowsy driver (Tefft, 2012). An eyelid closure analysis of naturalistic driving data (SHRP2; the Strategic Highway Research Program 2) found drowsy driving in 8.8–9.5% of crashes recorded and in 10.6–10.8% of crashes reported to the police (Owens et al., 2018).
Drowsiness can be caused by a variety of factors, including low arousal, high workload, and sleep-related factors. Periods of low arousal can lead to sleepiness due to boredom, also known as passive fatigue (Chong & Baldwin, 2021; Desmond & Hancock, 2000). For example, long, monotonous rural roads with little traffic can pave the way for this type of sleepiness. Passive fatigue is a particular concern with the current and upcoming advanced driver assistance systems, since monitoring the road for prolonged periods without manually controlling the vehicle can make it challenging to stay awake and attentive to the road (Körber et al., 2015; Schömig et al., 2015). On the other hand, extended periods of high workload can exert prolonged activation of and load on the synapses in the brain leading to high oxygenation of the neurons, DNA, proteins, and lipids (Chong & Baldwin, 2021). To prevent any permanent damage, the body initiates sleep, and this state is often referred to as active fatigue. Lastly, sleep-related factors such as sleep deprivation and the circadian rhythm, which is regulated by time of day, can also influence sleepiness levels (Chong & Baldwin, 2021; May & Baldwin, 2009). Consequently, majority of sleep related fatalities are reported to occur in the early morning (4–6 a.m. and 6–9 a.m.) with a second peak in the late afternoon (3–6 p.m.), when drivers are lacking sleep or when there is a large dip in the alertness levels (Brown et al., 2020; Valdez, 2019).
Popular strategies to fight drowsiness while driving include consuming caffeine, opening the windows, stretching, and resting (Gershon et al., 2011). One study showed that having a chewing gum with 100 mg of caffeine improved driving performance within 10 min when drivers experienced drowsiness due to low arousal (Gastaldi et al., 2016). Maintaining a constant blood oxygen level provided through an oxygen mask (Takahashi et al., 2014) and stretching while driving (Jang et al., 2017) have also been shown to have potential benefits for reducing sleepiness. However, these benefits depend on whether the driver is aware of their impaired state and whether they are able or willing to use these strategies. Further, drivers can under- or overestimate their sleepiness levels (e.g., Gaspar & Carney, 2023). Thus, monitoring the driver state in real time can be valuable for initiating timely interventions like informing the driver or even engaging driving automation.
Driver drowsiness monitoring systems that use driver physiological data (e.g., heart rate, electroencephalography; EEG), behavioral data (e.g., eye-tracking), vehicle kinematics (e.g., speed), subjective measures (e.g., sleepiness scales), or a combination of these sources have gained large interest in the last two decades. Numerous studies were published, including many scoping and systematic reviews (Arakawa, 2021; Chowdhury et al., 2018; Lohani et al., 2019; Lu et al., 2022; Ngxande et al., 2017; Sahayadhas et al., 2012; Sikander & Anwar, 2019; Watling et al., 2021; Yusoff et al., 2017). The reviews highlight the ongoing need for research in driver state detection methods before their widespread deployment in vehicles (Watling et al., 2021). Yet, even if accurate state detection systems become available, questions on whether and how the detected state should be communicated to the driver and whether and how the vehicle should intervene in response to this high-risk state remain unanswered. For example, when the system detects that the driver is drowsy, would alerting the driver through auditory warnings be stimulating or startling? Instead, should the vehicle turn on the lane keeping system to avoid lane deviations? As the next step of driver monitoring, in-vehicle systems can initiate countermeasures such as feedback displays, warnings, or automation aids for safe driving once drowsiness is detected. These interventions could make the drivers aware of their impaired state, encourage taking breaks or alert them, and support them with safely operating the vehicle.
In this paper, we report a scoping review aimed to explore the extent of available research and research gaps on in-vehicle countermeasures that utilize driver drowsiness monitoring and that have been evaluated in a driving context (e.g., driving simulator, on-road studies, or driver interviews). The scoping review also aimed to consolidate available information on different factors that affect intervention outcomes (e.g., driver monitoring system performance, driving scenarios, and participant population) to guide future intervention designs.
Methods
Protocol and Registration
The scoping review was conducted and reported in adherence to Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) Extension for Scoping Reviews (Tricco et al., 2018). A search protocol was prepared before conducting the search (Supplementary Material). The search was conducted to cover both driver drowsiness and driver distraction as part of a larger research effort; the current paper presents our findings on driver drowsiness exclusively.
Search and Information Sources
Search Concepts, Keywords, and Strategy
Eligibility Criteria
Eligibility Criteria for Title/Abstract and Full Text Screening
Due to the large number of identified studies based on the relaxed eligibility criteria in the title/abstract screening stage, three additional researchers (MA, CS, and CZ) contributed to the full text screening. All three received an hour-long training on the general review protocol, and all six reviewers received a 2-hour long training on full text screening. In the following week, all reviewers screened a set of five preselected papers and discussed their decision-making process in a calibration session. Afterwards, all studies were double screened by two reviewers independently (SA or YW vs. XT, MA, CS, or CZ) in strict adherence to the criteria listed in Table 2. Later, conflicts were discussed among the two reviewers who screened them, with a third as a tie breaker when necessary.
Data Charting and Synthesis of Results
Data Extraction Themes and Fields
Results
Selection of Sources of Evidence
3252 studies were identified with the systematic search. 2520 studies were screened out in the title/abstract screening phase. The average agreement between reviewers was 93% for the title/abstract screening (SA-YW: 96%, Cohen’s Kappa: .89; SA-XT: 89%, Cohen’s Kappa: .72), and 85% between the six reviewers for the full-text screening (average Cohen’s Kappa = .75). Many studies were excluded in the full-text stage; majority of the studies, despite mentioning an intervention in their titles or abstracts, focused only on the advancement of state monitoring technologies (n = 303, e.g., testing new features/sensors/models to improve detection accuracy), or did not use a state detection system to initiate an intervention (n = 152). At the end, a total of 11 studies passed to the data extraction phase. Later, a relevant study was identified that was not found in the first search due to an excluded subject area. A secondary search was conducted, and a little over 1300 additional papers were screened for title and abstract. We identified eight more eligible studies (not indexed in the databases searched) in the references of these 12 papers, leading to a total of 20 studies included in the review. The screening flowchart is shown in Figure 1. Overall, twelve journal articles, six conference proceeding papers, and two technical reports were identified. PRISMA flow diagram.
Study Details and Sample Characteristics
Study Details. Studies are Sorted in Alphabetic Order of First-Author Last Names
aNote that the total sample size does not match what is reported for sample characteristics as the study did not detail how data loss corresponded to sample characteristics.
Drowsiness Detection Methods and Performance Metrics
Overall, the majority of the studies focused on detecting only drowsiness (n = 17), while two studies also had distraction detection as part of the system evaluated (Horberry et al., 2022; Kircher et al., 2009) and one study had health monitoring (Hayashi et al., 2021). Studies varied in terms of the algorithms used for detecting drowsiness and the features utilized (i.e., numeric measures used in algorithm development, e.g., blink rates). Mostly eye-tracking (n = 11) and vehicle kinematics (n = 11) data was used to detect drowsiness. State detection was performed via observer ratings (Fitzharris et al., 2017; Heitmann et al., 2001; Saito et al., 2016, 2020; Vincent et al., 1998), threshold-based rules (Baldwin et al., 2014; Fairclough & Van Winsum, 2000; Heitmann et al., 2001; Liu & Uang, 2010; Nishigaki & Shirakata, 2019; Saito et al., 2016, 2020; Vincent et al., 1998; Wolkow et al., 2020), traditional statistical models (Aidman et al., 2015; Berka et al., 2005; Hayashi et al., 2021), or machine learning (Chen et al., 2014; Gaspar et al., 2017; Kundinger et al., 2021; Niu & Ma, 2022). Horberry et al. (2022) did not implement a detection system but instead built a human machine interface (HMI) prototype for a warning system. The prototype was designed based on interviews and workshops with users (e.g., truck drivers) and stakeholders (e.g., managers of trucking companies) and was assessed by users in a driving simulator. Table 4 lists the features and performance metrics used, which might impact driver acceptance or use of the systems.
Although the accuracy and false alarm rates can significantly impact the use of state detection systems, classification performance metrics were generally not reported (n = 10). With those that did (n = 6), accuracy rates varied greatly between 27% and 100%. Due to this large range and small number of studies, no association between algorithm performance and type or number of features was observed. For example, Hayashi et al. (2021) used heart rate, eye-tracking, body-tracking, and vehicle kinematics, but reported low accuracy (55%). On the other hand, Nishigaki and Shirakata (2019) used only vehicle kinematics data and reported 93% accuracy. This large variability might be due to small sample sizes (n = 5 participants) used in both studies. Relaxing thresholds helped achieve 100% accuracy in Saito et al. (2016); however, it may have also increased false alarm rates, which may lead to disuse. Further, no connections could be made between detection measures (e.g., EEG and heart rate) and user evaluations (e.g., driver acceptance).
Performance measures were not applicable in four of the studies as detected states were not compared to a ground truth. Performance measure calculations were not relevant in Horberry et al. (2022), as they conducted interviews and workshops for developing design principles for a warning system. Heitmann et al. (2001) and Vincent et al. (1998) used observer ratings only to monitor drowsiness; these papers were included in the review as we considered them to simulate a driver monitoring system. An accuracy measure was also not applicable in Chen et al. (2014): although a probabilistic neural network-based drowsiness detection system was described, the experiment was conducted only with alert drivers, which did not require detecting states.
Drowsiness Types and Induction Methods
Seven studies induced drowsiness through low arousal (i.e., passive fatigue) from continuously driving on monotonous roads in the simulator for long durations (Fairclough & Van Winsum, 2000; Hayashi et al., 2021; Kundinger et al., 2021; Liu & Uang, 2010; Nishigaki & Shirakata, 2019; Saito et al., 2016, 2020). Durations varied from 30 to 120 min, although 30-min drives were not sufficient in inducing drowsiness with some participants (Nishigaki & Shirakata, 2019). Saito et al. (2016) observed extreme drowsiness with 13 participants, out of which four fell completely asleep within three 30-minute drives. Only one study created active fatigue conditions with a secondary task for extended durations (90 mins) while driving in the simulator (Baldwin et al., 2014).
Type of sleepiness was unclear in six studies (30%). In particular, with naturalistic studies, researchers did not have access to or control over driving times, road conditions, or time of drives (Aidman et al., 2015; Fitzharris et al., 2017; Kircher et al., 2009; Wolkow et al., 2020). Further, Chen et al. (2014) and Horberry et al. (2022) did not test their systems with sleepy drivers.
Lack of sleep was investigated in five driving simulator studies (Berka et al., 2005; Gaspar et al., 2017; Heitmann et al., 2001; Kozak et al., 2006; Niu & Ma, 2022) and one on-road study (Vincent et al., 1998). Three of these studies scheduled long overnight drives (4–5-hour drives starting at 10 p.m. onwards) with participants awake for more than 14 hr (Berka et al., 2005; Gaspar et al., 2017; Vincent et al., 1998). In Heitmann et al. (2001), the experimental drives were scheduled as multiple trials (each around 30–50 mins) overnight either from 1 to 8 a.m. or from 10 p.m. to 9 a.m. Participants in Kozak et al. (2006) were awake for 23 hr before a three-hour experimental drive at 6 a.m. (without caffeine after 6 p.m. the day before). Niu and Ma (2022) restricted participants’ sleep the night before to at most 5 hr before a 30-min experimental drive. Berka et al. (2005) and Vincent et al. (1998) also reported that some drivers did not show drowsiness levels to trigger the warning systems.
Interventions and Outcomes
A variety of interventions were tested across studies as visualized in Figure 2 and described in Table 5. Figure 2 aims to guide researchers and designers to address the gaps in intervention design, while considering whether an intervention is appropriate for different reasons of drowsiness (Chong & Baldwin, 2021). Bubble chart showing the distribution of studies based on their drowsiness and intervention types (highest level is shown). Size of the bubbles indicate the number of studies. Lighter colored bubbles show single stage interventions while darker colored bubbles indicate multistage interventions. The numbers inside the bubbles correspond to study IDs. Descriptions of Interventions and Their Outcomes Note. Arrows indicate increase (↑) and decrease (↓) in given metrics, while ns indicates no significant change.
Most common interventions were auditory warning systems (n = 14). Some studies integrated a combination of strategies like displays and warnings in Nishigaki and Shirakata (2019). Some interventions were presented to the driver in a multistage approach (n = 12): the first stage informed or cautioned the driver, and then the second stage either urged the driver to take an action (e.g., warnings) or initiated a complete automation takeover.
Feedback systems
Interventions that displayed alertness states fell under this category (Aidman et al., 2015; Fairclough & Van Winsum, 2000; Kundinger et al., 2021; Nishigaki & Shirakata, 2019). Three studies continuously displayed driver states (Fairclough & Van Winsum, 2000; Kundinger et al., 2021; Nishigaki & Shirakata, 2019), while in Aidman et al. (2015), the display was designed to disappear if the risk levels were low for 5 min, which was calculated based on Johns Drowsiness Scale (JDS; a regression model of blink rates and velocity). An auditory alarm rang once the risk levels surpassed a certain threshold. Nishigaki and Shirakata (2019) and Fairclough and Van Vinsum (2000) implemented similar multistage auditory warning systems on top of the state display. Aidman et al. (2015) and Nishigaki and Shirakata (2019) reported reduced sleepiness, while Fairclough and Van Vinsum (2000) found significant improvements in driving performance (steering wheel movement velocity and lateral position). Three studies did not examine the impact of feedback displays alone (Aidman et al., 2015; Fairclough & Van Winsum, 2000; Nishigaki & Shirakata, 2019). Participants in Kundinger et al. (2021) found the display useful and easy to understand. In general, the impact of state feedback displays on driving performance or drowsiness needs further investigation.
Warning systems
Most of the studies (n = 16) incorporated at least one type of warning with visual, auditory, tactile modalities, or a combination of these. Generally, warning systems reduced sleepiness (Aidman et al., 2015; Berka et al., 2005; Fitzharris et al., 2017; Heitmann et al., 2001; Liu & Uang, 2010; Nishigaki & Shirakata, 2019; Wolkow et al., 2020) and improved driving performance (Baldwin et al., 2014; Berka et al., 2005; Fairclough & Van Winsum, 2000; Fitzharris et al., 2017; Gaspar et al., 2017; Heitmann et al., 2001; Kozak et al., 2006; Niu & Ma, 2022; Wolkow et al., 2020). Participants perceived drowsiness warning systems as helpful with high user acceptance (Berka et al., 2005; Fairclough & Van Winsum, 2000; Kozak et al., 2006), as positive, not disturbing, but unnecessary at times (Kircher et al., 2009), and as effective without causing interference with other in-cabin warning systems or overloading or startling (Horberry et al., 2022). On the other hand, participants in Vincent et al. (1998) disregarded the alarms by the warning system.
Only three studies compared different modalities of warnings (Gaspar et al., 2017; Kozak et al., 2006; Nishigaki & Shirakata, 2019). Kozak et al. (2006) and Nishigaki and Shirakata (2019) found that adding vibrations to steering wheel torque and to auditory warnings improved driving performance and time until sleepiness, respectively. Moreover, Kozak et al. (2006) showed that steering wheel torque combined with vibratory warnings was better than torque alone, and torque with a head up display. Gaspar et al. (2017), on the other hand, did not find a significant difference on the outcomes when auditory-visual, haptic, and their combination were compared.
Overall, the variation in measurements and the limited number of studies restrict drawing larger conclusions on drowsiness warning systems, but the existing findings are promising.
Automation systems
Four studies triggered automation when the monitoring system detected drowsiness. Automation upon detecting sleepiness were operationalized in two ways: (1) automation changed its behavior (Chen et al., 2014; Saito et al., 2016, 2020) or (2) automation took over complete control (Hayashi et al., 2021; Saito et al., 2016, 2020). In three of these studies, the systems improved driving performance (Chen et al., 2014; Saito et al., 2016, 2020), while Hayashi et al. (2021) reported trends of increased, but statistically nonsignificant, mental workload. Further, how or when to inform the driver about a control transition and when to perform the transition were not investigated in any of the four studies, which could impact driver acceptance and use of the system. Although Chen et al. (2014) reported good acceptance of the driver-oriented adaptive cruise control that they tested, these opinions belonged only to alert drivers and not to drowsy drivers or those transitioning to drowsiness. More research is needed on the design of the control transitions with sleepy drivers.
Despite the benefits, sudden changes in automation can be startling, confusing, and annoying for the driver, especially in case of false detection of sleepiness. It might also be dangerous, especially in an impaired state. Three of the studies attempted to address the false alarm issue by gradually giving automation control (Saito et al., 2016, 2020), or by getting verbal confirmation from the driver (Hayashi et al., 2021). In Saito et al. (2016, 2020), a partial control transition was performed to keep the vehicle within the lane when drowsiness was first detected. If the driver failed to interfere within 10 s, automation took over lateral control completely to bring the vehicle to the lane center. Finally, if this second stage got activated twice within 30 s, the vehicle assumed that the driver was sleepy and stopped itself. In Hayashi et al. (2021), the driver was asked to verbally confirm whether the detected drowsiness was correct within 3.7 s before the system activated control transition. Due to the small sample size (5 drivers), detection rates were low (55%, Table 4), but false alarms were rectified by driver’s verbal corrections of the detected states. Although not statistically significant, compared to a “No” response, “Yes” was associated with greater mental demand (evaluated through NASA-TLX), which might reflect drivers’ struggle to keep awake. This verbal confirmation might also momentarily improve alertness in passive fatigue states. Moreover, whether the time allowances (e.g., 10 s and 3.7 s) for the driver to intervene before control transitions were adequate need further testing.
Compatibility of design choices with sleepiness types and demographics
Overall, many interventions (n = 7) targeted passive fatigue (Figure 2), whereas active and sleep-related fatigue were not investigated for feedback and automation systems. For passive fatigue, the feedback and warning systems were associated with reduced sleepiness (Fairclough & Van Winsum, 2000; Nishigaki & Shirakata, 2019), improved driving performance (Fairclough & Van Winsum, 2000), good subjective ratings (Fairclough & Van Winsum, 2000; Kundinger et al., 2021; Liu & Uang, 2010), and improved signal detection sensitivity (Liu & Uang, 2010). Similarly, automation systems improved vehicle control in three studies (Hayashi et al., 2021; Saito et al., 2016, 2020). However, it is unclear whether the use of automation would exacerbate passive fatigue as the driver is expected to monitor automation instead of driving actively. None of the latter three studies assessed changes in sleepiness levels after control transition. Further research is needed to understand the impacts of automation interventions on drowsiness levels and its further consequences on the use of and reliance on automation.
All six studies that induced drowsiness through lack of sleep tested a warning system, and five of them reported that driving performance improved (Gaspar et al., 2017; Heitmann et al., 2001; Kozak et al., 2006; Niu & Ma, 2022) or showed marginal improvement (Berka et al., 2005). Two studies reported positive subjective ratings (Kozak et al., 2006; Niu & Ma, 2022). With the warning systems, sleepiness levels marginally decreased (Berka et al., 2005), or remained the same (Gaspar et al., 2017; Vincent et al., 1998). In Vincent et al. (1998), participants were allowed to take breaks (napping was not allowed) when they wished, and the results show that the sleepiness levels first reduced after a break but increased back to the prebreak levels after around 12 min of driving. In the same study, the intervention group had significantly higher drowsiness levels before taking a break compared to their baseline levels and compared to the control group before they took a break. This finding might indicate behavioral adaptations to warnings in which drivers, even if they are sleepier, might think they are fit to drive for longer until they receive a warning.
More than half of the studies incorporated a multistage approach. Interview participants in Horberry et al. (2022) mentioned single stage warnings to be startling and potentially dangerous and preferred multistage warnings. Gaspar et al. (2017) found a significant decrease in lane departures due to drowsiness when a multistage warning system was used as opposed to a single warning. Although other included studies did not investigate the differences in the effectiveness between single and multistage warnings, both type of warnings were found to be effective through multiple measures, including improved sleepiness and driving performance.
Lastly, even though sample demographics across studies were limited, age differences were observed in the effectiveness of warning and feedback systems (Baldwin et al., 2014; Kundinger et al., 2021), with older drivers (65+) benefiting from drowsiness interventions more compared to younger drivers (18–29 or 20–25 year olds, respectively). Warnings helped older drivers reduce crashes (Baldwin et al., 2014). Older drivers found the feedback system in Kundinger et al. (2021) exciting, useful, and easy to understand, and had higher intentions to use the system, especially during long drives at night, compared to younger drivers. The impact of drowsiness interventions on different demographic groups needs to be examined further. How these factors would impact the efficacy of automation interventions is also unknown.
Discussion
This scoping review identified studies which evaluated in-vehicle interventions featuring driver state detection systems for mitigating drowsiness. The popularity of drowsiness detection systems has increased in the last decade, likely due to technological advancements and increased availability of sensors and computational resources. Although overall findings showed positive impact of interventions on sleepiness levels, driving performance, and user evaluations, more research is still needed to understand the effectiveness of such systems and how best to design them. Many studies were excluded from this review due to not implementing an intervention after detecting drowsiness (n = 303) or not evaluating the interventions (n = 94). Much of the relevant research focused on developing and implementing new technology or sensors and improving the accuracy of detection models, but not on how drivers would interact with these technologies.
The drowsiness detection systems utilized in these studies used mainly five categories of features: EEG, heart rate, eye-tracking, body-tracking, and vehicle kinematics. Although a variety of physiological measures (e.g., breathing or sweat response) are commonly utilized in the driver drowsiness detection literature (see Chowdhury et al., 2018; Lohani et al., 2019; Sahayadhas et al., 2012; Sikander & Anwar, 2019; Watling et al., 2021), interventions identified in this review utilized only heart rate and EEG. The use of other physiological measures is yet to be explored for interventions.
In addition, detection performance was mostly not reported, and the reported metrics varied largely. Although other driver state detection studies have shown that adding different types of features (i.e., data fusion) can improve detection accuracy and reduce false alarm rates (e.g., He et al., 2022; Koo et al., 2015), no relationship between the type of measures used for detecting drowsiness (e.g., eye closure) and the resulting algorithm performance (e.g., accuracy) could be ascertained in this review. These metrics are typically reported in the operator state detection research, yet they seem to be omitted when the focus has been on intervention design and testing rather than algorithm development. Detection accuracy levels have a direct influence on intervention outcomes like effectiveness and acceptance. For example, false alarms can lead to confusion, annoyance, and system disuse and users might ignore the alarms and rely on their expected probability of events (Bliss et al., 1995; Parasuraman & Riley, 1997). One possible reason for the omission of accuracy rates might be due to the challenges of establishing a drowsiness ground truth. However, eye closure metrics, EEG data, and observer ratings (Kundinger et al., 2020; Wierwille & Ellsworth, 1994) have been used widely in the literature to identify the ground truth for drowsiness. Similarly, studies have utilized tools like Karolinska Sleepiness Scale (Åkerstedt & Gillberg, 1990) to collect subjective ratings from participants directly.
Overall, design and testing of interventions for different sleepiness types need more research. It is not clear whether an intervention that is effective for one type of sleepiness will perform well for another type. For example, the driver might need extra stimulation with passive fatigue, and decreased stimulation for active fatigue. In the case of lack of sleep, for example, drowsy drivers might need to be supported with advanced driver assistance systems for safely keeping within lane, while this countermeasure might worsen passive fatigue conditions. Further, benefits established in the general alarm literature may not apply directly to the drowsy driving context, as drowsiness occurs due to physiological changes that might impact vigilance and the recovery process (see Chong & Baldwin, 2021 for a detailed review).
Feedback displays that present the detected driver alertness levels have been implemented alone (Kundinger et al., 2021) or in conjunction with warnings (Aidman et al., 2015; Nishigaki & Shirakata, 2019). Such feedback systems can let drivers take appropriate measures and prepare for any further actions from the vehicle (e.g., control transitions), but may also lead to visual clutter. An example of a good strategy to prevent clutter is to dim the display when the driver is not sleepy (Aidman et al., 2015). The trade-off between providing information and display clutter is yet to be explored for drowsiness interventions.
Warnings were the most common intervention type tested, communicating when the driver needed to take a break (Fairclough & Van Winsum, 2000; Fitzharris et al., 2017; Kircher et al., 2009; Vincent et al., 1998; Wolkow et al., 2020) in addition to guiding drivers’ attention to the roadway (e.g., Niu & Ma, 2022). The impact of different modalities is not clear; however, combining modalities might be best to help drivers to notice the warning. Two studies showed that the best combination was auditory and tactile modalities (Kozak et al., 2006; Nishigaki & Shirakata, 2019), with Gaspar et al. (2017) not finding any differences across auditory and haptic modalities and their combination. In the general drowsiness intervention literature (without a state detection system), sounds combined with vibrations were associated with reduced sleepiness compared to no alarms (e.g., Zhao et al., 2012). Potential startling, annoyance, alarm fatigue, or system deactivation with the long-term use of these alarms must be also considered (Marshall et al., 2007; Wilken et al., 2017). As observed in Vincent et al. (1998), drivers might adapt to and over-trust warnings; drivers were observed to let their drowsiness progress further before taking breaks, and the authors reported that the alarms were disregarded by the drivers. Such behavioral adaptations to interventions need further research.
Although control transition strategies can be beneficial in reducing crashes, stopping the vehicle can be abrupt and confusing, especially in cases of false drowsiness detection. To alleviate the negative impact of false alarms, validation steps can be incorporated, such as asking the driver whether they are sleepy as was done in Hayashi et al. (2021), or giving the control to the vehicle gradually as was done in Saito et al. (2016, 2020). However, the relevant design choices were not sufficiently studied. For example, in Hayashi et al. (2021), if drivers did not respond within 3.7 s, the vehicle navigated itself to the nearest parking area, while in Saito et al. (2020), this time was 10 s. The time for control transition might be insufficient in cases when the driver is drowsy or distracted or when the noise levels are high, which may lead to unauthorized control transition.
Majority of the warning interventions utilized multistage warnings to communicate the urgency of the alarms. Both single and multistage interventions showed positive impact on sleepiness, driving performance, and other relevant measures, but only one study compared the two (Gaspar et al., 2017). In this study, the frequency of lane departures due to drowsiness was lower with the multistage warnings than with single stage warnings. It must also be noted that two-stage warning systems were preferred by drivers over single-stage systems; participants in Horberry et al. (2022) found single stage warnings to be startling and potentially dangerous. Both Aidman et al. (2015) and Nishigaki and Shirakata (2019) showed that integrating different modalities in the second stage of their intervention helped with reducing sleepiness. Despite the limited empirical evidence, benefits such as reducing confusion or surprise can be expected with a multistage approach. As well, incorporating multiple modalities at different stages might help communicate urgency, but more research is needed.
Although distractions generally have negative effects in driving, they can be effective strategies for mitigating low arousal states leading to drowsiness. Atchley et al. (2014) showed that strategically timed verbal tasks were effective in improving driving performance and attention to road. Verbal interactions with the vehicle can also lessen passive fatigue symptoms, by imposing mental stimulation (Hayashi et al., 2021). For example, a similar cognitive task was found to be helpful in alleviating the effects of sleep inertia (i.e., impaired cognitive performance after waking up; Hilditch & McHill, 2019) in automated driving tested in a driving simulator study (Wörle et al., 2020).
Future Research Directions
Given the limited but emerging nature of the field, this review helped identify important gaps for future research. First, a large gap exists in understanding intervention efficacy with respect to different types of drowsiness. For example, automation strategies might worsen the passive sleep symptoms by further reducing the physical and cognitive stimulation. On the other hand, automation that allows drivers to sleep for a certain amount of time can address lack of sleep, especially when the driver loses their ability to control the vehicle, in which case sleep inertia needs to be addressed.
In terms of assessing interventions, studies have mostly used sleepiness or driving performance as metrics, while user evaluations of technology and its acceptance were not utilized as much. One critical element that is missing in the literature is determining the best thresholds and parameters for triggering interventions (e.g., time to control transition) for higher user acceptance. In addition to this approach, qualitative research methods (e.g., interviews) can provide useful insights to guide user-centric design. The impact of differences in demographics (e.g., age, professional vs. nonprofessional drivers) on user acceptance has also not been adequately studied.
False alarms can drastically impact the use of drowsiness mitigation systems. Although false alarms are mostly irritating, they can also be dangerous, especially with startling warnings or control transitions. Sudden changes in vehicle control can also create discomfort or motion sickness for passengers. Further studies are needed to explore the performance of these systems under false alarm conditions and to develop potential mitigation strategies similar to Saito et al. (2016, 2020) and Hayashi et al. (2021).
Another critical gap is whether driver overreliance on an automation system can alter their decision to sleep on the road. Even without such interventions in current vehicles, drivers have been reported to intentionally sleep on the road with existing advanced driver assistance systems (Casaletto, 2022; Fitzsimons, 2021; Little & Armstrong, 2021). These systems need to be designed to build driver mental models to support appropriate reliance. Lastly, long term impacts of any of these interventions (feedback, warning, and automation) have not been investigated.
Limitations
Given the limited number of studies, their generally small sample sizes, and the large variability across studies in objectives, designs, collected measures (e.g., heart rate vs. lane deviation), and outcome variables (e.g., sleepiness vs. driving performance), we were limited in our ability to generalize or aggregate findings. The studies screened in this review are limited to those that were published until April 7, 2022, when the search was conducted, to those that included the search keywords in their title, abstract, or keywords, and finally to those that were indexed in the databases searched and the reference lists of the included articles.
Conclusions
This paper explored in-vehicle driver drowsiness interventions that utilize driver state detection. It identified the critical research gaps in the emerging field of drowsiness monitoring and mitigation. Large efforts are being placed in improving detection accuracies. However, how we can best utilize these systems to guide driver attention for improved road safety is unclear, and how intervention outcomes might vary under different drowsiness types remains unexplored. More studies are needed to evaluate interventions, not only for their technical performance, but also for human-vehicle interactions such as potential for confusion, annoyance, and overreliance. Further, the variety in drowsiness measures and outcomes, limitations in sample size and demographical representations, and insufficient study details limit aggregation of existing findings.
Key Points
Much of the research on driver drowsiness monitoring focused on developing and implementing new technology or sensors and improving the accuracy of detection models, but not on how drivers would interact with these technologies. Drowsiness interventions that utilize state monitoring systems have been found effective in reducing sleepiness, driving performance, and acceptance. However, only a limited number of studies performed evaluations and further research is needed. Warnings were tested the most; followed by automation control transition and feedback displays. The impact of intervention design on different sleepiness types (e.g., passive fatigue) has not been investigated. More studies are needed to investigate the long-term effects of these interventions as well as their unintended effects like overreliance on automation, worsened drowsiness, or annoyance and dangers due to false alarms.
Supplemental Material
Supplemental Material - Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review
Supplemental Material for Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review by Suzan Ayas, Birsen Donmez, and Tang Xing in Human Factors
Supplemental Material
Supplemental Material - Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review
Supplemental Material for Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review by Suzan Ayas, Birsen Donmez, and Tang Xing in Human Factors
Supplemental Material
Supplemental Material - Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review
Supplemental Material for Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review by Suzan Ayas, Birsen Donmez, and Tang Xing in Human Factors
Supplemental Material
Supplemental Material - Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review
Supplemental Material for Drowsiness Mitigation Through Driver State Monitoring Systems: A Scoping Review by Suzan Ayas, Birsen Donmez, and Tang Xing in Human Factors
Footnotes
Acknowledgments
This work was supported by the National Sciences and Engineering Research Council of Canada (NSERC) through the Discovery [RGPIN-2016–05580] and the Canada Research Chair Programs. We would like to thank Teruko Kishibe for their feedback on the study protocol and Yihan Wang, Mohamed Abdelwahab, Cole Stotland, and Claire Zhang for their help with the screening phases.
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: S. Ayas, B. Donmez; data collection: S. Ayas, X. Tang; analysis and interpretation of results: S. Ayas, X. Tang, B. Donmez; draft manuscript preparation: S. Ayas, B. Donmez. All authors reviewed the results and approved the final version of the manuscript.
Supplemental Material
Supplemental material for this article is available online.
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
