Abstract
Artificial Intelligence (AI) has been integrated into emerging technologies, such as automated vehicles (AVs), to help the vehicle sense the driving environment and make maneuvers. However, AI is imperfect and susceptible to errors. In partially AVs, the human driver is still needed to oversee the vehicle’s actions and take over when AI makes an incorrect identification. Explainable AI is needed for drivers to better calibrate their understanding of AI’s capabilities. The goal of the current study was to determine what AI explanation best supports drivers’ mental models and understanding of the partially AV when encountering various driving actions and road signs. Thirty-six participants were included in this 6 (driving scenario) × 3 (timing of the explanation) within-subjects design. Drivers’ understanding and subjective evaluations of the explanations were collected for 18 trials. This study has significance in the fields of emerging technology in transportation, human understanding of AI, and cybersecurity.
Introduction
Artificial Intelligence (AI) has been integrated into emerging technologies, such as automated vehicles (AVs), to help the vehicle sense the driving environment and make maneuvers. However, AI is imperfect and susceptible to cyber-attacks (Bonifacic, 2022; Yu et al., 2019). In partially AVs, the human driver is still needed to oversee the vehicle’s actions and take over when AI makes an incorrect identification (SAE, 2021). Previous studies have investigated how humans perceive the capabilities and limitations of AI in AVs, specifically within computer vision techniques, and found that people did not have an accurate understanding of how AI processes images (Garcia et al., 2022). These findings indicate that explainable AI (XAI) is needed for drivers to better calibrate their understanding of AI’s capabilities. The goal of the current study was to determine what AI explanation best supports drivers’ mental models and understanding of the partially AV when encountering various driving actions and road signs.
Method
A total of 36 participants were recruited from Rice University’s community with emails and flyers. Participants were required to have a valid U.S. driver’s license, and be between the ages of 18 and 65 to participate in the study. Each participant received a $20 Amazon gift card for their participation in this one-hour study, approved by the Rice University Institutional Review Board.
This study used a 6 (scenario type: lane changing, entering a construction zone, approaching a stop sign, stopping at a multi-way stop sign, yielding at a roundabout, signaling with the horn) x 3 (timing of explanation; before, during, after the vehicle’s action) within-subjects design to measure drivers’ understanding of the explanation, their response time to the explanation, and subjective evaluations of the explanations (appropriateness of the length and timing, and the usefulness and reasonableness of the explanations).
The experiment was presented through E-Prime 3.0 software (Psychology Software Tools, 2016) and Qualtrics on a 24-inch Dell computer monitor in a quiet room. The experimental trials assessed drivers’ ratings of different explanations that were provided by the partially AVs. Participants were told that they would be driving a partially AV, including the automated features and support it would provide.
Eligible participants experienced a total of 20 trials in E-Prime. The first two were practice trials followed by the 18 experimental trials. For each experimental trial, participants were presented with one of the six scenarios and then one of the three explanation timing types. For all 20 trials, a description of the scenario was presented visually on the screen. Participants were tasked to read the scenario aloud to ensure they read it before pressing the spacebar to continue to the explanation. The explanation was both visually and audibly presented to the participant. Participants were tasked to press the spacebar as soon as they understood the explanation; the time it took them to press the spacebar served as the measure of response time. Participants then used their own words to restate the explanation given. After, participants were required to rate their agreement or disagreement with four statements measuring their subjective evaluations of the explanation on a 7-point Likert scale and explain the reasoning for their rating. At the end of each trial, participants were able to leave any other comments about the explanation before continuing.
Outcome
A multivariate analysis of variance (MANOVA) was conducted on the reasonableness, usefulness, appropriateness of length, and appropriateness of timing for the explanations. A significant main effect of explanation timing on the appropriateness of length showed that participants rated the explanation to be a more appropriate length when they received the explanation during the action than when they received the explanation before the action. Additionally, response times were significantly faster when the explanations were before the action than when they were after the action. Together, this supports that drivers may not need a countdown in the explanation.
A significant main effect of explanation timing on usefulness showed that participants rated the explanations to be more useful when they received the explanation during the action than when they received the explanation after the action. Another significant main effect of explanation timing on the appropriateness of timing showed that participants rated the explanations to have a more appropriate timing when they were before and during the action than after the action. These findings may be related to drivers being accustomed to using navigation systems, which provide directions before the action happens.
Similarly, a significant interaction between scenario type and explanation timing on the appropriateness of timing showed that participants thought the explanations before and during the action were at a more appropriate time than after the action for the construction zone, lane changing, roundabout, and stop-sign scenarios. Participants also thought that the timing was more appropriate before the action than during the action for the horn scenario. These trends may not have been seen for the multi-way stop sign because it is subjective which vehicle arrives at the intersection first, and drivers wanted to learn how the vehicle processes this intersection (Ouellette & Wood, 1998).
A significant interaction between scenario type and explanation timing on reasonableness showed that participants rated the explanation to be more reasonable when it was provided before the action than after the action. Additionally, when the explanation was provided after the action, participants rated the construction-zone scenario more reasonable than the lane-changing scenario. This may be due to the complexity of the lane-changing scenario, and that drivers do not want to be overloaded with information after the action has already happened (Charlton, 2023).
Conclusion
Overall, people need a basic understanding of the system’s abilities to be ready to take over when the system fails. This knowledge will allow human users to stay safe and vigilant when using AI systems, especially while driving. Our findings suggest that drivers prefer explanations that occur before or during the action rather than after the action happens. Drivers also displayed behaviors aimed towards learning about how the AV processes information. Future studies can include driving scenarios where the partially AV does not perform as expected to determine if drivers’ preferences regarding the explanations change. Results may also be based on familiarity with certain intersections, such as roundabouts. Future studies can measure drivers’ familiarity and knowledge of different road signs and intersections. This study has significance in the fields of emerging technology in transportation, human understanding of AI, and cybersecurity. Vehicle manufacturers can incorporate explanations into their current and future partially AVs for the benefit of drivers.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by the National Science Foundation under grants #2241704, and #2245055, and the Ken Kennedy Institute 2023 Shell Graduate Fellowship at Rice University. Any opinions, findings, and conclusions or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
