Abstract
Voice assistants are increasingly popular in autonomous vehicles, with over one-third of United States drivers using them. However, more knowledge is needed for future technology advancements. A scoping review was conducted to understand factors that may impact the use of in-vehicle voice assistants. The PRISMA method identified relevant articles across three databases (Engineering Village, OneSearch, and IEEE Explore), resulting in 35 articles meeting the criteria. The paper explores three areas of intersection between autonomous vehicles and voice assistants: user interactions, user experience, and accessibility. These areas cover subthemes of tone, gender, anthropomorphism, trust, privacy and security, situation awareness, and visual inclusiveness. Results show differences in individual preferences and an understanding of how voice assistants can integrate seamlessly to meet drivers’ needs and expectations. The findings of this study provide insights into the user experiences of voice assistants (VAs) in today’s vehicles and the factors that impact their usage, which will help improve future in-vehicle systems to become more inclusive.
Introduction
Over the past decades, the introduction of voice assistants (VAs) in vehicles has become widespread. For example, in 2023, the number of VA users surpassed 4.2 billion and is projected to reach 8.4 billion by 2024 (He & Burns, 2022). In-vehicle VAs are AI-powered software agents that allow drivers to interact conversationally with them using interactive features and help conduct activities such as initiating phone calls hands-free (Braun et al., 2019). These systems promote drivers’ safety and convenience, as their hands do not come off the wheel while performing tasks such as playing music or checking directions. On the other hand, autonomous vehicles (AVs) are self-driving vehicles that allow drivers to engage in different tasks, such as playing on their phones while the vehicle is driving. Currently, most AVs on the road are semi-autonomous, meaning the vehicle can control most tasks but there are still times when the driver needs to intervene (SAE International, 2024).
Although VAs provide many benefits to drivers, there are still challenges of seamlessly integrating VAs into AVs and requires further research into driver-technology interactions. One example is the user experience of the VAs falling short of users’ expectations when serving commands or keeping the driver entertained. According to Kadam (2023), if a user has to repeat their request multiple times or if the VA responds inappropriately, the likelyhood of continued use decreases. Negative interactions with VAs have been found to lower the level of perceived human-likeness of the interaction. Additionally, many VAs are perceived to have a robotic tone, with ambiguity about their emotional state. However, subjective human-likeness ratings significantly improve with changes in tone, gender, and language used by the VA (e.g., Lee & Han, 2019; Wong et al., 2019). Despote these insights, there is a need to systematically understand challenges that may influence the usage of VAs. Therefore, this study aims to conduct a scoping review and synthesize the literature on factors influencing voice assistant integration and user experiences of voice assistants in autonomous vehicles.
Method
This scoping review used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009) to identify relevant research on VAs in AVs. The search was conducted using three databases in October 2023 (Engineering Village, OneSearch, and IEEE Xplore), using combined keywords related to VAs, and AVs. One example of the syntax was “(voice assistant) AND (autonomous vehicles).” The initial search yielded 253 results, and after removing duplicates, 198 articles remained. An additional three records were found through external sources. Only peer-reviewed academic journals and conference proceedings published in English within the last 5 years (January 2018 to October 2023) were included in the search. Based on the title and abstract review, 71 records were excluded, and after an in-depth evaluation of 127 articles, 92 additional records were excluded. Thirty-five articles were selected for the final data compilation (see Figure 1).

PRISMA diagram detailing the study selection process.
Results
Among the 35 articles, six subsections have been identified (Figure 2) according to their theme and progression. The Tone section explores the impact of VA’s tone on user perception and interaction (eight articles). Gender Characteristics examines user perceptions of VAs based on gender (six articles). Then in Linguistic Choice and Anthropomorphism, ten articles highlight the importance of using human-like natural language. Situation Awareness provides in-depth analyses of how VA interaction influences driver situation awareness (SA) (nine articles). In the Trust, Privacy, and Security subsection, nine articles focus on the critical role of trust in the VA-user relationship. Finally, Visual Impairment examines how VAs can enhance independence and accessibility for visually impaired individuals (nine articles). Some articles are double-counted due to an overlap in topics.

Distribution of selected studies across six key thematic areas.
Tone
Tone, such as the effects of voice tone and prompt formulation on user perception and preference, is essential in gaining drivers’ acceptance of VAs in AVs (Lotz et al., 2022). Studies have shown that drivers tend to respond more favorably to assertive VA tones (lower-dominant sounding pitch) compared to submissive ones (high-pitched and less powerful tone; e.g., He & Burns, 2022; Lee & Han, 2019; Wong et al., 2019; Yoo et al., 2022). For example, Wong et al. (2019) investigated the impact of VA assertiveness on driver attention in a simulated driving environment. Participants executed actions on a steering wheel when they heard a voice command while playing a mobile game. The result showed that the assertive voice was perceived as more urgent; thus, participants responded faster compared to the submissive tone. Additionally, Jestin et al. (2022) examined people's ratings on various navigation voices available. The findings showed that the assertiveness of a navigation voice also affected people’s willingness to use it daily. Participants perceived a more assertive voice as more likely to be used as an everyday navigation tool.
Gender Characteristics
The gender of in-vehicle VAs may significantly influence users’ perceptions and acceptance of VAs (e.g., Jestin et al., 2022; Lee et al., 2019; Park et al., 2023). Studies have produced conflicting findings regarding user preferences for VAs with different gender characteristics. For instance, Jestin et al. (2022) reported that users prefer male VAs. In the study, participants rated Joanna, a female-voiced assistant, and Jordan, a gender-ambiguous VA, as less desirable and more artificial than Matthew, a male-voiced assistant. On the other hand, the findings were opposite in Lee et al. (2019)’s study. Specifically, the authors conducted a study using four types of VAs: formal male, informal male, formal female, and informal female. A formal tone was considered to use proper grammar with a serious tone, avoiding contractions. In contrast, an informal tone was used for casual conversational speech, such as a friend giving directions. The study found that formal and informal female VAs with more relaxed tones were perceived as more likable and comfortable than male VAs with more formal tones. Particularly in sensitive situations or when providing emotional support, women VAs tend to be perceived as gentler and more compassionate (Dong et al., 2020). Male voices, in contrast, are usually associated with stereotypical traits such as authority and assertiveness. Park et al. (2023) suggested that preferences for VA gender may differ based on the user’s perception and needs. Dominant-toned male VAs are perceived as better suited to provide directions or warnings, as they tend to be more suitable for tasks requiring clarity, direction, or urgency. In contrast, submissive-toned female VAs are perceived as more appropriate for engaging in conversation or assisting with non-driving duties. This supports the stereotypical traits of genders when choosing a preference for what the driver likes to do within the vehicle.
However, some studies have found that VA gender does not influence user preferences. For example, in He and Burns (2022) study with four types of VAs: aggressive male, passive male, aggressive female, and passive female, there was no statistically significant effect on how VAs impacted the drivers. In addition, Tolmeijer et al. (2021) found no clear correlation between gender and voice pitch. The gender of VAs may have less effect on users’ preferences regarding tasks that do not require much interaction.
Linguistic Choice and Anthropomorphism
Compared to typically robotic-sounding counterparts, VAs designed with anthropomorphic qualities, such as human-like emotions and voices, have been seen to play a critical part in user acceptance and interaction (Large et al., 2019; Mahajan et al., 2021). Researchers have investigated drivers’ perceptions of anthropomorphism concerning intimacy, trust, and intentions to use in-vehicle VAs. For instance, Liu et al. (2023) surveyed 429 respondents about various VAs and found that VAs with anthropomorphism are perceived as one of the core competencies that inspire trust in users. Drivers tend to have more positive attitudes toward VAs with human-like psychological features (emotions, motivations, intentions) as they become more relatable to users. Ruijten et al. (2018) found similar findings by examining how anthropomorphism affects voice-based conversations in AVs. Drivers perceive conversational VAs as more intelligent and are likelier to use them during automated driving scenarios than graphical user interfaces. Moreover, Mahajan et al. (2021) study also supported the preference for conversational VAs over a traditional touchscreen interface. When testing a Wizard of Oz in-vehicle VA (such as providing traffic feedback and route navigation) against a traditional touchscreen on a driving simulator, the Wizard of Oz in-vehicle VA with a voice command interface was preferred due to its “human-like” qualities. Participants gave the in-vehicle VA the highest trust ratings and expressed a sense of pleasure and control over the AV. Anthropomorphized VAs also received significantly higher ratings regarding competence, warmth, and social presence compared to other types of VAs (Lee et al., 2019).
However, Park et al. (2023) noticed a gap between conversational human-like VAs and traditional VAs by stressing the importance of changing in-vehicle VAs’ human characteristics while taking the context of the conversation into account. As people all have individual characteristics, changing the in-vehicle VA characteristics allows for a more personalized interaction with the driver (Wang et al., 2023).
Additionally, the type of questions the VAs ask can impact the user’s interaction with them. Baburajan et al. (2022) compared VA responses to users answering open-ended and closed-ended questions. With open-ended questions, users can express their opinions in more detail, thus allowing the VA to understand user preferences, compared to closed-ended questions. However, closed-ended questions may provide more straightforward options. Depending on the situation, drivers should use their preferred language methods to enhance the overall driving experience in AVs (Large et al., 2019).
Situation Awareness
Drivers need to maintain situation awareness when driving. However, there are often external and internal distractions leading drivers to be unfocused and become out of the loop. By engaging drivers in stimulating and informative tasks, such as playing podcasts or audiobooks or engaging in conversations, VAs can create an environment for drivers to stay focused. Mahajan et al. (2021) found that participants who interacted with VAs demonstrated a greater ability to counter fatigue and respond more quickly to takeover processes than participants who did not. In contrast, VAs offer drivers more free time for activities like reading or working in semi-autonomous vehicles. In a supplementary study by Mahajan (2021) investigated the impact of in-vehicle assistants during takeover situations in semi-autonomous vehicles. The study involved participants driving with and without a VA during a takeover request (TOR). Participants who interacted with the VAs, such as event reminders, entertainment, or road information, exhibited more frequent glances toward the environment, including roadside objects, indicating a higher situation awareness. Additionally, participants were 31% more likely to make a timely takeover.
Trust, Privacy and Security
Perceived trust is a critical factor in people’s willingness to use VAs. Researchers found through a survey of 522 drivers that perceived usefulness and trust are the most critical factors in using autonomous vehicles (Choi & Ji, 2015). VAs can provide clear and concise information about the vehicle’s surroundings, allowing users a sense of security and control. For example, trust in VAs enables drivers to act as mediators between external and internal conditions, such as communicating weather conditions outside the vehicle on a foggy day (Wang et al., 2023). Hester et al. (2017) investigated drivers’ responses to different types of takeover requests in an automated vehicle. Four alert conditions were tested: no alert, sound alert, task-relevant alert, and task-irrelevant alert. The findings show that participants receiving task-relevant alerts were able to avoid colliding with the lead vehicle, whereas most participants failed to prevent a collision when the VA’s voice was irrelevant to the task. This highlights the importance of the information conveyed to the driver for a smooth transition between takeovers and autonomous driving. Interestingly, the study found no statistically significant difference in trust between the four alert conditions. This indicates that the type of alert did not significantly impact the drivers’ trust in the VA system. However, trust remains a critical factor. As Hester et al. (2017) point out, trust must be properly calibrated for the human operator to have the appropriate level of reliance on the vehicle. Participants’ ability to avoid collisions when provided with task-relevant alerts suggests that effective voice alerts can improve situation awareness and aid drivers during takeovers. This improvement in situation awareness and driver safety indirectly enhances the driver’s confidence and comfort in using the automated vehicle, which are key components of trust. Therefore, while the alert type might not directly impact trust levels, the overall improvement in safety and situation awareness facilitated by relevant alerts can foster greater trust in the VA system (Hester et al., 2017; Large et al., 2019).
Privacy and security concerns are among the core factors hindering people’s trust and acceptance of VAs in AVs. People tend to be concerned about how their data is being handled, with two of the leading privacy fears including unauthorized access to data (through hacking or weak security) or the data being used improperly without the user’s permission (Hataba et al., 2022). A study by Maier et al. (2023) explored the concerns of digital natives regarding privacy when using VAs. The experiment involved participants conversing with a conversational VA and found that even people aware of potential privacy threats still feel a sense of limited control over their data. This suggests a complex relationship with trust in technology: while people might be cautious about sharing personal information online, cumulative data can still tell much about an individual (Kadam, 2023). Building trust requires more than just focusing on the technical aspects of data security (Liu et al., 2020); however, limited research is known about the privacy of in-vehicle VAs in autonomous vehicles.
Visual Impairment
AVs have become an emerging solution for people with visual impairment, offering convenient transportation options (Brewer & Ellison, 2020). However, challenges still arise, including a lack of awareness and understanding of AVs, hindering decisions about using VAs (Brewer & Kameswaran, 2018). Previous studies reveal that people with no driving experience are less inclined to consider AVs due to their lack of familiarity with such technology. Although individuals with visual impairments have limited sight, they feel the need to be in control of the vehicle, reporting that control would provide them with a sense of safety (Ataya et al., 2021; Brewer & Kameswaran, 2018; Bynum et al., 2023). Therefore, VAs should prioritize user control by providing additional voice prompts while driving to increase situation awareness or offer more detailed information about the driving environment (Bynum et al., 2023). Other studies have also shown that people with visual impairment feel more trust when they can customize their preferences, such as recalling their trip settings. Furthermore, Brinkley et al. (2020) found a strong desire for independence and mobility among people with visual impairments. This suggests that future designs should investigate how to help people with visual impairment find the right balance between control and awareness to assist them in reaching their desired destination. In addition, Fink et al. (2021) discuss potential policies to meet the needs of people with visual impairment. The focus areas include AV accessibility and safety features such as audio and tactile feedback, accessible emergency override systems, and customizable speech-to-text features (Ataya et al., 2021; Fink et al., 2021; Lee & Kang, 2023). Additionally, people with visual impairment should have input control over their destination, route, and speeds (Brewer & Ellison, 2020). These efforts promote independence in AVs and prioritize safety and individual needs.
Discussion and Conclusion
This study conducted a scoping review of the factors that impact VAs’ usage and addressed the user experience of VAs in AVs to create greater human interaction with technology devices in a driving environment. VAs have the potential to significantly influence AVs, particularly in how their designs can impact user perception and interaction in a driving environment. Further research is needed to fully understand this relationship and develop voice assistants tailored to different users by offering a variety of features such as voice options, communication styles, and information details. For instance, research suggests that a dominant male voice may be more effective for navigation instructions, while a gentler female voice might provide better emotional support during unexpected events. Other examples include an assertive tone, which would be particularly beneficial for urgent situations or clear instructions. However, overly assertive tones might hinder a driver’s ability to regulate emotions during manual driving. This highlights the need for VAs to adapt their communication style based on the situation.
Anthropomorphizing VAs, giving them human-like qualities, can build trust and relatability between humans and in-vehicle VAs. Conversational VAs are perceived as more intelligent and suitable for autonomous driving scenarios than traditional interfaces. However, the level of anthropomorphism should be carefully adjusted based on the specific task and context of the interaction. Excessive anthropomorphism might unsettle drivers, potentially undermining trust in the system. Drivers may be able to choose these settings and select their desired characteristics in their VAs. For example, visually impaired individuals feel the need to experience control of their AVs. Although unable to physically see, with the VAs giving prompts about their surroundings, drivers tend to feel safer and more aware of the driving environment. This principle applies to drivers with other disabilities as well in AVs, emphasizing the importance of control over the vehicle.
Given that individuals have different preferences, challenges, and needs, VAs may include customizable settings, including speaking styles and information details, to address these varied requirements. Future research may explore multimodal interactions involving voice, touch, and visual icons to help drivers in the driving environment and develop VAs in AVs that can cater to user needs. The findings of this study provide insights into the user experiences of voice assistants (VAs) in today’s vehicles and the factors that impact their usage, which will help improve future in-vehicle systems to become more inclusive.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
