Abstract
Autonomous vehicles (AVs) have major potential to save lives in traffic, and when implemented in ride-hailing, they can potentially reduce traffic congestion and save labor costs. However, the majority of research in human–vehicle interactions (HVI) focuses on safe pedestrian crossing. In a ride-hailing scenario, it is important to ensure that the AV and the rider can correctly identify one another and negotiate mutual terms for the pickup. This article explores how human interaction with an AV in a ride-hailing scenario can be improved, concentrating on external interaction. A Wizard-of-Oz study was conducted in which 29 participants were asked to negotiate a pickup with an AV in a real urban traffic scenario. How aspects, such as the approaching speed of the vehicle, pickup location for the rider, as well as light panels on doors to indicate where to enter the vehicle, may affect the HVI, and trust in the AV in general, were investigated. Participants did not react differently when the vehicle approached at different speeds, and the stopping location could create confusion if the vehicle stopped after passing the rider. The findings revealed that the majority of participants failed to understand the purpose of the light panels.
Keywords
Annually, road traffic accidents claim approximately 1.34 million lives globally, with vulnerable road users, such as pedestrians, cyclists, and motorcyclists, accounting for over half of these fatalities ( 1 ). In 2019, pedestrian deaths reached 6,205, underscoring a critical safety challenge in urban mobility ( 2 ). Recognizing that human error contributes to 94% of serious traffic incidents, the advancement in autonomous vehicle (AV) technology emerges as a potential solution to significantly reduce these numbers ( 2 , 3 ).
The estimated combined investment in AV technology exceeded $200 billion in 2021 ( 4 ). Concurrently, the ride-hailing market has expanded significantly, with Uber providing 6.8 billion rides in 2019 ( 5 ). There is a positive sentiment toward the application of AVs in ride-hailing ( 6 ), and it could reduce labor costs as well as improve the inefficient use of privately owned vehicles ( 7 ). Projections indicate that by 2030, autonomous ride-hailing could generate revenues as high as $20 billion in Los Angeles, CA ( 8 ).
Despite the high level of investments in AV technology, there is limited research on human–vehicle interaction (HVI) ( 9 ), although it is arguably a topic of high relevance, in particular, for designing a driverless ride-hailing service (also known as robotaxis and autonomous ride-hailing). The integration of AVs into daily traffic alters traditional road user interactions, removing nonverbal communication methods, such as eye contact and gestures, crucial for pedestrian safety and navigation ( 10 ). However, existing studies predominantly examine pedestrian responses to AVs, leaving a broad area of HVI within the ride-hailing context underexplored ( 9 ).
Insights from studies on pedestrian–AV interactions could offer a foundational framework for designing driverless ride-hailing systems; however, the specifics of the interaction differ significantly. For pedestrian–AV interactions, the primary aim is to safeguard human lives and ensure the smooth flow of traffic ( 3 ). In contrast, ride-hailing interactions focus on facilitating a clear identification and agreement process between the AV and the rider for a secure pickup and accurate delivery to the destination ( 11 , 12 ). In addition, debate remains on whether AV technology in general needs to adopt explicit modes for interacting with pedestrians because there is no generally agreed-on framework for existing pedestrian–driver interactions ( 13 , 14 ). This is different for ride-hailing services, where smooth and rapid pickups often rely on the driver and rider explicitly engaging in verbal (over the phone) and/or gestural (waving for identification) interactions ( 15 ). Given the fundamental differences in interaction requirements between pedestrian–AV interactions and those within ride-hailing scenarios, the applicability of insights from pedestrian–AV studies to the ride-hailing context is inherently limited. This distinction underscores the necessity for the exploration and design of approaches that specifically address the unique challenges and opportunities of enhancing AV–rider interactions in ride-hailing services.
The aim of this article is to explore human interaction with AVs in ride-hailing settings, emphasizing external interactions. Drawing on previous studies of pedestrian–AV interactions, three research questions were formulated to guide the investigation. The first two research questions (RQs) are designed to explore how individuals perceive implicit interactions, such as the vehicle’s movement, and explicit interactions, such as the use of external displays.
RQ1: How do riders perceive and react to implicit interactions in the form of an AV’s behavior in a ride-hailing scenario?
RQ2: How do riders perceive and react to explicit interactions with an AV in a ride-hailing scenario?
Based on insights gained from observing participants interacting with an AV and discussing their perception, the aim is to understand how the interaction design can be modified to make it easier for the participant to understand. Therefore, the following third research question was introduced.
RQ3: How can we design the interaction between the rider and AV in a ride-hailing scenario to make it easier to understand for the rider?
To answer these RQs, an exploratory Wizard-of-Oz study was designed and conducted in which the participants were asked to negotiate a pickup by an AV in real urban traffic. The focus was on aspects that could affect rider pickup by a robotaxi, including the approach of the vehicle, its stopping location, and light panels on doors that indicate where and when riders could enter the vehicle. In addition to observing the physical behavior and the gaze of the participants during the vehicle’s approach and pickup, interviews were conducted to understand their perception of the interaction and draw insights into how it can be designed.
In the cases studied, the speed at which the vehicle approached had minimal effect on the rider’s interaction with the car. However, inaccuracies in the vehicle’s stopping position undermined the rider’s perceived trust in the autonomous system. In addition, basic light signals were effective in directing riders to the appropriate entry; more advanced lighting solutions might be necessary for clearly communicating the status of the vehicle’s doors. The diversity in perceptions of the robotaxi’s utility and feasibility highlights a complex landscape of expectations, from concerns over its flexibility to its potential socioeconomic benefits. From these insights, this article contributes to the current understanding of how individuals interact with AVs in a ride-hailing context.
Related Work
Interaction design involves crafting a communication exchange between a user and a product, system, or service, aiming to minimize negative experiences while amplifying positive ones ( 16 , 17 ). To design the interaction modalities for AVs, we need to understand how humans generally interact with vehicles in traffic. In this section, this article is discussed related to pedestrian–vehicle interaction, pedestrian interaction with AVs, and the initial findings of interaction design for robotaxis.
Pedestrian–Vehicle Interaction
Understanding the nuances of pedestrian–vehicle interaction is essential for designing AVs that can safely and effectively integrate into urban traffic environments. Pedestrians traditionally rely on a range of nonverbal cues, such as eye contact, head movements, and hand gestures, to convey their intentions to vehicle drivers ( 10 , 18 ). This interaction is inherently complex and culturally variable, indicating the need for AV designs that can adapt to diverse communication contexts ( 18 ). Vissers et al. highlight five critical factors influencing traffic interactions: (1) traffic rules; (2) expectations; (3) individual differences; (4) behavioral adaptation; and (5) informal rules coupled with nonverbal cues ( 19 ). Given the fixed nature of traffic rules, individual differences, and behavioral adaptations, the design focus for AVs primarily targets managing expectations and enhancing nonverbal communication strategies ( 20 ).
For pedestrians’ expectations, Schieben et al. identify four crucial aspects that pedestrians consider when predicting a vehicle’s future actions: (1) the vehicle’s driving mode; (2) its upcoming maneuvers; (3) its awareness of the surrounding environment; and (4) its capability for cooperation ( 20 ). These aspects form the basis on which pedestrians assess their safety and the vehicle’s intentions. Translating these considerations into the design of autonomous ride-hailing services unveils four corresponding user needs: (1) clearly identifying the vehicle; (2) understanding the current status of the vehicle; (3) knowing that the car is aware of the passenger; and (4) knowing the intent of the vehicle ( 15 ). This translation from pedestrian expectations to specific user needs underscores the importance of designing AV interfaces that are intuitive and informative, facilitating a seamless and safe interaction between the pedestrian (or rider) and the vehicle.
Pedestrian–AV Interaction
The design of interactions between pedestrians and AVs spans a spectrum from implicit to explicit mechanisms ( 13 , 21 ). Implicit interaction relies on nonverbal cues, such as a vehicle’s deceleration, signaling an invitation for pedestrians to proceed ( 13 , 22 ). In contrast, explicit interaction employs direct communication methods, including lights, displays, and projections, to convey the vehicle’s intentions to pedestrians and other road users ( 13 , 22 ). Another way of categorizing interactions is external ( 14 , 23 ) and internal (or in-vehicle) ( 24 ). Pedestrians typically engage with external human–vehicle interactions (eHVI), the introduction of robotaxi services necessitates the consideration of internal human–vehicle interactions (iHVI) as well. This differentiation underscores the complexity of interaction design in AV systems, where ensuring safety and clarity in communication becomes paramount.
Implicit External Human–Vehicle Interaction
For pedestrians making a crossing decision in front of an AV, multiple studies have shown that pedestrians deduce the vehicle’s intentions primarily through its movement (25–28). For instance, Palmeiro et al. conducted a Wizard-of-Oz study and found no significant difference in the gap distance between a pedestrian and an AV compared with that of a conventional vehicle, suggesting that pedestrians rely on the vehicle’s movement rather than explicit signals from a driver ( 29 ). Similarly, Clamann et al. identified gap distance as the primary factor influencing pedestrians’ decisions to cross, followed by the vehicle’s speed and the surrounding traffic density ( 25 ). In addition, Rothenbücher et al. observed that individuals tend to follow traditional crossing patterns unless there was a “breakdown of expectations” ( 27 ).
These findings reinforce the reliance on vehicle movement for interpreting AV intentions during road-crossing scenarios; they also illuminate a research gap in understanding these dynamics within ride-hailing. Unlike the straightforward goal of avoiding vehicles for safety during road crossings, ride-hailing interactions involve a nuanced goal where riders seek to safely approach the vehicle. This distinction underscores the need for further investigation into how implicit vehicle movements influence human–AV interactions in ride-hailing scenarios.
Explicit External Human–Vehicle Interaction.
Evidence from numerous studies suggests that vehicle movement alone may not suffice for clear and successful interaction between pedestrians and AVs, highlighting the need for explicit eHVI (30–32). Investigations into external human–machine interfaces (eHMIs) for AVs reveal that interfaces explicitly designed for communication markedly enhance the efficiency and safety of pedestrian–AV interactions compared with systems lacking these interfaces ( 9 ). In addition, Habibovic et al. demonstrated that even brief training sessions on these external interfaces significantly improve people’s comprehension of the signals conveyed ( 33 ).
When exploring the design of these interfaces, Dey et al. categorized 70 eHMI concepts into five communication modalities: (1) visual; (2) auditory; (3) haptic; (4) body language; and (5) others that defy traditional classification ( 34 ). Among these, abstract visual communication emerged as the predominant modality in pedestrian–AV interaction studies ( 34 ). Despite the general consensus on the role of eHMIs in clarifying AV intentions, the research community has yet to converge on the optimal type and modality for eHMI communication ( 34 ). In addition, pedestrians tend to favor interfaces that resemble familiar signals, such as traffic lights and signs, indicating a preference for conventional cues (35–38).
Of note, leading companies in autonomous ride-hailing—Google, Uber, and Lyft—have innovated in this space, filing patents for car-to-pedestrian communication using external displays. These patents illustrate a range of approaches from signaling a vehicle’s intent to pedestrians, as seen in Google’s patent ( 39 ), to more interactive solutions like Uber’s incorporation of a virtual driver and directional arrows ( 40 ), and Lyft’s display of the customer’s name on the windshield for easy identification ( 41 ).
Recent research has extensively explored explicit eHMIs, focusing on enhancing safety and user experience; the efficiency of these interfaces remains less examined ( 34 ). In addition, the specific application of external interfaces in facilitating interactions between riders and AVs in a ride-hailing context remains an area with limited available research ( 6 ).
Studying Implicit and Explicit eHVI
To study the nuances of eHVI, researchers employ a spectrum of methodologies, from monitor-based studies, which offer basic interaction scenarios, to immersive virtual reality (VR) environments that provide a more lifelike experience ( 9 ). Although a select few investigations incorporate real physical prototypes to assess eHVI, the deployment of high-fidelity driving simulators is often advocated to ensure optimal control and safety during the studies ( 42 ). However, studying human interactions with AVs in actual traffic conditions is deemed ideal for achieving the broadest environmental applicability of the research outcomes ( 9 ).
Given the current technological constraints in deploying AVs, the Wizard-of-Oz methodology emerges as a pivotal tool for investigating HVIs in realistic settings while ensuring participant safety. This approach involves concealing a human driver from the participants’ view, for example, using a car seat costume, to simulate a driverless vehicle experience, often referred to as the GhostDriver protocol ( 27 , 43 ). Introduced by Rothenbücher et al., this technique effectively convinced 87% of study participants of the vehicle’s autonomous operation, with 80% noting the absence of a visible driver ( 27 ). These findings underscore the method’s potential in exploring human interactions with presumed AVs. Hensch et al. found that 79% of their participants did not discern any driver in a Wizard-of-Oz AV; half genuinely believed in the car’s autonomous capabilities, suggesting that perceptions of autonomy are influenced by multiple factors beyond the absence of a driver ( 44 ).
Design and Validation of Driverless Ride-Hailing
There is a small but growing body of literature directly addressing the design and validation of robotaxi services ( 45 ). Kim et al. identified potential pain points for riders in South Korea and conducted a Wizard-of-Oz study to evaluate their effect on the robotaxi user experience ( 11 ). To address the challenges associated with safe and unambiguous pickup, they proposed the concept of a virtual taxi stand, “a designated place where a passenger may board or disembark from an autonomous taxi” ( 11 ). The exact shape and location of the virtual taxi stand is explicitly communicated to the rider using augmented reality overlays within the ride-hailing app. However, their study primarily focused on explicit eHVIs through the digital interfaces of a smartphone app, leaving implicit interactions unexplored.
To evaluate the role of both explicit and implicit eHMIs for ride-sharing, Hoggenmueller et al. conducted a study using a 360-degree video of a complex urban scenario presented to participants in a VR environment ( 12 ). Their findings emphasize the necessity for holistic design approaches to enhance AV–pedestrian interactions, noting participants’ preference for a combination of implicit and explicit cues ( 12 ). However, because their study was conducted in VR, questions remain about the real-world applicability of the results and the potential influence of participants’ varying levels of previous VR experience on their perceptions during the study.
Lee et al. highlighted that making robotaxis clearly stand out in traffic could lead to higher user acceptance because of the positive sentiment of cutting-edge technology ( 46 ). A generally positive attitude toward the deployment of AVs as driverless taxis was also registered among taxi service consumers in the UK ( 6 ). Both studies focus on customer acceptance and trust in robotaxi service, but do not delve into the dynamics of rider interactions with the vehicle.
Exploring the in-vehicle interaction, Meurer et al. observed rider behaviors concerning luggage management, noting a hesitance to use the luggage compartment without clear communication methods with the vehicle—a concern absent in interactions with human-driven taxis ( 47 ). In addition to interacting with the robotaxi, Yoo et al. described the need for riders to communicate with other road users (e.g., by honking) to alleviate any anxieties related to traffic situations that were perceived as dangerous (e.g., cut-ins, reckless driving, protruding vehicles, and external horn sounds) ( 48 ). However, these studies did not address anxieties linked to the eHVI between the passenger and the robotaxi.
Methodology
To answer the three RQs, an exploratory study was conducted. This approach is reasonable because the aim was to explore “how” participants interact and perceive their interaction with driverless vehicles ( 49 ) in a ride-hailing context when presented with implicit (RQ1) and explicit (RQ2) interaction cues and to identify potential issues as well as means that can make interaction easier to understand (RQ3). Because there are no commercial driverless ride-hailing services available in the region, a Wizard-of-Oz setup was utilized. This approach is common in the human–computer interaction field and has been utilized previously in research on driverless vehicles ( 50 ). In the following sections, the setting of this article will be outlined before discussing the study procedure, data collection and analysis, followed by the description of the study population.
Setting
To conduct the study in a realistic setting, the vehicle (https://adl.cs.ut.ee/lab/vehicle) from the Autonomous Driving Lab at the University of Tartu (a Lexus RX 450h modified by AutonomouStuff [autonomoustuff.com]) was used (Figure 1). The vehicle has a visible lidar system and signs indicating “Autonomous Driving Lab” on both sides of the car (Figure 1a). To ensure that the vehicle would operate within the confines of the study and to ensure the safety of the study participants, other road users, and pedestrians, the car was not used in autonomous mode. Instead, it was driven by a driver obscured by a seat costume (Figure 1b), creating the illusion that the car operated without a driver.

Showing: (a) University of Tartu’s Autonomous Driving Lab’s car; and (b) a driver sitting inside a seat costume.
For the location for the study, a straight side street within a residential area of Tartu, Estonia, was chosen (Figure 2). This choice was made because: (1) ordering a ride-hailing service from home (i.e., the residential area) is a common scenario; (2) the location had a parking spot on the street that would allow the vehicle to stop safely and the study participant to interact with the vehicle without obstructing traffic; and (3) the participant could see the car from a distance, which would allow their gaze to be studied when the car was approaching.

Showing: (a) participant’s field of view when waiting for the vehicle to arrive; and (b) map of the study location with the direction from which the vehicle will approach the participant.
To recruit participants for the study, an email invitation was created. The invitation contained information about the nature of the study, its goal, and procedure. In addition, it contained a link where interested individuals could book one of the 40 time slots on a first-come, first-served basis. Bolt, a well-established ride-hailing company in Estonia, sent emails to 2,000 active customers of their service. A total of 34 individuals were invited to book a time slot, and 29 participated in the study.
To answer the three main RQs outlined in the introduction, different implicit and explicit interaction cues were selected and distributed to participants between the different conditions (Table 1). First, to explore how the vehicle behavior can influence the external interaction of a rider as well as their perception about the interaction with a driverless vehicle in a ride-hailing scenario (RQ1) two conditions related to the behavior of the vehicle were selected: (1) the speed it approached the study participant; and (2) the stopping position related to the waiting location of the participants. This choice was made because the speed and stopping location are often considered common modes of implicit interaction (as described in section Pedestrian-AV interaction). In addition, in a ride-hailing context: (1) riders have to be able to spot the vehicle (for which the speed of an approaching vehicle could be a cue); and (2) because some form of explicit signaling between the driver and rider may be used to reduce the ambiguity over the exact pickup location, the implicit act of stopping location may influence how the rider perceives the vehicle’s intent. Related to the speed of the vehicle, regular speed (i.e., the common speed that most drivers would utilize at the chosen location) and low speed (i.e., an approach that would resemble a novice or cautious driver) were differentiated. For the stopping position of the vehicle, the vehicle stopping before reaching the waiting location of the participant, stopping at the position where the participant was waiting, and stopping after having passed the waiting participant were differentiated.
Study Conditions for the Research Questions (RQ)
Note: Two different states where designed for conditions ‘Speed’ and ‘Doors’ whereas three states were developed for conditions ‘Stopping position’ and ‘Door indicator’. na = not applicable.
To additionally explore how explicit interaction can affect the external interaction of a rider as well as their perception about the interaction with a driverless vehicle in a ride-hailing scenario (RQ2), two conditions were selected: (1) the doors being locked or open; and (2) a light indicator next to the rear passenger door. Because riders may choose to take the front or back seat, if the most commonly suggested explicit modality, an abstract light panel ( 34 ), could be used to guide the participant toward the back seat while indicating whether the vehicle doors are locked was explored. A small light panel was selected (Figure 3) and utilized in three separate states: (1) the light being off all the time; (2) the light being red and turning green after the car had stopped; and (3) the light being green all the time.

Light signal (currently red) used in the study.
Study Procedure
Participants were distributed equally between the study conditions (Table in Appendix A). At the beginning of each study (Figure 4), participants were asked to fill out a questionnaire (see section Data Collection for details on the content of the data collection instruments). Then, the study setup was introduced, and the participants were asked to imagine that they had just ordered a ride-hailing vehicle to pick them up. They were instructed to wait for the car to arrive and enter the car. The door they should use was not specified. Then, the participants were helped to put on the eye-tracking device, calibrate it, and signal the ghostdriver to come to the location. The driver then approached the waiting location of the participant at a regular or slow speed and stopped before, at, or after the waiting location, depending on the study condition. The light signal at the rear passenger door was operated via a remote according to the study condition by the study investigator.

Study procedure.
After the car had stopped, the participant approaching the vehicle and entering it was observed. After they had entered, a short semi-structured post interview was conducted. The participants were randomly assigned to the different study conditions to ensure that multiple participants experienced all conditions (Figure 5).

Distribution of participants per study conditions.
The study design was submitted to and approved by the Research Ethics Committee of the University of Tartu (application code 332/T-26).
Data Collection
To answer the three RQs stated in the Introduction section, data were gathered from multiple sources, including a pre-study questionnaire, eye-tracking glasses, a video recording of the study procedure, and a semi-structured post-study interview.
The aim of the pre-study questionnaire was to gather information about the participants related to their demographics and their frequency of using a ride-hailing service. For the latter, a scale was utilized that included six options: (1) I don’t use it; (2) once a year; (3) a few times in a year; (4) 1–3 times a month; (5) once a week; and (6) several times a week. The questionnaire results served as a qualitative datapoint that provided additional context for the analysis.
For eye-tracking, Tobii Pro Glasses 2 were used, which recorded the participants’ field of view as a video and the location they were looking at. The eye-tracking video served three primary purposes: (1) to assess whether or not participants noticed the ghostdriver; (2) whether and how they followed the trajectory of the approaching vehicle; and (3) whether and when the participant looked at the light panel (RQ2). The gaze of 24 participants was successfully recorded. For five participants, this data point was missing because of participants not being able to wear the devices or because of technical issues.
In addition, the study was video-recorded utilizing a tripod, which was set up behind the participant. The aim was to register any visible reactions to the vehicle’s movement (RQ1) and choice of door (RQ2).
In addition, a semi-structured interview was conducted at the end of the study. At the beginning of the interview, participants were informed that the car was not really driving autonomously and introduced the Wizard-of-Oz concept to ensure that all participants had equal amounts of information. Then, they were asked if they noticed a driver or thought that the vehicle was driving autonomously (e.g., “Did you think the car was driving autonomously? Did you notice the vehicle was missing a driver?”) to assess the feasibility of the Wizard-of-Oz setup. Then, participants were asked about their perception of how the vehicle approached them (e.g., “What was the car movement like? How did this movement make you feel?”) and about the stopping location of the vehicle (e.g., “Describe what happened when the vehicle stopped.”), addressing RQ1. In addition, how they decided which door to approach and how the light signal might have influenced their decision was discussed (e.g., “How did you decide which door to open?”“Did you notice a light signal?”“What do you think it meant?”), addressing RQ2. To identify potential means for improving the understandability of the interaction (RQ3), they were asked about their general perception of the study (e.g., “Was there a moment during the study when you would have liked the car to behave differently?”). Finally, they were asked to rate their perceived safety during the study on a Likert scale (adopted from [ 44 ]) from one (completely agree) to seven (completely disagree) and explain their choice. The aim was to identify whether the vehicle behavior (RQ1), external interaction (RQ2), or any other aspect (RQ3) influenced their perception of the safety of the driverless vehicle. Therefore, the survey was utilized as an additional qualitative rather than quantitative data point that aided the analysis of the interviews and observations. All interviews were conducted in the participants’ native language, Estonian, by the study’s main investigator, an Estonian native speaker. A full list of interview questions can be found in Appendix B.
Data Analysis
The recordings of the interviews were utilized as the main data source for answering the three RQs. The interviews were analyzed in the language they were conducted in and only translated quotes from Estonian to English for reporting. The translation was carried out by the main investigator and verified by another author of this article, who is also an Estonian native speaker.
To analyze the interviews, a procedure was followed based on inductive thematic analysis ( 51 ). In particular, the data was familiarized, and codes were created based on the three main RQs (vehicle movement, stopping location, light signal, and noticing the driver). Then, the codes were applied to the interviews and using multiple iterations, the coding scheme was expanded until the saturation point was reached. Then, the video recordings were used to confirm statements made by participants during the interviews. In particular, focusing on whether participants were actually looking at the vehicle when it was approaching, whether there were any signs of them recognizing the driver in the seat costume, and whether they had noticed the indicator light at the rear passenger door. Similarly, the eye-tracking data was utilized to confirm participant statements. However, the tracking was often unreliable; instead, the field of view of a participant was utilized, which was also recorded by the eye-tracking device. The questionnaire results finally served as additional context and were utilized to aid our understanding of the qualitative findings.
Study Population
In total, 29 individuals participated in the experiment, 15 male and 14 female. The age was from 19 to 57 years, the average being 35 years (standard deviation [SD] = 10.88). Related to their frequency of using ride-hailing services (Figure 6), most participants reported using the service “1–3 times a month” (52%). About a quarter of the participants noted that they used it more than that, with five participants (17%) stating that they used it “Once a week” and two (7%) stating that they used ride-hailing services “Several times a week.” The remaining seven participants (24%) answered that they used ride-hailing services “A few times a year” (n = 6 or 21%) or “Once a year” (n = 1 or 3%), respectively. All study participants have used ride-hailing services at least once, which can be expected as they were recruited through a ride-hailing service provider. Of note, there was no apparent effect of the frequency of using a ride-hailing service on the user experience.

Participants’ frequency of using ride-hailing services.
Results
The following sections discuss the findings from the study, starting with the feasibility of the Wizard-of-Oz setup and then the findings related to the three RQs gained from the post-study interviews.
Feasibility of Wizard-of-Oz Study
In total, 22 participants (76%) believed the car was driving in autonomous mode. Out of these 22, three participants (14%) mentioned that they did not look explicitly at the driver’s seat, and the remaining 19 (86%) said they did not see a driver. Out of the seven participants (24%) who did not believe the car was autonomous, three (43%) claimed to have noticed the driver. Another three participants (43%) mentioned that, despite not seeing any driver, they were certain that someone controlled the car. One person did not look at the driver at all. Because only three participants out of 29 (10%) reported noticing the driver in the seat costume, the proposed Wizard-of-Oz setting was feasible in convincing the participants that the approaching vehicle was driverless and operating autonomously.
Two participants (7%) mentioned being afraid when they noticed no driver present. First, such reactions are further confirmation of the convincingness of the Wizard-of-Oz. Second, it can indicate that some participants may experience anxiety when visibly driverless vehicles are introduced into general traffic, as evidenced by the following statement: “Even if I knew beforehand that something like that [vehicle with no driver] would come I still got scared when I did not see anyone there [in the driver seat]” (P25).
To confirm that the participants were paying attention to the approaching vehicle, the video recordings available for 24 participants were analyzed. In total, 23 of the 24 participants (96%) kept the approaching vehicle in the center of their field of view most of the time. The center was defined by splitting the visual field into three equal parts. If the vehicle mostly occupied the center portion, participants were considered to have kept the vehicle in the center. Because participants generally keep objects of interest in the center of their view, they were concentrating on the vehicle. The person not looking at the approaching vehicle appeared distracted by a street sweeper machine.
How Do Riders Perceive and React to Implicit Interactions in an AV’s Behavior in a Ride-Hailing Scenario? (RQ1)
Findings related to two study conditions, the speed of the vehicle and the stopping location, will be discussed before discussing how participants observed the behavior of the vehicle.
Speed of the Vehicle
When describing the movement of the vehicle, the participants who experienced the approach at a slow speed (Figure 7a) mostly used words such as “slow,”“regular,”“careful,” and “safe.” However, one participant also mentioned that the explicitly slow movement made them feel the vehicle was “helpless,” further explaining that because of this, they felt the desire to take control of the car. At the same time, the slow movement ensured the feeling of safety.
It approached so slowly that it seemed helpless. Maybe it doesn’t understand where I am and what it has to do. And because there was no driver, I felt as if I needed to take control. Because it came so slowly, it made me feel it was careful, so I had a safe feeling. (P4)
In addition, it is worth mentioning that two participants reported that the slow and careful movement gave a feeling of autonomy: “That it came so slowly. A human driver would never stop a car like this. It stopped so slowly, that it kept moving and moving and moving” (P1).
For the regular speed, participants used words such as “regular,”“smooth,”“peaceful,” but also “safe,”“slow,” and “calm” (Figure 7b). They mentioned that the approach made them feel calm and that the smooth driving style gave them a feeling of safety. In general, none of the participants used negative words to describe the vehicle’s speed. Analyzing the perceived safety for both groups (slow: mean = 6.29, SD = 0.74; regular: m = 6.42, SD = 1.10) provides further indication that both speeds were considered acceptable by the participants. The larger standard deviation in the group that was subjected to the regular speed indicates that there might be more heterogeneous perspectives.

Word clouds depicting adjectives participants used to describe the movement of the vehicle when it was approaching: (a) slowly; and (b) at a regular speed.
Stopping Location
Out of 10 participants for whom the car stopped before reaching their waiting location, six (60%) said they expected the car to come closer, three (30%) said the location was fine for them, and one person was looking in the other direction and did not notice the car’s stopping location. Out of the six who expected the car to come closer, three (50%) mentioned that they were fine with the car stopping that far as the “[r]eal taxis do not come exactly to you either, so it really doesn’t matter” (P3) and two (33%) mentioned that the stopping location did not make them feel any different. One was even convinced the car stopped at this location because of a technical error.
Of the 10 participants for whom the car stopped right at them, eight (80%) said the location suited them, and two (20%) expected the car to come even closer.
All nine participants for whom the vehicle stopped after passing their waiting location mentioned that they expected the car to stop at their location and not after. Two out of these nine (22%) said it was still acceptable as they were used to taxis missing them, and another two (22%) trusted the vehicle to pick a suitable location. Participant P25, however, was confused whether it was the correct car because it drove past them. They also added that because the vehicle “passed me so fast” (P25), they were scared and explained that they only started moving toward the vehicle after it had fully stopped. Another person mentioned a loss of perceived safety because of the vehicle missing them. This is evident by rating the perceived safety as three out of seven and explaining that the car missing their location made them doubt that the vehicle could handle similar situations in real traffic.
and then it drove past me, I wondered if it would do the same thing at intersections. That it knows it is an intersection, but still stops in the middle of it. A kind of flicker came into me just due to the fact that it drove past me. (P27)
Observing the Vehicle Behavior
Six out of the 29 participants (21%) mentioned that they closely observed the vehicle’s behavior when it was approaching. They specifically mentioned that they observed if the car followed the rules of priority at an intersection in the line of sight of their waiting location. One of these participants specifically mentioned that the car exhibiting what they perceived to be the correct behavior at that intersection increased their feeling of safety toward the car.
I observed the situation where it was supposed to cross the intersection, but a car came from the right. I looked at how it handled the situation. The vehicle did not get confused and was able to handle the situation correctly; therefore, a sense of safety arose. (P28)
How do Riders Perceive and React to Explicit Interactions of an AV in a Ride-Hailing Scenario? (RQ2)
In total, 13 of the 14 participants (93%) who were exposed to the light panel on the rear door noticed it. This was evident through the eye-tracking for 10 participants, and all of them mentioned it during the interview. However, only two participants (14%) mentioned choosing the rear door because of the light panel, indicating that it guided them to the rear door. Therefore, out of 4 participants, two individuals (14%) appeared to ignore it and chose to open the front door, indicating that the panel might have failed to influence their behavior. When digging deeper, one of them only opened the front door out of curiosity (“I chose the door to make it more exciting to drive, to see what is going on there in the front, so [I chose the door] out of pure curiosity.” P20). Therefore, this participant noticed the light panel but did not think it was meant for them, but rather “related to the experiment” (P20). The other participant explained choosing the front door as “showing that I am not afraid of the nondriver. Generally, I would sit in the back, but this was my first thought that I will show courage and open the front door” (P4). In addition, because this was the only person who did not notice the light panel, the light panel supported participants in choosing the intended door to enter the vehicle.
Out of the aforementioned 14 participants, eight (57%) had the light panel indicating the door being locked (by displaying red), and the doors were also locked. None of them waited for the panel to go green before trying to open the door. Only two participants (14%) realized after trying to open the locked doors that they needed to wait for the panel to go green before trying again.
However, participants also had trouble understanding whether the panel was even meant for them and what it tried to communicate. Seven out of the 14 participants who were exposed to the light panel in the study (50%) mentioned that they did not think the panel was intended for them. For instance, they thought the light panel was perhaps “part of a car that was measuring something” (P1). Two participants also reported that they simply did not understand the meaning behind the light, as evidenced by the following quote: “I noticed the light, but I didn’t understand the meaning right away. I didn’t think it was for me, but rather part of the system” (P21).
Reactions to Locked Door
In total, 15 participants encountered a scenario in which the vehicle’s door was locked when they first tried to open it. For six of these participants (40%), the initial reaction was to turn to the experiment organizer, which indicates seeking human assistance when the technology appears to fail. Participant P11 also explained that they were surprised that the door did not open and that “[i]n case the car stopped, this is an indicator for me to open the door” (P11). For four participants (27%), their first instinctive reaction was to try to open another door.
How can we Design the Interaction Between Rider and AVs in a Ride-Hailing Scenario to Make it Easier to Understand for the Rider? (RQ3)
When asked about their general opinion of AVs, 15 out of 29 participants (52%) mentioned one positive aspect: their perception of them being safe. These participants perceived them to be less prone to accidents because of human errors, because AVs “do not get distracted” (P29) and “cannot violate traffic rules” (P6). In addition, one participant also mentioned that “people are irrational” (P27). Five participants also perceived the AV sector as an exciting field when mentioning that they perceive it to be more environmentally friendly, as there will be more efficient use of resources (cars) and logistical effectiveness.
One of the more common concerns voiced by eight participants (28%) was the lack of trust in the AV handling unexpected traffic situations. For example, “if there is road work, which is not included in the system, is the system able to learn and insert it fast enough, so that they are aware of fast changes on the road” (P17). Five participants also mentioned the fear of a technical failure: “with technology there is always a risk that an error might occur, but the same possibility is with a human, but it is still the unknowingness whether it will stop or might hit something in case of the error” (P3). Three participants mentioned being afraid of issues because of the behavior of human drivers around AVs as AVs “cannot foresee other foolish drivers who make stupid maneuvers” (P21).
When discussing the positive aspects of autonomous ride-hailing, four participants (14%) perceived AVs as a good fit for certain routes. One of them mentioned that they often ride from Tartu city center to the train station and regarded this route as easily automated (P25). Three participants mentioned the potential for taxi services to become more available and cheaper, and one other participant was concerned that the new technology might make the service more expensive. One person also mentioned that autonomous ride-hailing could make more efficient use of cars.
Nine participants (31%) discussed the disappearance of social interaction between the rider and the driver. However, there were differing opinions about whether this was a positive or negative aspect. Some argued they liked talking to taxi drivers, and others preferred the opposite. However, most participants mentioned that their preference to engage in social interaction with the driver depends on their mood and/or the driver (“There are days, when I want to talk to a person and others when I don’t” [P9]). One participant even proposed a potential solution for this apparent dilemma: “In the future I think you can choose in the app which one I will order [autonomous or with a human driver]” (P9).
As a negative aspect of automation, seven participants (24%) mentioned taxi drivers losing their jobs. Five participants (17%) were also concerned that the service flexibility is much more complicated in the case of autonomous ride-hailing. They, for example, perceived making a stop in the middle of the ride to be much more complicated when dealing with an AV than with a human: […] humans are much more flexible than machines no matter how intelligent the machine is. I haven’t until today ever seen a very, very flexible machine. Through the (ride-hailing) app I can tell my needs, but it is much more complicated than telling the taxi driver, hey, stop here for a minute, I will go there for a moment and come back. I don’t even know how I would do this with a self-driving car. (P2)
Vehicle hygiene was also noted by one participant as a potential issue because AVs “cannot choose” (P27) their customers. Therefore, for instance, intoxicated customers may stain the vehicle, and there would be no driver to notice it and no one to clean it up.
Two participants (7%) described that if they were in a hurry, they would choose a human driver instead of an AV. One of them (P13) justified that in the case of AV, there is no driver whom they can ask to drive faster. The other (P20) mentions that they would urge the driver to potentially violate traffic rules, which they do not perceive to be feasible with an AV, “If I was in a hurry, I don’t know if I would dare to take it [autonomous vehicle], if I needed to break the traffic rules to speed up” (P20).
Finally, five participants (17%) mentioned that for them as customers, nothing would really change. They explained that the process of app-based ride-hailing is already highly automated, and not many adjustments can be made, so they would not perceive the transition to a driverless vehicle to be too abrupt. “Actually in case of ride-hailing service with apps nothing changes, because maybe you would say hi to the driver, but otherwise the destination is known, price is known, paying is automatic, so it is a waste of human resource” (P5).
Discussion and Conclusion
There is a continued need for research on human–AV interaction in the context of ride-hailing. There are billions of dollars invested in developing technical solutions; there are limited insights into aspects related to HVIs, which are a very important part of designing autonomous ride-hailing services. This article is one of the few studies to explore the external interaction of driverless ride-hailing (see also section Design and Validation of Driverless Ride-Hailing). It focuses on the interaction taking place during the pickup stage of the ride. The contribution of this article is the development of the methodology as well as the valuable insights gained from the study.
Implicit Interaction
For how vehicle movement influences the interaction, the study looked at vehicle movement from two perspectives: (1) vehicle speed; and (2) stopping location. The results indicate that speed did appear to strongly influence the way individuals perceived their interaction with the vehicle. Slow speed gave a certain group of participants a feeling of machine autonomy, which could indicate that these participants expect the driving style of AVs to be slower and more careful than that of a car with a human driver. Related to the stopping location, the findings revealed that overshooting might be an issue for some participants who reported that it reduced their trust in the AV.
Of note, all participants were successfully able to understand the vehicle’s intention of stopping using implicit interaction, which corresponds to similar findings in pedestrian–AV interactions (25–28). This perception may have been bolstered by the distinguished look of the vehicle (e.g., noticeable lidar system on the roof) and because the study took place in a location with relatively light traffic. When autonomous ride-hailing services become more accessible, there might be a need for an explicit mode of interaction (e.g., rider’s name on the windshield [ 41 ] to fill the customer’s need to clearly identify the vehicle, as suggested by Owensby et al. ( 15 ).
Explicit Interaction
The study used a binary red–green light panel as an explicit display to aid the participants in interacting with the doors of the vehicle. The light panel served two purposes. The color (red/green) indicated whether the doors were locked, and its placement on the rear door guided the participants to enter the vehicle through this door. For the displayed color, none of the participants waited for the panel to turn green before opening the door, showing that the light panel could not communicate its intended meaning. In general, the participants were confused about whether the light panel was meant for them and did not understand its meaning. This finding agrees with Hensch et al., who found that individuals have trouble understanding new arbitrary signaling concepts ( 30 ). However, Habibovic et al. reported that individuals will understand the abstract signals of external displays after some introduction ( 33 ).
Almost all participants noticed the light, and most used it as a signal to enter the car through the intended door. Therefore, it primarily served its guiding purpose. However, some participants deliberately chose the front door because they knew the vehicle was driverless. In addition, it is important to note here that choosing where to sit might be connected to cultural differences. In Estonia, sitting in the front passenger seat is quite common; however, in other countries, sitting in front is prohibited when using ride-hailing or taxi services. It may be that the habit of sitting in the front was too strong for the participants despite the light panel. However, this issue may become nonexistent when fully autonomous ride-hailing gets deployed, because some research (e.g., [ 52 ]) envisions a complete redesign of the interior of AVs after the need for a driver seat is eliminated. Therefore, the rider can freely choose their seat.
Other Findings
The findings indicate that some participants wanted to take control of the car when they saw it had no driver. This observation is in line with findings reported by Smith and Anderson, who revealed that some individuals do not want to ride in a driverless vehicle because they have the feeling of wanting to take control ( 53 ). Similarly, Yoo et al. described that riders may need a way (e.g., honking) to interact with outside road users during potentially high-risk traffic situations (e.g., cut-ins, reckless driving, protruding vehicles, and external horn sounds) ( 48 ).
The participants’ opinions about autonomous ride-hailing were diverse. A group of them regarded the ride-hailing service as very automated and requiring little contact with the driver. It is reasonable to assume that this group would find adapting to autonomous ride-hailing smooth. However, the findings also revealed multiple potential issues for autonomous ride-hailing, including reduced route flexibility (e.g., stopping in the middle of the ride, guiding the driver to stop at a specific location), vehicle hygiene, and the loss of social interaction. For social interaction, participants voiced different views; some mentioned that they would miss interacting with the driver, some were happy about the potential lack of social interaction, and others noted that it would depend on their mood. The need to frame the interaction with robotaxis using social conversation (e.g., “Hello car!”, “Goodbye car!”) was also identified by Meurer et al. ( 47 ). Therefore, the transition to autonomous ride-hailing needs to be gradual to give people the choice of an AV or a driver. It would allow all groups to be satisfied and help people gradually become used to vehicles without drivers.
Of interest, some participants reported preferring a human driver instead of an autonomous service when they were in a hurry because AVs will not break the traffic rules. The question of whether AVs should be allowed to break the rules is an aspect that is widely discussed in the ethics of AVs. The Law Commission of England and Wales and the Scottish Law Commission even went as far as to launch a joint public consultation to see how to adapt the laws for self-driving vehicles ( 54 ). For example, they analyzed whether there are any circumstances in which AVs should be programmed to exceed speed limits, drive on footpaths, or “edge through” pedestrians ( 54 ).
In addition, the high average score for perceived safety could be a sign of overtrust in technology, especially because of the current technological level of AVs. The problem is that people could have an unrealistic assessment of the capabilities of the AV; therefore, they tend to overtrust the vehicle in the situation ( 55 ). Overtrusting AVs can lead to possible accidents, for example, Tesla’s Autopilot was in at least three fatal accidents during 2017–2019 because of people underestimating the consequences and overreliance on the product ( 55 ).
Despite the high average score for perceived safety, many participants still hesitated to trust the AV to handle unexpected situations. This finding aligns with a survey of 4,000 Americans for automation in everyday life ( 53 ). The participants had several reasons for not wanting to ride in a driverless vehicle, such as a lack of trust, worry about giving up control, and safety concerns ( 53 ). From a positive perspective, they thought the experience of driving in a vehicle would be cool, safer, they would be free to do other things while driving, and it would be less stressful than driving themselves ( 53 ).
Implications for Research and Recommendations for Practice
This article makes a valuable contribution to the limited amount of research in the driverless ride-hailing field. First, it offers a distinct demographic perspective because the study conducted by Kim et al. was conducted in South Korea ( 11 ), and this article originates from Estonia. Second, Kim et al. primarily investigated explicit eHVIs via smartphone app interfaces ( 11 ), which left implicit interactions unexplored; however, this article sheds light on how implicit interactions, such as stopping too far away, can have an effect on the trust in AVs. Hoggenmueller et al. investigated implicit interactions ( 12 ); however, they conducted their study in a VR environment where the participant had a relatively passive role.
In addition, this article looks specifically into rider and driver interactions within the robotaxi use case, capturing participants’ perceptions after interacting with a robotaxi. In contrast, previous studies by Lee et al. ( 46 ) and Hasan et al. ( 6 ) centered on people’s perceptions of robotaxis despite their experience with AVs. By focusing on users who have experienced interactions with robotaxis, this article provides deeper insights into their perceptions, particularly for the potential integration of robotaxis into urban environments. In addition, it diverges from Meurer et al.’s research ( 47 ), which concentrated on internal interactions.
For practical implications, this article shows how future robotaxis should be designed, keeping the vehicle movement in mind. This means robotaxis should not stop after the users, because it might significantly affect their perception of the safety of the vehicle. Introducing new signaling concepts, such as the red–green light panel, may require familiarization efforts because of user confusion. Cultural norms on seating preferences also affect passenger behavior and should be considered in AV design.
Diverse user opinions highlight the importance of flexibility in robotaxi service design. In driverless ride-hailing, it is especially important to win user trust because their trust eventually decides how much the business will work. Therefore, this article demonstrates that gradual transition strategies, accommodating AV and human-driven vehicles, can facilitate user acceptance. Despite a high perception of safety, users exhibit hesitation when trusting AVs in unexpected situations. Transparent communication about AV capabilities and limitations is crucial to managing overreliance on technology and preventing accidents. Future research could further examine the relationship between passenger behavior and psychological perception when interpreting implicit interactions. For example, it would be valuable to explore how different driving styles, such as slower or faster approaches, shape passengers’ sense of safety and trust.
Limitations
Because of limited previous research, an exploratory study was conducted to gain the first insight for further research. However, there are certain limitations to these studies.
First, the study was conducted in a specific location, with a specific group of people invited through a single ride-hailing app over a limited time, meaning that conducting the same study in another place and another time with other participants might yield different results. This recruitment method may also introduce selection bias because participants will probably reflect the user base of that specific platform rather than a more diverse population. However, this app is the primary and most widely used ride-hailing service in Estonia, with a large and varied user base, which offers valuable insights. The findings remain meaningful because they provide initial observations on how individuals interact with a robotaxi under varying conditions.
Second, the size of the study population poses a threat to the generalizability of the findings. This is accepted because the aim was to explore how participants would react to and perceive implicit and explicit interaction cues. Therefore, this article provides a rich description of participant behavior and perceptions. These insights are valuable for informing theory and practice, and they can serve as a foundation for future, larger-scale quantitative investigations. In addition, the limited sample size prevented drawing causal relationships that were related to gender, age, and frequency of use.
Third, an exploratory first rather than a hypothesis-driven quantitative approach, followed by an in-depth qualitative analysis, was chosen. Both approaches are suitable for studying the interaction between individuals and an AV in a ride-hailing context. However, a more open-ended approach was chosen to allow for novel insights to emerge that can then be confirmed in follow-up quantitative studies.
Fourth, when recruiting participants, it was stated that the study would include an AV. Therefore, people who were more interested in AVs and open to novel technologies could have been attracted. In addition, expectations could have been created in people for the interaction. For future studies, it would be good to conduct the study with people without any prior knowledge of the study to see if there is any significant difference in the results.
Fifth, the study used Tobii Pro Glasses 2 to monitor where the participants were looking. Unfortunately, these glasses are sensitive to sunlight, and because the second day of the user study was very sunny, some potential data for analysis was lost. However, this issue was compensated for by analyzing each participant’s field of view video recordings. Precise eye-tracking data was unavailable; the well-established principle that individuals tend to keep objects of interest in the center of their vision ( 56 ) was relied on. Using this assumption, along with qualitative insights from the post-study interviews, it could be reliably inferred where participants were directing their attention.
Supplemental Material
sj-pdf-1-trr-10.1177_03611981251391717 – Supplemental material for External Human–Vehicle Interaction: A User Study in the Context of a Driverless Ride-Hailing Service
Supplemental material, sj-pdf-1-trr-10.1177_03611981251391717 for External Human–Vehicle Interaction: A User Study in the Context of a Driverless Ride-Hailing Service by Kristina Meister, Andres Kuusik, Alexander Nolte and Karl Kruusamäe in Transportation Research Record
Footnotes
Acknowledgements
This work is based on a master’s thesis submitted to the University of Tartu by Kristina Meister in 2021 ( 57 ).
Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: K.M., A.K., A.N., K.K.; data collection: K.M.; analysis and interpretation of results: K.M., A.K., A.N., K.K.; draft manuscript preparation: K.M., A.K., A.N., K.K. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was in part supported by the European Social Fund via the Smart Specialization project with Bolt Technology OÜ and the Estonian Centre of Excellence in IT (EXCITE), funded by the European Regional Development Fund.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
