Operational safety hazard identification methodology for automated driving systems fleets

Abstract

The safety of Automated Driving Systems (ADS) operating as Mobility as a Service (MaaS) depends on multiple factors in addition to the vehicle’s functionality, reliability, and performance. Currently, no comprehensive approach has been formally developed to identify operational safety hazards and define the operational safety responsibilities of the key agents involved in Level 4 (L4) ADS MaaS operations. This work develops and applies a structured hazard identification methodology for this operation. The methodology leverages and complements the strengths of various hazard identification and modeling methods, including Event Sequence Diagram (ESD), Concurrent Task Analysis (CoTA), System-Theoretic Process Analysis (STPA), and Fault Tree Analysis (FTA). The methodology is applied to analyze the operation of a fleet of L4 ADS vehicle fleets without a safety driver, monitored and supervised by remote operators. The results highlight the fleet operator’s role in ensuring the correct vehicle operation and preventing and mitigating incidents. The analysis demonstrates the developed methodology’s strengths and suitability for operational safety analysis of complex systems’ operations, considering the inherent complexity of the interactions between multiple human and machine agents.

Keywords

Automated driving systems safety assessment mobility as a service hazard identification system-theoretic process analysis concurrent task analysis

Introduction

Autonomy can be defined as a system’s ability to make independent decisions and to adapt to new circumstances to achieve an overall goal.¹ Multiple research and industry efforts have focused on developing Automated Driving Systems (ADS), resulting in a rapidly increasing number of commercial enterprises testing and deploying ADS-equipped vehicles worldwide for passenger and goods transport. Reducing human error in traffic incidents is often cited as one of the most important factors for developing ADS, as well as more accessible and environmentally friendly transport systems.^2–4 Currently, vehicle automation capabilities are categorized by the Society of Automotive Engineers (SAE) under six levels, from Level 0 to Level 5, depending on the extent of driving support and automated driving features capabilities.⁵ As the Level of Autonomy or Automation (LoA) increases, so does the complexity of the functions the system should perform, for example, situational assessment and risk-informed decision-making.

At the current stage of development, shorter-time goals include deploying Level 4 (L4) vehicles at a commercial scale for Mobility as a Service (MaaS) applications.^6,7 By SAE definition, L4 ADS are capable of performing all Dynamic Driving Tasks (DDT) – defined as the real-time operational and tactical functions required to operate a vehicle in on-road traffic⁵– under certain conditions and locations specified in their Operational Design Domain (ODD)^8,9 and implementing fallback strategies in the event of emergencies. The safety of ADS operations depends on many factors, including appropriate road infrastructure, wireless connectivity, and interaction with other road users.^10,11 The main safety-related tasks L4 ADS vehicles must perform are: (a) operate within and enforce the ODD, (b) perform safety-adequate DDTs relying on real-time conditions, and (c) implement DDT-fallback strategies to achieve Minimal Risk Conditions (MRC) when required. However, the implications of enforcing SAE level definitions, ODD requirements, fallback strategies, and MRC mechanisms on operational safety are still unclear, particularly in the case of driverless operations.^12–14 Recent incidents involving L4 ADS vehicles blocking fire trucks and buses, causing traffic disruptions, unfortunate interactions with law enforcement and first responders, and recent severe incidents with pedestrians continue to highlight the need to approach safety from an operational perspective.^15–17

The safety of autonomous systems has been discussed from several perspectives other than the technological and engineering challenges their deployment carries, including ethical, social, and legal aspects.^18–21 Yet, ADS’s system risk and safety research has mostly focused on demonstrating functional safety, system reliability, or potential increases in traffic safety.^22,23 In an evolving technological and regulatory environment, scenario-based simulation and drive testing have also become essential tools to provide system-specific empirical safety evidence.^24–26 Related research has focused on modeling ADS vehicle interactions with other road users and risk-related aspects through game-theoretic approaches or data-based methods.^27,28 While those aspects are undoubtedly important for a safe operation, organizational safety and human-related issues will also play a significant role in ensuring the operational safety of L4 ADS vehicles.^29–32 Yet, operational aspects of the fleet concerning how it should be designed to reduce risks, prevent hazards from happening, and mitigate possible consequences have not yet been fully addressed.

Risk management frameworks play a significant role in guiding and informing the design of safety-critical systems, operating procedures, and safety policies. Multiple traditional hazard identification and scenario modeling approaches serve as the foundations of probabilistic risk assessments (PRA) of complex systems.³³ The importance of comprehensive hazard identification and scenario-based analysis in risk management cannot be understated, particularly in the case of new and innovative technologies with limited operational experience and field data. The emergence of autonomous and highly automated systems with limited explainability and transparency introduces additional challenges to analyze their functions, behavior, and failures, and their implication on risk and safety. Hence, it is highly relevant to explore methodological improvements that allow system developers and operators to identify potential hazards and improve their systems’ safety at different levels– fleet, system, or functional levels. Traditional approaches employed in research and industry to identify, analyze, and model potential hazardous scenarios include tools such as conventional hazard analysis and risk assessment (HARA) and hazard identification (HAZID), fault tree analysis (FTA), event trees analysis (ETA), Event Sequence Diagrams (ESD), failure mode and effect analysis (FMEA), hazard and operability studies (HAZOP), and, more recently, Bayesian networks (BNs).^34,35 These methods are confronted with an increasingly complex landscape with the growing adoption of autonomous functions, capacities, and technologies, which can exhibit intricate subsystem interactions, evolving feedback loops, and unexpected emerging properties. Conventional methods usually focus on multiple aspects of complex systems from isolated perspectives, for example, human reliability, software failures, and hardware reliability. Therefore, these methods may not be capable of adequately modeling autonomous systems and their interactions with open-world environments.³⁶

Techniques such as Concurrent Task Analysis (CoTA) and System-Theoretic Process Analysis (STPA) have been recently proposed to identify and model interactions and emerging behaviors between subsystems and complex operational environments.^36,37 STPA is based on system theory and is increasingly employed in complex systems, including automated driving, software engineering, maritime and aviation industry, and nuclear power plants (NPPs).^34,38–44 When compared to other traditional methods, the advantages of STPA lie in its potential to identify unsafe or unintended interactions even when no explicit failures have occurred in the system.^45,46 The CoTA method was developed to specifically model human-system interactions from a task analysis perspective.^36,47 This process involves a functional decomposition of tasks each agent must perform to achieve common goals, following an extension of the cognitive model IDA (Information, Decision, and Action).³² These methods can be employed to conduct a detailed analysis of system interactions, dependencies, and potential undesired behaviors, additionally introducing elements of causal relationships and providing a pathway to design safety countermeasures.

It has long been recognized that to achieve an adequate level of completeness, hazard identification, and scenario modeling processes should employ various techniques, leveraging and complementing the strengths and limitations of each method.^48–50 This is particularly relevant for high-risk industries incorporating autonomous functionalities, where interactions between hardware, software, and human elements require profound analysis. For instance, hazard identification approaches integrating STPA with FMEA have been developed to elicit safety requirements in modern software-intensive systems.⁵¹ Similarly, a combination of STPA with Goal Structuring Notation (GSN) was used to derive safety arguments for assurance cases.⁵² As STPA provides limited information about consequences, it can benefit from combinations that allow risk quantification. For instance, combinations of STPA and BN have been explored to develop supervisory risk control frameworks for autonomous systems^53,54 and remote pilotage operations in maritime applications,⁵⁵ as well as for reliability assessment applications in autonomous vehicles.⁵⁶ Likewise, STPA and FTA have been combined to identify common cause failures of digital safety systems⁵⁷ and for digitalization efforts in NPP contexts.⁵⁸ A system-oriented approach through STPA can also analyze causal factors for scenario-based analysis⁵⁹; however, other methods focus on hazard scenario modeling as a base for risk quantification efforts. For example, frameworks integrating FTA and BN have been used for collision risk analysis in maritime operations⁶⁰ and dynamic risk analysis in NPPs.⁶¹ Alternatively, ESDs, FTA, and BNs have been integrated into the Hybrid Causal Logic methodology⁶² and applied to autonomous vehicles⁶³ or oriented toward human reliability analysis in NPPs.⁶⁴ Similar approaches have also been applied to dynamic risk assessment of third-party damages in natural gas pipelines.⁶⁵ Additionally, ESDs have been combined with FTA, BN, and CoTA to represent interactions between hardware and software failures,^63,66,67 as well as human errors.^47,68 Indeed, ESDs and the natural integration to success- and failure- oriented analysis through CoTA and FTA can help set the overall context and elicit root causes other than those related to the correctness of control actions in the system provided by STPA.

Hazard identification provides a critical foundation upon which risk assessment tools are employed to estimate the risk a certain technology, system, or operation poses for users, operators, and society. Effective hazard identification methodologies are key in defining safety expectations and providing a robust basis for future risk quantification efforts. The importance of this increases when these systems are expected to operate in open-world environments, subject to multiple interactions with other human and machine agents. Currently, there is no clear or comprehensive approach to define the risk reduction measures or operational safety responsibilities of the key organizations involved in the deployment of L4 ADS-equipped vehicles. Fleet operators play a central role in MaaS operations, and it is expected that their operational safety responsibilities will be determined by their relationship with ADS developers, vehicle manufacturers, and regulatory entities. The first step in defining these safety responsibilities is, thus, identifying the hazards associated with the system’s operation.

This work presents a structured hazard identification methodology developed to study the operation of complex systems. This approach combines traditional and modern risk assessment techniques to model system functions and interactions, including ESDs, FTs, CoTA, and STPA. The methodology leverages and complements the strengths of these methods to address the inherent complexity of multiple human-machine interactions present in complex socio-technical system operations. Combining scenario- and control theory-based methodologies allows tracking the evolution of hardware, software, and human interactions, the development of latent failures, and a link toward safety countermeasure design and task allocation strategies. The main goals of the methodology are to identify hazards within high-level operational scenarios, associate them with risk contributors to provide a link between key safety-related tasks and agents, and provide a foundation for the development of risk assessments and layered-safety operational policies. This approach provides practical insight for organizations operating, managing, and designing complex systems aiming to conduct risk assessments and identify key risk mitigation measures. In this work, the methodology is applied to derive operational safety responsibilities of fleet operators involved in L4 ADS MaaS operations. This work focuses on the safety implications of human-machine agent interactions and the role of the fleet operator in ensuring operational safety. The implementation of the methodology is exemplified through the analysis of the ADS vehicle operation during a trip with passengers.

The structure of the paper is as follows. Section 2 presents the proposed hazard identification methodology, divided into system modeling, scenario modeling, and hazard identification phases. Section 3 demonstrates the use of this methodology for L4 ADS fleets for MaaS providers. The case study focuses on the operational phase when the vehicle drives on-route to its destination with onboard passengers. Section 4 discusses the proposed methodology’s main results, advantages, and limitations. Finally, Section 5 presents the main conclusions of this work.

Hazard identification methodology

The use of risk assessment frameworks to guide and inform safety and policy requirements has been thoroughly explored in the past decades.^69–71 Risk management is an iterative process, where the identification of hazards, hazard scenario modeling, quantification of risk, and assessment of risk criteria compliance can be continuously updated based on new information, operational experience, and developing technologies. A general risk management framework is presented in Figure 1, highlighting the hazard identification and scenario modeling stages.⁷⁰ These are the focus of this work and are crucial to developing comprehensive qualitative and quantitative risk assessments.

Figure 1.

General risk management framework outline.

Identifying hazards and the risk scenarios that develop from them during operation is critical to determine if the system’s safety barriers are sufficient to prevent or mitigate potential undesired consequences. Using Kaplan and Garrick’s definition, risk scenarios are uniquely characterized by sequences of hazardous event or chain of hazardous events that occur within a specific context leading to an undesired consequence.⁷²

In this work, hazards are defined at a sub-system level, that is, the underlying source of risk stems from a function or task that was not performed as expected (e.g. an undetected sensor failure) which in a certain context can lead to an undesired consequence (e.g. collision with an object on the road or other road user). Therefore, a risk scenario is constructed from a sequential combination of hazard scenarios that represent the failure path of a key event (e.g. detecting a sensor failure and implementing adequate failure mitigation strategies).⁷³ For instance, the hazard scenario“the ADS fails to detect that a fallback is required” includes a breadth of specific situations with different aggravating elements with distinct likelihoods (weather conditions, other road users, conditioning system failures, etc.) that lead to consequences with varying severity, such as an interrupted trip or a collision with other road users. These hazard scenarios can play a role in multiple risk scenarios and may develop either from explicit system failures or from unsafe subsystem interactions, which in other contexts may not have led to undesired consequences.

As the system’s complexity or the environment in which it operates increases, so does the number of potential hazard and risk scenarios. In this regard, achieving scenario completeness is a long-desired goal, subject to the sophistication of the available modeling tools.⁷¹ However, conventional hazard identification and modeling methods frequently focus on specific aspects of the system, isolating each element for in-depth analysis and, therefore, neglecting the importance of hardware, software, and human interactions.^71,74 The methodology proposed in this work aims to leverage the benefits of different methods used for hazard identification and scenario modeling. The following sections describe the main characteristics of these methods and present the strategy to combine these for a structured analysis of a complex system.

Methods employed

A frequent approach to model risk scenarios is to employ ESD. This traditional hazard analysis method can be defined as generalized event trees (ETs), allowing better representation of dynamic systems.³⁵ ESDs depict a sequence of pivotal events stemming from a common initiating event and leading to different end-states through success or failure paths. The level of detail ESDs provide depends on system breakdown and how hazardous scenarios propagate local effects into system-level events. The quantification of ESDs estimates each outcome’s frequency based on the initiating event’s frequency and the intermediary events’ probabilities.⁴⁷ Figure 2(a) presents a simple ESD, depicting how the success or failure of events A and B may lead to different end-states, representing either a successful or failed scenario.

Figure 2.

Traditional scenario modeling tools: (a) example ESD and (b) example FT.

The intermediary event probabilities are often expressed through FTA. Fault trees (FTs) represent the logical hierarchy of a system-level failure based on the functional decomposition of failure events. The assessments may be qualitative or quantitative, depending on the availability of data to characterize the probability of occurrence of each sub-event. The primary outcomes of an FTA are systematically identifying causes (i.e. a combination of events) that lead to an undesirable event (denominated as the Top Event), obtaining the probability of occurrence for the events based on Boolean algebra, and identifying critical combinations of events that lead to the Top Event.⁷⁵ As discussed in,³⁴ modeling the functionality of an ADS relying only on classical FTs is untenable, as faults may be triggered or propagated by external environmental conditions as opposed to internal faults at lower levels of description. However, the strengths of the hierarchical structure may still be employed when addressing subsystem or component-level faults. In Figure 2(b), the “Top Event: Failure in Event A” is expanded so that lower-level failure events (of the same agent or not) can be identified through logic gates.

The success of each ESD event may also be expressed as a function of the series of tasks that need to be completed in sequence or concurrently. CoTA is a novel method developed to model the tasks this in the context of the Human-System Interaction in Autonomy (H-SIA) framework.^36,47 It is a flexible method for complex system analysis based on Task Analysis (TA) that models the interactions between subsystems or agents working toward a common goal. The decomposition of system-level goals into sub-goals analyzes each agent’s expected behavior and performance. These sub-goals are hierarchically organized through plans, indicating the order in which the tasks must be performed to achieve the system-level goals. The underlying methodology extends the cognitive model IDA (I-Information, D-Decision, A-Action) to human and autonomous systems.^32,36 This division of tasks into phases is fundamental to identifying different failure modes for each agent and emergent failures and/or failures arising from unsafe interactions. The tasks can be categorized as sequential, parallel, interface, and the newly introduced trigger tasks, as follows:

Sequential tasks: Tasks that must be performed sequentially to achieve the specified goals.

Parallel tasks: Groups of tasks that must be performed concurrently to achieve the specified goals. These frequently refer to supporting tasks, that is, they are not directly related to the events in the ESD but are necessary for executing the other tasks and the interaction between the agents.

Trigger tasks: Specific tasks at the lowest level of redescription that, when completed, provide the necessary input for other tasks to be performed during the same operational phase. The triggered tasks can be located at higher redescription levels and belong to different agents. Note that the triggered tasks only take place depending on the outcome of the trigger task.

Interface tasks: Specific tasks at the lowest level of redescription that, when completed, provide the necessary input for other tasks from different agents during the same operational phase to be performed adequately.

The CoTA analyzes the tasks involved in all the ESD events associated with the participating agents to: (a) identify the specific subsystem/component required for the task’s success, (b) identify interface and trigger tasks, and (c) analyze the propagation of failure. The granularity of task descriptions depends on system knowledge as well as a series of stop rules related to IDA.⁴⁷ This task decomposition approach is key to identifying each agent’s role in ensuring system safety, as well as potential sources of errors or failures arising from the interactions between multiple agents. Figure 3(a) shows an example of CoTA models built for the agents participating in the example ESD (Figure 2(a)). Interface and trigger tasks are depicted by distinct figures, while sequential (B.1 –> B.2) and parallel (A.1 // A.2) tasks are shown in the agents’ plans.

Figure 3.

Novel hazard identification methods: (a) example CoTA diagram and (b) example STPA diagram.

Another approach that has gained popularity to identify hazards and corresponding mitigation measures in complex systems is STPA. STPA is a method based on the STAMP (System Theoretic Accident Model and Processes) model,⁷⁶ as well as systems and control theory. It considers that hazards arise from uncontrolled or unsafe interactions between components, subsystems, and the environment.³⁷ In this context, hazards arise from failure mechanisms describing (i) required control actions not provided; (ii) an incorrect control action provided; (iii) a control action provided in an incorrect time or sequence (too early or too late); or (iv) a control action applied for an incorrect duration (too short or too long). The flexibility of the STPA framework allows for defining the system at varying levels of complexity and detail depending on the study’s objective.

A central element of the STPA model is the development of a control structure diagram that illustrates the functional interactions between subcomponents and feedback control loops. This diagram is key to identify unsafe control actions (UCA) that may breach the system-level constraints and lead to loss scenarios. As a hazard identification methodology, STPA is founded on the concept that an effective control structure can enforce the system-level constraints to avoid system-level hazards. STPA provides insight into causal factors leading to the UCAs (e.g. a control action provided in the wrong context). Determining how hazards may originate and propagate through the system, challenging the safety barriers that constrain unwanted behaviors, is used to identify preventive or mitigative countermeasures. Thus, identifying and documenting the UCA may lead to safer designs and system modifications.^49,53 An example of a hierarchical control structure is shown in Figure 3(b) showing how two agents interact with the environment (world).

Table 1 summarizes the methods used, their main advantages, and their characteristics. These methods can be applied to analyze different qualitative and quantitative aspects of failures arising from software, hardware, procedures, and human elements. While integrations between ESDs and FTs have played a significant role in risk modeling and quantification, these may struggle to represent function interactions or dependencies efficiently. Thus, CoTA and STPA approaches provide an alternative to elicit the nature of functional interactions even if quantification (e.g. event probability or frequency) is not directly embedded in them.

Table 1.

Comparison of scenario modeling tools employed.

Method	Advantages	Logic	Is context embedded?	Are interactions considered?	Are consequencesexplicit?	Is it quantifiable?
ESD	Model dynamic causal relationships between initiating event, intermediate stages, and outcomes.	Inductive	Yes	No	Yes	Yes
CoTA	Model interactions between tasks performed by different agents for a common goal and subgoals.	Deductive	Possible	Yes	No	No
STPA	Model interactions between components leading to system-level hazards.	Deductive	Possible	Yes	Possible	No
FTs	Identifies causes and critical combinations of events leading to undesirable events.	Deductive/ Inductive	No	Possible	Yes	Yes

Proposed methodology

The proposed methodology for complex systems’ operations hazard identification consists of 10 steps, organized into three overall stages:

I. System modeling: Conduct a functional breakdown and role description of the main agents participating in the system’s operation. Identify and describe the distinct operational phases of the system’s operation, if any.

II. Scenario modeling : Build a representation of the system’s operational phases based on the success/failure of the participating agent’s functions employing ESD, FTA, STPA, and CoTA to model from system down to component level. Table 2 summarizes the integration of these models into the overall methodology, and details are provided in the following sections.

III. Hazard identification : Conduct a systematic hazard identification and characterization through the cross-comparison of the system’s representation through the multiple techniques used. Each modeling technique employed aims to answer the following questions: a) What hazards could occur in each operational phase? b) How do these hazards develop? c) Who is the main risk contributor? d) Why do they develop? e) What are the potential consequences of these hazards? f) What measures can prevent or mitigate the hazard?

Table 2.

Hazard modeling techniques and framework integration.

Method	Integration strategy
ESD	Developed to represent operational phases. Binary event outcomes (yes/no) lead to success or failure end-states.Used to identify the (a) specific hazard scenarios, (b) risk contributors, and potential consequences throughend-states (e).
CoTA	Developed to describe the tasks involved in the success path events of ESDs. Used to identify additional hazardsand identify (c) failure modes and mechanisms.
STPA	Developed to graphically represent the interactions and feedback loops between different subsystems. Used toidentify additional hazards and identify (c) failure modes and mechanisms.
FT	Developed for pivotal events in ESDs. Basic events categorized as human errors, hardware or software malfunctions,process design errors. Used to identify and categorize basic failure events by their possible (d) root causes.

Figure 4 presents an overview of the hazard identification methodology and the main outputs of each stage. Each stage consists of interdependent steps that are designed to be implemented in order (as read vertically from top to bottom). These steps and the implementation approach are described in the following sections. Note that some steps may be performed concurrently (shown at the same horizontal level). For example, Steps 3 and 6 may be developed in parallel, but Steps 4 and 5 must be implemented after Step 3.

Figure 4.

Overview of hazard identification methodology.

Stage I: System modeling

The first stage consists of a system familiarization process. This stage depends on the practitioners’ knowledge and the complexity of the system studied (i.e. the presence of interdependent hardware, software, and human elements). The level of system decomposition depends on the purpose of the risk assessment and the stage of the system or technology development, that is, design, testing, and operation. The level of analysis is also constrained by the availability of needed resources, including data sources.⁷⁷ The system’s modeling is divided into two steps, as described below.

Step 1: Perform a functional system breakdown for each participating agent.

Analysts identify the relevant agents participating in the system’s operation and the functions they must perform to ensure its adequate operation. An agent is a human, software, or machine subsystem with the agency, that is, it can act or exert power – in this case, decision-making power over its own state and that of the system’s entire operation. Agents can be composed of multiple elements, each expected to perform specific functions.

The output of this step is a list of the most relevant elements and functions of each agent and any critical interactions between them. A scheme of this functional breakdown is depicted in Figure 5(a), in which each agent’s main functions and sub-functions are developed. The level of function decomposition at this point may be high-level. For instance, for an L4 ADS vehicle fleet one of its main functions is to perform passenger transport, while relevant subtasks include passenger verification and customer support.

Step 2: Define the most relevant operational phases.

Figure 5.

Stage 1 – system modeling: (a) step 1 – agent functional breakdown and (b) step 2 – operational phase definition.

Analysts determine if the functions the agents must perform differ during different operational stages of the system. For instance, an ADS vehicle is expected to perform different functions when picking up a passenger and when driving the passenger toward their destination. This step is system-dependent, and the scope of further analysis may be only some phases focused on a specific agent or the whole operation. Each operational phase is defined by boundary conditions, which may refer to a change in an agent’s state or the entry of an agent not active during another phase (e.g. the entrance of a passenger in a vehicle).

The output of this step is a list of operational phases of interest, the functions each agent performs during each phase, and the boundary conditions or events that trigger evolving from one phase to another. A scheme of this functional breakdown is depicted in Figure 5(b), exemplifying an operational sequence where certain interface events or conditions must be met for the system to evolve its state.

Stage II: Scenario modeling

The second stage involves modeling the system’s operational phases and critical events that may lead to hazard scenarios. The following sections briefly describe each technique and its application in the proposed hazard identification methodology.

Step 3: Model each operational phase through an Event Sequence Diagram (ESD).

The third step involves developing an ESD representing each operational phase. The ESD models the scenario representing an operational phase at a high level. The initiating event, thus, is related to the boundary conditions delimiting each phase. As the ESDs are employed to model the whole sequence of events that may occur in the operational phase, this will result in multiple end-states with varying levels of severity. When developing the ESDs, the analysis must ensure that each event is associated with one agent only (defined in Step 1), unless referring to an external event affecting the system’s operation.⁴⁷ In this regard, contextual aggravators (e.g. conditioning system failures) are assessed from the perspective of the system not reacting as expected (e.g. failure is not detected by the system and/or does not trigger an emergency response).

The output of this step is one or more ESDs depicting the chosen operational phases, events associated with each agent (with varying degrees of detail as determined by the analyst), and a list of end-states ranked by a qualitative assessment of their severity.

Step 4: Model agents’ tasks and interactions through Concurrent Task Analysis (CoTA).

The fourth step consists of developing a CoTA for each agent involved in each operational phase, building on the ESD diagrams.⁴⁷ According to the extended IDA model,⁷⁸ human operators or autonomous systems can fail during:

Information gathering and pre-processing (I phase), for example: receiving information from sensors (ADS) or responding to an alarm (remote fleet supervisors).

Situation assessment and decision making (D phase), for example: deciding a new course for collision avoidance (ADS) or adequate situational awareness (remote fleet supervisors).

Action taking (A phase), for example: implementing context-specific DDTs (ADS); transmitting dispatch commands (remote fleet supervisors).

Therefore, all tasks an agent performs (i.e. humans or autonomous system) are decomposed into steps focusing on receiving information, selecting an appropriate action plan, and performing the chosen action path. The CoTA method can identify the tasks needed for the success of key ESD events, for example, if the information is correctly transmitted to and received by an agent, whether previous knowledge or parallel inputs were available to support the agent in selecting an adequate action plan, and if the mechanisms needed to perform the actions adequately were successful. The output of this step is a CoTA for each agent participating in the chosen operational phases modeled in step (3). The CoTA outputs a hierarchical list of tasks for each agent, the dependency between tasks (e.g. sequential, parallel, trigger), and the IDA stage each task represents (e.g. Information). For instance, this analysis provides insight into which task failures or errors are more relevant to system-level effects. The tasks must address all the events associated with the agent in the ESDs.

Step 5: Model the failure of key pivotal ESD events through Fault Trees (FTs).

The fifth step consists of developing qualitative FTs whose “Top Events” are related to pivotal events defined in the ESDs. The events selected to be expanded through FTs correspond to those representing explicit safety-related functions of the system’s agents, for example, an L4 ADS vehicle detecting the need for a DDT fallback. The level of event decomposition is determined according to the goal of the analysis and data availability (e.g. hardware, software, human, or procedural errors or failures). Basic events are categorized as human-related errors, software and/or hardware malfunctions, process design errors, or external events. These categories can be adapted according to the level of analysis and the objectives. This decomposition aims to identify the type of failure, linking these back to the agent responsible for each event. Further details and decomposition levels may be achieved by selecting adequate tools based on the failure type (e.g. expanding hardware failures through FMEA, employing Human Reliability Analysis techniques for human errors or software reliability-specific tools, etc.).

The output of this step is one or more FTs developed for chosen ESD events developed in step (3). The FTs output a list of events decomposed into basic events, which in turn, are categorized by their potential source.

Step 6: Model agent’s functions and interactions through System Theoretic Process Analysis (STPA).

The sixth step employs the STPA method to develop a system-level description of agents’ functions and interactions. The system boundaries, high-level hazards, and constraints are developed based on the functional breakdown in step (1). System-level hazards are identified as a system state or a set of system conditions that will lead to a system loss. System-level constraints are defined as system conditions or behaviors expected to be satisfied to prevent system-level hazards and associated losses. The scope of the STPA model is defined by the analyst as a function of these hazards and constraints. In this work, the main focus of developing the hierarchical control diagram is identifying how inadequate inputs can alter downstream functions. In addition to identifying hazards not explicitly modeled by the ESD, the control diagram allows for identifying the key communication channel whose interruption or failure (on the transmitting or receiving end) can lead the UCA within the scenarios developed in step (3).

The output of this step is a list of UCA associated with the controller-actuator pair affected. Insight on potential causal factors is explored in Stage III of the methodology.

Step 7: Determine model connections and redundancies.

The purpose of this step is to establish a connection between the scenarios represented by ESD events and the STPA, CoTA, and FT models. The success and failure paths of an ESD event are linked to the success of tasks modeled in the CoTA or sub-events detailed in the FT. On the other hand, the sequence or dependencies of tasks in the CoTA plans are linked to issues in the transmission, content, or reception of specific control actions modeled through STPA. This may lead to a redundant description of a system function from two or three sources. The output of this step are equivalence tables, detailing these relationships (Table 3) and serve as input for Stage III.

Table 3.

Step 8 – model connection example.

Scenario		Tasks involved		Sub-event failures		Control Actions (CA)
Event	Agent	Task	Type	Event	Source	CA	Type	Controller	Receiver
A	II	A.1.1	Sequential	B.1	Software	F.1	Feedback	1	2
		A.1.2	Sequential	–	–	C.1	Action	2	3
		A.2	Trigger	B.2	Software	–	–	–	–
		–	–	C	Hardware	C.2	Action	2	4

Stage III: Hazard identification

The third stage leverages the models developed in Stage II. The stage consists of four steps, leading toward creating a hazard catalog. These are:

Step 8: Identify hazard scenarios through ESD models.

The identification of hazards is initially based on pivotal ESDs events, stemming from the failure paths of events associated with each agent (see Table 3). In this approach, hazards are defined at a sub-system level, that is, the underlying source of risk stems from a function or task that was not performed correctly or resulted in an undesired consequence. Therefore, a hazard scenario represents the failure path of an event, independently whether the event is related to a system failure or to an unsafe interaction resulting from an action which in other contexts may not have led to undesired consequences. Only ESD events associated with an agent are considered. This process allows the identification of (a) high-level hazard scenarios per operational phase. For example, the hazard associated with the ESD event “The ADS vehicle detects a DDT-fallback is required” is “The ADS vehicle fails to detect a DDT-fallback is required.” The output of this stage is a list of hazards by operational phase.

Step 9: Elaborate the list of failure modes for each hazard scenario.

Each safety hazard listed in Step 8 is associated with multiple failure modes. This step consists of leveraging the equivalence tables developed in Step 7 to identify specific failure modes through the CoTA, STPA and FT models (Step 9a). This allows describing the hazard scenario in terms of (b) uncontrolled or failed feedback loops, key system-level interactions, and dependencies between agent tasks. Unsafe interactions leading to the identified hazards are determined by finding the explicit relations between tasks (CoTA) and errors in controlled feedback loops (STPA). Although UCAs do not necessarily stem from system failures or errors, the methodology focuses on the potential consequence of an UCA, for example, incorrect sensor data (not provided/provided out of sequence) can lead to detection failures (at a functional level) that leads to a hazard scenario. Given the detailed procedure to develop the CoTA and STPA diagrams, new hazards may also be identified from functional relationships not explicitly defined through the ESDs.

Each failure mode is associated with (c) a main risk contributor and/or the main agent responsible for preventing or mitigating the specific failure mode (Step 9b). For some failure modes, the risk contributor and the responsible agent may represent different functional modules of the same agent (e.g. ADS software affecting vehicle hardware), or from different agents (e.g. inspection crew procedural errors affecting vehicle operation). The FTs allow an alternative characterization of the failure mechanisms developed through the STPA (e.g. identifying the communication channel failures causing uncontrolled or failed feedback loops). Thus, this process allows the (d) identification of basic failure events by categories representing possible root causes (Step 9c).

Step 10: Construct hazard catalog.

The last stage of the methodology is the consolidation of the results of Step 9. The output of this step is presented in the form of a tabular table, as shown in Table 4. Here, the hazard scenario is characterized by the operational phase where it may occur and the specific failure modes that lead to it, risk contributors, and responsible agents. Information such as the IDA error type (I – Information, D – Decision, A – Action), the UCA type (not provided, incorrect time/sequence, incorrect duration, or incorrect content), and the root cause category (e.g. human, software, hardware, procedural) provide additional details to enrich the hazard scenario’s description. Finally, information on the affected agent and function allows analysts to trace the effect of the contributing factors across multiple hazard scenarios.

Table 4.

Step 9 – example hazard scenario “event A failure.”

Hazardscenario	Operational phase	Failuremode	Riskcontributor	Agentresponsible	IDAerrortype	UCA type	Root causecategory	Affectedagent	Affected function
Event “A”failure	Phase B	FM 1.1	Agent I	Agent II	I	Not Provided,IncorrectTime/Sequence	Software	Agent I	Function 1.1
Event “A”failure	Phase A	FM 1.2	Agent II	Agent II	A	Not Provided,IncorrectContent	Human	Agent II	Function 2.3

Methodology outcomes

The constructed hazard catalog can be used for multiple purposes depending on the system analyzed and the scope of the analyst’s investigation. On one hand, it can provide the foundation for risk quantification efforts, which would require (e) an analysis of event consequences and likelihoods. Each hazard scenario can lead to various consequences that can be expressed at a higher or lower level of analysis and depend on the scope and objectives. Through the CoTA and STPA models, specific failure modes are linked to effects at a local level (i.e. the affected function or agent) and a system level (i.e. the end-states from the ESD). For instance, a scenario end-state can be characterized as a “potential collision with a passenger onboard,” leading to a severe consequence of “possible harm to life.” On the other hand, a more detailed analysis characterizing the vehicle speed, location, and external conditions can lead to higher granularity when assessing the consequences: a collision with very low speed may lead to “possible injuries to passengers” rather than harm to life. These consequences can be evaluated through a semi-qualitative risk scale, for instance, based on ASIL in ISO 2626-2.^33,79,80

On the other hand, it can provide insight into (f) the design of risk reduction activities to prevent the hazard scenario’s failure modes or mitigate its potential consequences. For instance, preventive actions would be related to inspection, maintenance, and procedural elements, as these impact the system’s state before operation. Mitigation actions, in contrast, are mainly associated with operational teams or the system reacting to multiple hazard scenarios during the system’s operation. A further examination of the agent identified as responsible for a failure mode and the root cause category can lead to an analysis of the resources required by the system and operating crew for a safe operation. The nature of these resources depends on the high-level responsibilities of the organization operating the system and its relation to the system’s designers and regulators. At a high level, the resources to support risk reduction activities can refer to procedures, training, tools, and working conditions (Table 5).

Table 5.

Examples of risk mitigation activity categories.

Activity type	Description
Operational Procedures	Operational guidelines developed to support the activities of the agents involved in the system’s operation, e.g., human operators and crew, software and hardware systems. These procedures include regulating the content, frequency, and requirements for agent interactions.
Operator & CrewTraining	Specific training activities focused on the tasks of the human operators and crew that support the system’s operation. This includes familiarization with the operational procedures, required HSI functions, emergency procedures, and workplace safety guidelines, if applicable.
Hardware & SoftwareTools	Hardware and software tools necessary for the agents to perform expected tasks. These include necessary interaction and communication mechanisms, reliable connectivity conditions, and diagnostic tools.
Work Conditions	General policies and equipment that are designed to improve multiple aspects of workplace adequacy as well as human operator and crew performance.

Case study: Operational safety hazards of L4 ADS fleets in MaaS

This section presents a case study, applying the methodology to the operation of L4 ADS for MaaS to determine potential risk reduction activities. This section highlights the steps and key results for system modeling, scenario modeling and hazard characterization for the case of an ADS vehicle driving from origin to destination with passengers on board. In particular, results from modeling Steps 1 to 6, along with the hazard identification steps 9–10, are presented.

The case study’s scope consists of elements that are within the responsibilities of the fleet operator, and it is considered that the ADS vehicles have undergone the necessary testing to obtain license to operate in L4. Thus, events concerning hardware failures are developed under the perspective of inspection and maintenance activities (responsibility of the fleet operator), as opposed to system reliability (responsibility of the ADS developer and/or vehicle manufacturer). Finally, passengers and other road users are outside the scope of analysis and are considered to behave as expected.

Stage I: System modeling

The system analyzed (“reference fleet”) consists of light-duty passenger vehicles with L4 ADS capabilities operating without a safety driver. This reference fleet was validated with stakeholders representing different industry sectors and is considered a more complex and likely scenario for the future of L4 ADS. The fleet is managed by a fleet operator who has procured the vehicles from an ADS developer. The main role of the fleet operator is to ensure the correct and safe operation of the fleet, such as ensuring the safety of passengers and other road users (cyclists, pedestrians, drivers.)

The mobility services offered are ride-hailing services for passengers in limited geographic areas and under specific conditions. This ODD is limited to urban and suburban areas with adequate connectivity conditions for localization and communication purposes and relies on built-in HD maps available to the ADS developer. The fleet operator may also establish or operate within a more restrictive ODD to comply with connectivity and passenger communication needs.^9,81

Step 1: ADS Fleet Operations Functional Breakdown

The operation of the L4 ADS fleet is broken down into three agents with distinct functions, as described below (Figure 6, Table 6). The ADS-equipped vehicles are supported by two distinct entities: the Fleet Operations Center (FOC) monitors and supervises the ADS vehicle’s operation, and the Maintenance Operations Center (MOC) inspects, maintains, and stores the vehicles.

Figure 6.

Fleet operator subsystems and functions.

Table 6.

System breakdown for L4 ADS operations.

ADS Vehicle	FOC operators	MOC crew
Employ real-time monitoring data to perform DDT in established ODD.	Supervise vehicle operation and intervene when required (safety operator.)	Follow ADS developers’ maintenance requirements to prevent vehicle failures.
Handle passenger interaction and communication requests.	Dispatch vehicles or provide waypoints to and from maintenance centers to initiate operational shifts or as requested by the MOC.	Report abnormal vehicle behavior, diagnostic logs, and other elements discovered through inspection and maintenance activities.
Transmit information and receive commands from FOC operators.	Handle passengers’ requests and contact first responders (service operator.)	Perform pre-shift inspection before clearing vehicles for operation.
Detect DDT-fallbacks are required for ODD breaches, self-diagnosed failures, incidents.	Initiate post-incident procedures after the vehicle achieves MRC.	Perform low-complexity corrective and preventive maintenance procedures.
Determine and implement DDT-fallback outcome (MRC, etc.)	Detect DDT-fallbacks are required for ODD breaches, self-diagnosed failures, incidents.	Manage and recover vehicles stranded in MRC during operation.
Determine and implement DDT-fallback outcome (MRC, etc.)	Determine and implement DDT-fallback outcome (MRC, etc.)	Satisfy local regulations and reporting duties for post-incident procedures.

○ Level 4 Automated Driving System-Equipped Vehicle (ADS):

Each ADS vehicle is expected to perform autonomous driving tasks that are coherent with the definition of L4 throughout its operation. This includes implementing context-specific DDT fallback strategies in response to DDT performance-relevant system failure(s) or upon an ODD exit. Identifying the need for and implementing a DDT fallback is the responsibility of the ADS vehicle, relying on onboard diagnostic capabilities and built-in ODD enforcement mechanisms. Yet, in case it fails, the FOC safety operators can support and send the command for a fallback.

Mobility service specific functionalities include key aspects of passenger-vehicle interaction, such as picking-up and dropping-off passengers assigned to it, enabling communication between the passenger and the FOC operators, and receiving commands from the FOC. Note that direct vehicle control is managed exclusively by the ADS software, which requires real-time monitoring data of the vehicle’s surroundings.

○ Fleet Operations Center (FOC):

The primary role of the FOC operators is to support the operations of the ADS vehicle. These functions may be further divided into a control center (focused on functional safety aspects) and a service operator (focused on mobility service aspects). Without onboard safety drivers, remote fleet supervisors may play an active role in ensuring passenger and vehicle safety, including monitoring tasks and providing indirect assistance (e.g. waypoints or commands for achieving MRC).^29,82

The FOC’s tasks include monitoring the vehicles’ overall operation, addressing the passengers’ requests, sending commands to the vehicle, and managing communication with the passenger. The FOC operators may dispatch the vehicle to specific locations or directly assign waypoints to the vehicle’s trajectory. In the event of an incident leading to a vehicle achieving MRC, the FOC operators are responsible for initiating post-incident procedures. Depending on the event’s severity, this may include contacting first responders and law enforcement, dispatching vehicle recovery teams or a secondary vehicle for the passenger, and reporting the incident for investigation, as expected to be required by local regulations or internal operational procedures. Depending on the context, whether the ODD is breached, a safety-critical failure, an incident, or an emergency stop request has occurred, the vehicle is expected to perform the DDT fallback actions necessary to return to the ODD, achieve MRC, or implement other mitigation strategies.

Two concepts have been introduced to describe potential DDT-fallback scenarios: Stopped stable condition (SSC) and Minimal Risk DDT (MR-DDT).^83–85 The SSC corresponds to “A stable, stopped condition to which a user or an ADS may bring a vehicle after performing the DDT fallback triggered by a passenger or third-party request and when given trip may be continued.” In case the passenger’s request is due to their suspicion that the vehicle is acting unsafely, the FOC can command the vehicle to achieve MRC. If a third party triggers the SSC, the FOC operator communicates with the traffic agents, and depending on why the vehicle was stopped, they may command it to achieve MRC or to return to DDT. The MR-DDT is defined as a “limited” DDT and can be performed if a vehicle with no onboard passengers presents a non-safety-critical failure. It corresponds to “A mitigation strategy that addresses non-safety critical failures that do not require the vehicle to enter MRC but drive under restricted conditions.” The MR-DDT may refer to, for instance, driving at a lower speed or avoiding highways. The vehicle should perform MR-DDT to direct itself to the MOC and be scheduled for maintenance and repairs.

○ Maintenance Operations Center (MOC):

The ADS vehicle may arrive at the MOC under different scenarios: either recovered by a team, in MR-DDT, or under normal operational conditions. The ADS vehicle manufacturer is expected to establish minimum inspection and maintenance requirements that the MOC must enforce to minimize risks. In this case, the MOC’s responsibility is limited to performing low-complexity system inspections, maintenance, and general upkeep before vehicle operation. These procedures will likely be based on safety checklists developed by the ADS developer and updated based on operational experience.

Activities regarding system updates, sensor calibration, and more complex maintenance procedures are performed in coordination with or directly by the ADS developer. The MOC is expected to report to the FOC and external partners (ADS developer, vehicle manufacturer, authorities) for multiple purposes. For instance, the MOC is expected to communicate system or procedure upgrades to the FOC. The ADS developer may also require reports indicating high-level failures detected by the FOC operators or MOC crew. Further, the MOC is also expected to satisfy local regulations and reporting duties regarding post-incident procedures and investigations.

Step 2: ADS Operational Phase Breakdown

The approach selected for this case study is to define the operational phases focused on one of the agents and its relationship with the other agents. Figure 7 presents the operational profile based on the operational phases of the ADS-equipped vehicle. Note that an operational shift denotes the continuous operation of an ADS vehicle within a delimited time period. The distinct operational phases are briefly described below; further details can be found in.⁸⁴

(a) Pre-shift inspection and corrective maintenance: this process may include the MOC crew performing a checklist of all safety-critical subsystems and functions of the ADS vehicle, including reviewing the ADS onboard diagnostic logs and the inspection of the vehicle’s general state (e.g. verifying tire pressure, windshield integrity, battery levels). If needed, low-complexity maintenance actions can be performed (e.g. adjusting tire pressure, charging the battery).

(b) On-route to destination without passengers: when the ADS receives a dispatch command, the vehicle must perform all the required DDTs to reach the target destination. If any event triggers the need for a DDT-fallback, the ADS is expected to implement the adequate mitigation strategy (MRC, SSC, MR-DDT). If the vehicle has been stranded at MRC, the MOC must physically recover the vehicle.

(c) On-route to destination with passengers: this phase is defined between the passenger pickup and drop-off phases. This phase also considers the interactions between the passenger, the ADS vehicle, and the remote FOC operator. Passengers may request to communicate with the FOC operator or request the vehicle to perform an emergency stop. This request must be detected by the ADS, triggering fallback DDTs to achieve an SSC, and initiating communication with the FOC operator.

(d) Passenger pick-up & drop-off: when the ADS vehicle approaches the designated pickup or drop-off location, it is expected to achieve an SSC. Once the SSC is achieved, the passengers can safely board or exit the vehicle. In addition, the passenger is expected to follow the required safety instructions (closing doors, putting on seat belt, etc.) and confirm the trip details (i.e. drop-off location) before the ADS can initiate or end the trip.

(e) Post-incident management: the FOC must initiate the post-incident procedures if the vehicle has been prompted to achieve an MRC. At a minimum, it is expected that these procedures include (1) automatically disabling the ADS, turning on hazard lights (if not already on), unlocking doors, and disconnecting the main battery; (2) maintaining continuous passenger-FOC operator communication, if possible; (3) contacting first responders and/or law enforcement to aid the passengers or other road users affected.

(f) Preventive maintenance and system updates: including periodic vehicle inspections, service maintenance, software updates and instrument calibration processes. These activities are expected to occur at a lower frequency than the pre-shift inspections, and it is the responsibility of the fleet operator to maintain the schedule specified by the system’s manufacturers. System updates and instrumentation calibration activities are performed in coordination with the ADS developer. These may include low-level software and driving tests to verify the correct functionality of the vehicle.

Figure 7.

Operational phases of ADS vehicle for MaaS.

Stage II: Scenario modeling

This section presents selected results of the methodology’s scenario modeling stage, focusing on the role of the ADS vehicle and the FOC in ensuring safety during the operational phase “on-route with passengers.”

Step 3: Model the operational phase through an ESD

The ESDs and relevant end-states and intermediate events were developed for each operational phase. This section describes the modeling of the operational phase “on-route with passengers,” illustrated in Figure 8. The initiating event denominated “The vehicle is on-route to destination” refers to passenger trips after the pick-up phase has been completed (see Figure 7). Figure 8 shows the diverging paths based on the occurrence of key events (described in Appendix Table 1). The events related to the ADS vehicle and the remote operators at the FOC have been defined based on the IDA cognitive model (Information, Decision, Action). External events are also included for modeling purposes, providing the necessary context to assess the agents’ expected behavior outcome. The potential end-states of this operational phase are described in Table 7.

Figure 8.

ESD for on-route with passenger operational phase.

Table 7.

On-route without passengers ESD end states.

End state #	End state name	Outcome
1	Trip completed successfully	The ADS vehicle safely arrives at the destination.
2	ADS Vehicle is on-route to destination with passengers	The vehicle is in condition to start a passenger trip. This event initiates the operational phase: (2) on-route with passengers.
3	Post-incident procedures are initiated	The FOC initiates post-incident procedures once the vehicle engages in MRC.
4	Vehicle and passenger are stranded	The vehicle engages into MRC when passengers are on board, and the FOC fails to initiate post-incident procedures.
5	Passenger at risk	The vehicle fails to perform a DDT fallback while a passenger is on board. The vehicle cannot successfully achieve SSC or MRC automatically or with the assistance of the FOC operator. This also accounts for when the vehicle is stranded in SSC.

A successful ADS vehicle’s trip may be interrupted by the vehicle breaching the ODD, presenting a failure, by wrongly executed DDTs, or an unavoidable external event. Additionally, passengers on board may request an emergency stop. Based on real-time perception and localization data, the ADS is expected to detect that fallback DDTs are required, plan and execute the DDT fallback. The fallback may lead to returning to nominal DDT, entering MR-DDT, or achieving MRC (Events 2-D, 2-E). The second safety barrier is the remote FOC operator. The role of the FOC is two-fold in this operational phase. The FOC safety operator may intervene if the ADS has either not detected DDT fallbacks were necessary or has wrongly executed the selected fallback strategy, or when alerted by the service operator. In this case, if there is sufficient time for the operator to intervene, they are expected to plan and transmit the correct DDT fallback strategy (Events 2-D2, 2-D3). The service operator is responsible for contacting the passengers if an emergency stop has been requested or the vehicle has achieved MRC (2 N). The vehicle can only resume driving if the passenger confirms that the trip can continue (2-O). Suppose communication with the passenger cannot be established or the passenger confirms the trip cannot continue. In that case, the FOC operator should command the vehicle to achieve MRC (2-D3) and initiate post-incident procedures (2 J).

Step 4: Model agents’ tasks and interactions through CoTA

The events from the ESD are translated into the tasks each agent – ADS vehicle and FOC operators – must perform to ensure its success through the CoTA. The main goal of the ADS vehicle is to “Drive passengers to their destination.” To achieve this goal, the ADS vehicle is expected to perform the High-Level Tasks specified in Table 8. Note that all these tasks (except Task 4) are performed in parallel. Based on their definitions, Tasks 2, 3, and 6 depend on the success of Task 1, while Task 4, “Execute DDT fallback plan,” is only performed if triggered by the corresponding sub-levels of Task 3. As an example, a detailed redescription of Task 4 is provided in Appendix Table 2.

Table 8.

ADS vehicle high-level tasks for on-route with passengers’ operational phase.

Subtasknumber	Subtask	Type	Description
1	Perform DDT OEDR supporting functions	Parallel	Continuous collection and analysis of information about the vehicle’s state and surroundings from its sensor suite (including cameras, environmental sensors, and component sensors). A prescribed set of processed information is transmitted to the FOC.
2	Perform DDTplanning andexecution.	Parallel	Implementation of Dynamic Driving Tasks (DDT) by employing the processed information about the surrounding objects and events, local rules, and the vehicle’s current trajectory to determine the optimal trajectory of the vehicle and, if necessary, implement tactical maneuvers.
3	Determine if aDDT fallbackis required.	Parallel/Trigger	Continuous assessment of DDT fallback plans. A DDT fallback plan can be triggered by breaching the ODD, a vehicle failure, a collision, a passenger requesting an emergency stop, or an external party requesting the vehicle to stop.
4	Execute DDTfallback plan.	Sequential/ Triggered	This is triggered by the outcome of Task 3. Task 4 includes planning, implementing, and evaluating the outcome of a DDT fallback plan. The ADS may request a fallback plan from the FOC if it cannot develop a fallback strategy.
5	Communicationwith FOC andpassengers	Parallel	Continuous vehicle and trip data transmission to the FOC, communication with passengers, and reception of FOC commands.
6	Perform self-diagnostictasks.	Parallel	Continuous monitoring and diagnosis of the vehicle’s health state. The ADS then transmits the diagnostic outcomes to the FOC.

The main goal of the FOC operators is to “Support the vehicle to arrive at its destination.” To achieve this goal, the FOC operators must successfully perform the High-Level Tasks specified in Table 9. In this case, the four tasks are expected to be performed in parallel and may be further divided into safety and service operator tasks. Tasks 1, 2, and 3 may be performed by “safety operators,” while “service operators” may be focused on performing Task 4. Figure 9 provides a simplified version of the CoTA diagram developed for the FOC. Note that Task 3 may trigger “post-incident procedures” if the ADS vehicle alerts the MRC that it has been achieved. As an example, a detailed redescription of the FOC operator’s Task 3 is provided in the Appendix Table 3.

Table 9.

FOC high-level tasks for on-route with passengers’ operational phase.

Subtasknumber	Subtask	Type	Description
1	Monitor ADS vehicleoperation and safety.	Parallel	Continuous monitoring of fleet operations. Attention is focused on specific vehicles depending on alerts received, trip status, weather, and road conditions to evaluate their safety.
2	Communicate withthe vehicle.	Parallel	Transmitting commands or request for specific information to the ADS and responding to ADS requests.
3	Intervene vehicle operationwhen required.	Parallel / Trigger“Post-incidentprocedures”	Evaluating the need for DDT fallback plan based on the received information as a secondary safety barrier. Note that identifying that a fallback plan is needed and planning it are the main responsibilities of the ADS vehicle. If MRC is achieved, post-incident procedures are triggered.
4	Provide support to passengers.	Parallel	Monitoring passenger communication channels or emergency stop requests and establishing communication with passengers if required.

Figure 9.

FOC subtasks while the vehicle is on route without passengers.

Through the development of the CoTA diagrams, not only agent-specific tasks are explored in depth, but also the key interface tasks are identified. Dedicated trigger tasks are included to represent specific scenarios, for instance, the ADS Task 3, each triggering specific subtask responses within Task 4, “Execute DDT fallback plan.” The FOC tasks operate in response to the ADS vehicle’s trigger tasks, and their success is inherently limited to the time available for the FOC operator to achieve sufficient situational awareness. This may depend on the human-system interface (HSI) available to the operator, whose characteristics and requirements should be specified and/or provided to the fleet operator by the ADS developer. The time-sensitive nature of these interface tasks emphasizes the need for adequate communication channels and may lead to stricter ODD requirements for passenger trips. This aspect is further discussed in Section 3.3.

Step 5: Model the failure of key pivotal ESD events through FTs

The failure path of some pivotal ESD events is expanded through FTs focused on each agent, functionally decomposed until selected “Basic Events” described in Table 10. The agents directly participating in the “On-route with passengers” phase are the ADS vehicle and the FOC operators. However, the passive role of the MOC is shown through expanding pivotal events from this ESD.

Table 10.

Selected basic events for developed FTs.

Basic Event	Name	Description	Factors
BE1	FOC-related error	FOC operators fail to follow the required procedure to act upon the available information when intervention is needed to ensure the safety of the vehicle, passengers, or other road users.	Inadequate FOC procedures; failure to follow procedures; inadequate training; inadequate tools.
BE2	MOC-related error	MOC crew fail to follow the required procedure provided by the ADS developer to ensure the vehicle’s safety through inspection and/or maintenance actions.	Inadequate MOC procedures; failure to follow procedures; inadequate training; inadequate tools.
BE3	Hardware-relatedmalfunction	This refers to hardware malfunctions occurring during the vehicle’s operation. The MOC is responsible for informing the ADS developer of these failures.	Inadequate MOC procedures; limited hardware reliability.
BE4	Software-relatedmalfunction	This basic event is divided into software reliability failures and software design limitations. Software reliability refers to malfunctions occurring during the vehicle’s operation. Software design limitations that negatively affect the vehicles’ expected performance. The fleet operator is responsible for informing the ADS developer of these failures and adopting system updates if required.	This basic event is divided into (a) inadequate MOC procedures, limited hardware reliability, and (b) software design limitations from the ADS developer.
BE5	External event	An external and unavoidable event has occurred, negatively affecting the vehicles’ expected performance. The fleet operator is responsible for informing the ADS developer of these failures and adopting system updates if required.	Other road users; wireless communication providers failure, etc.
BE6	Procedure designis inadequate	The procedure provided by the ADS developer is inadequate to address the event. The fleet operator is responsible for informing the ADS developer of these limitations and adopting operational updates if required.	ADS developer error or procedure limitation.

An example FT associated with the operational phase “on-route with passengers” is presented in this section. The top event, “II-1: FOC operator fails to detect DDT-fallback is required,” is related to the ESD event 2-D2 (Figure 10). This event is key to recovering from the failure of the ADS vehicle not detecting or implementing a DDT fallback correctly. In this case, the top event may be caused by communication errors between the FOC operator and the ADS vehicle or the FOC operator failing to follow established emergency procedures. The intermediate events are described in the Appendix Table 4 and Appendix Figure 1.

Step 6: Model agent’s functions and interactions through STPA

Figure 10.

Top Event II-1 FOC operator fails to detect DDT-fallback is required.

Following the procedure to apply STPA, the operation of the reference fleet of L4 ADS vehicles is considered within the broader MaaS context.^86,87 The following stakeholders are considered: S-1 Passengers, S-2 Other Road users, S-3 Regulatory bodies, S-4 Law enforcement, S-5 First responders, S-6 Urban planning, S-7 Fleet operator, and S-8 ADS developers. Each of the stakeholders involved in the operation of the ADS vehicle may be associated with high-level loss scenarios. Of the loss scenarios presented in Table 11, L-1-L-3 directly relate to safety. While L-4 and L-5 may have safety connotations depending on the scenario, L-6 would have other consequences not related to safety, which would nonetheless affect the ADS’s operation.

Table 11.

Identified loss scenarios.

No.	Loss description	Relevant stakeholder
L-1	Injury or loss of life of passengers or other road users.	S-1, S-2, S-3, S-4, S-5, S-7, S-8
L-2	Vehicle damage due to collisions with objects.	S-1, S-2, S-4, S-5, S-7, S-8
L-3	Loss of vehicle connection.	S-1, S-7, S-8
L-4	Loss of mission (trip could not be completed).	S-1, S-7, S-8
L-5	Vehicles are stranded.	S-2, S-4, S-6
L-6	Loss of reputation.	S-3, S-6, S-7, S-8

The system components and boundaries are defined based on Section 3.1 to identify the system-level hazards leading to these loss scenarios. The system-level hazards, constraints, and associated loss scenarios are presented in Table 12.

Table 12.

Identified system-level hazards, system-level constraints, and related loss scenarios.

Hazard No.	System-level hazard	Constraint no.	System-level constraint.	Relevant loss scenario
H-1	Vehicle does not execute DDT correctly.	SC-1	Vehicle must execute DDT correctly under the conditions specified in the ODD.	L-1, L-2, L-6
H-2	Vehicle suffers a safety-critical failure.	SC-2	Fleet operators must minimize the risk of safety-critical failures through adequate inspection and maintenance procedures.	L-1, L-2, L-3, L-6
H-3	Vehicle breaches established ODD.	SC-3	Vehicle must implement adequate DDT procedures to remain within the specified ODD and perform DDT fallback strategies if necessary to return to the ODD.	L-3, L-4, L-6
H-4	FOC does not intervene when required.	SC-4	Fleet operators must monitor and supervise the vehicle at all times and intervene to support the vehicle’s operation.	L-1, L-2, L-3, L-5, L-6
H-5	MOC clears the vehicle for operation incorrectly.	SC-5	Fleet operators must establish maintenance and inspection procedures in coordination with the ADS vehicle manufacturer and ensure these are followed.	L-1, L-3, L-4, L-6

The hierarchical control diagram based on the previous definitions is presented in Figure 11. The control structure considers the functional breakdown, associated control actions, and feedback loops. The functions of the three subsystems (i.e. agents) are organized in a vertical hierarchy, where the FOC and MOC controllers are at the highest level while the interaction of the ADS vehicle and its surroundings, “World,” is at the lowest level. Note that controllers receiving and emitting commands are placed at the same horizontal level. This modeling decision represents that the ADS vehicle’s operation is contingent on the fleet operator’s preventive and mitigative actions (FOC, MOC). The interactions with external agents can be summarized into five distinct types. The system’s outputs and entry information with the ADS vehicle’s surroundings (O1, E1), passengers (O2, E2), and third parties during vehicle operation (O3, E3), while interchanges with the ADS developer (O4, E4) and regulatory entities (O5, E5) focus on the management role of the fleet operator. Within the fleet operator, interactions between functional components are denoted with a “C” for control commands and an “F” for feedback, are numbered and carry a tag letter (a-d) to represent complementary actions (either emitted or received by the same controller). A brief overview of the diagram is summarized in this section. Details about the control actions are described in Appendix Table 5 and Appendix Table 6.

Figure 11.

Fleet operations control diagram.

The first interface of the vehicle to its surroundings is the collection and processing of sensor data (F1) to support diagnostic (F2) and DDT planning (F3) functions. Likewise, sensor data from the vehicle’s physical systems is also collected and processed for diagnostic purposes (F4). The vehicle’s physical behavior (C4) may be altered by evolving local path generation plans (C1), navigation goals, ODD constraints (C2), DDT-fallback triggers (C5), or commands received by the FOC or passengers (C6). The ADS transmits vehicle health data (F5) and receives commands from the FOC to communicate with passengers (C7) and intervene in the event of emergencies (C8). Communications between the FOC and the MOC consist of coordinating inspection and maintenance activities, post-incident procedures, and vehicle recovery (C11, C12). The MOC coordinators act as the main manager of MOC crew activities, relaying information internal maintenance operations (C13, C14), and the vehicles’ recovery team (C10). Abnormal vehicle conditions discovered through inspection and maintenance operations are reported to the MOC (F13, F14), while system updates are managed in cooperation with the ADS developer (C15). Other interactions between the MOC and the ADS developer consist of management of change, coordination of external maintenance operations (C16) and fleet operations review (C17).

Step 7: Determine model connections and redundancies

Stage II was conducted for all the operational phases defined in Stage I. Eight ESDs were developed to represent the operational phases, containing over 100 events related to the agent’s performance and 41 distinct end-states. Following this, 16 CoTA models were developed based on the ESDs, identifying over 200 tasks for the ADS, FOC, and MOC agents. Then, 13 ESD events were selected for further exploration through FTs, decomposing the top failures into over 120 intermediate events characterized by the six basic failure events detailed in Table 10. Finally, the STPA model developed is summarized in 38 control actions and 35 distinct feedback responses. The information generated at this stage was organized as equivalency tables, as the example presented in Table 3.

Stage III: Hazard identification

Implementing Steps 8–9 resulted in a total of 43 high-level hazards identified associated with 912 failure modes, providing traceability of multiple failure modes and agent interactions. Of these failure modes, 211 (23%) were unique, meaning these were connected to only one of the 43 hazard scenarios. Table 13 summarizes the failure modes identified by agent and method.

Table 13.

Failure modes identified per agent and method.

Agent/ method	Total (Unique)							Total	Percentage (%)
Agent/ method	CoTA	STPA	FT	CoTA/FT	STPA/CoTA	FT/STPA	All	Total	Percentage (%)
ADS	60 (18)	62 (14)	43 (7)	57 (11)	37 (9)	22 (7)	140 (19)	421 (85)	40
FOC	123 (35)	48 (9)	22 (6)	10 (5)	44 (8)	0	12 (4)	259 (67)	32
MOC	49 (17)	128 (19)	23 (10)	14 (6)	12 (3)	0	6 (4)	232 (59)	28
Total	232 (70)	238 (42)	88 (23)	81 (22)	93 (20)	22 (7)	158 (27)	912 (211)
Unique, %	30%	18%	26%	27%	22%	32%	17%	23%

The distribution of failure modes characterizing each hazard scenario strongly depends on the focus with which Stage II was developed. In this application, in addition to failure modes directly related to an ESD event’s failure path (i.e. the hazard scenario), we have chosen to incorporate indirect failure modes explicitly. These originate from parallel and trigger tasks (CoTA) and dependent information pathways (STPA) leading to specific failure modes. Note that only the lowest levels of CoTA tasks and FT events were considered in the final list of failure modes. Likewise, FTs were only developed for selecting ESD events, hence covering a limited number of contributing failure modes. In relation to the system’s agents, the ADS has the highest number of contributing failure modes (40%), followed by the FOC operators (32%) and the MOC crew (28%). The COTA method provided more contributing failure modes related to the FOC operators and FTs provided more details to issues with the ADS vehicle. This relation may be expected, given that CoTA was developed specifically to address human-system interactions in complex systems, while FTs have been extremely popular to describe hardware and software failure events. STPA, on the other hand, contributed significantly to characterizing procedural issues of the MOC crew, particularly procedures not fully captured by the ESD-based scenario development (e.g. management of change issues). The highest overlap between the methods is observed regarding the ADS vehicle, for which 158 (27 unique) failure modes were identified.

Figure 12 provides an overview of distribution of failure modes by method, considering the (a) total number and (b) those uniquely connected to a hazard scenario. Focusing on the 211 unique failure modes, CoTA generated 66%, STPA 45%, and FT 37% (Figure 12(b)). Considering the failure modes identified by a single method, CoTA is responsible for 33%, STPA for 20%, and FT for 11%. However, there was also several redundant failure modes – 23% were identified by two methods (10% CoTA/FT, 9% CoTA/STPA, and 3% STPA/FT) and 13% by all three.

Figure 12.

Distribution of identified failure modes by method (a) Total, (b) Unique.

Each failure mode is associated with a single risk contributor described in Table 14. These are selected based on the functional breakdown of each agent performed during the system and scenario modeling stages. This division of agent functions is valuable in determining hazard prevention and mitigation responsibilities (see Section 3.4 for further details).

Table 14.

Risk contributor breakdown and description.

Agent	Risk contributor		Description
ADS	A1	ADS vehicle	Refers to specific hardware of the vehicle, e.g., instrumentation and motion control.
	A2	ADS software	Refers to the ADS and other software-controlled processes of the vehicle.
	A3	ADS communication	Refers to the communication channels’ functionality, including hardware and software.
FOC	F1	FOC safety operator	Refers to remote operators located at the FOC’s control center, focused on functional safety aspects. Monitoring the vehicle’s safety and intervening to ensure the vehicle’s safety are the responsibilities of the safety operator.
	F2	FOC service operator	Refers to remote operators located at the FOC’s control center, focused on mobility service aspects. Communications with passengers, first responders, and law enforcement are the responsibility of the service operator.
	F3	FOC communication	Refers to the functionality of the communication channels, including both hardware and software.
MOC	M1	MOC maintenance crew	Refers to the crew stationed at the MOC’s maintenance center, responsible for performing corrective and preventive maintenance to the vehicle and reporting to the MOC coordination center.
	M2	MOC inspection crew	Refers to the crew stationed at the MOC’s inspection center, responsible for performing pre-shift vehicle inspections and reporting to the MOC coordination center.
	M3	MOC coordinators	Refers to staff stationed at the MOC, responsible for coordinating the actions between the inspection and maintenance team, as well as dispatching the vehicle recovery team, and communicating with external parties.
	M4	MOC communication	Refers to the functionality of the communication channels between the MOC units and with the FOC or external parties involved.

As an example, Table 15 presents the hazard scenarios identified during the “on-route with passengers” operational phase. This table provides the explicit connections between events developed to describe the ESD and the functions and interactions that support the success of events. The latter are derived from the corresponding CoTA, STPA, and FT diagrams. This aims to demonstrate how contextual information given by the operational phases determines which tasks and interactions are required to perform each subsystem’s functions.

Table 15.

Selected hazard scenarios mapped to modeling methods.

ID#	Agent	Hazard Scenario	Event (ESD)	Task (COTA)	Action (STPA)	Tree (FT)
2.1.1	ADS	Fails to detect DDT-fallback is required.	2-A	A1, A2	F1a, F2a, F3a, C1, C3, C4	I-1
2.1.2	ADS	Fails to perform DDT-fallback correctly.	2-D	A3, A5	F3a, F4a, F3b, C5a	I-1
2.1.3	ADS	Fails to request post-incident management procedures.	2-E	A4.1, A4.2, A4.3	C1, C2b, C3, C4	I-2
2.2.1	FOC	Fails to detect DDT fallback is required.	2-D2	F3.1	F2b, F5b, C5b, F8b	II-1
2.2.2	FOC	Fails to send correct DDT fallback command.	2-D3	F3.2	C8b, C6c	II-2
2.2.3	FOC	Fails to initiate post-incident procedures.	2 J	F3.3	C8b, C11b	II-3
2.2.4	FOC	Fails to communicate with passenger.	2 N	F3.1.7.2	C7a, F7a, F8a, C7b	II-1

Details on the Hazard Scenario Example: #2.2.1 “FOC does not detect a DDT fallback is required” are discussed as follows. This hazard is characterized by the main failure modes provided in Table 16 and related indirect failure modes shown in Table 17. As the hazard derives from the failure path of the event 1-D2: “FOC detects DDT fallback is required,” the success of this event is described by the task F3.1: “Determine if a DDT fallback plan is required” (see Appendix Table 3). To determine that a DDT fallback is required, the FOC operator must assess whether the vehicle has breached the ODD (F3.1.1). This requires the operator to be trained in what is contemplated within the vehicle’s ODD (F3.1.1.1) and have the tools to assess the vehicle’s location and/or surroundings in real-time (F3.1.1.2).

Table 16.

Example hazard scenario #2.2.1 main failure modes.

Risk Contributors	Failure Mode Fails to/Fails to provide:		Causal Factor leading to UCA
FOC safetyoperator	F1.1	Evaluate if the ODD is breached.	Not Provided, Incorrect Time/Sequence
	F1.2	Determine if there is an ADS vehicle failure.
	F1.3	Determine if a collision has occurred.
	F1.4	Determine if a passenger has requested anemergency stop.
	F1.5	Determine if external party asked for a stop.	Not Provided, Incorrectly Provided
	F1.6	Evaluate state of passengers and vehicle.	Not Provided, Incorrectly Provided
FOC communication	F3.1	Receive request from ADS.	Not Provided, Incorrect Time/Sequence,Incorrect Duration
FOC communication	F3.2	Receive the outcome ofDDT fallback implementation.	Not Provided, Incorrect Time/Sequence,Incorrect Duration

This table is read as: The [Risk Contributor] presents the failure mode [Fails to/Fails to provide] potentially caused by [Causal Factor leading to UCA].

Table 17.

Risk contributors	Failure mode Fails to/Fails to provide:		Causal factor leading to UCA	Relatedhazard
ADSsoftware	A2.1	Transmit outcome of self-diagnostic tests	Not Provided, IncorrectTime/Sequence	2.1.1
	A2.2	Detect a vehicle communication channel failure
	A2.3	Processed perception data for FOCoperator supervision
	A2.4	Detect an external connectivity failure
	A2.5	Recorded diagnostic logs for FOCoperator supervision
	A2.6	Determine if external party requested a stop	Incorrect Time/Sequence,Incorrect Duration
	A2.7	Determine if a passenger has requested an emergency stop	Not Provided, Incorrect Time/Sequence, Incorrectly Provided
	A2.8	Use updated/correct HD maps	Not Provided, Incorrect Time/Sequence, Incorrectly Provided
	A2.9	Enforce updated/correct ODD limits	Not Provided, Incorrectly Provided
ADScommunication	A3.1	Request plan for DDT fallback from FOC	Not Provided, Incorrect Time/Sequence	2.1.2
	A3.2	Alert that DDT fallback is required
	A3.3	Request maintenance scheduling
	A3.4	Transmit communication from passenger to FOC	Not Provided, Incorrect Time/Sequence, Incorrect Duration
	A3.5	Respond to request for information
	A3.6	Transmit communication from vehicle to FOC
	A3.7	Transmit to FOC prescribed information		2.1.1
	A3.8	Maintain stable communication with FOC	Incorrect Time/Sequence,Incorrect Duration	2.1.1
FOC safetyoperator	F1.7	Monitor ADS vehicle operations	Not Provided, Incorrect Duration	–
	F1.8	Evaluate ADS vehicle safety	Not Provided, Incorrect Duration
	F1.9	Determine if more information is needed	Not Provided, Incorrect Time/Sequence
	F1.10	Evaluate information from ADS	Not Provided, Incorrect Time/Sequence,Incorrect Duration
	F1.11	Respond to ADS request	Not Provided, Incorrect Time/Sequence,Incorrect Duration
	F1.12	Confirm operational guidelines update	Not Provided, Incorrectly Provided
FOC serviceoperator	F2.1	Receive requests from passengers	Not Provided, Incorrect Time/Sequence,Incorrect Duration	–
	F2.2	Communicate with passengers	Not Provided, Incorrect Time/Sequence,Incorrect Duration
	F2.3	Alert of passenger emergency stop request	Not Provided, Incorrect Time/Sequence,Incorrectly Provided
	F2.4	Alert DDT fallback is required,request secondary vehicle.	Not Provided, Incorrect Time/Sequence
MOC coordinators	M3.1	Update operational procedures.	Not Provided, Incorrect Time/Sequence, Incorrect Duration, Incorrectly Provided	–
MOC coordinators	M3.2	Confirm procedure update.	Not Provided, Incorrect Time/Sequence	–

This table is read as: The [Risk Contributor] presents the failure mode [Fails to/Fails to provide] potentially caused by [Causal Factor leading to UCA] and also contributes to the hazard scenario [Related Hazard].

Similarly, the operator is expected to assess whether the vehicle has failed to respond adequately if an on-board failure has been detected, a collision has occurred, if an external party has requested a stop (i.e. law enforcement or first responders), or a DDT fallback plan was not executed correctly. As detailed in Table 9, Task 3 is expected to be performed by the FOC operator in parallel to continuously monitoring the vehicle’s operation, receiving and requesting information from the vehicle when required, and assessing if the vehicle needs to be dispatched elsewhere (Task 1, 2, and 4). The redescription of these tasks (in particular, Tasks 1 and 2) highlights the importance of the reliability and safety of the communication channels (e.g. video, sensors, and alarms) between the FOC safety operators and the ADS vehicle.

This critical aspect is further indicated by the key control actions and feedback that, when disrupted, may lead to a hazard scenario. An error in these control actions may be expressed as Not Provided or Incorrectly Provided (wrong action), Incorrect Time/Sequence (too early or too late), or Incorrect Duration (too long or too short), depending on the type of action (see Table 16). For instance, the quality and completeness of the information recorded by the vehicle is also of great importance in providing the FOC operator with the necessary tools to detect that a DDT fallback is required (F2b and F5b – errors arise through Not Provided or Incorrect Time/Sequence). Note that this depends not only on the reliability of the communication network but also on the design of the FOC human-system interface (HSI) (II-1F, Appendix Table 4). While the failure of the communication channels (II-1B, Appendix Table 4) can unexpectedly occur during operation, it may be the result of imperfect inspection and maintenance procedures (BE2: MOC-related human errors) or due to an improper frequency of maintenance activities (BE6: procedure design error) – both expressed through control actions Not Provided or Incorrectly Provided.

On the other hand, the ability of the FOC operator to detect potential threats to the vehicle’s operation may be significantly reduced if the ADS self-diagnostic module fails to detect an on-board failure (I-1B). Further, training, shift hours, and other factors can affect the operator’s situational awareness (II-1I, Appendix Table 4). Additionally, the pre-defined DDT-fallback procedures may not be adequate for the specific scenario which triggered the DDT fallback. This issue might be of importance during the initial period of operation of the fleet or given changing road conditions (e.g. construction zones, changing traffic signals). These procedures are of great importance, particularly when discussing hardware failures that the ADS self-diagnostic system cannot monitor without additional and failure-specific sensor systems (e.g. broken windshield or braking lights). Moreover, the ADS vehicle may not be capable of detecting every failure that affects its performance. The ADS vehicle manufacturer and the fleet operator are expected to establish which components or subsystems require a more frequent inspection to avoid unexpected operational failures.

This process is repeated until Step 10 is completed, and a hazard catalog compiles the list of contributing failure modes in the form presented in Tables 16 and 17 for each hazard scenario derived. Applying the proposed hazard identification methodology to the entire system yielded 43 high-level hazards associated with over 900 contributing failure modes. This enables the analyst to trace contributing failure modes and risk contributors across multiple hazards occurring during the same or distinct operational phases. Linking the contributing failure modes to specific risk contributors can be used to derive a list of operational safety responsibilities for each agent.

Safety responsibilities and risk mitigation measures

The results from the hazard identification methodology can be used to further explore the agents responsible for preventing the contributing failure modes. As presented in Table 17, while many operational hazards may be related to ADS failures or errors (e.g. a faulty sensor triggering an MRC), the underlying causes potentially can be detected, identified, and addressed before operation (e.g. pre-shift inspections). It must be emphasized that this process of identifying operational safety responsibility aims to improve existing safety barriers or countermeasures at different levels – remote operator, maintenance crew, fleet operator, or other organizations involved in operating the ADS-equipped vehicles. While the operational safety responsibilities highlighted in Table 18, for example, following established procedures, completing related tasks, etc., are linked to specific agents at a functional level, each activity is supported by the fleet operator.

Table 18.

Example of operational safety responsibilities from hazard scenario #2.2.1.

Failure mode Fails to/Fails to provide:		Agent responsible	Agent responsibility
F1.1	Evaluate if the ODD is breached.	FOC safety operator	Follow established operating procedure
F1.2	Determine if there is an ADS vehicle failure.
F1.3	Determine if a collision has occurred.
F1.5	Determine if external party asked for a stop.
F1.7	Monitor ADS vehicle operations
F2.1	Receive requests from passengers	FOC service operator
F2.2	Alert of passenger emergency stop request
F2.3	Communicate with passengers
F2.4	Alert DDT fallback is required
F3.1	Receive request from ADS.	FOC safety operator	Report anomalies during the operation
F3.2	Receive the outcome of DDT fallback implementation.		Report anomalies during the operation
F3.4	Confirm operational guidelines update		Follow established operating procedure
A2.1	Transmit outcome of self-diagnostic tests	MOC inspection crew	Verify functionality of ADS software during inspection
A2.2	Detect a vehicle communication channel failure
A2.3	Processed perception data for FOC operator supervision
A2.6	Determine if external party requested a stop
A2.5	Recorded diagnostic logs for FOC operator supervision	MOC maintenance crew	Review state of ADS software
A2.9	Enforce updated/correct ODD limits	MOC maintenance crew	Review state of ADS software
A3.2	Alert that DDT fallback is required	MOC inspection crew	Verify the functionality of ADS communication
A3.4	Transmit communication from passenger to FOC
A3.7	Transmit to FOC prescribed information
A3.8	Maintain stable communication with FOC
M3.1	Update operational procedures.	MOC coordinators	Implement updates from MOC external operations
M3.2	Confirm procedure update.	MOC coordinators	Implement updates from MOC external operations

This table is read as: The [Failure Mode] can be prevented or mitigated by the [Agent Responsible] through a high-level [Agent Responsibility] that is supported by the fleet operator.

Given the relationship between the fleet operator and the ADS developer assumed for this analysis, the role of the fleet operator is generally limited to complying with operational requirements specified by the ADS developer. For instance, the ADS developer is expected to establish certain inspection and maintenance procedures requirements. However, the fleet operator may be considered responsible for developing internal operational procedures, and providing or certifying that their operators and crew members receive adequate training or tools to perform their tasks. Some operational procedures or tools may require input from the ADS developer, depending on the fleet operator’s access to the system’s hardware and software components.

The operational safety responsibilities addressing each contributing failure mode were categorized by the resources the fleet operator requires to support each agent in performing their tasks (Table 5). This analysis resulted in a total of 140 risk mitigation activities covering the 211 unique contributing failure modes. This list contains activities that impact the tasks and performance of multiple target agents and can be aggregated into 81 distinct activities spanning operator and crew training, operational procedure development, software and hardware tools, and workplace adequacy factors. An example of these activities related to the hazard scenario #2.2.1 detailed in the previous section is presented in Table 19. In this example, it can be seen that while the operational procedures embedded in the ADS vehicle software play a direct role in detecting DDT fallback triggers (ODD breaches or system failures), these mainly depend on the ADS developer. Nonetheless, there may be other safety barriers in place that the fleet operator can enforce, particularly related to the remote operators and maintenance crew members. As L4 ADS developers begin to deploy larger-scale operations, it is crucial for fleet operators and regulatory entities to determine which activities, procedures, and requirements must be considered to ensure operational safety. Some of the most relevant risk mitigation activities identified target all agents (ADS vehicle, FOC, MOC) and include key activities such as those related to management of change, training remote supervisors to monitor and intervene in vehicles’ operation, providing adequate working conditions for operators, enforcing vehicle connectivity, and dispatching requirements, and coordinating internal incident mitigation activities.

Table 19.

Risk mitigation activity related to hazard scenario #2.2.1.

Activity type	Activity purpose	Target Agent	Organization Responsible
Operator & Crew Training	Recognize HSI and connectivity failures.	FOC Safety Operator	Fleet Operator
	Use HSI to monitor and intervene in the vehicle’s operation.	FOC Safety Operator	Fleet Operator/ADS Developer
	Enforce vehicle software and communication devices safety checklist.	MOC Inspection & Maintenance Crew	Fleet Operator/ADS Developer
Operational Procedures	Determine DDT-fallback triggers, goals, and strategies.	ADS Vehicle	ADS Developer
	Enforce ODD and local road restrictions.
	Enforce self-diagnostic capabilities (vehicle hardware, software, connectivity).
	Interact with first responders/law enforcement.
	Enforce vehicle dispatching requirements.	FOC Safety Operator	Fleet Operator
	Locate and manage vehicles exhibiting abnormal behavior.	FOC Safety Operator	Fleet Operator
	Coordinate training activities with ADS Developer.	MOC Coordinators	Fleet Operator/ADS Developer
	Implement specified maintenance activities content and frequency.	MOC Coordinators	Fleet Operator/ADS Developer
Hardware & Software Tools	Provide and maintain adequate HSI design to support agent tasks.	FOC Safety & Service Operator, MOC Coordinators, Inspection & Maintenance Crew	Fleet Operator
Working Conditions	Determine the adequate length of shifts.	FOC Safety & Service Operator, MOC Coordinators	Fleet Operator
	Provide emergency procedure handbooks/guidelines.	FOC Safety & Service Operator, MOC Coordinators	Fleet Operator/ADS Developer
	Provide adequate working conditions.	FOC Safety & Service Operator, MOC Coordinators, Inspection & Maintenance Crew	Fleet Operator

Discussion

L4 ADS case study

The hazard scenarios identified in the example highlight the complexity of the interactions between the ADS vehicle, FOC operators, and MOC crew during operations. This methodology enables tracking the hazard scenarios and contributing failure modes toward the fleet operator’s risk mitigation activities required to support the agents in performing their tasks.

The reliability and trustworthiness of ADS software and hardware play a central role in ensuring a safe trip for passengers and other road users. Yet, if the ADS fails to mitigate risks autonomously during operation, the FOC operators may need to function as a layered safety barrier. This implies additional design requirements for HSI, ADS component redundancy, safety alarm systems, and the extensive human factors involved in the remote operator’s functions. It may also lead to fleet operators opting to restrict further the ODDs prescribed by the ADS developer, for instance, regarding wireless communication systems’ reliability, stability, and quality (controlling the external factor, which may lead to the hazard scenarios described in the previous section). Additionally, even though the participation of the MOC is not explicitly shown in the example provided, preventive and corrective maintenance policies are expected to be key to reducing vehicle malfunctions during operation. Some key insights are summarized as follows:

The ODD usually includes physical restrictions regarding geographical locations, road conditions, and extreme weather. Wireless communications restrictions may need to be strictly enforced to support the tasks of the FOC safety and service operators. This is important for the MaaS aspect, as the ADS vehicles are expected to operate independently at reasonable low-risk conditions. Communication with passengers must be available in an emergency; hence, fleet operators may wish to restrict the ODD further than designed by the ADS developers. This aspect raises critical cybersecurity and wireless communication stability requirements that need to be pursued in depth.^88,89 To address communication requirements in MaaS applications, it may be beneficial to leverage existing research regarding Connected Automated Vehicles (CAV) applications.^90–92 Further, it may be useful to introduce the concept of a Fleet Operational Design Domain (FODD), determining the conditions under which the ADS vehicle may operate for MaaS.

The role of the FOC safety operator requires dedicated analysis. Particularly, the practicality of remote interventions in the ADS vehicle’s operation to address automation failures. In many scenarios, there might not be enough time for the remote operator to receive and adequately react to the information delivered by the vehicle or passengers.^93,94 This time constraint upon the FOC’s next tasks is integrated into the ESD diagram (Figure 8). The event “2-D1: There is sufficient time for FOC operators to react to DDT-fallback requirement” is included such that the remote operators; performance may be physically possible and attenable considering HSI design, and network latency. The effect of time-related constraints over operational boundaries is a design factor that requires further study.^82,95 Just as the current role safety drivers play during testing and validation of ADS vehicles, the FOC safety operators should know the ODD limits of the vehicle and how to recognize a potential breach. Further, fleet operators and ADS developers may design HSI, stricter ODD requirements (or FODD), or alarm systems to support the role of the FOC safety and service operators in performing driving assistance teleoperation functions.²⁹

Fleet operators will have a significant role in ensuring MOC crew training and adequate procedures. Hardware and software malfunctions may not be prevented or corrected due to inadequate inspection and maintenance operations. It may be the role of the ADS developer to work with the fleet operator and establish guidelines for inspection, maintenance, and incident investigation. Fleet operators may modify prescribed procedures (monitoring and intervening vehicle operation, pre-shift inspection, and/or maintenance policies) based on operational experience. Depending on the information asymmetry between the ADS developer and the fleet operator, different reporting and oversight policies might be involved to ensure the vehicle’s safe operation. A limited knowledge of the fleet operator about the ADS software and hardware specifications, requirements, and maintenance procedures may imply the need for a more active participation of the ADS developer, both in terms of operational and regulatory compliance.

Hazard identification methodology

Hazard identification constitutes a crucial step that can provide insight to develop more comprehensive procedures and guidelines to ensure operational safety. The proposed hazard identification methodology combines modern tools with traditional approaches that multiple analysts may be more familiar with.

The use of ESD to model the operational sequences plays a central role in constructing scenario-based or contextual analysis at a high-level. While agent functions and interactions are developed in depth through CoTA and STPA methodologies, context is fundamental to identifying hazards correctly. It also provides a risk-based approach in which hazards are not only identified, but they are associated with a consequence and frequency, allowing risk-based prioritization. A combination of unsafe actions may or may not lead to system-level hazards depending on the context. As modeling all the potential variations of scenarios in which the ADS vehicle operates (denoted as “World” in Figure 11) is not within the scope of the present analysis, establishing what can occur and who is responsible for it depending on the context (i.e. operational phase) provides a structured approach to exploring the space of potential hazards. Depending on the focus of the analysis, the specificity of the initiating events modeled may be chosen to challenge specific functions of any of the agents involved in the system’s operation. For instance, additional contextual aggravators may be added instead of “The vehicle is on-route to destination,” for example, “The vehicle is on-route to destination under reduced visibility conditions.” The list of relevant contextual aggravators may be elicited from the functional breakdown of the system, incident reports, or naturalistic driving datasets, among other sources. While these aggravators are challenging to identify and model when analyzing hazards at fleet-level operations, a higher specificity is crucial when expanding toward risk quantification efforts.

The proposed methodology leverages the synergy of the STPA and CoTA approaches, both aimed at explaining how unexpected interactions and task dependencies may propagate system-level failures. Both approaches are success-driven, that is, the models are constructed based on expected system behavior. Hence, the hazard identification process consists of interrogating each function, action, or task to identify sources of error, failure, or unsafe interactions. As the CoTA models are derived from the scenario-based ESDs, while the STPA system-level hierarchical control diagram is obtained from the functional definition of the system, both methods may lead to different contributing failure modes for the same hazards or even aid the identification of different hazard scenarios. However, while the identification of consequences is embedded into the development of STPA and FT models, CoTA relies on the ESDs end-states to express potential consequences of task failures. Likewise, while the tasks derived from the CoTA method are directly linked to specific scenarios by design (through the ESD), contextual analysis of causal factors in STPA are derived after identifying the UCA and is strongly dependent on how explicitly the feedback loops of the system are modeled. In contrast, the proposed methodology begins by developing ESDs, a method specifically for modeling dynamic scenarios, providing the foundation for the rest of the analysis. The deductive and inductive approach to context of the CoTA and STPA, respectively, acts as complementary forces within the hazard identification methodology. Another relevant aspect of the ESD and COTA model construction is using the IDA model to analyze the tasks and interactions of the ADS, FOC remote operators, and MOC crew. STPA has demonstrated its effectiveness in identifying software-related hazards and provides a means to assess the correctness of control actions regarding content or timing. However, employing STPA to design safety countermeasures may prove difficult as it requires detailed knowledge of the system, which in turn increases the difficulty of conducting the UCA analysis effectively.^49,96 In the proposed methodology, the IDA cognitive model is employed to identify potential hazard scenarios and contributing failure modes for each of those stages. Hence, the actions of each agent (machine or human) are modeled at the level of diagnostics and situation assessment rather than only as controllers and feedback emitters.³⁶ The underlying IDA approach is employed to identify the resources needed to perform said actions, for example, how the information is transmitted and presented to the agent, what previous knowledge the agent requires to decide the appropriate action, and what mechanisms are needed to perform the actions adequately. The CoTA and FT models provide insights into the dependency between tasks and functions that do not depend on the accuracy of the modeled feedback loops. Further, the FTA models is the only failure-oriented method employed, providing additional depth by exploring the failure path of key ESD events. However, the benefits of employing FTs methods (as well as the ESD) increase significantly when more quantitative data is available. Similarly, the benefits from conducting STPA-based hazard identification increase when more information about the nature of the system is available.

Regarding the methodology’s efficiency in identifying failure modes contributing to hazard scenarios, only a 13% of these were covered by all three methods – STPA, CoTA, and FTs (Table 13). In the case study presented, CoTA was the method that provided the most unique failure modes (33%), followed by STPA (20%) and FTs (11%), while the rest were identified through at least two methods. This resulted in a total of 36% redundant contributing failure modes. However, this result is highly dependent on the analysis and will probably vary depending on the nature of the system studied.

Given the tools’ flexibility, the proposed methodology is highly scalable, and the level of detail may be modified iteratively as desired. The strengths and limitations of methods used in this approach have been leveraged to analyze the agent’s interactions from different angles. In this regard, analyzing highly complex cyber-physical systems requires experts with diverse perspectives, including system reliability (software, hardware, and human) and cybersecurity. As presented in this work, the proposed methodology has the inherent flexibility to be fully expanded for qualitative analysis. It may also be employed for quantitative risk assessments, as the modeling stages of the agents’ functions and tasks may be reduced or compressed at a higher level to comply with the available data. For instance, while specific hardware of software failure rates for the perception sensor suite may not be available, data representing a combined “detection failure” may be the lowest-level FT basic event to be quantified. Alternatively, semi-quantitative approaches can be leveraged to prioritize the development of prevention or mitigation policies addressing critical safety hazards. In this case study, the methodology’s results focus on identifying operational safety responsibilities and providing input to develop risk mitigation measures. Common cause failure analysis plays a key role in highly interconnected complex systems. In this work, the analysis is conducted at a sub-system level and so the study of component-level redundancies and dependencies to determine common cause failures is not within the scope of the application and must be carried out in detail at a later stage. However, linking the hazard scenarios to risk contributors and risk mitigation measures enables analysts to trace potential causes of contributing failure modes, providing evidence, for instance, to consider common cause failures in the system. For example, if a high number of ADS vehicle sensor failures can be traced back to inappropriate inspection and maintenance policies, this may prompt the fleet operator to update procedures, training, or the tools available to the crew.

The methodology implementation presents technical challenges in terms of available resources, time, and knowledge of the system. All the methods employed require significant time and expertise to be developed in a way that adequately represents the system operation and the analysts’ purpose of the investigation. In this regard, implementing this methodology can become a highly resource-intensive process even at early system design stages. Further, some potential limitations of the methods need to be further addressed, for example, scenario explosion in ESDs, the analyst-dependent identification of parallel tasks and the re-description details of CoTA tasks,⁴⁷ difficulty evaluating safety countermeasures to identified UCAs,⁹⁶ or difficulty expressing event dependencies.³⁴ While this work aimed to apply the methodology for a fleet-level analysis, these models would need further refinement to focus on highly specific scenarios, that is, contextual aggravators. Likewise, the assessment of the risks each hazard scenario poses is out of the scope of the methodology – as it is highly system-dependent. In this regard, the methodology and its application presented in this work are more suited for a system-wide exploration of hazards conducted by an organization responsible for operating or developing operational guidelines for a system rather than an in-depth component-level analysis for system designers.

Applying the proposed methodology led to identifying hazards and contributing failure modes not sufficiently explored in operating L4 ADS fleets for MaaS. By including human and organizational aspects into the system’s overall performance, this methodology highlights hazards from interactions not usually considered in functional safety studies, for example, between fleet operator’s operators and crew members with the ADS vehicle. This approach and the main findings of this work have been validated with stakeholders during the development of a larger project exploring potential risk-reducing strategies for L4 ADS fleet operators. Further validation of the hazard identification methodology can be achieved through comparison with other methods, such as simulation-based, dynamic risk assessment methods, or driving simulator-based observational studies. As the testing and deployment of L4 ADS fleets become more widespread, so do the opportunities to conduct validation activities through expert opinion and data collected during operation. In particular, future efforts may focus on the role of external events, including actions of other road users.

Effective hazard identification plays a key role in designing safe systems and operations, as well as providing a robust basis for future risk quantification. When analyzing complex systems where multiple human and machine agents interact to achieve common goals, the number of risk scenarios to be analyzed increases unbounded by the complexity of the environment in which they operate. Identifying the safety hazards through a structured methodology is the first step to developing adequate safety barriers from a functional and procedure-based perspective. These barriers, in the form of operational procedures or system design, are needed to address human and organizational aspects still present in autonomous system operations. The hazard identification methodology proposed in this work addresses these key issues. This methodology provides a scalable and flexible approach to establish connections between hazard scenarios, agents’ tasks, and the corresponding operational safety responsibilities to contain, prevent, and mitigate the identified hazards. Further work focusing on quantifying event likelihoods, the severity of potential consequences, and the resources required to implement safety barriers can lead to assessing the effectiveness of the identified risk reduction activities.

Conclusions

This work presents a structured and layered hazard identification methodology designed to address the intricate dependencies and interactions between different elements in complex systems. The methodology is applied in the context of L4 ADS fleet operations for MaaS, where fleet operators manage and deploy these vehicles for passenger transport. The methodology, which comprises three distinct stages: system modeling, scenario modeling, and hazard identification, offers a structured and layered approach to identifying and characterizing hazard scenarios. The methodology thoroughly analyzes the system’s operational landscape by employing a combination of traditional and modern hazard identification and modeling tools such as ESD, CoTA, STPA, and FT. The methodology offers flexibility to pursue risk quantification efforts and explore risk mitigation measures to address the identified hazards.

The implementation of each stage and the utilization of these models are described in detail, and a case study focused on the operation of an L4 ADS-equipped vehicle fleet is presented. This work considers the crucial role of the fleet operator, who is responsible for two agents: the remote operators at the FOC supervising the vehicle and the MOC crew dedicated to inspection and maintenance activities. By exploring the success and failure of pivotal ESD events through CoTA and FT, respectively, the methodology investigates the contributing factors leading to hazards occurring during the vehicle’s operation. Combined with the system-level STPA, this method establishes a clear link between operational scenarios, risk contributors, and safety-related agents’ tasks. This systematic approach enables a comprehensive characterization of the hazard scenarios and their contributing factors, providing insights for developing safety barriers through operational guidelines or procedures. An example is provided to display the use of the methodology, focusing on the hazard scenario hazards “FOC does not detect a DDT fallback is required.” This hazard scenario encapsulates multiple contributing factors, including the design of monitoring, intervention, maintenance procedures, and other human-related factors that should be further explored in future work.

The discussion and results presented in this paper are part of a larger project that aims to identify the safety responsibilities and risk mitigation activities of fleet operators of ADS L4 fleets operating as MaaS. The proposed methodology provides a structured approach to identify the main safety hazards that may arise during operation. Moreover, it provides the necessary traceability required to derive operational safety responsibilities to address said hazards. A key challenge remains to fully leverage the proposed methodology’s strengths for quantifying risks. This stems primarily from the limited access to operational data of L4 ADS vehicles deployed as MaaS. However, the methodology can also be oriented toward deriving data collection activities to populate the developed models, paving the way for further research. Ultimately, the hazard identification methodology presented in this work is a compelling approach for analyzing and addressing risks in complex systems, particularly in the domain of L4 ADS fleet operations for MaaS.

Supplemental Material

sj-docx-1-pio-10.1177_1748006X241233863 – Supplemental material for Operational safety hazard identification methodology for automated driving systems fleets

Supplemental material, sj-docx-1-pio-10.1177_1748006X241233863 for Operational safety hazard identification methodology for automated driving systems fleets by Camila Correa-Jullian, Marilia Ramos, Ali Mosleh and Jiaqi Ma in Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability

Footnotes

Appendix

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by U.S. Department of Transportation National Highway Traffic Safety Administration [grant number 693JJ921D000018]. The work presented in this paper remains the sole responsibility of the authors.

ORCID iDs

Camila Correa-Jullian

Marilia Ramos

Ali Mosleh

Jiaqi Ma

Supplemental material

Supplemental material for this article is available online.

References

Ramos

Thieme

Utne

, et al. Autonomous systems safety – state of the art and challenges. Proceedings of the First International Workshop on Autonomous Systems Safety, Trondheim, Norway, 2019.

Hyland

Mahmassani

. Operational benefits and challenges of shared-ride automated mobility-on-demand services. Transp Res Part A Policy Pract 2020; 134: 251–270.

Bocca

Baek

. Automated driving systems: key advantages, limitations and risks. In: 2019 AEIT international conference of electrical and electronic technologies for automotive (AEIT AUTOMOTIVE), Turin, Italy, 2–4 July 2019, pp.1–6. New York: IEEE.

Perumal

Sujasree

Chavhan

, et al. An insight into crash avoidance and overtaking advice systems for autonomous vehicles: a review, challenges and solutions. Eng Appl Artif Intell 2021; 104: 104406.

SAE International. Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles. SAE Standard J3016_202104.

Wong

Hensher

Mulley

. Mobility as a service (MaaS): charting a future context. Transp Res Part A Policy Pract 2020; 131: 5–19.

Narayanan

Chaniotakis

Antoniou

. Shared autonomous vehicle services: A comprehensive review. Transp Res Part C Emerg Technol 2020; 111: 255–293.

National Highway Traffic Safety Administration. Automated driving system 2.0: a vision for safety, https://www.nhtsa.gov/vehicle-manufacturers/automated-driving-systems (2017), accessed March 13, 2023.

Thorn

Kimmel

Chaka

. A framework for automated driving system testable cases and scenarios, www.ntis.gov (2018, accessed 27 July 2022).

10.

Biever

Angell

Seaman

. Automated driving system collisions: early lessons. Hum Factors 2020; 62: 249–259.

11.

Shetty

Tavafoghi

Kurzhanskiy

, et al. Automated vehicle safety and deployment: lessons from human crashes. In: ICCVE 2022 – IEEE international conference on connected vehicles and expo, Lakeland, FL, 7–9 March 2022. New York: IEEE. DOI: 10.1109/ICCVE52871.2022.9742994.

12.

Gyllenhammar

Brännström

Johansson

, et al. Minimal risk condition for safety assurance of automated driving systems. In: CARS 2021 6th international workshop on critical automotive applications: robustness & safety, http://ri.diva-portal.org/smash/get/diva2:1625422/FULLTEXT01.pdf (2021).

13.

Stolte

Ackermann

Graubohm

, et al. Taxonomy to unify fault tolerance regimes for automotive systems: defining fail-operational, fail-degraded, and fail-safe. IEEE Trans Intell Vehicles 2022; 7: 251–262.

14.

Lee

Nayeer

Garcia

, et al. Identifying the operational design domain for an automated driving system through assessed risk. In: 2020 IEEE intelligent vehicles symposium (IV), Las Vegas, NV, 19 October–13 November 2020, pp.1317–1322. New York: IEEE.

15.

Merriman

Plant

Revell

KMA

, et al. What can we learn from automated vehicle collisions? A deductive thematic analysis of five automated vehicle collisions. Saf Sci 2021; 141: 105320.

16.

National Highway Traffic Safety Administration. Summary Report: Standing General Order on Crash Reporting for Automated Driving Systems. US Dep Transp Summ Rep DOT HS 813 3242022, pp. 1–9.

17.

Office of Defects Investigation, National Highway Traffic Safety Administration. INOA – opening resume approved (NHTSA Action Number: PE23018), https://static.nhtsa.gov/odi/inv/2023/INOA-PE23018-11587.pdf (2023).

18.

Erhardt

Mucci

Cooper

, et al. Do transportation network companies increase or decrease transit ridership? Empirical evidence from San Francisco. Transportation 2022; 49: 313–342.

19.

Martinho

Herber

Kroesen

, et al. Ethical issues in focus by the autonomous vehicles industry. Transp Rev 2021; 41: 556–577.

20.

Moody

Bailey

Zhao

. Public perceptions of autonomous vehicle safety: an international comparison. Saf Sci 2020; 121: 634–650.

21.

Bennett

Challinor

Modesto

, et al. Attribution of blame of crash causation across varying levels of vehicle automation. Saf Sci 2020; 132: 104968.

22.

AVSC00006202103. AVSC best practice for metrics and methods for assessing safety performance of automated driving systems (ADS).

23.

Sohrabi

Khodadadi

Mousavi

, et al. Quantifying the automated vehicle safety performance: a scoping review of the literature, evaluation of methods, and directions for future research. Accid Anal Prev 2021; 152: 106003.

24.

Khastgir

Brewerton

Thomas

, et al. Systems approach to creating test scenarios for automated driving systems. Reliab Eng Syst Saf 2021; 215: 107610.

25.

Zhao

Salako

Strigini

, et al. Assessing safety-critical systems from operational testing: a study on autonomous vehicles. Inf Softw Tech 2020; 128: 106393.

26.

Stadler

Montanari

Baron

, et al. A credibility assessment approach for scenario-based virtual testing of automated driving functions. IEEE Open J Intell Transp Syst 2022; 3: 45–60.

27.

Ramos

Moura

Lins

, et al. The use of game theory for autonomous systems safety: an overview. In: Proceedings of the 31st European safety and reliability conference, Research Publishing, Singapore, 2021, pp.2494–2501.

28.

Han

, et al. Operational safety of automated and human driving in mixed traffic environments: a perspective of car-following behavior. Proc IMechE, Part O: J Risk and Reliability 2023; 237: 355–366. DOI: 10.1177/1748006X211050696

29.

Mutzenich

Durant

Helman

, et al. Updating our understanding of situation awareness in relation to remote operators of autonomous vehicles. Cogn Res Princ Implic 2021; 6: 9.

30.

Ramos

Mosleh

. Human role in failure of autonomous systems: a human reliability perspective. In: Proceedings – annual reliability and maintainability symposium, Orlando, FL, 24–27 May 2021. New York: Institute of Electrical and Electronics Engineers Inc., 2021. DOI: 10.1109/RAMS48097.2021.9605790.

31.

Moura

Beer

Patelli

, et al. Learning from accidents: Interactions between human factors, technology and organisations as a central element to validate risk studies. Saf Sci 2017; 99: 196–214.

32.

Chang

YHJ

Mosleh

. Cognitive modeling and dynamic probabilistic simulation of operating crew response to complex system accidents. Part 1: overview of the IDAC model. Reliab Eng Syst Saf 2007; 92: 1076–1101.

33.

International Organization for Standardization. ISO 26262:2018, Road vehicles – functional safety.

34.

Kramer

Neurohr

Büker

, et al. Identification and quantification of hazardous scenarios for automated driving. In: Zeller

Höfig

(eds) Model-Based Safety and Assessment, IMBSA, Lecture Notes in Computer Science, Vol. 12297, 2020, Springer, Cham. https://doi.org/10.1007/978-3-030-58920-2_11

35.

Swaminathan

Smidts

. The event sequence diagram framework for dynamic probabilistic risk assessment. Reliab Eng Syst Saf 1999; 63: 73–90.

36.

Ramos

Thieme

Utne

, et al. A generic approach to analysing failures in human – system interaction in autonomy. Saf Sci 2020; 129: 104808.

37.

Leveson

Thomas

. STPA handbook, https://psas.scripts.mit.edu/home/get_file.php?name=STPA_handbook.pdf (2018).

38.

Rejzek

Hilbes

. Use of STPA as a diverse analysis method for optimization and design verification of digital instrumentation and control systems in nuclear power plants. Nucl Eng Des 2018; 331: 125–135.

39.

Abdulkhaleq

Wagner

Leveson

. A comprehensive safety engineering approach for software-intensive systems based on STPA. Procedia Eng 2015; 128: 2–11.

40.

Yang

Utne

Sandøy

, et al. A systems-theoretic approach to hazard identification of marine systems with dynamic autonomy. Ocean Eng 2020; 217: 107930.

41.

Scarinci

Quilici

Ribeiro

, et al. Requirement generation for highly integrated aircraft systems through STPA: an application. J Aerosp Inf Syst 2019; 16: 9–21.

42.

Ventikos

Chmurski

Louzis

. A systems-based application for autonomous vessels safety: hazard identification as a function of increasing autonomy levels. Saf Sci 2020; 131: 104919.

43.

Chaal

Bahootoroody

Basnet

, et al. Towards system-theoretic risk assessment for future ships: a framework for selecting risk control options. Ocean Eng 2022; 259: 111797.

44.

Chaal

Valdez Banda

Glomsrud

, et al. A framework to model the STPA hierarchical control structure of an autonomous ship. Saf Sci 2020; 132: 104939.

45.

Bensaci

Zennir

Pomorski

. A comparative study of STPA hierarchical structures in risk analysis: the case of a complex multi-robot mobile system. In: 2018 2nd European conference on electrical engineering and computer science (EECS), Bern, 20–22 December 2018, pp.400–405. New York: IEEE.

46.

Kolln

Klicker

Schmidt

. Comparison of hazard analysis methods with regard to the series development of autonomous vehicles. In: 2019 IEEE intelligent transportation systems conference (ITSC), Auckland, New Zealand, 27–30 October 2019, pp.2969–2975. New York: IEEE.

47.

Ramos

Thieme

Utne

, et al. Human-system concurrent task analysis for maritime autonomous surface ship operation and safety. Reliab Eng Syst Saf 2020; 195: 106697.

48.

Ericson

C.A.

, Hazard Analysis Techniques for System Safety. Wiley, 2005. doi: 10.1002/0471739421.

49.

Sun

Zio

. Comparison of the HAZOP, FMEA, FRAM, and STPA methods for the hazard analysis of automatic emergency brake systems. ASME. ASME J. Risk Uncertainty Part B. September 2022; 8(3): 031104. https://doi.org/10.1115/1.4051940

50.

Thieme

Rokseth

Utne

. Risk-informed control systems for improved operational performance and decision-making. Proc IMechE, Part O: J Risk and Reliability 2023; 237: 332–354. DOI: 10.1177/1748006X211043657

51.

Chen

Jiao

Zhao

. A novel hazard analysis and risk assessment approach for road vehicle functional safety through integrating STPA with FMEA. Appl Sci 2020; 10: 7400.

52.

Hirata

Nadjm-Tehrani

. Combining GSN and STPA for safety arguments. In: Romanovsky

Troubitsyna

Gashi

Schoitsch

Bitsch

, (eds) Computer Safety, Reliability, and Security. SAFECOMP 2019. Lecture Notes in Computer Science(), vol 11699. Springer, Cham. https://doi.org/10.1007/978-3-030-26250-1_1

53.

Yang

Utne

. Towards an online risk model for autonomous marine systems (AMS). Ocean Eng 2022; 251: 111100.

54.

Johansen

Utne

. Supervisory risk control of autonomous surface ships. Ocean Eng 2022; 251: 111045.

55.

Basnet

BahooToroody

Chaal

, et al. Risk analysis methodology using STPA-based Bayesian network- applied to remote pilotage operation. Ocean Eng 2023; 270.

56.

Wang

Araújo

. Reliability assessment of autonomous vehicles based on the safety control structure. Proc IMechE, Part O: J Risk and Reliability 2023; 237: 389–404. DOI: 10.1177/1748006X211069705

57.

Bao

Shorthill

Zhang

. Hazard analysis for identifying common cause failures of digital safety systems using a redundancy-guided systems-theoretic approach. Ann Nucl Energy 2020; 148: 107686.

58.

Weglian

Riley

Gibson

. Integrating fault tree analysis with system theoretic process analysis. In: 2023 Annual reliability and maintainability symposium (RAMS)Orlando, FL, 23–26 January 2023. New York: IEEE. DOI: 10.1109/RAMS51473.2023.10088187.

59.

Jung

Heo

Yoo

. A formal approach to support the identification of unsafe control actions of STPA for nuclear protection systems. Nucl Eng Technol 2022; 54: 1635–1643.

60.

Sokukcu

Sakar

. Risk analysis of collision accidents during underway STS berthing Maneuver through integrating fault tree analysis (FTA) into Bayesian network (BN). Appl Ocean Res 2022; 126: 103290.

61.

Mamdikar

Kumar

Singh

. Dynamic reliability analysis framework using fault tree and dynamic Bayesian network: a case study of NPP. Nucl Eng Technol 2022; 54: 1213–1220.

62.

Groth

Wang

Mosleh

. Hybrid causal methodology and software platform for probabilistic risk assessment and safety monitoring of socio-technical systems. Reliab Eng Syst Saf 2010; 95: 1276–1285.

63.

Thomas

Groth

. Toward a hybrid causal framework for autonomous vehicle safety analysis. Proc IMechE, Part O: J Risk and Reliability 2023; 237: 367–388.

64.

Ekanem

Mosleh

Shen

. Phoenix – a model-based human reliability analysis methodology: qualitative analysis procedure. Reliab Eng Syst Saf 2016; 145: 301–315.

65.

Hong

Shao

Guo

, et al. Dynamic Bayesian network risk probability evolution for third-party damage of natural gas pipelines. Appl Energy 2023; 333: 120620.

66.

Thieme

Mosleh

Utne

, et al. Incorporating software failure in risk analysis – Part 1: Software functional failure mode classification. Reliab Eng Syst Saf 2020; 197: 106803.

67.

Thieme

Mosleh

Utne

, et al. Incorporating software failure in risk analysis—part 2: risk modeling process and case study. Reliab Eng Syst Saf 2020; 198: 106804.

68.

Abilio Ramos

Utne

Mosleh

. Collision avoidance on maritime autonomous surface ships: operators’ tasks and human failure events. Saf Sci 2019; 116: 33–44.

69.

Zio

. The future of risk assessment. Reliab Eng Syst Saf 2018; 177: 176–190.

70.

Modarres

. Risk analysis in engineering: techniques, tools, and trends. Boca Raton: CRC Press, 2016. doi: 10.1201/b21429.

71.

Mosleh

. Pra: A perspective on strengths, current limitations, and possible improvements. Nucl Eng Technol 2014; 46(1): 1–10.

72.

Kaplan

Garrick

. On the quantitative definition of risk. Risk Anal 1981; 1: 11–27.

73.

Wall

. The Kaplan and Garrick definition of risk and its application to managerial decision problems. Monterey, California: Naval Postgraduate School. https://hdl.handle.net/10945/32571 (2011).

74.

Leveson

Fleming

Spencer

, et al. Safety assessment of complex, software-intensive systems. SAE Int J Aerosp 2012; 5: 233–244.

75.

Vesely

W. E.

Goldberg

F. F.

Roberts

N. H.

, et al. Fault Tree Handbook. Livermore, CA (United States), Jan. 1981. doi: 10.2172/4169124.

76.

Patriarca

Chatzimichailidou

Karanikas

, et al. The past and present of system-theoretic accident model and processes (STAMP) and its associated techniques: a scoping review. Saf Sci 2022; 146: 105566.

77.

Pacevicius

Ramos

Roverso

, et al. Managing heterogeneous datasets for dynamic risk analysis of large-scale infrastructures. Energies 2022; 15: 3161.

78.

Thieme

Ramos

Holte

, et al. New design solutions and procedures for ensuring meaningful human control and interaction with autonomy: automated ferries in profile. In: Johansson

Fernández

Dalaklis

Pastra

Skinner

(eds) Studies in national governance and emerging technologies. Cham: Palgrave Macmillan, 2023, pp.213–242.

79.

Khastgir

Sivencrona

Dhadyalla

, et al. Introducing ASIL inspired dynamic tactical safety decision framework for automated vehicles. In: 2017 IEEE 20th international conference on intelligent transportation systems (ITSC), Yokohama, Japan, 16–19 October 2017, pp.1–6. New York: IEEE.

80.

de Gelder

Elrofai

Saberi

, et al. Risk quantification for automated driving systems in real-world driving scenarios. IEEE Access 2021; 9: 168953–168970.

81.

Chaka

Blanco

Stowe

, et al. FMVSS considerations for vehicles with automated driving systems: volume 2. 2021; 1: 630p. [Online]. Available: https://rosap.ntl.bts.gov/view/dot/54442

82.

Tener

Lanir

. Driving from a distance: challenges and guidelines for autonomous vehicle teleoperation interfaces. In: Conference on human factors in computing systems – proceedings, pp. 1–13. New York, NY: ACM.

83.

Correa-Jullian

McCullough

Ramos

, et al. Modeling fleet operations of autonomous driving systems in mobility as a service for safety risk analysis. In: Leva

Patelli

Podofillini

, et al. (eds) 32nd European safety and reliability conference (ESREL 2022). Singapore: Research Publishing Services, 2022. DOI: 10.3850/978-981-18-5183-4_J03-06-566-cd.

84.

Correa-Jullian

McCullough

Ramos

, et al. Safety hazard identification for autonomous driving systems fleet operations in mobility as a service. Presented at Probabilistic Safety Assessment and Management (PSAM 16), 2022.

85.

Ramos

Correa Jullian

McCullough

, et al. Automated driving systems operating as mobility as a service: operational risks and SAE J3016 standard. In: 2023 Annual reliability and maintainability symposium (RAMS), Orlando, FL, 23–26 January 2023, pp.1–6. New York: IEEE.

86.

Polydoropoulou

Pagoni

Tsirimpa

. Ready for mobility as a service? Insights from stakeholders and end-users. Travel Behav Soc 2020; 21: 295–306.

87.

Butler

Yigitcanlar

Paz

. Barriers and risks of Mobility-as-a-Service (MaaS) adoption in cities: a systematic review of the literature. Cities 2021; 109: 103036.

88.

Bouali

Pinola

Karyotis

, et al. 5G for vehicular use cases: analysis of technical requirements, value repositions and Outlook. IEEE Open J Intell Transp Syst 2021; 2: 73–96.

89.

Tanshi

Soffker

. Determination of takeover time budget based on analysis of driver behavior. IEEE Open J Intell Transp Syst 2022; 3: 813–824.

90.

Leslie

Ghiasi

, et al. Empirical analysis of a freeway bundled connected-and-automated vehicle application using experimental data. J Transp Eng A Syst 2020; 146: 04020034.

91.

Raboy

Leslie

, et al. A proof-of-concept field experiment on cooperative lane change Maneuvers using a prototype connected automated vehicle testing platform. J Intell Transp Syst Technol Plan Oper 2021; 25: 77–92.

92.

Hasan

Girs

Uhlemann

. Characterization of transient communication outages into states to enable autonomous fault tolerance in vehicle platooning. IEEE Open J Intell Transp Syst 2023; 4: 101–129.

93.

Roe

Schulman

. A reliability & risk framework for the assessment and management of system risks in critical infrastructures with central control rooms. Saf Sci 2018; 110: 80–88.

94.

Utne

Schjølberg

Roe

. High reliability management and control operator risks in autonomous marine systems and operations. Ocean Eng 2019; 171: 399–416.

95.

Mutzenich

Durant

Helman

, et al. Situation awareness in remote operators of autonomous vehicles: developing a taxonomy of situation awareness in video-relays of driving scenes. Front Psychol 2021; 12: 727500.

96.

Gunaratnam

Abdullah

Bakar

. Hazard analysis techniques, methods and approaches: a review. Int J Adv Res Eng Innov 2022; 4: 23–34.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.02 MB