Abstract
This study examines how the complexity of multi-vehicle operations scales due to the cost of coordinating multiple remotely piloted aircraft (RPA) between multiple operators. We conducted a desktop simulation study to investigate how task load (2 RPA vs. 4 RPA per person), crew size (2-person vs. 3-person), and work design (functional vs. structural), affect crew performance in multi-vehicle operations. Participants conducted search and rescue missions in crews using multiple RPA. We measured self-reported workload, number of images classified, and frequency of conflicts between RPA. Data was analyzed using Bayesian hierarchical models. Results suggest that coordination cost increases when increasing task load and crew size, and that these costs stem from interdependencies that are created between crew members in each work design.
Keywords
Introduction
Remotely piloted aircraft systems (RPAS) are an emerging technology and have many applications within civil aviation, including search and rescue, agriculture, mining, energy, and transport (International Civil Aviation Organization [ICAO], 2011). Remote operations, particularly in safety critical systems, provide numerous advantages where human operators can perform their tasks away from hazardous environments. Work can also be conducted more efficiently, flexibly, and cost effectively compared to crewed operations (International Civil Aviation Organization [ICAO], 2011).
Currently, the growth of commercial RPAS operations is limited by the cost of personnel associated with single vehicle operations. Such operations involve one or more operators exercising supervisory command and control over a single RPA. One option to improve the commercial viability of RPAS operations is to employ multi-vehicle (m:N) operations. This refers to a ratio of multiple operators (m) exercising simultaneous supervisory command and control over multiple vehicles (N). However, although advances in RPAS control systems have made multi-vehicle operations possible, there is a relatively limited body of research that evaluates their safety and effectiveness (e.g., Calhoun et al., 2016; Cummings et al., 2007; Fincannon et al., 2011). The current study contributes to this literature by investigating how the complexity of multi-vehicle operations scales as a function of the number of vehicles being controlled, the number of personnel in a crew, and the allocation of work between personnel.
Command Complexity
The research that has been done on multi-vehicle operations suggests that operations can become increasingly complex as the number of vehicles under control and the number of personnel in a crew increase (Lewis et al., 2011). Lewis et al. (2011) introduced the concept of command complexity to understand this problem. They argue that the complexity of multi-vehicle operations scales depending on the degree of interdependency between tasks. In highly interdependent systems, moving from single vehicle to multi-vehicle operations can create cascading work due to the cost of coordinating, planning, and synchronizing interdependent tasks being performed simultaneously by multiple vehicles (Lewis et al., 2011).
The current operational solution to managing workload increase is to increase the size of the crew. The assumption is that tasks can be spread between more personnel which keeps individual workload within acceptable limits. However, this solution comes with a cost of additional coordination load. When crew size increases and tasks are split between additional personnel, information needs to be shared between more people to maintain a common understanding of the operational situation. Currently, there is limited research on this workload and coordination load trade-off for crews conducting multi-vehicle operations. There is a need to understand how work scales when increasing the number of vehicles or size of a crew, particularly in highly interdependent systems.
Work Design
Work design is the process by which work, in the form of tasks, activities and responsibilities, is divided among workers (Parker et al., 2015). The work design of a crew may influence the scaling of command complexity in multi-vehicle operations. This is because different work designs create different interdependencies between roles, which can change the degree of coordination required between personnel. There has been limited research to date that directly investigates how work design affects the scaling of complexity, and its associated impacts to crew performance, in multi-vehicle operations.
Two work designs that can be used in multi-vehicle operations are the functional design and the structural design (Marshall et al., 2023). In a functional work design, crew members are allocated responsibilities grouped by function type. Each crew member performs tasks specific to their function for all vehicles. For example, flight managers perform flight management functions and payload managers perform payload management functions for all vehicles. Coordination between personnel in a functional work design is required when vehicles perform sequences of tasks that involve different functions. In a structural work design, crew members are allocated responsibilities grouped by vehicle. Each crew member performs tasks for all functions, such as flight management and payload management, for their allocated vehicles. Coordination between personnel in a structural work design is required when tasks performed by one vehicle affects the tasks of other vehicles.
Method
We conducted a desktop simulation study to investigate how task load, crew size, and work design, affect crew performance in multi-vehicle RPAS operations.
Simulation Microworld
The simulation microworld was designed to represent multi-vehicle operations in search and rescue contexts (see Figure 1). Participants worked as a crew to search for missing people in a bushland area using multiple RPA, prioritizing the safety, efficiency, and effectiveness of the mission. The mission consisted of two phases, the departure and transit phase, and the search phase.

Simulation user interface.
Departure and Transit
Each RPA required an initial flight plan to be developed prior to departure. The flight plan specified the route that the RPA would take to transit from the aerodrome to the search area. All RPA were required to depart in a specified sequence, with each RPA leaving as soon as the previous RPA had cleared the runway. All RPA were required to adhere to the separation minima once airborne (1,000 feet vertically and 1,000 feet laterally). A conflict detection tool identified RPA that were predicted to breach the separation minima. Participants were instructed to prioritize resolving conflicts above all other tasks.
Search
Participants were informed that image processing software analyzed the video feed from each RPA inside the search area and identified points of interest (POI) for further analysis. When a POI was identified, a sequence of tasks was performed. The location of the POI had to be recorded, and the RPA’s flight plan had to be modified to perform a circle back maneuver to acquire a high-resolution image. Once the high-resolution image was acquired, it had to be inspected to identify whether a person was present or absent in the image.
Design
The study used a 2 × 2 × 2 mixed-subjects design. We manipulated crew size (2-person vs. 3-person) between-subjects. We manipulated task load (2 RPA vs. 4 RPA per person) and work design (functional vs. structural) within-subjects. In the functional condition, work was allocated to crew members by function. Each participant performed their allocated function (flight management or payload management) for all RPA. In the structural condition, work was allocated to crew members by RPA. Each participant performed all functions (flight management and payload management) for their allocated RPA.
Procedure
The procedure was based on Marshall et al. (2023). Participants completed simulation training and four 45-minute testing blocks over two consecutive days. Training involved part-task training of all simulation tasks (2 hours) and practice scenarios (1 hr for each work design). Part-task training included editing flight plans, departing RPA, creating search legs, performing search task sequences, managing conflicts, and crew communication. Practice scenarios included a training mission, conflict management exercise, and weather event. Throughout training, the mission priorities (safety, efficiency, and effectiveness) were emphasized, and participants were shown examples of how the priorities could be achieved in each task. Participants completed two testing blocks on each day. Work design was held constant within each day, and task load was varied. The order in which the work design and task load conditions were presented was counterbalanced.
Measures
We measured self-reported workload, number of images classified, and frequency of conflicts between RPA. The workload and conflicts measures are indicators of safety, and image classification is an indicator of mission effectiveness. Self-reported workload was measured using the Air Traffic Workload Input Technique (Stein, 1985), presented at 5-minute intervals, totaling to nine responses for each participant, for every block. Participants responded to a single item (“Please rate your workload”) on a scale from 1 (Low) to 10 (Extreme).
Participants
We recruited 86 participants (M = 22.42 years, 59 female) through the School of Psychology Research Participation System. Participants were reimbursed $160 in gift card vouchers. Ethics was approved for the study (2021/HE000587). Due to technical issues, we excluded data from two 3-person crews and an additional eight blocks. The final data set included 32 crews (16 of both 2-person and 3-person crews, 80 participants in total).
Results
We used Bayesian generalized linear mixed models to examine the effects of task load, crew size, and work design on the three outcome measures. The analyses were conducted using the brms package in R (Bürkner, 2017). For the workload analyses, we used a cumulative probit distribution as the outcome is ordinal. For the image classification and conflicts analyses, we used a Poisson distribution as the outcomes are counts. An effect was interpreted as credible when the 95% credible interval (CI) of the posterior distribution did not include zero. The CI quantifies the uncertainty in the estimated parameter values based on the posterior distribution. The graphs show the estimated posterior means and 95% CIs.
Self-Reported Workload
Overall, participants reported greater workload when operating more vehicles (β = .63, 95% CI [0.46, 0.80]). While there was no credible effect of work design on overall workload, the effect of increased task load was stronger in the structural condition than in the functional condition (β = .50, 95% CI [0.26, 0.75]; see Figure 2). A follow up analysis revealed that in the functional condition, increases in task load had a stronger effect on workload for flight managers (2 RPA: M = 5.21 vs. 4 RPA: M = 6.32) than for payload managers (2 RPA: M = 3.86 vs. 4 RPA: M = 4.18; β = −.49, 95% CI [−0.82, −0.15]).

Workload by task load and work design.
While there was no credible effect of crew size on overall workload, the effect of increased task load was stronger for 2-person crews (2 RPA: M = 4.31 vs. 4 RPA: M = 5.69) than 3-person crews (2 RPA: M = 4.82 vs. 4 RPA: M = 5.83; β = −.23, 95% CI [−0.45, −0.01]).
Image Classification
Three-person crews (M = 30.06) classified more images than 2-person crews (M = 26.16; β = .26, 95% CI [0.03, 0.50]). Crews classified more images in the structural condition (M = 31.35) than in the functional condition (M = 24.87; β = .22, 95% CI [0.09, 0.36]). There was no credible effect of task load on images classified (2 RPA: M = 28.30 vs. 4 RPA: M = 27.92; β = .02, 95% CI [−0.13, 0.17]).
Conflicts
More conflicts were created when crews were operating more vehicles (β = .88, 95% CI [0.67, 1.09]), and in 3-person crews than 2-person crews (β = .84, 95% CI [0.62, 1.06]). While there was no credible effect of work design on the overall number of conflicts, the effect of task load on conflicts was stronger in the structural condition than in the functional condition (β = .51, 95% CI [0.22, 0.80]). Furthermore, this interaction was more pronounced in 3-person crews compared to 2-person crews (β = −.53, 95% CI [−0.87, −0.19]; see Figure 3).

Conflicts by task load, work design, and crew size.
Discussion
This experiment was an empirical study of the scaling of complexity in multi-vehicle operations. The findings suggest that there are coordination costs associated with increasing task load and crew size when conducting multi-vehicle operations. These affect workload and performance, but do so in different ways, depending on how the work is designed.
The cost of coordination in the functional condition is reflected in the lower number of images classified. In the functional condition, crew members were dependent on each other to complete search task sequences, which affected their ability to acquire and classify images. Payload managers’ task demands depended on flight managers’ ability to control multiple RPA to sweep the search area. As flight managers had a higher workload than payload managers, they became a bottleneck when controlling more RPA, limiting the number of images available for the payload manager to work on. In the structural condition, crew members were not dependent on each other to complete search sequences. Each participant was able to manage their own workflow, enabling them to complete search sequences more effectively, and therefore classify more images.
The cost of coordination in the structural condition, by contrast, is reflected in increased conflicts and higher workload. More crew coordination was required for managing conflicts in the structural condition, because all crew members were responsible for creating flight plans. The coordination cost increased as a function of task load and crew size. These variables increased both the number and complexity of potential conflicts that could be generated by a single action. In the functional condition, the coordination costs associated with conflict management were minimized, because only flight managers were responsible for creating flight plans for all RPA. This meant that they were less likely to create a conflict in the first place, and that when a conflict was created, it was easier to resolve, because it required coordination with fewer people than in the structural condition. This explanation is consistent with the finding that increases in task load had a greater impact on workload in the structural condition than in the functional condition.
There are two potential limitations to this study. First, the total number of RPA varies in each condition. For example, when task load is 2 RPA per person, 2-person crews have 4 RPA in total whereas 3-person crews have 6 RPA in total. This means we can’t disentangle the effect of the total number of RPA from the effects of task load and crew size. For example, the increased frequency of conflicts in the 3-person condition could be because there were more people in the crew (more people to coordinate flight plans with), or because there were more RPA in the operation (more flight plans to be coordinated). Future studies could compare combinations of crew size and task load while holding total RPA constant, for example 2-person crews controlling 3 RPA each vs. 3-person crews controlling 2 RPA each. When holding total RPA constant, increasing crew size not only increases coordination load (more people to coordinate with) but also decreases task load (total RPA are split between more people). This allows us to compare whether the effect of increasing crew size is stronger than the effect of decreasing task load. In practical terms, this could indicate whether the benefit of splitting the work between more people outweighs the cost of coordinating work in a larger crew in particular systems.
Second, it is possible that the use of novices limits the generalizability of the results. Experts would undoubtedly perform multi-vehicle search and rescue operations better than novices because they possess a higher degree of skill and knowledge in this domain. To address this, we designed the training so that novices could develop sufficient knowledge and expertise of the microworld system within the time available. The fidelity of the simulation tasks and elements were designed to match the level of expertise that novices could develop with this training. This meant that novice participants could strategize, adapt, and make decisions based on their expertise of the microworld system, much like how experts do so in the field.
The findings from this study may inform how and when interdependencies in complex systems can be minimized to manage the scaling of work. In many cases, it may not be possible to eliminate interdependencies from multi-vehicle operations in complex systems. For example, in the current study, work designs that eliminate interdependencies in one phase of the mission can create interdependencies in another phase. However, it may be possible to minimize interdependencies through the design of work roles or the design of control systems. Flexible work roles could allow the allocation of responsibilities to change over different phases of a mission (Marshall et al., 2023), or the automation of highly interdependent tasks could reduce coordination load for operators.
Understanding how command complexity scales is critical to the safety and effectiveness of multi-vehicle operations. We need to consider the interdependencies within a system when moving from single to multi-vehicle operations. The level of interdependencies affects how command complexity scales when increasing the number of vehicles under control and the size of the crew in an operation. Different work designs also create different interdependencies between roles, which means that there are tradeoffs when considering the allocation of work in multi-vehicle operations. The current study contributes to the literature on the safety and effectiveness of multi-vehicle operations and also contributes to building a safety case for the regulation of multi-vehicle RPAS operations in the future.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research project was funded by the Australian Research Council [ARC LP190100188], and by industry partners Boeing Research & Technology Australia, and Boeing Defence Australia.
