Abstract
We present quantitative metrics to assess the effectiveness of shepherding-based human-swarm guidance. Our swarm is flocks of sheep. Our human-swarm teaming arrangement contrasts a sheepdog and a human shepherd trained using traditional methods, and a drone and a human pilot trained using computational models of abstracted mustering skills. We design two effectiveness attributes for mustering that measure the level of influence a sheepdog/drone exerts on a sheep (swarm-guidance-controllability) and the level of anticipation that assists sheep to modulate their response (swarm-guidance-predictability). We collect field data where the sheep are mustered by a sheepdog in one setting and by a drone in another. We contrast field data against data collected from simulation models. Our analysis reveals that while collective behaviours in simulations may appear to be similar to field data, the dynamics of influence are vastly different between the two. The findings are particularly important when assessing the suitability of contemporary swarm shepherding and collective behaviour models produced by the research community.
Introduction
If a robot acts like an ant, does it objectively experience what an ant does? While this question has long-standing roots in artificial life, its importance and relevance to this paper are twofold. First, when models are developed to understand a phenomenon, it is important to have objective metrics to validate whether these models have truly captured the dynamics of the phenomenon on a deeper level or not. Second, when models get transferred between contexts, these metrics need to be widely applicable to generalise across contexts.
Take, for example, sheep mustering. Mustering is not a process of inducing stress in sheep to escape from sheepdogs. In fact, proper mustering requires order and stress-free guidance because stress can have a significant negative impact on animal welfare, and subsequently on meat and milk production. Therefore, a modeller needs to ensure that the model does not merely show that a biological or artificial sheepdog successfully collects sheep, but should demonstrate that the model conserves the properties of mustering in the real world. Moreover, these properties should generalise whether the sheepdog is a biological dog, an uncrewed aerial vehicle (UAV) in the sky attempting to collect sheep, or a point mass in an abstract simulation attempting to model the mustering phenomenon.
This paper tackles the above challenge by designing two metrics for widely understood concepts in animal welfare. However, these two concepts have not yet been transformed into quantitative metrics appropriate for both current biological systems or artificial/robotic contexts. The two concepts are swarm-guidance-controllability (SGC) (based on controllability defined by Lee et al. (2018)) and swarm-guidance-predictability (SGP) (based on predictability defined by Lee et al. (2018)). The two behaviour-based metrics, SGP and SGC, derived from predictability and controllability to assess the effectiveness of shepherding-based swarm guidance, can later inform machine education towards supervisory control (Abbass et al. (2021)).
SGC assesses whether the sheepdog is effectively guiding the sheep; that is, sheep go where the sheepdog wants them to go. SGP is about anticipation and situation awareness; that is, sheep “know” where to go. A flee behaviour is when the animal panics while attempting to run away. The direction the animal takes is chosen spontaneously with the only goal of escaping the perceived threat. In mustering, flee is undesirable because high stress is in conflict with animal welfare. Instead, smooth guidance from the sheepdog creates a level of anticipation, where sheep collectively move while easily anticipating each other’s movements.
To illustrate both measures, we use three groups of data. First, field data we have collected where a sheepdog performs mustering. Second, the field data we have collected where the same experimental setup used by the sheepdog was repeated, except that the sheepdog was replaced by a UAV attempting to muster the sheep. Third, simulated data using a well-established model of shepherding in the literature.
Before we regress into the details of this work, we first summarise selected work from the interdisciplinary literature covering different areas important for this paper. We then present our newly designed measures and demonstrate the advantages of their use in field and simulated data. A discussion, including future work is presented before a Conclusion.
Background
Shepherding-based swarm guidance systems are characterised by a swarm system guided to a defined goal, usually only known by the guiding agent. A swarm system is characterised by a team of individuals acting in synchronisation without any central coordination (Abbass and Hunjet (2021); Bonabeau et al. (1999)). A guiding agent may be an individual within the swarm, or in the context of shepherding-based swarm guidance, external to the swarm system and often with differing characteristics from the individuals in the swarm system. An example of a shepherding-based swarm guidance system is a flock of sheep (swarm system) being guided to a defined goal by a sheepdog (guiding agent); the task is commonly called mustering.
Farmers use mustering to support animal welfare outcomes, including animal health monitoring, reallocating animals to alternate food sources, and specific tasks, such as administering medication (Goddard (2008); Coppinger and Coppinger (2014)). When considering a sheep broadacre farm, mustering is often supported by a sheepdog as a herding agent to influence a flock to the desired goal point (Goddard (2008)). Since 2019, there have been examples of farmers using alternate mustering technologies (Burry, M (2019); Sexton-McGrath (2023)), with Yaxley et al. (2023) reporting the differences between a well-trained sheepdog and a novel drone, indicating further work was required to realise an autonomous agent. Qualitative metrics used by Yaxley et al. (2023) to assess the mustering activity effectiveness included elements of animal welfare, specifically predictability and controllability (Lee et al. (2018)).
Shepherding-based swarm guidance systems have been an area of research since the 1990s (Long et al. (2020)), with the earliest contributions based on the shepherding of ducks (Vaughan et al. (2000)) and the most recent contribution involving new field experiments (Jadhav et al. (2024)). The Sky Shepherds research project began experiments with sheep in 2019 (Yaxley, Joiner, and Abbass (2021)), pursuing the development of a future autonomous shepherding-based swarm guidance system (Yaxley et al. (2020)). Foundational to achieving the outcome was to demonstrate adherence to relevant ethics and regulations by measuring the effectiveness of the human-UAV team in achieving the shepherding-based swarm guidance task. Consequently, it is necessary to understand the roles of agents within the shepherding-based swarm guidance task and, therefore, how to measure effectiveness.
In Yaxley et al. (2023), two types of shepherding-based swarm guidance were compared, specifically a human-sheepdog team and a human-UAV team. In both cases, the intermediate (human-X) teams were assessed using effectiveness, efficiency, and timeliness measures. The effectiveness metrics used to assess the shepherding-based swarm guidance teams were based on animal welfare principles. Animal welfare fosters positive welfare while minimising negative welfare events (Goddard (2008); Rault et al. (2022)). Two considerations within animal environments are predictability and controllability (Lee et al. (2018)), with a balanced system supporting both dimensions.
Skills (Hussein et al. (2022)), Onto4MAT Actions (Hepworth et al. (2022)), and Associated Sheep Research that map to Support understanding Knowledge of Sheep Behaviour
As presented in Yaxley et al. (2020), the development of an autonomous system capable of shepherding-based swarm guidance suggests a curriculum-based approach to realise meaningful human-UAV-swarm teaming. Curriculum-based approaches are used to support the training of sheepdogs (Williams (2007g); Keil (2015)), and would support both the development of an autonomous guiding agent, as well as the development of a human-UAV-animal team to understand the tasks required. To support such an approach, simulations representing the swarm and shepherding agent have the potential to form part of the curriculum and have been used by Wade and Abbass (2019); Clayton and Abbass (2019); Hussein et al. (2022) with some promising early approaches.
One model developed based on observed sheep response to an external influencing agent is presented by Hepworth et al. (2020). The agents in the model are reactive, based on boids (Reynolds (1987)), while also responding to an external guiding agent with principles from Strömbom et al. (2014), and do not reflect the autonomous development presented in Hussein et al. (2022). Hepworth et al. (2020) also introduced a mixture of entropy and behaviour-based metrics (syncronicity, predation risk, situational awareness) to reflect swarm agent interactions.
Extending on this work, Hepworth et al. (2023) presented a series of information markers to support swarm agent behaviour characterisation for an external observer to support shepherding-based swarm guidance. Hepworth et al. (2023) define information markers as information that can reveal the presence or absence of a particular behaviour. Consequently, the information markers are not used to support an understanding of herding effectiveness but to identify behaviours exhibited by agents during the herding task. Identified behaviours enable classification of agent types, allowing the external observer to support the herding agent to use alternate shepherding-based guidance strategies. Such work is similar to the experimental set-up in Yaxley et al. (2023), where an external controller (sheepdog trainer or UAV pilot) supported the swarm guiding agent (sheepdog or UAV) to guide the swarm to the designated goal.
Researchers have explored different metrics to assess the effectiveness of abstracted behaviours within a synthetic environment to further build upon an understanding of how to emulate shepherding-based swarm guidance. In Yao et al. (2021), the efficiency of the shepherding simulation was improved by developing a more defined switching algorithm between collect and drive behaviours of the shepherd. Efficiency was introduced by considering not only the dispersion of the flock but also the destination (goal) of the shepherding task. Further, at the commencement of the shepherding-based swarm guidance task, the shepherd would follow a path similar to a sheepdog approaching the flock to prepare for the mustering task, that is, more circular compared to the direct path to the furthest sheep exhibited in Strömbom et al. (2014). This act of preparing the flock, or initiating an alert response, for the mustering task reflects behaviours reported in field experiments of sheep mustering, where Yaxley, Joiner, and Abbass (2021) reported that initiating an alert response elicited lower physiological stress response to a drone guiding agent in sheep.
The effectiveness of the Yao et al. (2021) algorithm was measured in terms of the time step (timeliness), flock dispersion (effectiveness), sheepdog trajectory distance and flock GCM distance (efficiency) outperforming the Strömbom et al. (2014) algorithm in all metrics. However, while the effectiveness metric considers how the swarm responds to the guiding agent by measuring dispersion, the rate of change in the dispersion relative to sheep behaviour (predation response) is not considered. Further, while the timeliness of the task is improved, it is unknown whether this speed in completion is a result of improved cohesion, as only distance is considered, and not heading.
Auletta et al. (2022) presented herding effectiveness metrics for herders gathering non-cooperative, non-flocking agents. While the agents in Auletta et al. (2022) are non-flocking, the performance metrics consider gathering time, distance travelled by herders, herd distance from the containment region (goal), and herd spread. Herders were effective if they achieved lower gathering time, shortest distance travelled, closest proximity to the goal, and tightest spread. Overall, Auletta et al. (2022) demonstrated herders were most effective if leader-follower strategies were supported during the task.
Syme and Elphick (1982) reported sheep with lower heart rates displayed more calmness during handling and were also reported as leading the flock during movement. In contrast, shy sheep would vary between followership and uncomfortable followership, making driving more difficult during moving tasks. Similarly, Michelena et al. (2010) modelled bold and shy sheep decision-making, with bold sheep more likely to select a grazing area based on the preference of grazing options over whether other sheep were located in the area. Importantly, bold sheep were characterised by their curiosity towards novel items and calmness while investigating items within a pen (Sibbald et al. (2009)), with shy sheep moving faster when driven. The faster movement of shy sheep while driving is consistent with Syme and Elphick (1982), and was also reported in Yaxley, McIntyre, et al. (2021), where sheep that displayed curiosity towards a UAV moved more steadily than those surprised by the UAV. Consequently, promoting leadership sheep (bold personality type) to translate shepherding agent cues to move towards the goal, or leader-follower relationships within flock movement during shepherding, is more efficient than follower-leader relationships, which is consistent with Schaerf et al. (2021).
Most recently, Jadhav et al. (2024) demonstrate the importance of supporting the social networks of sheep during the mustering task. Analysis of experiments conducted between a well-trained sheepdog and flock of sheep revealed the importance of supporting leader-follower relationships within the flock, with the sheepdog communicating cues to the leader through movement. Deeper analysis of the leader-follower relationship within the flock revealed cohesion relied upon information flows from the lead sheep to follower sheep at the rear of the flock. The identification of information flows from lead sheep to follower sheep is consistent with network controllability presented by Liu et al. (2011), revealing the importance of fostering both leader-follower behaviours (swarm-guidance-predictability) and flock direction (swarm-guidance-controllability) in order to effectively guide the flock to the desired goal.
Within the literature, swarm effectiveness metrics are often associated with task optimisation, including operational effectiveness or task improvement. The presented swarm effectiveness metrics aim to support the development of autonomous human-swarm teams through a machine education curriculum. As such, these metrics offer an opportunity to assess whether commands have been executed effectively at both a macro- and micro-level. The macro effectiveness metric is SGC, whereas the micro effectiveness metric is SGP and later research users may seek a composite objective function of the two.
SGP and SGC as effectiveness measures must be underpinned by livestock welfare management during shepherding-based swarm guidance, such as mustering and training, rather than sheepdog trials. Although sheepdog trials do rely on defined rules, the traits of a sheepdog successful at trial differ greatly from a sheepdog successful at mustering (Williams (2007b)). Consequently, rather than considering sheepdog trial rules to inform the developed effectiveness measures, principles that consider the quality of livestock care during handling by Goddard (2008) have been used. Although Goddard (2008) considers both the human and the sheep in recommendations for human-animal interactions, the common animal elements required for livestock care training are as follows: • Knowledge of sheep behaviour • Knowledge of the effective animal welfare fostering actions • Knowledge of specific techniques to be performed • Knowledge of specific breed and health problems likely to be burdened
In other shepherding-based swarm guidance systems, particularly with an external influencing agent, the change in behaviour between collect and drive is considered to understand whether the swarm agents are being effectively guided towards the goal (Strömbom et al. (2014)). Specifically, collect indicates the shepherding agent must position itself behind the furthest agent relative to the global centre of mass and goal, such that the agent forms a flock with other flocking agents. The behaviour drive indicates the shepherding agent positions itself behind the formed flock to guide the flock towards the goal. By such definitions, a shepherding agent predominately performing the drive behaviour is considered effective. However, this effectiveness metric does not consider the behaviour of the swarm agents and, therefore, is unable to meet all of the elements presented in Table 4. Consequently, while collect and drive remain essential to shepherding-based swarm guidance systems, using them without effectiveness metrics may lead to unethical robotics systems, particularly within agriculture.
Definitions of swarm agents used as part of development of both SGP and SGC metrics
Definitions of swarm Behaviours, of Both Swarm Agents and Influencing Agents, Used as Part of Development of Both SGP and SGC metrics
Sky Shepherds Field Research
Elements of Livestock Care and Training, as Presented by Goddard (2008), and Measured by Yaxley et al. (2023) Between a Human-Sheepdog Team and Human-UAV team
In Yaxley et al. (2023), the breed remained the same, with all sheep assessed as healthy for the duration of the testing, and compared the remaining three common elements between the human-sheepdog team and human-UAV team. Table 4 shows how the elements were measured and qualitatively assessed by the Trial Director.
Table 4 shows the human-sheepdog team outperformed the human-UAV team across the common elements of livestock care training. The purpose of the experiments reported in Yaxley et al. (2023) was to assess whether a novel technology could effectively muster flocks of sheep when compared to a well-trained sheepdog. Of note, the human-sheepdog team was well-trained in the mustering of sheep, and therefore, both cognitive agents within the team had developed an understanding of the common elements of sheep behaviour, animal welfare fostering actions, and specific actions performed. In contrast, the human-UAV team consisted of a pilot, co-pilot, and novel Sky Shepherd. While the pilot and co-pilot were experienced UAV pilots, they were novices regarding mustering sheep. Both pilots understood the theory of sheep behaviour and animal welfare and had practised the expected manoeuvres to muster the flocks of sheep. However, they had not accumulated field experience with mustering sheep that the sheepdog handler would have with a working sheepdog. The results presented in Table 4 support the need to develop a curriculum for human-UAV teams to succeed in shepherding-based swarm guidance (Yaxley et al. (2020)).
Methodology
The effectiveness metrics and information markers used to measure the effectiveness of swarm-guidance in both field and simulation results
The effectiveness metric, swarm-guidance-predictability (SGP), is defined as how an individual agent has responded to a herding agent’s intent and is calculated for individual agents (micro-metric).
The effectiveness metric, swarm-guidance-controllability (SGC), is defined as how the swarm responds to the guidance task relative to a goal position, and is calculated on the swarm’s perimiter or convex hull (macro-metric).
SGP is the primary metric of the two effectiveness metrics, while SGC is the secondary metric. SGP is considered the primary metric to promote the leading sheep in leading the flock, rather than the guiding agent controlling the flock (Williams (2007c); Jadhav et al. (2024)). Consequently, a flock with established and supported leader-follower behaviours will more readily respond to the guiding agent’s cues to reach the defined goal. Hence, SGP should be consulted first, followed by SGC. However, when considering the different cognitive systems that exist between an observing agent and guiding agent(s) (Keil (2015)), the primary and secondary metric use will likely reflect the scaffolding in the system and outcomes sought from the defined task. Here, we will present a mathematical descriptor of effectiveness metrics as a step towards bridging the knowledge gap between traditional scaffolding for human-swarm teaming (sheep mustering) and future biologically inspired shepherding-based swarm guidance applications (e.g. robotic swarm guidance).
As detailed in Table 5, the swarm effectiveness metrics are assessed over an observation period (τ). To determine the observation period, characteristics of the measurement system used by Yaxley et al. (2023) and the properties of a typical experiment were considered. To capture the movement of sheep during experiments conducted by Yaxley et al. (2023), smartphones were used to collect the time, space, positioning information (TSPI), from networked global positioning system (GPS), of each agent, minimising data synchronisation challenges during post-trial analysis, at a rate of 1 Hz. Consequently, for the observation rate to have meaning, a period greater than 1 s was considered. Similarly, Jadhav et al. (2024) also used observation periods of 5 s to evaluate flock cohesion relative to barycentre speed.
Swarm-Guidance-Predictability (SGP)
Schaerf et al. (2021) model speed changes between animal groups with leader-follower relationships. The speed and direction changes indicate greater efficiency between leader-follower over follower-leader relationships. Specifically, greater efficiency is possible when a shepherding agent supports leader-follower relationships to respond to guidance cues. In developing shepherding-based swarm guidance metrics to evaluate field and simulated data, we propose speed and predation risk most accurately reflect supporting leader-follower relationships within the swarm, due to the characteristics of sheep (Syme and Elphick (1982); Michelena et al. (2010)) and predation risk response of sheep (Lima and Dill (1990); Hepworth et al. (2020); Yaxley, McIntyre, et al. (2021)), leading to the definition of the effectiveness of SGP within shepherding-based swarm guidance systems.
Predation Risk (PR) to model sheep behaviour in shepherding-based swarm guidance simulations was first presented in Hepworth et al. (2020), which was based on observations of behaviour responses of sheep to the presence of a novel swarm guidance agent. As presented by Lima and Dill (1990), the risk of predation is controlled by feeding animals through decision-making and is observed in the resulting behaviour of the animal when responding to the predation risk. In the context of mustering on farms, the decision-making response of sheep to an influencing agent or perceived predation risk is leveraged to achieve animal welfare outcomes associated with moving livestock. Or, more generally, Knowledge of sheep behaviour is leveraged to influence the flock towards a defined goal.
At each time step equivalent to t = 1s, the following equation is used to calculate the PR for each agent (π
i
) within a flock size N, i ∈ {1, …, N}, which was first presented in Hepworth et al. (2020).
A system with good SGP will support agents to trade PR with awareness of the herding agent (moving in and out of O b < B), such that there is a natural and regular change in the PR during the mustering task. Speed during this process is consistent, so agents do not exhibit a flee response. While a flee response (unpredictability) may be an appropriate influence for some herding-agent-swarm systems (e.g. Paranjape et al. (2018)), these effectiveness measures have been developed based on a system where SGP is prioritised.
The SGP effectiveness metric is calculated using the algorithm presented in Algorithm 1 and illustrated in Figure 1. Pictorial Representation of the SGP Effectiveness Metric Presented in Algorithm 1 (Left) and Pictorial Representation of the SGC Effectiveness Metric Presented in Algorithm 2 (Right)
The SGP effectiveness metric offers an opportunity for future research to understand how an individual swarm agent completes the sense, decide, and act associated with the task. For example, sheep will display curiosity when exposed to novel stimuli to understand the environmental implications. Such behaviours are often associated with sheep being stationary to sense the novel stimuli. Similarly, sheep that have been overstressed can change their behaviour, removing themselves from the flock (Kilgour and de Langen, H. (1970)). Consequently, the SGP of a system with an agent that has been overstressed will be influenced by this change in behaviour. In the context of robotic swarm systems, such individual agent changes may indicate a state of the system’s health, which can improve effectiveness after a failure or trigger a countermeasure.
Although SGP is calculated at the individual flock level, depending on the level of granularity required to support transparency (Hepworth et al. (2021)), a user may wish to consider SGP at the overall flock level. To achieve this, the median (Mdn) of the SGP measure may be used. To reflect the SGP assessments completed in Yaxley et al. (2023),
Swarm-Guidance-Controllability (SGC)
A system with good SGC will have agents at the front of the convex hull with a heading towards the goal (σ g ), such that the distance to the goal (d g ) is also decreasing. The heading (σ g ) towards the goal (P g ) is established from the start point (P s ), with a defined tolerance.
The SGC effectiveness metric is calculated using the algorithm presented in Algorithm 2 and illustrated in Figure 1.
Importantly, to calculate the SGC effectiveness measure, knowledge of the influencing agent is not required; it requires defining a goal. The goal may be either a waypoint or a final destination. The metric has been developed to ensure effectiveness by considering only the swarm agents. The benefit of this metric is that it may be possible to apply the SGC metric in human-swarm teaming systems without an intermediary influencing agent.
Field Data
Field data analysis yielded examples of predictable and controlled swarm movement (Program 13) and unpredictable and uncontrolled swarm movement (Program 17). Program 13 used a human-sheepdog team, while Program 17 used a human-UAV team. The field movements for both Programs are shown in Figure 2. Field Movement of Program 13 (Human-Sheepdog Team) and Program 17 (Human-UAV Team), With the Experimental Set-Up of a Simulated Obstacle. The Simulated Obstacle was Used as the Goal Location for the Purposes of Analysis
Comparison Methodology
To contrast the effectiveness metrics using field data fairly against simulation results, simulations were defined to reflect the open field environment used in Yaxley et al. (2023), with a simulation environment as depicted in Figure 3. Simulation Set-up to Reflect Field Set-up and Data Collection, Used in Yaxley et al. (2023). All Agents are Clustered at the Start Position, and the Herding Agent Begins Behind the Clustered Flocking Agents, Reflecting the Recorded Field Data, which was Captured From the Start Point after the Flock had Become Aware of the Herding Agent’s Presence
Agent weightings for comparison methodology, as presented in Hepworth et al. (2023) and Strömbom et al. (2014)
After the simulation run, the data was truncated to include the same sample run data and flock size for a fair comparison against field data. Given that all field data examples were captured once the herding agent (sheepdog or UAV) had commenced the program, the simulation was set to ensure all flock agents were clustered, with the guiding agent positioned behind the flock. The simulation was considered complete once a single agent was in the goal, or, in the case of failure to reach the goal, once the flock began to disperse around the goal area. One time step in the simulation case was equated to one second of field data.
For analysis, flock agents were randomly selected from the original 48 using the Matlab function randperm, reflecting the experimental set-up in Yaxley et al. (2023) where 16 sheep were instrumented and 32 were non-instrumented. Analysis was completed per valid field data examples, allowing for comparative simulation and field results analysis.
The effectiveness measures are non-normal, which informs comparison analysis and statistical analysis. Variance was measured using Levene’s Test, and median was assessed using Mood’s Median, reflecting the existence of outliers in the data.
The hypotheses for the Levene’s Test are as follows: • •
The hypotheses for the Mood’s Median Test are as follows: • •
To further assess the accuracy of the simulation results reflecting the captured field results, the average standardised residual (ASR) was calculated for all simulations, as per equation (2).
Keane and Joiner (2020) used the ASR metric to build trust in simulations and autonomous underwater vehicle recovery localisation accuracy and homing behaviour. In the context of shepherding-based swarm guidance, localisation accuracy is the ability to understand where the flock is in context to the task, whereas homing behaviour is the completion of the task at the goal. Given the shepherding-based swarm guidance effectiveness metrics support understanding how well the swarm is being guided towards the goal (localisation over homing), using ASR allows us to understand how well the simulations reflect the field results. Of note, Keane and Joiner (2020) found the ASR metric to be more accurate in assessing the localisation performance, over the homing performance of the simulations when compared to the field trials.
Results
Using the developed SGP and SGC effectiveness metrics, analysis of field data from Yaxley et al. (2023) yielded examples of predictable and controlled swarm movement (Program 13) and unpredictable and uncontrolled swarm movement (Program 17). The field movement, SGP effectiveness, and SGC effectiveness for both Programs are shown in Figures 4, and 5. SGP Effectiveness of Program 13 (Human-Sheepdog Team) and Program 17 (Human-UAV Team), Showing the Human-Sheepdog Team Achieving Overall Higher SGP Effectiveness Compared to the Human-UAV Team. The SGP Effectiveness is Communicated as Mdn(SGP). In Yaxley et al. (2023), the Trial Director Assessed the Human-Sheepdog Team as 9/10 for SGP and the Human-UAV Team as 3/10 SGC Effectiveness of Program 13 (Human-Sheepdog Team) and Program 17 (Human-UAV Team), Showing the Human-Sheepdog Team Achieving Overall Higher SGC Effectiveness Compared to the Human-UAV Team. In Yaxley et al. (2023), the Trial Director Assessed the Human-Sheepdog Team as 10/10 for SGC and the Human-UAV Team as 3/10

Applying the same metrics to simulations data generated as part of the comparison methodology did not achieve the upper scale of SGP or SGC effectiveness, with the maximum level of effectiveness achieved for all simulation types being semi-predictable (SGP=1) or semi-controlled (SGC=1).
Field Effectiveness Results, Compared to Field Trial Director Qualitative Assessment of Program 13 (Human-Sheepdog Team) and Program 17 (human-UAV team). The Normality Test Results Presented are the Shapiro–Wilk t-Test Completed Using Quantum Excel (XL) (2016) [Version 5.29.1700]
To compare the field data, summary results for the simulations are presented in Table 7, indicating the variance of the Dispersed Search and Classic simulations are statistically similar for SGP, with no null hypotheses rejected for Levene’s Test. While the variance (shape) of the distribution may be statistically similar to the Program 13 (human-sheepdog team) field results, the simulations were unable to demonstrate the upper scale of SGP, reflected in the rejection of all Mood’s Median tests for Dispersed Search and Classic. Consequently, the widely used simulation for shepherding-based swarm guidance developed by Strömbom et al. (2014) does not accurately represent sheep behaviour during shepherding-based swarm guidance. The use of reactive-based shepherding simulations, such as Dispersed Search or Classic, to support machine education curriculum development would be limited to low-level knowledge development of shepherding-based swarm guidance.
Summary of rejected null hypothesis results for variance and median of simulated results for SGP and SGC effectiveness. Tests were completed using Quantum XL plug-in
The ASR results reported in Figure 6 indicate the Traditional Sky Shepherd simulations most accurately reflect captured field results for the human-UAV team, demonstrating the homogeneous reactive shepherding model, developed by Strömbom et al. (2014), most accurately reflect unpredictable and uncontrolled shepherding-based swarm guidance. ASR of SGP and SGC Effectiveness Measures for Weighted and Traditional Simulations for Both Sky Shepherd (Human-UAV Team) and Sheepdog (Human-Sheepdog Team)
Discussion
No simulations achieved SGP effectiveness, with all simulations emulating behaviour measured as exhibiting unpredictable effectiveness. While this reflects the field observation for the human-UAV team field result (Program 17), it does not reflect the example human-sheepdog team field result (Program 13). Given the development of simulations has focused on the movement of influencing agents and the development of reactionary flocking agents, the behaviours of curiosity, leadership, and followership have been oversimplified, which is reflected in poor SGP effectiveness. Consequently, using widely accepted reactive agent-based modelling does not promote teaching a guiding agent how to effectively guide a swarm of cognitive flocking agents by promoting leader-follower relationships during the task.
Although the widely used traditional simulations used to compare field results were developed using a sheepdog to muster a flock of sheep (Strömbom et al. (2014)), the developed effectiveness measures in this paper indicate the behaviour does not reflect effective swarm-guidance-control and would not support an agent to effective cue the leading sheep on where to guide the flock. Consequently, it is recommended that such models be limited to lower-level machine education curriculum outcomes. Further, using reactive agents only to execute shepherding-based swarm guidance is not recommended, as this study’s results indicate such agents cannot emulate SGP or SGC effectiveness in the swarm agents they are guiding. Clearly, the development of these metrics will likely assist to enhance shepherding models to improve their matching of field data and thus biomimicry.
Effectiveness Metrics for Mustering Sheep
Currently, effectiveness measures for mustering sheep on farms are mostly subjective and rely on the activity owner (farmer) to have built a knowledge of sheep behaviour through exposure to the task (Goddard (2008)). While sheepdog trials offer guidelines for demonstrating skills in mustering, such tasks are not conducted for the purpose of livestock care or welfare fostering actions (Williams (2007b)). Consequently, combining the effectiveness guidelines used in Yaxley et al. (2023), with guidelines presented by Goddard (2008), and the animal welfare framework developed by Lee et al. (2018) has enabled the development of effectiveness metrics informed by swarm information markers. As agricultural practices evolve to integrate more technology (Ojo et al. (2021)), mapping of multi-disciplinary knowledge to a common language (Hepworth et al. (2021)) is vital to assuring the effectiveness and quality of such outcomes (Li et al. (2024)). These new metrics exemplify how this fusion can support future smart agricultural systems for livestock care and farming.
Specific to the mustering task, SGP focuses on how the herding agent influences the flock, supporting each swarm agent to respond in a manner reflecting the herding agent’s intent while minimising undue stress (Yaxley et al. (2023); Williams (2007d)). SGC focuses on how the flock responds to cues from the herding agent, such that the swarm moves in the direction of the goal in a manner that supports flock autonomy over path correctness (Yaxley et al. (2023); Williams (2007e)). Consequently, SGP reflects individual swarm agents’ effectiveness response to shepherding-based swarm guidance (micro-level effectiveness measure), while SGC reflects collective swarm response to shepherding-based swarm guidance (macro-level effectiveness measure). The shepherding-based swarm guidance metrics presented in this paper are a mapping of the knowledge used by an observing agent to support effective sheep mustering. As presented by Hepworth et al. (2021), SGP and SGC effectiveness metrics support the interpretability of the shepherding-based swarm guidance system for human-swarm teaming (HST). Combined, the effectiveness measures reflect how well the external agent supports the flock in achieving team tactics, such that the flock behaves as a swarm (Abbass and Hunjet (2021)).
A potential limitation to the effectiveness metrics is the use has been developed from use cases that are relatively low in complexity. While a simulated obstacle was used in the trials conducted by Yaxley et al. (2023), in general, the nature of the task was designed to measure performance between a human-sheepdog team and a human-UAV team, and not the ability to perform different types of mustering tasks on a farm. Examples where multiple sheepdogs were used to complete the task have not been considered. While Williams (2007f) describes changing the guiding agent to complete the task, thereby only using a single agent for sub-tasks, any multi-guiding agent impacts on effectiveness metric performance has not been considered. Consequently, while the effectiveness metrics have the potential to support understanding shepherding-based swarm guidance of sheep in simple situations, they may not be effective alone for assessing effectiveness in more complex tasks.
Effectiveness Metrics for Swarming
While these effectiveness metrics have been informed by field trials involving sheep mustering, the general nature of the metrics warrants investigation for any shepherding-based swarm guidance system. Hepworth et al. (2023) demonstrated how improvements to swarm guidance was possible by characterising swarm agents and strategies using information markers. However, the computational cost would become significant if used throughout the swarm guidance task. Conversely, understanding the effectiveness of the shepherding-based swarm guidance task and only applying agent classification or strategy when necessary has the potential to improve shepherding-based swarm guidance systems beyond those inspired by human-sheepdog-sheep systems.
For shepherding-based swarm guidance tasks that increase in complexity, such as the inclusion of obstacles, El-Fiqi et al. (2020) demonstrated improvement by increasing the number of guiding agents to complete the task. While the effectiveness metrics do not consider the guiding agent information markers, future work needs to assess whether the effectiveness metrics would generalise to multi-sheepdog situations and can demonstrate SGP or SGC for a complex swarm task.
Although the effectiveness metrics have been developed by considering shepherding-based swarm guidance with an external guiding agent, considering that the metrics do not rely on knowledge of the guiding agent, using the effectiveness metrics for swarm systems with a designated leader may be possible. This is particularly the case when considering that the SGP effectiveness metric provides an indication of leader-follower effectiveness within the swarm. For example, Wang et al. (2020) developed a guiding algorithm for UAV swarms using a designated virtual leader. Applying the SGP effectiveness metric in this application has the potential to highlight how effective the UAV swarm is following the guidance algorithm. However, the guiding algorithm developed by Wang et al. (2020) requires the swarm to adopt formations dependent on the task. Given the SGC effectiveness metric has been developed to consider the convex hull of a flock of sheep, it is unknown whether the SGC effectiveness metrics would support the understanding of abstract swarm formations.
Abbass and Hunjet (2021) contrast reactive and cognitive agents, with reactive agents parameterised to respond, and through the articulation of equations, model systems based on Event/Condition/Actions. Cognitive agents, however, are developed to model learning and knowledge (through lessons or feedback) and to plan and adapt to features in an environment. Consequently, reactive agents are less computationally costly, yet are less sophisticated than cognitive agents. As such, the Shepherds and Sheepdogs Autonomy Architecture presented by Abbass and Hunjet (2021) combines the attributes of reactive and cognitive agents to support the development of an autonomous agent capable of executing missions that have been defined to include elements beyond simple equations and parameterisation. Consequently, the effectiveness measures presented in this paper would support evaluating missions executed by an autonomous shepherd or sheepdog, including animal welfare and knowledge of sheep behaviour.
A learning system is presented by Zhi and Lien (2022), where guiding agents are trained to herd Reynolds’ boids groups between obstacles. Zhi and Lien (2022) assume the guiding agent does not know the group behaviour and uses a fear force to reward/penalise the shepherd during training, showing that high stress guidance (low SGP) was less successful. Expanding the reward to consider the effectiveness measure of SGP could support the guiding agent in progressing quicker and increase the flock size that can be successfully herded through obstacles. Further, including learning for SGC would likely support the guiding agent in developing basic swarm-guidance abilities, with limitations introduced if only reactive agents form the exemplar swarm.
Future Work
The presented shepherding-based swarm guidance metrics map the knowledge used by an observing agent to support effective sheep mustering. It is recommended that the SGP and SGC effectiveness measures be used to improve the interpretability of human-swarm teaming applications. However, it is likely the SGC effectiveness measure may need to be further refined to improve the ability to measure SGC effectiveness as the swarm manoeuvres through obstacles (changes convex hull properties) or approaches a defined goal. More examples of expertly assessed, predictable, and controlled movement by an observing agent are recommended to support this outcome.
To further refine the effectiveness metrics, it is also recommended the observation period τ is investigated. While the current setting of τ = 5s is appropriate, there may be an opportunity to optimise for different tasks and applications within the context of shepherding-based swarm guidance across task complexity, primary (SGP), and secondary (SGC) effectiveness metrics. Walton et al. (2018) tested a sampling rate of 8, 16 and 32 Hz over time windows of 3, 5 and 7 s, and for three behaviours: lying, standing and walking. More accurate positioning sensors fixated on an animal body are expected to reduce the effective sampling frequency. In Yaxley et al. (2023), reliance on GPS overcomes data synchronisation and misalignment problems, while averaging over a five-second window (τ = 5) smooths noise in the trend.
Using the average standardised residual metric to measure refinements in reactive modelling can support researchers in understanding whether new models demonstrate improved swarm agent behaviours and guiding actions that promote leader-follower behaviours and cues. Using the effectiveness metrics, in combination with improved simulations and the average standardised residual metric, has the potential to demonstrate trust in any autonomous systems developed through a curriculum.
With more consistent effectiveness metrics, it is recommended these swarm effectiveness measures be used to assess smart autonomous system learning of shepherding-based swarm guidance activities to support the continued development of Farmer and Sky Shepherd Teaming, progressing the research from users operate the system to users monitor the system (Handley (2020) Yaxley et al. (2020)).
Conclusion
This paper presents shepherding-based swarm guidance effectiveness metrics of swarm-guidance-predictability (SGP) and swarm-guidance-controllability (SGC) developed using exemplar field experiments between a human-sheepdog team and a human-UAV team. The effectiveness metrics have been developed by combining knowledge of subject matter experts in farming (Yaxley et al. (2023)), sheep welfare (Goddard (2008)), animal welfare (Lee et al. (2018)), and swarm information markers (Hepworth et al. (2023)). The mapping of knowledge to a common language for multiple disciplines represents an improvement in the interpretability of shepherding-based swarm guidance for human-swarm teaming (Hepworth et al. (2021)), both in the context of sheep mustering and swarm systems research.
Through simulations and fieldwork of Sky Shepherds research, we have shown a swarm system with a guiding agent supporting SGP and SGC will be shepherded to a defined goal where leader-follower relationships are prioritised. Therefore, the swarm can guide themselves towards the goal with minimum guiding agent input. Further, current reactive simulation model developed by Strömbom et al. (2014), used broadly within shepherding-based swarm guidance research, requires work to reflect field results demonstrating effective SGP and SGC.
The SGP effectiveness metric is maximum when swarm agents can respond naturally to an external guiding agent’s cues, representing a swarm system promoting leader-follower behaviour during movement. The SGC effectiveness metric is maximum when swarm agents can respond to guiding agent cues towards the defined goal and is therefore secondary to the SGP effectiveness metric. Combined, a shepherding-based swarm guidance system demonstrating high SGP effectiveness and high SGC effectiveness demonstrates mapping of interdisciplinary knowledge of farming, sheep welfare, animal welfare, and swarm information markers.
Such an insight into the dynamics of a swarm goes beyond the mere success indicators of a task to the deeper causal behaviours that guide the influencing agent. Researchers could then validate their understanding of the behaviours in a domain against the behaviours a simulation exhibits. Moreover, this understanding can inspire the design of swarm systems (simulated or physical) while contrasting such knowledge against those held by subject matter experts in mustering. Information markers of swarms act as the intermediary between designers and knowledge-holders to ensure that the right behaviours have been detected, assessed and modelled.
As both effectiveness metrics do not rely on guiding agent position, applications may consider both an intermediate (human-X-team) and direct control of the sheepdog in shepherding-based swarm guidance applications. Such generalised use cases extend the application of the metrics to enable a broader understanding of human-swarm teaming across multiple applications beyond mustering.
Footnotes
Acknowledgements
The authors wish to acknowledge Adam Hepworth and Daniel Baxter for sharing the foundational code for the simulation we used in this paper and for critical discussions in the shepherding group.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Biographies
