Sage Journals: Discover world-class research

Abstract

In this paper, we use an agent-based model (ABM) to run (counter)mobility scenarios to explore which characteristics of intermediate force capabilities (IFC) are relevant to these, and how they can affect outcomes in gray zone conflicts. Using an ABM called Map-Aware Non-Uniform Automata (MANA), developed by the New Zealand Defense Technology Agency, we implemented two scenarios where the friendly forces’ mobility was limited by large groups of civilians. Then, we employed data farming and analytics methods to analyze the data and identify key parameters influencing the outcomes. The main parameters appeared to be the IFC Range, Power (a measure of the duration of the effect), and Crowd Density. Future research could include a wide range of mobility scenarios and possibly a more detailed IFC representation.

Keywords

Intermediate force capabilities agent-based modeling (counter)mobility armed-conflict permutation feature importance shapely additive explanations Map-Aware Non-Uniform Automata

1. Introduction

In 2021, NATO Military Commission tasked Allied Command Transformation and Systems Analysis Study (SAS) 151 to develop an intermediate force capability (IFC) concept. SAS 151 conducted a series of distributed table-top wargames to identify key areas where IFC would enhance NATO mission effectiveness. One of the identified areas was mobility-countermobility scenarios, in particular scenarios in which the freedom of movement of military units is impeded by crowds of non-combatants. These wargame results are corroborated by numerous historical examples of such scenarios, ranging from Somalia to Ukraine.¹

IFCs are a set of military capabilities below the threshold of lethal intent. They are meant to provide NATO forces with an expanded range of options between presence and the use of lethal force.² These capabilities would provide NATO commanders with additional options to manage operations in the complex environment, including scenarios involving non-combatants and civilian infrastructure. IFCs include a wide range of capabilities including cyber, information warfare, and a variety of electromagnetic capabilities. They include non-lethal directed energy (DE) weapons such as acoustic hailers, laser warning devices, as well as microwave, millimeter wave or radio-frequencies devices intended to stop, degrade, disable, suppress, or move targets, both people and vehicles.

The need for the development of IFCs bore out of NATO adversaries leveraging their awareness of NATO’s conventional lethal capabilities to use the space below lethal to further its objectives.³ Examples of states that have used these hybrid warfare tactics are Russia, China, and Iran. To close the gap in its deterrence, NATO has been working to develop the ability to deny its adversaries from acting freely below conventional conflict through the development of IFCs. Previous articles have run some of the following scenarios to determine IFC effectiveness: maritime task force’s ability to counter hybrid threats in the gray zone;¹ Naval Force Protection scenarios outside of conventional warfare;⁴ crowd confrontation and convoy dynamics during an insurgency,⁵ and so on. Despite existing scenarios used to determine IFC effectiveness, one scenario that has not yet been simulated is one in which the friendly force needs to regain mobility in an environment where the adversary uses civilians to block traffic.

In analyzing the use of IFCs in scenarios where an adversary uses crowds of civilians against friendly forces, this project begs the following questions: Do the IFCs contribute to maintaining and regaining mobility? What scenario parameters have the most significant impact on the mission effectiveness when using IFCs? What are the impacts of IFC Range and mobility against a set of crowd and convoy sizes, going at different speeds? And most importantly, do IFCs prevent unnecessary civilian and friendly force casualties? (Friendly force refers to individuals on the same side as NATO forces.)

In order to explore the parametric space for non-lethal DE counter-personnel systems, we used the New Zealand Defence Technology Agency ABM called Map-Aware Non-Uniform Automata (MANA);⁶ In the IFC context, MANA was used to simulate a wide range of conflict scenarios including naval vessel self-defence⁷ and advanced combat situations.⁵ MANA simulates possible scenario outcomes based on input parameters that are then visualized and analyzed to determine which characteristics of IFC technologies will prove most effective. For example, establishing whether the range or mobility of an IFC technology enhances crowd control abilities. Within the context of this project, MANA is used to develop strategically predictable but operationally unpredictable solutions to scenarios where an adversary uses crowds of civilians to block traffic. Our project consists of (1) generating synthetic data using MANA and (2) analyzing scenario outcomes using model agnostic feature importance methods to establish which IFC characteristics are most important.

MANA was originally developed to explore intangibles in warfare and was used for data farming under the US Marine Corps data farming project Albert.^8,9 Within the context of this project, data farming is an analytical approach that leverages artificially generated data through parameterized scenarios implemented in ABMs. Data farming is a concept which originated in the field of military applications, resulting in a large amount of data that requires post-processing and analysis.^10,11 Machine learning methods are applicable for this task as they are excellent at handling big data. However, it is challenging to understand the prediction results of high-performance models due to over-parameterization and general complexity.^12,13 Model agnostic methods have been developed to explain or interpret the results of complex machine learning models providing easily visualized summary statistics.^14,15 Some of these methods include techniques that determine feature importance, such as SHapely Additive exPlanations (SHAP) and Permutation Feature importance.^15,16

2. Scenarios

To assess the IFC’s effectiveness, we developed two scenarios involving tactical movement through an urban area. Each of them is designed to demonstrate the utilization of IFCs to preserve freedom of movement/tactical mobility through unarmed/civilian crowds used by adversaries to block NATO convoys. The first scenario served as a proof of concept; in this scenario, a NATO convoy (BLUE) moved along a relatively straight line and was stopped by civilians (RED) (Figure 1, Scenario 1: Road). The second scenario involved a NATO convoy/quick response force (BLUE) moving across a populated market to support troops in contact with armed insurgents (RED) and is stopped by a crowd of civilians (YELLOW) (Figure 2, Scenario 2: Market).

Figure 1.

Scenario 1: “Road.” BLUE forces at the bottom of the map attempt to reach the blue flag at the top of the map. RED adversaries attempt to block tank mobility.

Figure 2.

Scenario 2: “Market.” BLUE forces in the upper-right-hand corner of the map navigate civilians (YELLOW) and physical barriers (market in dark brown) to reach comrades (BLUE) in bottom-lefthand corner exit.

In both cases, BLUE needs to maintain mobility without any civilian casualties. In the second scenario, they need to move fast enough to suppress insurgents and minimize BLUE casualties. BLUE uses IFCs to facilitate the passage. The simulations end when either (1) the maximum time allowance has ended or (2) when BLUE reaches the end objective. This experiment was testing the ability of IFCs to regain mobility; therefore, a situation running longer than 10,000 steps was viewed as non-successful. The second stop condition was marked by the BLUE team successfully navigating the crowd using IFC technology to reach either the BLUE flag (Scenario 1) or the exit of the market—lower left corner (Scenario 2).

3. Methods

To determine which IFC parameters most heavily influenced simulation outcome, MANA outputs were generated for each combination of parameters (many other features could have been included, such as ammunition levels, firing rates, armor penetration characteristics, agents’ influence over one another, and agents’ vision; however, our initial list was chosen based on their prevalence in crowd confrontation models). Table 1 lists the selected parameters used in each scenario to help determine the usefulness and limitations of IFCs in CCOs:

IFC Range describes the maximum distance the directed energy weapon can be used on a target.

IFC Power describes how effective the weapon is at deterring a target. The integer value is how many time steps a target will then run away before re-engaging with BLUE forces.

Number of Tanks in Convoy is the size of BLUE convoy (friendly forces) trying to reach the end target. These tanks are equipped with IFCs.

Crowd Size describes the number of civilians represented by the simulation.

Table 1.

Scenario parameters.

Parameter	Range	Increment
IFC Range	10–200	10
IFC Power	0–100	10
Number of Tanks in Convoy	3–6	1
Crowd Size	10–200	10

IFC: intermediate force capabilities.

3.1. Data farming

MANA was used to data farm 12,000 simulation outputs describing two measures of effectiveness (MOEs). The first MOE is binary; whether the friendly force (BLUE) agents made it to their target location. The second MOE is the number of steps taken for friendly agents to reach this target. These MOEs were selected to quantify successful IFC intervention in regaining and maintaining convoy mobility.

3.2. Model training and testing

Previous research predicting MOEs of simulated military scenarios and exploring the feature importance of IFC parameters have shown promising results using the decision forest machine learning algorithm.¹¹ However recently, XGBoost, a tree boosting method, has become more popularly used by data scientists to achieve high classification and regression performance.¹⁸ In this study, classifiers and regressors applying the tree boosting method XGBoost were used to predict the previously listed MOEs. Data analysis was implemented using open-source Python libraries including scikit-learn, XGBoost, NumPy, and pandas. Fivefold Random search hyperparameter tuning was applied to determine the hyperparameters that resulted in the best performance.

For each scenario and MOE, a model with the parameters determined through hyperparameter tuning was trained and tested using a fivefold cross-validation approach. In each fold, in addition to the model training on the training set and tested on the test set, the feature importance of each feature was determined using two methods. SHAP and permutation feature importance were both applied to the test set. Analysis and visualizations were produced using the open-source shap Python library. The resulting SHAP scores demonstrate the contribution of the feature to the prediction and in the case of permutation feature importance, the scores signal the significance of the feature to the accurate prediction of the true value.

4. Results

4.1. Model performance

Table 2 shows the performance of each of the tuned models as a result of the cross-validation. In Scenario 2, additional MOEs included BLUE and RED casualties from a firefight in the lower-left hand corner of the map, with the goal of minimizing BLUE casualties.

Table 2.

Model performance for both scenarios and applicable MOEs.

	Performance metric	Scenario
		1: Road	2: Market
BLUE reach goal	Accuracy	0.865 ± 0.005	0.757 ± 0.335
Steps	Mean absolute error	896 ± 6.87	686 ± 91.0
BLUE casualties	Mean absolute error	N/A	1.46 ± 0.398
RED casualties	Mean absolute error	N/A	20.5 ± 6.23

MOE: measure of effectiveness.

4.2. Scenario 1: BLUE reach goal

Figure 3 shows the SHAP feature importance when BLUE reaches its goal. The results were generally consistent, except for variations in the ranking of the parameters. In Figure 4, the crowd size is the most important parameter, with IFC Power not a far behind second. Given the dominance of crowd size in the measure of effectiveness for whether BLUE reaches its goal, we re-ran the SHAP values with the minimum (10), medium (100), and maximum (200) crowd size parameters to gauge its effect. For the minimum and medium crowd size runs, IFC Power (how far away the individual crowd members were pushed) was the most important parameter in determining whether BLUE reached its goal, whereas in the maximum crowd size IFC Range was the most important. This is somewhat intuitive. For smaller crowds the power, that is, the impact on the individuals was dominant. However, for larger crowds (larger crowd density), the individuals would be replaced by other crowd members. Therefore, the range (i.e., the ability to push all the crowd members further, albeit for a shorter time) was more important.

Figure 3.

Scenario 1: Mean SHAP scores for the measure of effectiveness: BLUE reach goal.

Figure 4.

Scenario 1: SHAP beeswarm plot for the measure of effectiveness: BLUE reach goal.

Figure 4 demonstrates a SHAP summary plot which represents the range and distribution of the parameter impacts on model output. Each dot represents the SHAP value, with blue representing low and red high feature values. Positive SHAP values indicate positive contributions to the predictions, while negative SHAP values indicate negative contributions. Figure 4 demonstrates that a feature’s value (shown as the color of each point) is highly correlated to the impact on model output. For example, as crowd size decreases (shown in blue), the likelihood of blue reaching goal increases (a positive SHAP value).

4.3. Scenario 1: steps

Figure 5 shows the SHAP feature importance of the number of steps it takes for BLUE to reach its goal. Like Figure 4, the crowd size is the most important parameter with IFC Power a distance behind. Given the dominance of crowd size in the measure of effectiveness for the number of steps it takes for BLUE to reach its goal, we re-ran the SHAP values with the minimum (10), medium (100), and maximum (200) crowd size parameters to gauge its effect. Similar to the results in Scenario 1, in the minimum and medium crowd size runs, IFC Power was the most important parameter, whereas in the maximum crowd size, IFC Range was the most important. This is again consistent with the ability of the systems to affect individuals for a longer time or clear the area further. Figure 6 shows a smooth graduation in color, indicating a smooth increase in the model’s output.

Figure 5.

Scenario 1: Mean SHAP scores for the measure of effectiveness: steps.

Figure 6.

Scenario 1: SHAP beeswarm plot for the measure of effectiveness: steps.

4.4. Scenario 2: BLUE reach goal

Figure 7 shows the SHAP feature importance of when BLUE reaches its goal for Scenario 2. The IFC Range is the most important parameter. This finding is most likely due to the fact that the crowd is distributed over a large area, and therefore, the increase in the crowd size was not as important since only a small part of the crowd was interacting with BLUE force at any given time. Therefore, the IFC parameters yielded greater influence. Figure 8 shows a smooth, yet tapered and broken graduation in color, indicating a mixed graduation in the model’s output.

Figure 7.

Scenario 2: Mean SHAP scores for the measure of effectiveness: BLUE reach goal.

Figure 8.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: BLUE reaches goal.

4.5. Scenario 2: steps

Figure 9 shows the SHAP feature importance of the number of steps it takes for BLUE to reach its goal. The methods were generally consistent in the ranking of the parameters. IFC Power is the most important parameter with crowd size having the second largest impact. Figure 10 shows a clustered graduation in color, indicating a clustered graduation in the model’s output. This suggests that while the crowd size had a limited impact on the ability of BLUE to get to their objective, it had an impact on how quickly they were able to do it.

Figure 9.

Scenario 2: Mean SHAP scores for the measure of effectiveness: steps.

Figure 10.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: steps.

4.6. Scenario 2: BLUE and RED squad casualties

Since Scenario 2 involved an armed component (insurgents), we also looked at the number of squad casualties (BLUE and RED) and analyzed their relationship with the scenario parameters. Investigating how they would impact the arrival of the quick reaction force further effecting the rate at which the insurgents are eliminated. Figure 11 shows the SHAP feature importance of the most important parameters in predicting BLUE and RED casualties. The methods were generally consistent in the ranking of the parameters, except for the SHAP results in predicting BLUE casualties. In Figure 11, IFC Power is the most important feature in predicting BLUE and RED casualties, but this is closely followed by the IFC Range and Crowd Size. Figures 12 and 13 show smooth graduation in color, indicating smooth graduation in the model’s output.

Figure 11.

Scenario 2: Mean SHAP scores for the measure of effectiveness: (a) BLUE and (b) RED casualties.

Figure 12.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: BLUE casualties.

Figure 13.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: RED casualties.

5. Discussion

Pair plots (the pair plots allow for us to see a visual relationship of how two variables interact. Each dot represents a single scenario: green dots represent a successful run (BLUE reaches goal) and RED for an unsuccessful scenario run (BLUE does not reach goal before 1000 steps). Distributions for each variable are shown along the center diagonal axis (upper-left to lower-right) of the matrix.) in Figures 14 and 15 visualize the relationships between any two variables and the specified MOEs. In Figure 14, the plots describe how parameters interact in Scenario 1. As expected, as the crowd size increases, BLUE is less likely to reach its goal, and even if they do, it will take longer. As soon as the crowd size exceeds 150 civilians, BLUE convoy speed is reduced significantly, and the simulation requires at least 4000 steps (a measure of scenario time as simulation deemed unsuccessful at 10,000 steps) to complete. Figure 14 also suggests that with larger crowds (>150), a greater IFC Range (>100) increases the likelihood of scenario success, that is, it mitigates the effects of the crowd. Interestingly, while an increase in IFC Power or IFC Range does increase the likelihood of BLUE convoy reaching its goal, it does not significantly speed up simulation time (steps). As shown in Figure 14, Number in Convoy histogram, increasing the convoy size (and subsequently, the number of IFCs) slightly increased simulation success. The findings conclusively show that the presence of IFCs improved the ability of BLUE force to reach the objective (and reach it faster).

Figure 14.

Scenario 1: SHAP pair plot for the measure of effectiveness: BLUE reach goal.

Figure 15.

Scenario 2: SHAP pair plot for the measure of effectiveness: BLUE reach goal.

In Figure 15 describing Scenario 2, the additional MOEs, that is, BLUE and RED casualties, were considered as well. For instance, an increase in IFC Range led to a slight decrease in the number of steps as BLUE convoy was able to navigate to its target faster, and as a result, it led to an increase in RED casualties as BLUE tanks were able to quickly join the firefight. Crowd size had very little influence on Scenario 2, as depicted by the uniform distribution of success shown in IFC Power versus Crowd Size and IFC Range versus Crowd Size plots in Figure 15. An increase in IFC Power did however seem to speed up the scenario time (reduced steps taken). This demonstrates that in a more complex spatial layout with lower crowd density more powerful IFCs will increase the likelihood of success.

The results conclusively showed that the IFC improved the ability of BLUE force to achieve their objective (and to maintain freedom of movement). Increasing the number of available IFCs improved the mission’s success. Two other characteristics were the range and the duration of the effect on individuals. The range was more important as the crowd density increased, while the effects on individuals were more important for smaller, sparser crowds.

While we simulated the presence of crowds using MANA, we did not factor in any differences in crowd demographics such as the age or gender of the individuals. While it was possible to factor some of these considerations into scenarios within MANA, it was not only beyond the scope of our study but it would have significantly expanded the parameter search space. Therefore, drastically increasing the computational cost. Due to the abstraction intrinsic to agent-based models, we assumed that the effectiveness of the IFCs would be comparable for all demographics. In addition, we did not model the operational and strategic impacts of the IFCs. For instance, would the use of IFCs create greater friction and distrust among NATO forces and local populations? Such analysis was beyond the scope of the present study.

6. Conclusion

We used data farming employing an agent-based model called MANA and machine learning methods to assess the mission effectiveness of IFC technologies in two mobility/counter-mobility scenarios. In Scenario 1, the crowd size, or more likely crowd density, had the greatest influence on the simulation outcomes. The IFC Power, as a measure of the strength of the IFC effects on individuals, was the second most important feature for lower density crowds. As crowd size increased, the IFC Range became a more dominant influence. This was likely due to the easy replacement of the individuals as the crowd became denser. However, in Scenario 2, IFC Power had the greatest influence on the simulation outcomes. Unlike in Scenario 1, increasing crowd size did not increase the importance of the IFC Range, possibly due to the limited distances between buildings. In summary, the results of our study show the value of the IFCs to achieve mobility. Increased numbers of IFCs improved the effectiveness, as did the IFC Range. For lower density crowds or where the range is limited, the duration of the effect on individuals becomes important.

Footnotes

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Jessica Afara

Author biographies

Jessica Afara is an employee of the Canadian Federal Government. She graduated with a Masters in International Affairs with specializations in Data Science, Foreign Policy and Diplomacy from the Norman Paterson School of International Affairs at Carleton University. She also received a bachelor’s degree in Political Science with a concentration in Western Thought and Civilization from King’s at Western University. Her research areas include gauging the impacts of large-scale mines on local perspectives toward foreign countries, the use of non-lethal weapons in defense strategies, and so on. The views expressed in this paper are the author’s alone, and do not reflect the policies or views of the Government of Canada.

Victoria Ajila is a second-year Master of Applied Science student in Electrical and Computer Engineering with a specialization in Data Science at Carleton University, Department of Systems and Computer Engineering. She has received a bachelor’s degree in Biomedical and Electrical Engineering from Carleton University. Her research areas include biomedical informatics specifically genomics and proteomics in addition to broad data science and machine learning topics with applications in defense.

Hannah Macdonell is a second-year Masters student at Carleton University studying Geography with a specialization in Data Science. Her research is primarily focused on statistical approaches to model evaluation in permafrost regions. She graduated from St. Francis Xavier University in 2021 with a Bachelor of Arts and Science in Climate and Environmental Sciences, and a Bachelor of Sciences in Computer Science.

Peter Dobias is a Section Head for the Land and Operational Commands section of Defense Research & Development Canada, Center for Operational Research and Analysis (DRDC CORA), Ottawa, Canada. He is responsible for operational research and strategic analysis support provided by five teams embedded with the Canadian Armed Forces’ operational commands and the Canadian Army. Previously, he led several teams in DRDC CORA and at US Central Command. His research background includes analysis of complex adaptive and self-organized systems, deterrence and threat assessment, wargaming and constructive simulations, and strategic and operational mission assessment.

References

Dobias

Christensen

. Wargaming the use of intermediate force capabilities in the gray zone. J Defense Model Simulat 2021. https://cradpdf.drdc-rddc.gc.ca/PDFS/unc372/p813670_A1b.pdf

Berger

. Intermediate force capabilities: bridging the gap between presence and lethality. U.S. Department of Defense. https://mca-marines.org/wp-content/uploads/DoD-NLW-EA-Planning-Guidance-March-2020.pdf (2020, accessed 4 April 2022)

Dobias

Christensen

. The “Grey Zone” and hybrid activities. Ottawa, ON, Canada: Defence Research and Development Canada, Centre for Operational Research and Analysis.

Dobias

Eisler

. Modeling a naval force protection scenario in MANA. Ottawa, ON, Canada: Centre for Operational Research and Analysis Defence Research and Development Canada, 2017.

Dobias

Wanliss

. Interdisciplinary applications of statistical physics to modeling decisions in combat situations. Int J Arts Sci 2014; 7: 271.

McIntosh

. MANA (Map Aware Non-Uniform Automata) version 4.0 user manual. Auckland: New Zealand Defence Technology Agency, 2007.

Dobias

Eisler

Liu

. Use of agent—based models in support of coastal surveillance planning and assessment. Defence Research & Development Canada—Centre for Operational Research and Analysis, 2016, https://cradpdf.drdc-rddc.gc.ca/PDFS/unc219/p803348_A1b.pdf

Moffat

Smith

Witty

. Emergent behaviour: theory and experimentation using the MANA model. J Appl Math Decision Sci 2006; 3, https://downloads.hindawi.com/archive/2006/054846.pdf

Forsyth

Horne

Upton

. Marine corps applications of data farming. In: Proceedings of the winter simulation conference, Orlando, FL, 4 December 2005.

10.

Horne

Meyer

. Data farming: discovering surprise. In: Proceedings of the 2004 winter simulation conference, Orlando, FL, 4 December 2004.

11.

Serre

Amoyt-Bourgeois

Astles

. Use of shapley additive explanations in interpreting agent-based simulations of military operational scenarios. In: Annual modeling and simulation conference, Fairfax, VA, 19–22 July 2021.

12.

Belle

Papantonis

. Principles and practice of explainable machine learning. Front Big Data 2021; 4: 688969.

13.

Xuhong

. Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond. Knowl Inf Syst 2022; 64: 3197–3234.

14.

Ribeiro

Sameer

Guestrin

. Model-agnostic interpretability of machine learning, 2016, https://arxiv.org/pdf/1606.05386.pdf?source=post_page

15.

Carvalho

Pereira

Cardoso

. Machine learning interpretability: a survey on methods and metrics. Electronics 2019; 8: 832.

16.

Lundberg

Lee

. A unified approach to interpreting model predictions, 2017, https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

17.

Breiman

. Random forests. Mach Learn 2021; 45: 5–32.

18.

Chen

Guestrin

. XGBoost: a scralable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 785–794, https://www.kdd.org/kdd2016/papers/files/rfp0697-chenAemb.pdf

Use of agent-based modeling to model intermediate force capabilities in (counter)mobility crowd scenarios

Abstract

Keywords

1. Introduction

2. Scenarios

3. Methods

3.1. Data farming

3.2. Model training and testing

4. Results

4.1. Model performance

4.2. Scenario 1: BLUE reach goal

4.3. Scenario 1: steps

4.4. Scenario 2: BLUE reach goal

4.5. Scenario 2: steps

4.6. Scenario 2: BLUE and RED squad casualties

5. Discussion

6. Conclusion

Footnotes

Funding

ORCID iD

Author biographies

References