Abstract
In this paper, we use an agent-based model (ABM) to run (counter)mobility scenarios to explore which characteristics of intermediate force capabilities (IFC) are relevant to these, and how they can affect outcomes in gray zone conflicts. Using an ABM called Map-Aware Non-Uniform Automata (MANA), developed by the New Zealand Defense Technology Agency, we implemented two scenarios where the friendly forces’ mobility was limited by large groups of civilians. Then, we employed data farming and analytics methods to analyze the data and identify key parameters influencing the outcomes. The main parameters appeared to be the IFC Range, Power (a measure of the duration of the effect), and Crowd Density. Future research could include a wide range of mobility scenarios and possibly a more detailed IFC representation.
Keywords
1. Introduction
In 2021, NATO Military Commission tasked Allied Command Transformation and Systems Analysis Study (SAS) 151 to develop an intermediate force capability (IFC) concept. SAS 151 conducted a series of distributed table-top wargames to identify key areas where IFC would enhance NATO mission effectiveness. One of the identified areas was mobility-countermobility scenarios, in particular scenarios in which the freedom of movement of military units is impeded by crowds of non-combatants. These wargame results are corroborated by numerous historical examples of such scenarios, ranging from Somalia to Ukraine. 1
IFCs are a set of military capabilities below the threshold of lethal intent. They are meant to provide NATO forces with an expanded range of options between presence and the use of lethal force. 2 These capabilities would provide NATO commanders with additional options to manage operations in the complex environment, including scenarios involving non-combatants and civilian infrastructure. IFCs include a wide range of capabilities including cyber, information warfare, and a variety of electromagnetic capabilities. They include non-lethal directed energy (DE) weapons such as acoustic hailers, laser warning devices, as well as microwave, millimeter wave or radio-frequencies devices intended to stop, degrade, disable, suppress, or move targets, both people and vehicles.
The need for the development of IFCs bore out of NATO adversaries leveraging their awareness of NATO’s conventional lethal capabilities to use the space below lethal to further its objectives. 3 Examples of states that have used these hybrid warfare tactics are Russia, China, and Iran. To close the gap in its deterrence, NATO has been working to develop the ability to deny its adversaries from acting freely below conventional conflict through the development of IFCs. Previous articles have run some of the following scenarios to determine IFC effectiveness: maritime task force’s ability to counter hybrid threats in the gray zone; 1 Naval Force Protection scenarios outside of conventional warfare; 4 crowd confrontation and convoy dynamics during an insurgency, 5 and so on. Despite existing scenarios used to determine IFC effectiveness, one scenario that has not yet been simulated is one in which the friendly force needs to regain mobility in an environment where the adversary uses civilians to block traffic.
In analyzing the use of IFCs in scenarios where an adversary uses crowds of civilians against friendly forces, this project begs the following questions: Do the IFCs contribute to maintaining and regaining mobility? What scenario parameters have the most significant impact on the mission effectiveness when using IFCs? What are the impacts of IFC Range and mobility against a set of crowd and convoy sizes, going at different speeds? And most importantly, do IFCs prevent unnecessary civilian and friendly force casualties? (Friendly force refers to individuals on the same side as NATO forces.)
In order to explore the parametric space for non-lethal DE counter-personnel systems, we used the New Zealand Defence Technology Agency ABM called Map-Aware Non-Uniform Automata (MANA); 6 In the IFC context, MANA was used to simulate a wide range of conflict scenarios including naval vessel self-defence 7 and advanced combat situations. 5 MANA simulates possible scenario outcomes based on input parameters that are then visualized and analyzed to determine which characteristics of IFC technologies will prove most effective. For example, establishing whether the range or mobility of an IFC technology enhances crowd control abilities. Within the context of this project, MANA is used to develop strategically predictable but operationally unpredictable solutions to scenarios where an adversary uses crowds of civilians to block traffic. Our project consists of (1) generating synthetic data using MANA and (2) analyzing scenario outcomes using model agnostic feature importance methods to establish which IFC characteristics are most important.
MANA was originally developed to explore intangibles in warfare and was used for data farming under the US Marine Corps data farming project Albert.8,9 Within the context of this project, data farming is an analytical approach that leverages artificially generated data through parameterized scenarios implemented in ABMs. Data farming is a concept which originated in the field of military applications, resulting in a large amount of data that requires post-processing and analysis.10,11 Machine learning methods are applicable for this task as they are excellent at handling big data. However, it is challenging to understand the prediction results of high-performance models due to over-parameterization and general complexity.12,13 Model agnostic methods have been developed to explain or interpret the results of complex machine learning models providing easily visualized summary statistics.14,15 Some of these methods include techniques that determine feature importance, such as SHapely Additive exPlanations (SHAP) and Permutation Feature importance.15,16
2. Scenarios
To assess the IFC’s effectiveness, we developed two scenarios involving tactical movement through an urban area. Each of them is designed to demonstrate the utilization of IFCs to preserve freedom of movement/tactical mobility through unarmed/civilian crowds used by adversaries to block NATO convoys. The first scenario served as a proof of concept; in this scenario, a NATO convoy (BLUE) moved along a relatively straight line and was stopped by civilians (RED) (Figure 1, Scenario 1: Road). The second scenario involved a NATO convoy/quick response force (BLUE) moving across a populated market to support troops in contact with armed insurgents (RED) and is stopped by a crowd of civilians (YELLOW) (Figure 2, Scenario 2: Market).

Scenario 1: “Road.” BLUE forces at the bottom of the map attempt to reach the blue flag at the top of the map. RED adversaries attempt to block tank mobility.

Scenario 2: “Market.” BLUE forces in the upper-right-hand corner of the map navigate civilians (YELLOW) and physical barriers (market in dark brown) to reach comrades (BLUE) in bottom-lefthand corner exit.
In both cases, BLUE needs to maintain mobility without any civilian casualties. In the second scenario, they need to move fast enough to suppress insurgents and minimize BLUE casualties. BLUE uses IFCs to facilitate the passage. The simulations end when either (1) the maximum time allowance has ended or (2) when BLUE reaches the end objective. This experiment was testing the ability of IFCs to regain mobility; therefore, a situation running longer than 10,000 steps was viewed as non-successful. The second stop condition was marked by the BLUE team successfully navigating the crowd using IFC technology to reach either the BLUE flag (Scenario 1) or the exit of the market—lower left corner (Scenario 2).
3. Methods
To determine which IFC parameters most heavily influenced simulation outcome, MANA outputs were generated for each combination of parameters (many other features could have been included, such as ammunition levels, firing rates, armor penetration characteristics, agents’ influence over one another, and agents’ vision; however, our initial list was chosen based on their prevalence in crowd confrontation models). Table 1 lists the selected parameters used in each scenario to help determine the usefulness and limitations of IFCs in CCOs:
IFC Range describes the maximum distance the directed energy weapon can be used on a target.
IFC Power describes how effective the weapon is at deterring a target. The integer value is how many time steps a target will then run away before re-engaging with BLUE forces.
Number of Tanks in Convoy is the size of BLUE convoy (friendly forces) trying to reach the end target. These tanks are equipped with IFCs.
Crowd Size describes the number of civilians represented by the simulation.
Scenario parameters.
IFC: intermediate force capabilities.
3.1. Data farming
MANA was used to data farm 12,000 simulation outputs describing two measures of effectiveness (MOEs). The first MOE is binary; whether the friendly force (BLUE) agents made it to their target location. The second MOE is the number of steps taken for friendly agents to reach this target. These MOEs were selected to quantify successful IFC intervention in regaining and maintaining convoy mobility.
3.2. Model training and testing
Previous research predicting MOEs of simulated military scenarios and exploring the feature importance of IFC parameters have shown promising results using the decision forest machine learning algorithm. 11 However recently, XGBoost, a tree boosting method, has become more popularly used by data scientists to achieve high classification and regression performance. 18 In this study, classifiers and regressors applying the tree boosting method XGBoost were used to predict the previously listed MOEs. Data analysis was implemented using open-source Python libraries including scikit-learn, XGBoost, NumPy, and pandas. Fivefold Random search hyperparameter tuning was applied to determine the hyperparameters that resulted in the best performance.
For each scenario and MOE, a model with the parameters determined through hyperparameter tuning was trained and tested using a fivefold cross-validation approach. In each fold, in addition to the model training on the training set and tested on the test set, the feature importance of each feature was determined using two methods. SHAP and permutation feature importance were both applied to the test set. Analysis and visualizations were produced using the open-source shap Python library. The resulting SHAP scores demonstrate the contribution of the feature to the prediction and in the case of permutation feature importance, the scores signal the significance of the feature to the accurate prediction of the true value.
4. Results
4.1. Model performance
Table 2 shows the performance of each of the tuned models as a result of the cross-validation. In Scenario 2, additional MOEs included BLUE and RED casualties from a firefight in the lower-left hand corner of the map, with the goal of minimizing BLUE casualties.
Model performance for both scenarios and applicable MOEs.
MOE: measure of effectiveness.
4.2. Scenario 1: BLUE reach goal
Figure 3 shows the SHAP feature importance when BLUE reaches its goal. The results were generally consistent, except for variations in the ranking of the parameters. In Figure 4, the crowd size is the most important parameter, with IFC Power not a far behind second. Given the dominance of crowd size in the measure of effectiveness for whether BLUE reaches its goal, we re-ran the SHAP values with the minimum (10), medium (100), and maximum (200) crowd size parameters to gauge its effect. For the minimum and medium crowd size runs, IFC Power (how far away the individual crowd members were pushed) was the most important parameter in determining whether BLUE reached its goal, whereas in the maximum crowd size IFC Range was the most important. This is somewhat intuitive. For smaller crowds the power, that is, the impact on the individuals was dominant. However, for larger crowds (larger crowd density), the individuals would be replaced by other crowd members. Therefore, the range (i.e., the ability to push all the crowd members further, albeit for a shorter time) was more important.

Scenario 1: Mean SHAP scores for the measure of effectiveness: BLUE reach goal.

Scenario 1: SHAP beeswarm plot for the measure of effectiveness: BLUE reach goal.
Figure 4 demonstrates a SHAP summary plot which represents the range and distribution of the parameter impacts on model output. Each dot represents the SHAP value, with blue representing low and red high feature values. Positive SHAP values indicate positive contributions to the predictions, while negative SHAP values indicate negative contributions. Figure 4 demonstrates that a feature’s value (shown as the color of each point) is highly correlated to the impact on model output. For example, as crowd size decreases (shown in blue), the likelihood of blue reaching goal increases (a positive SHAP value).
4.3. Scenario 1: steps
Figure 5 shows the SHAP feature importance of the number of steps it takes for BLUE to reach its goal. Like Figure 4, the crowd size is the most important parameter with IFC Power a distance behind. Given the dominance of crowd size in the measure of effectiveness for the number of steps it takes for BLUE to reach its goal, we re-ran the SHAP values with the minimum (10), medium (100), and maximum (200) crowd size parameters to gauge its effect. Similar to the results in Scenario 1, in the minimum and medium crowd size runs, IFC Power was the most important parameter, whereas in the maximum crowd size, IFC Range was the most important. This is again consistent with the ability of the systems to affect individuals for a longer time or clear the area further. Figure 6 shows a smooth graduation in color, indicating a smooth increase in the model’s output.

Scenario 1: Mean SHAP scores for the measure of effectiveness: steps.

Scenario 1: SHAP beeswarm plot for the measure of effectiveness: steps.
4.4. Scenario 2: BLUE reach goal
Figure 7 shows the SHAP feature importance of when BLUE reaches its goal for Scenario 2. The IFC Range is the most important parameter. This finding is most likely due to the fact that the crowd is distributed over a large area, and therefore, the increase in the crowd size was not as important since only a small part of the crowd was interacting with BLUE force at any given time. Therefore, the IFC parameters yielded greater influence. Figure 8 shows a smooth, yet tapered and broken graduation in color, indicating a mixed graduation in the model’s output.

Scenario 2: Mean SHAP scores for the measure of effectiveness: BLUE reach goal.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: BLUE reaches goal.
4.5. Scenario 2: steps
Figure 9 shows the SHAP feature importance of the number of steps it takes for BLUE to reach its goal. The methods were generally consistent in the ranking of the parameters. IFC Power is the most important parameter with crowd size having the second largest impact. Figure 10 shows a clustered graduation in color, indicating a clustered graduation in the model’s output. This suggests that while the crowd size had a limited impact on the ability of BLUE to get to their objective, it had an impact on how quickly they were able to do it.

Scenario 2: Mean SHAP scores for the measure of effectiveness: steps.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: steps.
4.6. Scenario 2: BLUE and RED squad casualties
Since Scenario 2 involved an armed component (insurgents), we also looked at the number of squad casualties (BLUE and RED) and analyzed their relationship with the scenario parameters. Investigating how they would impact the arrival of the quick reaction force further effecting the rate at which the insurgents are eliminated. Figure 11 shows the SHAP feature importance of the most important parameters in predicting BLUE and RED casualties. The methods were generally consistent in the ranking of the parameters, except for the SHAP results in predicting BLUE casualties. In Figure 11, IFC Power is the most important feature in predicting BLUE and RED casualties, but this is closely followed by the IFC Range and Crowd Size. Figures 12 and 13 show smooth graduation in color, indicating smooth graduation in the model’s output.

Scenario 2: Mean SHAP scores for the measure of effectiveness: (a) BLUE and (b) RED casualties.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: BLUE casualties.

Scenario 2: SHAP beeswarm plot for the measure of effectiveness: RED casualties.
5. Discussion
Pair plots (the pair plots allow for us to see a visual relationship of how two variables interact. Each dot represents a single scenario: green dots represent a successful run (BLUE reaches goal) and RED for an unsuccessful scenario run (BLUE does not reach goal before 1000 steps). Distributions for each variable are shown along the center diagonal axis (upper-left to lower-right) of the matrix.) in Figures 14 and 15 visualize the relationships between any two variables and the specified MOEs. In Figure 14, the plots describe how parameters interact in Scenario 1. As expected, as the crowd size increases, BLUE is less likely to reach its goal, and even if they do, it will take longer. As soon as the crowd size exceeds 150 civilians, BLUE convoy speed is reduced significantly, and the simulation requires at least 4000 steps (a measure of scenario time as simulation deemed unsuccessful at 10,000 steps) to complete. Figure 14 also suggests that with larger crowds (>150), a greater IFC Range (>100) increases the likelihood of scenario success, that is, it mitigates the effects of the crowd. Interestingly, while an increase in IFC Power or IFC Range does increase the likelihood of BLUE convoy reaching its goal, it does not significantly speed up simulation time (steps). As shown in Figure 14, Number in Convoy histogram, increasing the convoy size (and subsequently, the number of IFCs) slightly increased simulation success. The findings conclusively show that the presence of IFCs improved the ability of BLUE force to reach the objective (and reach it faster).

Scenario 1: SHAP pair plot for the measure of effectiveness: BLUE reach goal.

Scenario 2: SHAP pair plot for the measure of effectiveness: BLUE reach goal.
In Figure 15 describing Scenario 2, the additional MOEs, that is, BLUE and RED casualties, were considered as well. For instance, an increase in IFC Range led to a slight decrease in the number of steps as BLUE convoy was able to navigate to its target faster, and as a result, it led to an increase in RED casualties as BLUE tanks were able to quickly join the firefight. Crowd size had very little influence on Scenario 2, as depicted by the uniform distribution of success shown in IFC Power versus Crowd Size and IFC Range versus Crowd Size plots in Figure 15. An increase in IFC Power did however seem to speed up the scenario time (reduced steps taken). This demonstrates that in a more complex spatial layout with lower crowd density more powerful IFCs will increase the likelihood of success.
The results conclusively showed that the IFC improved the ability of BLUE force to achieve their objective (and to maintain freedom of movement). Increasing the number of available IFCs improved the mission’s success. Two other characteristics were the range and the duration of the effect on individuals. The range was more important as the crowd density increased, while the effects on individuals were more important for smaller, sparser crowds.
While we simulated the presence of crowds using MANA, we did not factor in any differences in crowd demographics such as the age or gender of the individuals. While it was possible to factor some of these considerations into scenarios within MANA, it was not only beyond the scope of our study but it would have significantly expanded the parameter search space. Therefore, drastically increasing the computational cost. Due to the abstraction intrinsic to agent-based models, we assumed that the effectiveness of the IFCs would be comparable for all demographics. In addition, we did not model the operational and strategic impacts of the IFCs. For instance, would the use of IFCs create greater friction and distrust among NATO forces and local populations? Such analysis was beyond the scope of the present study.
6. Conclusion
We used data farming employing an agent-based model called MANA and machine learning methods to assess the mission effectiveness of IFC technologies in two mobility/counter-mobility scenarios. In Scenario 1, the crowd size, or more likely crowd density, had the greatest influence on the simulation outcomes. The IFC Power, as a measure of the strength of the IFC effects on individuals, was the second most important feature for lower density crowds. As crowd size increased, the IFC Range became a more dominant influence. This was likely due to the easy replacement of the individuals as the crowd became denser. However, in Scenario 2, IFC Power had the greatest influence on the simulation outcomes. Unlike in Scenario 1, increasing crowd size did not increase the importance of the IFC Range, possibly due to the limited distances between buildings. In summary, the results of our study show the value of the IFCs to achieve mobility. Increased numbers of IFCs improved the effectiveness, as did the IFC Range. For lower density crowds or where the range is limited, the duration of the effect on individuals becomes important.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
