Sage Journals: Discover world-class research

Abstract

Modern military decision-making emphasizes the efficient conversion of information to action under uncertainty. Canonical frameworks such as Boyd’s Observe–Orient–Decide–Act (OODA) loop and Endsley’s model of Situational Awareness (SA) describe this process. To supplement these models, a measure called the HIT score is proposed, which functions as a proxy for both OODA and SA. The HIT score is a quantitative, nonintrusive, post hoc metric for information-processing efficiency. It is domain-agnostic and measures how effectively a system compresses environmental uncertainty into context-appropriate action within time constraints or other domain-specific costs. HIT depends on three observable or inferable components: H (Shannon entropy of the environment), I (mutual information between context and response), and T (decision time or cost). Formalisms for computing HIT are presented for single agents and networked settings. Empirical examples, including an iterated prisoner’s dilemma simulation and command and control (C2)-like scenarios, illustrate that HIT distinguishes adaptive strategies from those exhibiting mere random variability. Moreover, the metric is expanded by Petri Net-based approaches that are computationally tractable and practically applicable in both simulations and real-world scenarios. The HIT score provides a minimal quantitative bridge between information theory and operational decision-making in information-rich environments.

Keywords

Information theory situational awareness command and control (C2)Petri Nets decision efficiency OODA loop cybernetic metrics

1. Introduction

Effective command and control (C2) in dynamic environments requires assessing the degree to which decision-makers process information. Seminal frameworks, such as Boyd’s¹ Observe–Orient–Decide–Act (OODA) loop and Endsley’s² three-level situational awareness (SA) model, qualitatively describe the cycle of observing, orienting to context, deciding, and acting under uncertainty. However, these canonical frameworks have notable shortcomings, namely that they do not provide any non-intrusive quantitative means for assessing decision-making either in near real-time or post hoc. Given two systems, which had a tighter OODA loop or higher SA in a given situation? The HIT score proposed in this paper can act as a proxy metric for such measurements, especially in simulations where ground truth is accessible. The proposed metric is, in essence, about timely compression of environmental entropy into actionable information. Hence, it can also be considered to be a metric of adaptive intelligence. In that regard also, the HIT score has certain advantages in comparison to other metrics. Many existing attempts to measure adaptive intelligence fix a suite of tasks and tally performance,³ rely on detailed internal models of cognition,⁴ or invoke theoretical constructs such as Kolmogorov complexity that are impractical to compute in practice.⁵ Instead of necessitating modeling of internal processes, the HIT score is calculable using observable or inferable variables.

In this paper, the HIT score—named for its three core elements: $H$ , $I$ , and $T$ —is introduced as a minimalist, externally observable or inferable metric of information-processing efficiency. (It is not an acronym, but self-referential to the concept’s mathematical formulation. It can be read as “H-Information-per-Time,” or “hit.”) It is a general metric that captures the essence of adaptive decision-making: how much uncertainty an agent faces and how effectively it turns that uncertainty into timely, appropriate actions. The HIT score asks how efficiently does an agent absorb the entropy of its environment and respond with relevant information in its actions, given time or resource constraints. The metric is domain-agnostic, requiring only observations or estimates of the system’s inputs, outputs, and decision latency. By quantifying the reduction of uncertainty achieved per unit time, the HIT score (later also simply “HIT”) enables “apples-to-apples” comparisons of disparate systems—whether human or machine, individual or networked—facing decision-making situations. The formalisms of HIT are derived from first principles in information theory. At its core, HIT combines three measurable components: (1) the entropy of the inputs (a measure or estimate of environmental uncertainty or variability), (2) the mutual information between the system’s context and its outputs (a measure of how well the system’s actions reflect relevant situational information), and (3) the latency or cost incurred in producing those outputs.

Per Endsley’s² classic SA definition: “Situation awareness is the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future.” HIT operationalizes core aspects of Endsley’s SA model through the lens of information theory. Perception, comprehension, and projection are directly and meaningfully reflected in $H (C)$ (the Shannon entropy of the observed context), $I (C; R)$ (the mutual information between context and response), and $I (R; A)$ (the mutual information between response and subsequent environmental state or outcome). Here, $C$ denotes the situational context, $R$ the agent’s response, and $A$ the ensuing state of the environment. The key advantage is that HIT does not require cognitive modeling or introspective self-report instruments. It relies solely on observable or inferable context–action–latency triplets.

Intuitively, a high HIT score would indicate that a system is both informed and efficient—it encounters a wide variety of situations and reliably acts in context-appropriate ways with minimal delay. This formulation captures desirable adaptive behavior in line with OODA and SA. For example, an agent with a fast and contextually precise response cycle will score higher than one that either responds slowly or ignores important aspects of the situation. Conversely, an agent that reacts randomly or rigidly (ignoring context) will have a low score despite achieving possibly high action variability. In short, HIT distinguishes information-driven adaptability from mere activity.

The performance of the HIT score is tested in a sensitivity analysis and validated in game-theoretic and military decision-making-inspired simulated settings. In an iterated prisoner’s dilemma (IPD) tournament,⁶ strategies that condition their actions on the opponent’s behavior (thus leveraging context) achieve higher HIT scores than unconditional or random strategies. A similar pattern is observed in simplified agent-based air-policing models and C2-like simulations: an autonomous agent that dynamically balances new information with timely action outperforms one that either rushes decisions or hesitates excessively. These in silico case studies illustrate HIT’s applicability and its alignment with intuitive notions of decision-making quality.

Contributions:

A novel metric—the HIT score—for adaptive information compression, grounded in Shannon entropy and mutual information with explicit costing, is proposed.

A formal definition of HIT is developed and applied for individual agents and networked systems.

It is demonstrated conceptually that HIT operationalizes key aspects of both OODA and SA and, thus, may be applicable as their post hoc proxy.

Examples showing how the HIT score can empirically distinguish effective strategies in dynamic simulated scenarios are provided.

2. Related work

In terms of situational awareness, Endsley’s model remains the dominant account of what must be known to act effectively in dynamic work domains, particularly in aviation. The model defines SA as the state of knowledge about the situation, organized into Level 1 (perception of elements), Level 2 (comprehension of their meaning), and Level 3 (projection of their future status).² It explicitly distinguishes SA (a state) from decision and action (processes), arguing that SA mediates performance in time-pressured domains such as aviation and air traffic control.

Two major families of SA measures have emerged: probe-based methods, chiefly SAGAT,⁷ and self-report instruments, exemplified by SART⁸ and SPAM.⁹ Probe methods generally better predict performance but are intrusive, whereas self-ratings are lightweight yet noisier.¹⁰ These instruments have also been used as post hoc techniques, exemplified, for example, by the use of SPAM in simulated maritime decision-making.¹¹

The main differences among these approaches stem from research design trade-offs: SAGAT can be intrusive because it requires task interruption, while self-report and post hoc measures (such as SART or SPAM) avoid disruption at the cost of higher variance.^10,12 Further work has extended SA to the group level (Team SA) and to automated proxies (e.g. gaze-tracking or task-state estimation approaches), continuing a long-running debate about how to best quantitatively measure SA.^13,14

The proposed HIT score acts as a novel post hoc metric—one especially suited to cybernetic systems, whether human or artificial. Because HIT relies on observable or inferable environmental entropy and employs temporal windowing for stable estimation, it is particularly applicable in simulation environments or after-action analysis. Moreover, in relation to Team SA, HIT’s networked extension yields insight into collective decision-making efficiency by attributing information leverage to adjacency structure. In sum, HIT provides an information-theoretic, nonintrusive proxy for situational awareness, most convenient in computerized settings rich in contextual data. As a potential proxy, it might complement other SA metrics with suitable operationalization—an important subject for future studies.

In parallel, Boyd’s OODA—a model concerning processes, not states—is treated more in doctrine and practice than in academia. The crux of OODA is in the feedback loop of Observe–Orient–Decide–Act, where the orientation phase is considered central to agents’ world-model building.¹ OODA is primarily a military scientific concept, but there is a scholarly line that places Boyd’s model inside cybernetics and control.¹⁵ One notable example relevant for HIT is a C2-centric dynamic OODA model with delays and feedback proposed by Brehmer.¹⁶ While not strictly related to OODA, recent relevant efforts to quantify decision tempo and quality with explicit information-theoretic¹⁷ or network models¹⁸ have been undertaken. On another note, Boyd and Endsley’s models have been abridged in information warfare (IW) literature, where a loose mapping between OODA phases and SA levels has been identified.¹⁹

HIT relates to OODA in three principal ways. First, the HIT score’s context–action–latency triplets aptly map to OODA’s core concepts: where OODA is an indefinitely continuing feedback loop, HIT is a measure of ongoing entropy compression and timely reaction. The proposed HIT score thus operationalizes OODA’s phases and accounts for the latency (or cost function) inherent to dynamic decision-making. Second, HIT’s components also align to SA levels, thus supporting the view that OODA and SA can be bridged. Finally, and arguably most importantly, HIT provides a post hoc information-theoretic proxy for OODA-like decision-making performance in high-information contexts, like simulations.

Algorithmic Information Theory (AIT) is a classic proposal for a metric of agency. Solomonoff’s inductive inference and Hutter’s AIXI agent define optimal prediction via Kolmogorov complexity and algorithmic probability.^5,20 While elegant in principle, these notions are in general uncomputable.²¹ HIT retains AIT’s spirit of compression but replaces Kolmogorov complexity with Shannon estimators that require only observable or inferable modes. In a vein similar to AIT, Tononi’s Integrated Information Theory (IIT) gauges consciousness via the irreducible cause–effect power of a network.^22,23 Tononi’s metrics are also notoriously hard to compute for anything beyond trivial networks.^24,25 In contrast, HIT trades philosophical ambition for computational tractability.

The free-energy principle (FEP) as described by Friston models biological agents as Bayesian filters that minimize variational free energy—an information-theoretic upper bound on surprise.^4,26 HIT mirrors the entropy-reduction motif yet abandons the need to specify the agent’s internal generative model, making it potentially applicable when internals are opaque (e.g., proprietary black-box AI). In the realm of behavioral benchmarks, general game-playing and reinforcement-learning (RL) benchmark suites excel at measuring achieved performance. They, however, generally remain silent on how efficiently agents process their inputs.^27,28 HIT could diagnose two agents with identical scores but divergent information economies by contrasting $I (C; R)$ per unit time under matched task performance.

Finally, purely quantitative network-centrality metrics—like degree, betweenness, and eigenvector centralities—capture solely structural importance.^29,30 They overlook semantic relevance and cost.³¹ The HIT score supplies a complementary functional centrality rooted in information flow. In addition, HIT enables exploitation of traditional graph-theoretic metrics in its networked extension. Therefore, the analysis of multiagent environments can employ both current quantitative network metrics and the proposed information-theoretic HIT score.

3. Theoretical foundation

3.1. Single-agent formulation

Consider a single adaptive agent operating in a stochastic environment. Let $X$ be a random variable representing the state of the environment as perceived by the agent (the observations or signals it receives). Let $C$ denote the underlying context or relevant environmental features that dictate what the appropriate response should be—for instance, the true state of the world or adversary’s move that the agent is responding to. The agent produces an output or action denoted by random variable $R$ . Throughout, $X$ denotes observations and $C$ the evaluative context (true or task-relevant state). $H (C)$ is reported when the context distribution is known by construction (e.g., toy models) and $H (X)$ when only observable signals are available—both quantify environmental variety relevant to decision-making.

Only access to observable triplets $(X, R)$ and, when evaluating, to $C$ (or a defensible proxy) are assumed, without any access to the agent’s internals. Write $X, C, R$ for the supports and $p (x, c, r)$ for the joint distribution with the usual marginals and conditionals. First, the potential information available to the agent is quantified. The Shannon entropy of the observation variable $X$ (base-2 logarithms) is³²

H (X) = - \sum_{x \in X} p (x) \log_{2} p (x),

(1)

which, in bits, quantifies the uncertainty or variety in the situations the agent encounters.³³ Larger $H (X)$ implies a wider range of possibilities, while $H (X) = 0$ indicates completely predictable input.

Next, the contextual information captured in the agent’s response is measured. The conditional entropy of $R$ given $C$ is³²

H (R ∣ C) = - \sum_{c \in C} p (c) \sum_{r \in R} p (r ∣ c) \log_{2} p (r ∣ c),

(2)

the expected residual uncertainty in the response once the context is known. The mutual information between context and response is then³²

I (C; R) = H (R) - H (R ∣ C),

(3)

which measures the expected reduction in uncertainty about $R$ once the context $C$ is known. Equivalently, it measures how much of the context is reflected in the response, and by symmetry $I (C; R) = H (C) - H (C ∣ R)$ .

If actions are perfectly tailored to context (e.g., a unique action for each relevant context state), $I (C; R)$ approaches $H (C)$ , and if actions ignore context (random or fixed), $I (C; R) = 0$ . $I (C; R)$ is interpreted as the agent’s contextual acuity, or the degree to which its behavior is situation-aware.

Finally, the cost or latency required for the agent to perceive the context and execute its response is incorporated. Let $T$ denote the average time delay—or more generally, a resource cost—for the mapping from observation to action. In many settings, especially in military contexts, there is an inherent penalty for slowness: delayed decisions can reduce the effectiveness or relevance of the action. Thus, a smaller $T$ is often favorable. $T$ could, for instance, be the mean number of decision cycles or seconds taken to respond to changes in the environment.

To combine these elements into a single efficiency index, they need to be normalized to comparable scales. Normalized entropy, mutual information and time cost are here denoted as $\bar{H}$ , $\bar{I}$ and $\bar{T}$ , respectively. $H$ and $I$ are converted to dimensionless fractions by dividing by their maximum possible values (in bits) for the given scenario. For example, if $X$ has $| X |$ equally likely states, $H {(X)}_{\max} = \log_{2} | X |$ . Similarly, the theoretical maximum $I_{\max}$ would be $\min (H (C), H (R))$ (achieved when the agent’s actions perfectly indicate the context). Empirical estimates of $\bar{H} (X)$ or $\bar{I} (C; R)$ could potentially exceed their respective theoretical maxima due to finite-sample noise. In such cases, the variables must be clamped to their theoretical bounds before computing the normalized scores $\bar{H}$ and $\bar{I}$ to maintain interpretability and avoid distortion.

Moreover, the time cost $T$ is expressed as a ratio $T / T_{0}$ relative to a reference time $T_{0}$ (such as a maximum tolerable delay or some domain-specific standard), so that $\bar{T} = 1$ represents the worst acceptable delay, and values below $1$ indicate faster responses.

The HIT score for a single agent is the product of the normalized entropy and mutual information, penalized by the normalized time:

HIT score = \frac{\bar{H} \bar{I}}{\bar{T}}

(4)

where

\bar{H} = \frac{H (X)}{H_{\max}}, \bar{I} = \frac{I (C; R) I (R; A)}{I_{\max}}, \bar{T} = \frac{T}{T_{0}} .

In essence, $\bar{H}$ and $\bar{I}$ range from $0$ to $1$ , indicating the fraction of potential uncertainty present and utilized, respectively, while $\bar{T} \geq 0$ represents the relative delay (with lower being better). Throughout the experiments presented in Section 4, the reference delay is set to $T_{0} = T_{\max}$ , the worst-case cost observed in the scenario. The notation max rather than $0$ is thus purely contextual and all derivations remain unchanged if $T_{0}$ is chosen differently.

Intuitively:

More potential variety ( $H$ ) widens the canvas,

More compression ( $I$ ) shows skillful shedding of excess entropy, and

Less delay ( $T$ ) indicates rapid mobilization of that knowledge.

3.1.1. Rudimentary edge-case sanity checks

Deterministic environment: If $H (X) = 0$ , then $\bar{H} = 0$ , and therefore, $HIT score = 0$ regardless of $I$ or $T$ since no uncertainty exists to compress. For instance, a thermostat regulating a room held at a perfect constant temperature cannot display adaptive intelligence when there is nothing to determine. Random actor: If the agent’s response is independent of context, $I (C; R) I (R; A) = 0$ and $HIT score = 0$ even if the environment is rich ( $H (X) > 0$ ). Omniscient but sluggish brain: Suppose $H \approx I$ (near-perfect prediction) but $T$ is large. As $T \to \infty$ , the $HIT score \to 0$ . More generally, when $T >> HI$ , the score becomes vanishingly small due to temporal inefficiency.

A high HIT score therefore requires jointly high environmental entropy $H$ and contextual information $I$ , coupled with bounded or small latency $T$ such that $T$ does not dominate the $HI$ term. Conversely, the HIT score approaches zero if any of these terms degenerate: when $H \to 0$ or $I \to 0$ , or when $T \to \infty$ Hence, an agent in a trivial environment ( $H \approx 0$ ), an agent that ignores context ( $I \approx 0$ ), or an agent that is too slow (large $\bar{T}$ ) will all yield low HIT scores.

It is worth noting that the HIT score reflects efficiency rather than raw performance. For example, an agent dealing with a very low-entropy environment might solve its task easily, but $\bar{H}$ will be small, limiting the HIT score. This aligns with the intuition that just performing well in an overly simple scenario does not demonstrate adaptive capacity—chess requires loads more adaptation than tic-tac-toe. On the other hand, an agent-producing arbitrary diverse actions without regard to context might have high $H (C)$ but low $I (C; R) I (R; A)$ , also resulting in a low HIT score.

The metric therefore captures the balance emphasized by Ashby’s³⁴ law of requisite variety: to effectively control (or adapt to) a complex environment, an agent’s internal variety, as evidenced by its range of responses, must match the variety of the environment. The HIT score quantifies this match and adds the requirement of doing so efficiently in time. These limit properties ensure that the HIT score behaves monotonically with respect to each component: it increases with greater contextual information and environmental entropy, and decreases with increasing latency.

3.1.2. Relation to OODA and SA

The HIT score can be interpreted in terms of the OODA loop phases. High $H (X)$ implies the agent is observing a wide array of inputs, high $I (C; R) I (R; A)$ indicates the agent’s actions are oriented and decided based on the true situation, and low $T$ reflects swift action. In other words, the HIT score essentially quantifies how well an agent’s OODA loop is functioning in a given environment. A system with a high HIT score is effectively compressing observational data into actionable knowledge and doing so rapidly—achieving a high level of situational awareness and agility. In contrast, a low HIT score may signal breakdowns in observation (missing information), orientation (misinterpreting context), or insufficient decision or action speed. By condensing these aspects into a single variable, the proposed HIT score provides a tractable measure to compare systems and track improvements in decision-making processes quantitatively and analytically.

The HIT components are mapped onto OODA and SA constructs in Table 1. In brief, entropy relates to Observe (OODA phase 1/Perception (SA Level 1), mutual information to Orient (OODA phase 2)/Comprehension (SA Level 2), and policy consistency to Decide–Act (OODA phases 3 and 4)/Projection (SA Level 3). $I (C; R)$ indexes orientation/comprehension (OODA-P2/SA-L2), while $I (R; A)$ indexes projection (OODA-P3 and OODA-P4/SA-L3) by quantifying action–outcome coupling. This approach of mapping OODA and SA is similar to other approaches in information warfare literature, such as Poisel’s.¹⁹ The cost term $T$ penalizes slow or overly expensive loops.

Table 1.

Qualitative–quantitative mapping of HIT.

HIT component	OODA phase	SA level
$H (X)$	Observe (P1)	Perception (L1)
$I (C; R)$	Orient (P2)	Comprehension (L2)
$I (R; A)$	Decide (P3)–Act (P4)	Projection (L3)
$T$	Latency across phases	Delay/friction

3.2. Networked extension

Many real-world systems consist of multiple interacting agents or components (e.g., teams of humans, human-machine interaction, distributed sensors, or swarms of autonomous drones). The HIT score is extended to such networked settings by accounting for information flow between agents.

Consider a network of $k$ agents. An adjacency matrix $A \in R^{k \times k}$ is defined, where each element $A_{ij}$ represents the effective information throughput from agent $i$ to agent $j$ over some interval. This could be measured, for instance, by the mutual information between $i$ ’s messages or actions and $j$ ’s observations, or by a normalized communication volume if each message is assumed equally informative. For a given focal agent $i$ , let ${in}_{i} = \sum_{j = 1}^{k} A_{ji}$ be the total incoming information to $i$ from others, and ${out}_{i} = \sum_{j = 1}^{k} A_{ij}$ be the total information that $i$ provides to the network.

A leverage ratio for agent $i$ is defined as

λ_{i} = \frac{{out}_{i}}{{in}_{i}} .

(5)

When ${in}_{i} = 0$ , an arbitrarily small value must be substituted to avoid division by zero in case an agent has no inputs (in the experiments $10^{- 6}$ was used). The factor $λ_{i}$ characterizes the agent’s role in the network: $λ_{i} > 1$ indicates that agent $i$ disseminates more information than it assimilates (a net information source or leader), whereas $λ_{i} < 1$ implies it primarily consumes information from others (a follower or sink). $λ_{i} \approx 1$ suggests balanced information exchange.

For each agent $i \in {1, \dots . . ., k}$ , let $X_{i}$ , $C_{i}$ , and $R_{i}$ denote its observations, context, and responses, respectively. The network-augmented HIT score for agent $i$ is defined as

{HIT}_{i}^{net} = λ_{i} {HIT}_{i},

(6)

where ${HIT}_{i}$ is the single-agent HIT score for agent $i$ , computed from its own $H_{i} (X_{i})$ , $I_{i} (C_{i}; R_{i})$ , and $T_{i}$ . Thus, an agent’s score in a network is boosted if it effectively contributes information to others (high $λ_{i}$ ) and tempered if it relies heavily on others for information (low $λ_{i}$ ). The network HIT score still ranges over $[0, \infty)$ , but it adds a new dimension of evaluation: two agents with identical individual performance might differ in network HIT score, if one also facilitates team situational awareness better.

This extension aligns with intuitive C2 notions of force-multiplying agents or nodes that improve the whole network’s performance. A command node that quickly disseminates useful information to peers would achieve $λ_{i} > 1$ , raising its HIT score to reflect its broader impact. In contrast, an agent that requires extensive input from others to act ( $λ_{i} < 1$ ) would see a lower network-adjusted score, indicating reliance.

3.3. Sensitivity analysis

Quantitative indices that combine several stochastic components, such as the proposed HIT score, require systematic examination of their numerical stability and interpretability under controlled variation of inputs. Sensitivity analysis serves this role by identifying how changes in model parameters influence outputs, thus revealing both robustness and latent fragility.^35,36 In decision and control research, such analyses are a common prerequisite for establishing construct validity and operational reliability.^37,38 For HIT, the purpose is twofold: first, to confirm that the metric behaves monotonically and proportionally with respect to its theoretical components—entropy $H$ , mutual information $I (C; R) I (R; A)$ , and latency $T$ —and second, to determine the boundaries within which HIT remains interpretable when estimated from finite samples or under noisy conditions.

Given HIT’s formulation (Equation (4)), its stability depends on how each normalized component reacts to realistic variations in context diversity, policy reliability, and decision timing. Sensitivity analysis provides an empirical complement to the theoretical edge-case checks presented in Section 3.1. It also supports reproducibility and transparency: by reporting explicit parameter effects, researchers applying HIT to new domains (e.g., C2 experiments, adaptive autonomy, or simulated decision loops) can anticipate potential distortions arising from sparse or noisy data.

A sensitivity analysis of the HIT score was conducted, which synthesizes artificial context–response–latency triplets. The reproducible Python code—made available on GitHub³⁹—computes the HIT score under controlled parameter sweeps, emulating the key variables that shape adaptive decision-making: environmental variety, response fidelity, and temporal cost. The objective was to examine numerical stability, locate potential fragility points, and provide guidance for practical use. The analysis examined the influence of (1) context and response space cardinality, (2) context distribution skew, (3) policy stochasticity, (4) latency variation, (5) sample size, and (6) network asymmetry on HIT and its normalized extensions. Together, these experiments establish the empirical boundaries within which HIT maintains its intended meaning as a tractable, information-theoretic efficiency measure.

Per Equation (4), $C$ denotes the set of observed contexts, $R$ the set of system responses, and $T$ the observed latency in producing each response. Normalized quantities were used: $\bar{H} = H (C) / \log_{2} | C |$ (normalized Shannon entropy), $\bar{I} = I (C; R) / \log_{2} \min (| C |, | R |)$ (normalized mutual information), and $\bar{T} = E [T] / T_{\max}$ ( $E [\cdot]$ denoting statistical expectation). The HIT score is given by $HIT = (\bar{H} \bar{I}) / \bar{T}$ . A network-adjusted form ${HIT}^{net} = λ_{i} {HIT}_{i}$ is further defined in Equation (6), and the logarithmic leverage factor

λ_{i}^{\log} = \log_{2} (1 + {out}_{i}) - \log_{2} (1 + {in}_{i}),

(7)

is used to prevent runaway scaling in asymmetric networks. When ${out}_{i} = {in}_{i} = 0$ , will $λ_{i}^{\log} = 0$ and ${HIT}_{i}^{net} = {HIT}_{i}$ .

Synthetic data were generated for parameter sweeps using $n = 300$ and 1,000 samples per configuration. The parameter grid was as follows:

Context cardinality $| C | \in {3, 10}$ ,

Response cardinality $| R | \in {3, 10}$ ,

Context distribution: uniform, or Zipf-like with exponent $α = 1.2$ (the probability of observing the $r$ th most common context was proportional to $r^{- α}$ ),

Policy accuracy $P (correct) \in {1.0, 0.8, 0.6}$ , and

Latency mode: fixed, jittered ( $\pm 1$ integer steps), or Poisson-distributed ( $λ = 1$ ).

3.3.1. Context space ( $| C |$ )

HIT remained effectively stable (<5% variance) across $| C | = 3$ and $| C | = 10$ . Normalization by $\log_{2} | C |$ eliminates most cardinality dependence. No special handling is required unless the context distribution becomes degenerate.

3.3.2. Response space ( $| R |$ )

The empirical results show HIT to be approximately stable or mildly increasing with larger response alphabets. When $| R |$ was raised from $3$ to $10$ (with $| C |$ matched), the mean HIT increased by roughly $50 %$ (median $+ 53 %$ , mean $+ 71 %$ ). Thus, while the normalization denominator $\log_{2} \min (| C |, | R |)$ can constrain $\bar{I}$ , information gain from richer response mappings would offset this. Maintaining $| R |$ roughly commensurate with $| C |$ remains good practice.

3.3.3. Context distribution skew

Switching from a uniform to a Zipf( $α = 1.2$ ) distribution altered HIT by less than $1 %$ (mean relative change $- 0.3 %$ ). The metric is robust to moderate non-uniformity and requires no skew correction under typical frequency profiles.

3.3.4. Policy noise/accuracy

HIT declines sharply and near-linearly with reduced policy accuracy. A 20% reduction in correctness (from 1.0 to 0.8) lowered mean HIT by about 45%, while a 40% reduction (1.0 to 0.6) reduced it by roughly 70%. This confirms that HIT is a faithful proxy for adaptive fidelity and discriminates stochastic or inattentive policies without parameter tuning.

3.3.5. Latency noise

Three latency modes were tested: fixed ( $T = 1$ ), jittered ( $T \in {0.5, 1, 1.5, 2}$ ), and Poisson( $λ = 1$ ) shifted by $0.5$ to avoid zeros. Under the default normalization $\bar{T} = E [T] / T_{\max}$ , larger latency increases the denominator, inadvertently inflating HIT. To maintain the intended inverse relation between timeliness and HIT, it is recommended to redefine $T = {(1 + E [T])}^{- 1}$ , which restores monotonicity (higher latency → lower HIT score). With this correction, jittered and Poisson delays yield $\approx 13 \dots 25 %$ lower HIT compared to fixed latency. Smoothing (e.g. median) can also mitigate volatility in $\bar{T}$ .

3.3.6. Sample size

For $n \geq 300$ , HIT estimates were stable. At smaller samples ( $n = 100$ ), mutual information exhibited an upward bias of about $+ 8 %$ , leading to overestimation. Bias could be reduced using corrected MI estimators such as Miller–Madow⁴⁰ or by bootstrap resampling to estimate and correct finite-sample bias. Where feasible, $n \geq 500$ is recommended for live or streaming evaluations—supporting the suggested post hoc use cases.

3.3.7. Network asymmetry

The logarithmic leverage factor $λ^{\log}$ (Equation (7)) effectively bounds amplification in highly asymmetric networks, capping ${HIT}^{net}$ when nodes act as pure broadcasters. It is advisable to report both $HIT$ and ${HIT}^{net}$ separately and use the log-scaled form for interagent comparisons.

The sensitivity risks and guidelines are summarized in Table 2. In summary, HIT remains interpretable, stable, and responsive under a wide range of conditions, particularly when paired with proper normalization and component reporting. These sensitivities should be considered when applying the HIT score in real-world environments.

Table 2.

Empirical sensitivity summary for HIT, as derived from the reproducible parameter sweep.

Factor	Observed effect	Recommended safeguard
Context cardinality $\| C \|$	Stable ( $< 5 %$ variance)	None
Response alphabet $\| R \|$	+50% mean HIT (3 → 10)	Keep $\| R \| \approx \| C \|$ or re-scale MI
Zipfian skew ( $α = 1.2$ )	$< 1 %$ deviation	None
Policy accuracy	45 … 70% drop for 20 … 40% accuracy loss	No correction, interpret as adaptive fidelity
Latency jitter/Poisson	13 … 25% HIT reduction (after inversion)	Smooth or invert $\bar{T}$ definition
Small $n$ ( $< 300$ )	Upward bias ( $\approx + 8 %$ )	Miller–Madow or bootstrap correction
Network imbalance	Potential inflation	Use $λ^{\log}$ bounded leverage

3.4. Petri nets as a HIT substrate

While the HIT score is by design substrate-agnostic, its interpretability and structural fidelity are highly dependent on the modeling framework used to represent system behavior. In this context, Petri nets (PNs) emerge as a particularly well-suited computational substrate for HIT. A PN consists of a finite set of places $P$ , transitions $T$ , and arcs connecting them.⁴¹ A transition $t \in T$ represents a discrete event that may change the system’s state by consuming and producing tokens in its input and output places. A transition is said to fire when all its input places contain the required number of tokens. Firing removes those tokens and deposits new ones into the transition’s output places, producing a new marking (state) of the net.

As a computational substrate, PNs may potentially surpass traditional tools such as process mining, causal inference, or partial information decomposition (PID),⁴² especially in tactical and cyber-physical environments. PNs model system dynamics as state-transition systems, driven by token flow across places and transitions. This structure inherently preserves causal ordering. Transitions cannot fire without required tokens—thus causality is enforced by construction. Furthermore, tokens are not implicitly duplicated or erased, ensuring data provenance and conservation. Finally, multiple transitions can compete or synchronize on shared places, capturing complex control logic and concurrency. In contrast, methods such as PID⁴¹ or Granger causality⁴³ require statistical estimation of directed dependencies, which may become error-prone in sparse or highly dynamic systems.

The context–action–latency triplet needed for HIT could in simulations be directly and unambiguously extracted from PN firings, and no inference step is required. There are many ways by which latency could be encoded in PNs: structurally (the number of transitions fired from context observation to action), temporally (using timed or stochastic transitions), or compositionally (e.g., via intermediate decision places and fusion transitions). This allows HIT to faithfully reflect both tactical latency, like short reaction loops, and higher echelon latency, such as multiagent fusion cycles, while retaining traceability of delay sources. PNs also provide semantic anchoring—HIT’s denominator thus becomes more than a clock difference, and is structurally meaningful.

In addition, PNs have many compelling features for multiagent compositions. For example, each agent or subsystem can be modeled as a subnet, where interagent interactions are defined via shared places or message-passing arcs. Shared resources (e.g. bandwidth, tokens, and decision rights) can also be modeled explicitly. This supports granular measurement of HIT at multiple scales:

Local HIT: per agent or subnet, reflecting internal adaptiveness,

Network HIT: using $λ^{\log}$ derived from structural arc counts or message flow, and

System HIT: via composition or aggregation of per-agent HITs.

In summary, PNs potentially enable micro-, meso- and macro-level evaluation of information processing efficiency using HIT.

Tokens in PNs are inspectable, and transitions are observable. It follows that traditional PN metrics can be interpreted in relation to HIT: dead (unfired) transitions signal unused policy modes, overused arcs signal bottlenecks or load imbalance, and token traces reveal decision flow under specific contexts. Combined with HIT computation, this enables diagnosis of:

Why a node’s HIT is low—poor context diversity, low MI, or latency inflation,

Where in the system flow information is lost or delayed, and

How to restructure the decision loop for improvement.

Compared, for example, to the aforementioned PID, which attempts to quantify shared and unique information retrospectively, PNs do not require probabilistic modeling of latent variables. PNs also provide explicit, visual, and stateful representations of decision logic, and support simulation, verification, and replay—not just estimation. In terms of mathematical formulation, can adjacency matrices (as discussed in Section 3.2) be used as a basis for PN formulation. Therefore, the graph theoretic extensions of HIT elegantly provide means by which PN-based formalisms could be used. This avenue is computationally tractable, provided that the information flow in the system can be modeled meaningfully to derive the HIT score, or a perfect information ground state is available, as is the case in simulations. It is worth mentioning, however, that HIT and PNs do not compete with PID (or any other metric) but preempt it—they offer potentially similar insight with potentially greater transparency and lower inferential cost.

A summary of the discussion concerning PNs and HIT is presented in Table 3. PNs potentially offer a powerful, interpretable, and simulation-friendly substrate for computing and analyzing HIT in complex and distributed decision-making systems.

Table 3.

How PNs complement HIT.

Property	Benefit to HIT
Token-based causality	Avoids need for inferred information flow
Latency modeling	Embeds delay meaningfully into system structure
Modular agents	Supports per-node and network-level HIT
Observability	Enables introspection and fault diagnosis
Replayability	Allows scenario-based HIT stress testing

4. Empirical evaluation

4.1. Seven-agent Iterated Prisoner’s Dilemma

To evaluate the practical behavior of the HIT score, a canonical environment from game theory was implemented: the IPD.⁴⁴ This testbed allows controlled comparison of strategic policies under uncertainty, repeated interaction, and incomplete knowledge. It is widely used to study adaptive and cooperative dynamics, and serves here as an ideal benchmark to investigate HIT in action. The IPD discussed here substantiates the single-agent formulation of HIT presented in Section 3.1.

A closed seven-agent tournament was constructed, where each agent plays a bilateral IPD with every other agent for $2 000$ rounds, resulting in $21$ directed pairwise match-ups. The reproducible Python code is made available on GitHub.³⁹ Each game proceeds as follows: on each round $t$ , agent $i$ observes the previous action of its opponent $j$ and selects a response. This sequence is recorded, and entropy, mutual information, and the resulting HIT per agent are computed over all its match-ups. All agent actions are binary: $1$ for cooperation, $0$ for defection. The context is defined as $C_{t} = a_{t - 1}^{(j)}$ as the opponent’s previous action, and the response as $R_{t} = a_{t}^{(i)}$ as the current action. This makes the input and output spaces binary, bounding theoretical entropy at $H_{\max} = 1$ bit. For each directed agent pair $(i, j)$ , the Shannon entropy $H (C)$ is estimated of the opponent’s behavior and the mutual information $I (C; R) I (R; A)$ between that context and the agent’s responses. Frequency-based plug-in estimators with a Miller–Madow correction for bias are used. Latency cost is fixed at $T = 1$ to isolate information-processing differences.

The simulation implements seven canonical strategies, selected to span different degrees of responsiveness, context sensitivity, and internal logic within the IPD framework:^6,45

Always Cooperate (All-C): cooperates ( $C = 1$ ) in every round.

Always Defect (All-D): defects ( $D = 0$ ) in every round.

Grim Trigger (GT): cooperates until the opponent defects once, then defects permanently.

Random (Rand): independently samples each action from a Bernoulli $(p_{c} = 0.5)$ distribution, yielding stochastic play.⁴⁶

Tit-for-Tat (TFT): begins with cooperation and thereafter mirrors the opponent’s previous action.

Win–Stay, Lose–Shift (reactive, WSLS-R): repeats its previous action if it matched the opponent’s last move, otherwise switches.

Win–Stay, Lose–Shift (payoff-based, WSLS-P or “Pavlov” per Nowak & Sigmund⁴⁵): repeats its previous action if the preceding outcome was a reward ( $R = 3$ ) or punishment ( $P = 1$ ), and switches after temptation ( $T = 5$ ) or sucker payoff ( $S = 0$ ).

All strategies except Rand are deterministic. WSLS-R and WSLS-P differ in their conditioning variable: the former reacts to the symbolic match between moves, while the latter uses the payoff ordering $T > R > P > S$ satisfying the standard inequality $2 R > T + S$ .

The 7-IPD results are presented in Table 4. It is evident that responsiveness drives HIT. TFT and both WSLS variants achieve the highest HIT scores by reliably mapping opponent behavior into predictable, context-sensitive responses. Their mutual information matches their observed entropy. It is also found that unconditional strategies are blind—as intuitively expected. All-C, All-D, and GT exhibit nonzero entropy due to diverse inputs, but respond identically regardless of context. Consequently, in these strategies, $I (C; R) I (R; A) = 0$ and $HIT = 0$ . Finally, the HIT score reveals correctly that randomness is not intelligence (when understood as dynamical adaptation). Although Random exhibits the highest marginal action entropy across match-ups, its unconditioned responses retain little mutual information with opponents' actions. It fails to exploit contextual structure.

Table 4.

Average normalized entropy ( $\bar{H}$ ), normalized mutual information ( $\bar{I}$ ), and HIT score per agent over all pairwise games ( $2000$ rounds, arbitrarily chosen seed $42$ for reproducibility).

Strategy	$\bar{H}$	$\bar{I}$	HIT
Always Cooperate	$0.167$	$0.000$	$0.000$
Always Defect	$0.171$	$0.000$	$0.000$
Grim Trigger	$0.167$	$0.000$	$0.000$
Random	$0.501$	$0.014$	$0.007$
Tit-for-Tat	$0.167$	$0.167$	$0.028$
WSLS-reactive	$0.167$	$0.167$	$0.028$
WSLS-payoff	$0.167$	$0.167$	$0.028$

Despite differing logic, TFT and both WSLS variants achieve identical HIT scores. This illustrates a key property: the HIT score measures functional behavior, not internal code. If two strategies compress environmental entropy equally well into responses, they are equally “intelligent”—or more precisely, equally informed—under HIT.

The 7-IPD illustrates HIT’s ability to separate information-aware behavior from naïve or context-insensitive action. It further confirms HIT’s alignment with cybernetic and OODA principles: agents must both perceive variation and act on it in a timely, relevant manner. The fact that distinct policies converge to the same HIT when they exploit entropy identically suggests the metric is consistent, robust, and reflective of behavioral capacity rather than implementation detail.

4.2. Air policing toy models

To evaluate HIT in operationally flavored decision-making scenarios, two toy models inspired by air policing activities were designed. These simulations aim to illustrate HIT’s responsiveness to uncertainty, contextual reasoning, and delay under quasi-realistic constraints such as noise, adversarial spoofing, and network communication factors. The purpose of these simulations is to illustrate both the single-agent HIT and its networked extensions presented in Sections 3.1 and 3.2, respectively. It is strongly emphasized that any values and decision criteria used in these air policing models are arbitrary—they do not reflect any doctrinally or politically relevant criteria, and must not be interpreted as such under any circumstances. Air policing scenarios were selected for empirical evaluation owing to their inherent relevance to both OODA and SA, not to impose HIT in military decision-making or to refute standing processes.

4.2.1. Single-agent air policing simulation

In the single-agent model, a rule-based autonomous agent interacts with sequential incoming air contacts. For each contact, the agent selects among graded actions based on noisy sensor features. The reproducible Python implementation is provided on GitHub.³⁹

Each contact possesses a hidden ground-truth identity $Y \in {H, A, F}$ , denoting Hostile, Ambiguous, or Friendly classification. These classes are drawn uniformly. Feature vectors—representing typical air policing inputs such as squawk code, IFF response, radar mode, flight behavior, and apparent cooperation—are sampled from class-conditioned distributions designed to loosely mimic realistic cues.^47,48 Sensor noise is modeled via a $20 %$ misreadprobability and $5 %$ dropout rate, simulating degraded observation quality under quasi-realistic constraints.

The agent updates a belief vector over $Y$ by combining the noisy features through a voting-based heuristic that emulates Bayesian classification. On each timestep, the agent chooses an action from a fixed repertoire:

Ignore: minimal cost, permissible only if the contact appears clearly friendly.

Interrogate: incurs low cost but no engagement.

Intervene: a limited-action response appropriate to ambiguous situations.

Intercept: a full-force response intended for confirmed threats.

Each action incurs a cumulative cost determined by the contact’s apparent identity, its geographic context (own, international, or foreign airspace), and a Rules of Engagement (ROE) risk weighting, approximating the decision logic described in standing NATO air policing policies.⁴⁹ The agent continues probing until (1) the most probable class exceeds a confidence threshold of $0.7$ , or (2) four cycles elapse. To evaluate the system’s overall efficiency, the HIT score is computed per contact and then averaged. The inputs are:

$H_{\max} = \log_{2} 3 \approx 1.585$ , the theoretical entropy of the uniformly distributed classes.

$I (C; R)$ , the mutual information between the true class $C$ and the predicted class $R$ .

$I (R; A)$ , the mutual information between the predicted class and the executed action.

$T$ , the average cost accrued across contacts.

The HIT score is computed via Equation (4), with $T_{0} = T_{\max}$ and the binary alphabets presented in the code. $T_{\max} = 90$ is defined to fix an upper bound for the total cost of a poorly managed contact.

4.2.2. Results

Running the simulation on $5 000$ independent contacts yields the following:

$I (C; R) = 0.005$ bits: the agent extracts only minimal information about the true contact class.

$I (R; A) = 0.258$ bits: actions remain reasonably consistent with the internal class prediction.

$\bar{T} = 0.162 (T = 14.57)$ : the average cumulative cost per contact, approximately $16.2 %$ of the maximum possible.

$HIT = 0.0126$ : a low but nonzero value reflecting modest internal consistency but poor environmental responsiveness.

Final classification accuracy is $34.96 %$ , only marginally above the $33.3 %$ random baseline. This low discriminability is by design: the distributions are intentionally under-informative, serving to test the HIT score’s performance rather than to emulate real air-policing efficiency.

4.2.3. Interpretation

These results reflect the expected limitations of a noisy, low-resolution sensor environment. The ground-truth entropy $H (C)$ is constant by construction, but low mutual information $I (C; R)$ indicates the agent struggles to extract reliable information from inputs. Nonetheless, a fair degree of policy coherence ( $I (R; A)$ ) implies that actions are consistent with internal beliefs—even if those beliefs are wrong. The HIT score captures this asymmetry: the system is internally aligned but externally misinformed. This diagnostic split is crucial in air policing contexts, where decision latency and semantic consistency may obscure situational misapprehension. Real-world parallels include spoofing, electronic deception, or misconfigured sensor fusion.

In this simulation, realism was not an objective, but future work could examine how richer sensors, ensemble classifiers, or causal estimators influence $I (C; R)$ , and thus potentially improve HIT scores. Alternatively, if realism is required, tuning action thresholds or introducing adaptive confidence mechanisms may further reduce $T$ , yielding a more responsive decision loop.

4.2.4. Multiagent air policing

In a multiagent air policing variant, four homogeneous agents observe the same stream of contacts but exchange their probabilistic belief vectors over intraflight communication links. Such shared-belief architectures are central to Network-Centric Warfare (NCW) concept and the broader notion of networked situational awareness.^19,50 The Python implementation is available on GitHub.³⁹ Two network topologies are compared: a unidirectional ring—each node forwards its belief to its clockwise neighbor with a one-step delay—and a fully connected mesh, where all nodes broadcast to all peers each step with the same latency. This structure echoes Moffat’s⁵¹ models of how network connectivity shapes information diffusion and decision quality.

At time $t$ , agent $i$ updates its posterior belief through multiplicative Bayesian fusion,

P_{i}^{(t + 1)} (c) \propto P_{i, local}^{(t)} (c) \underset{j \in N (i)}{Π} P_{j}^{(t)} (c),

(8)

followed by normalization over all classes $c$ . Here, $N (i)$ denotes the set of neighbor agents whose beliefs are accessible to $i$ under the given topology. Equation (8) implements a logarithmic opinion pool—a multiplicative consensus model that assumes conditional independence among contributors and consistent priors across the network.^52,53 In dense or cyclic graphs these assumptions are only approximate, as shared evidence can be inadvertently reused (“data incest”), yielding overconfident posteriors. Here, the fusion rule is interpreted as an idealized information-flow operator that exposes how topology alone influences uncertainty compression.

A more conservative alternative is innovation (likelihood-increment) fusion: agents exchange only the new likelihood contribution since the previous step rather than full posteriors, or employ correlation-robust pooling (e.g. Kullback–Leibler average consensus or covariance intersection) to avoid double-counting.^54,55 While such schemes prevent evidence reuse, they also converge more slowly in homogeneous networks. In the scope of this paper, for demonstration purposes, the simpler multiplicative form is retained to highlight the topological contrast captured by the HIT score.

Node-wise HIT is computed from each agent’s local observation, decision, and cost streams. The mean and standard deviation $σ_{HIT}$ across the four agents are reported in Table 5, under identical sensor noise (20% misread, 5% dropout).

Table 5.

Topology impact on normalized mean node metrics ( $5 000$ contacts, seed $0$ ).

Topology	$\bar{H}$ [b]	$\bar{I} (C; R)$ [b]	$I (R; A)$ [b]	$T$ [cost]	$HIT (σ)$
Ring	$1.585$	$0.011$	$1.004$	$28.6$	$0.079 (0.003)$
Mesh	$1.585$	$0.067$	$1.006$	$14.3$	$0.645 (0.013)$

The multiagent simulation yields insightful interpretations related to previous discussions on degraded mutual information and distributed latency effects.^51,56 In the ring topology, effective mutual information collapses to $\bar{I} (C; R) \approx 0.01$ bits because evidence trickles only one hop per cycle, and is further degraded by noise before reaching distant nodes. The mesh maintains an approximately order-of-magnitude higher context coupling of $\approx 0.07$ bits in comparison to the ring. Slow evidence propagation also causes cost inflation. Ring topology forces agents to interrogate longer ( ${\bar{T}}_{ring} \approx 0.32$ vs ${\bar{T}}_{mesh} \approx 0.16$ ), directly penalizing the denominator of HIT. The combined penalty produces an order-of-magnitude efficiency gap, as seen by the contrasted HIT scores: ${HIT}_{mesh} / {HIT}_{ring} \approx 8$ . Policy consistency remained unchanged, as expected. $I (R; A) \approx 1$ bit in both cases, confirming that agents execute internally coherent actions once an identification is declared. Efficiency losses are therefore network-induced, not policy-induced. This pattern echoes Watts’⁵⁷ critique that connectivity alone does not eliminate friction: information delays and losses reintroduce Clausewitzian constraints on tempo.

Interpreted through OODA and SA, the mesh topology exemplifies the benefit of rapid, shared orientation, while the ring topology demonstrates how limited bandwidth throttles contextual coupling and drives down efficiency. Mesh architectures achieve higher HIT scores because they accelerate entropy reduction and lower decision cost—an effect generalizable to systems where contextual information must traverse multiple hops to reach all decision-makers.

The observed behavior aligns with NCW analyses that link robust networking to superlinear gains in situational awareness and operational tempo by expediting information flow and fusion.^50,51,56 Formally, these gains manifest as faster entropy reduction across the force: connectivity compresses dispersed uncertainty into a shared operational picture. Conversely, degraded coupling in the ring topology recreates the friction described by Watts, reminding that bandwidth and topology bound the achievable rate of entropy reduction.^33,57 Within this framing, the HIT score quantitatively expresses how efficiently a network converts distributed observations into coordinated, low-entropy action. HIT does not prescribe engagement decisions but diagnoses, post hoc, how well information was compressed into timely action.

4.3. PN of a five-agent C2 system

To empirically validate the HIT score using PNs within a realistic, interpretable, and structured decision environment, a simulation of a five-agent command and control (C2) system was implemented. This simulation substantiates utilizing PNs as discussed in Section 3.4. The scenario captures the adaptive and hierarchical nature of tactical decision-making under routine operational tempo. The simulation highlights how the HIT score can reveal role-dependent efficiency and information dynamics in a distributed multiagent system, with results grounded in quantifiable, observable system structure. Reproducible Python code is available on GitHub.³⁹

The modeled C2 system includes five agents organized in a classic military-style hierarchy. Alpha serves as the top-level command node. It does not perceive the environment directly, but synthesizes reports from subordinate agents and issues commands to all nodes in the system. Bravo and Charlie act as mid-tier echelons: both perceive their environment with some noise, fuse their own observations with upstream reports, and issue commands to one subordinate each. Delta and Echo are field agents with direct environmental access. They act immediately based on local context and receive commands from their respective mid-tier controllers.

The simulated environment produces one of three possible context states—calm, suspicious, or hostile—according to a discrete distribution, where calm occurs with probability $0.5$ , suspicious with $0.3$ , and hostile with $0.2$ . Delta and Echo observe the context with a $10 %$ probability of error. Each agent selects one of three possible actions: monitor, query, or engage, with the choice governed by policies that range from deterministic to probabilistic, depending on the node’s function. Bravo and Charlie fuse upstream reports with noisy local impressions, each with their own biases inspectable in the code.³⁹ Alpha fuses only the reports from Bravo and Charlie. The simulation involves two subnets: one for information sharing (SA proxy) and another for issuing commands. The HIT scores are calculated for each.

Latency is structurally embedded in the PN model: field agents act with a latency of one transition step, mid-tier nodes act with two steps (one for fusion and one for action), and Alpha acts with three steps, accounting for the additional fusion layer. Transitions are triggered by token firings, preserving the causal dependencies between observations and actions. The simulation runs for $3 000$ iterations, generating context–action–latency triplets for each agent.

PNs offer several compelling advantages in this setting—many of which were discussed in Section 3.4. First, they preserve causal structure by design. Transitions do not fire unless all required tokens are present, ensuring that actions are conditionally dependent on prior states and inputs. Second, latency is not inferred or timestamped, but instead emerges naturally from the transition structure. Each hop in the network introduces a delay by design, allowing HIT to reflect timing in semantically grounded ways. Third, PNs allow modular representation of each agent as a subnet, facilitating both localized HIT analysis and compositional modeling. Finally, PNs are replayable, inspectable, and support debugging and visualization—features particularly valuable when diagnosing performance anomalies or validating real-time decision systems. In sum, PN implementations are ripe for simulation implementations where ground truth and other HIT components are readily accessible.

Table 6 presents the HIT scores and associated metrics for each agent. HIT is calculated as the product of normalized entropy and mutual information, divided by normalized latency. To account for each agent’s role in the larger system, the network leverage terms are also computed using the log-scaled difference in upstream and downstream flow. These are denoted as $λ_{info}$ for information leverage and $λ_{cmd}$ for command leverage. The product of HIT and leverage yields a network-weighted HIT contribution, allowing performance to be contextualized not only in terms of internal efficiency but also systemic importance.

Table 6.

HIT results for the five-agent C2 PN simulation.

Agent	$H$	$I$	$\bar{H}$	$\bar{I}$	$\bar{T}$	HIT	$λ_{info}$	$λ_{cmd}$	$HI T_{info}$	$HI T_{cmd}$
Alpha	$1.510$	$1.510$	$0.953$	$0.953$	$1.000$	$0.907$	$- 12.551$	$13.551$	$- 11.387$	$12.295$
Bravo	$1.506$	$1.128$	$0.950$	$0.712$	$0.667$	$1.014$	$0.000$	$0.000$	$0.000$	$0.000$
Charlie	$1.509$	$0.843$	$0.952$	$0.532$	$0.667$	$0.759$	$0.000$	$0.000$	$0.000$	$0.000$
Delta	$1.512$	$0.830$	$0.954$	$0.523$	$0.333$	$1.498$	$11.551$	$- 12.551$	$17.309$	$- 18.808$
Echo	$1.503$	$0.833$	$0.948$	$0.526$	$0.333$	$1.495$	$11.551$	$- 12.551$	$17.266$	$- 18.761$

The results show a distinct stratification by role. Alpha demonstrates high internal HIT, with a score of $0.907$ , indicating effective fusion of input data into timely and context-sensitive decisions. Its information leverage is markedly negative ( $λ_{info} = - 12.551$ ), while its command leverage is strongly positive ( $λ_{cmd} = 13.551$ ), consistent with its structural role as a top-level command node that consumes but does not generate upstream information while issuing directives downstream. Bravo exhibits a similarly strong internal HIT of $1.014$ , operating as a clean mid-tier relay with zero net leverage in both information and command channels ( $λ_{info} = 0.000$ , $λ_{cmd} = 0.000$ ). Charlie’s performance is noticeably weaker, with a mutual information of just $0.843$ and a corresponding HIT of $0.759$ , suggesting less informative or more stochastic policy execution—owing to Charlie’s slightly differing policy from that of Bravo (This models heterogeneous tactical preferences—Bravo assertive and Charlie hesitant. Bravo monitors in calm conditions, tends to query when suspicious ( $p = 0.7$ ), and engages directly when hostile. Charlie is more cautious, randomizing even in calm conditions ( $p = 0.8$ monitor), shows less decisiveness under suspicion ( $p = 0.6$ query), and only commits to engagement when hostile).

In contrast, Delta and Echo achieve the highest HIT scores at $1.498$ and $1.495$ respectively, a consequence of their direct environmental access and minimal latency ( $\bar{T} = 0.333$ ). Their strong information leverage ( $λ_{info} = 11.551$ ) and corresponding ${HIT}_{info}$ contributions ( $17.309$ and $17.266$ ) confirm their critical role in generating situational awareness, while their negative command leverage ( $λ_{cmd} = - 12.551$ ) and large-magnitude ${HIT}_{cmd}$ values ( $- 18.808$ and $- 18.761$ ) reflect their purely receptive position in the command hierarchy.

In summary, the five-agent PN simulation illustrates that HIT, when paired with a structured substrate like a PN, offers a powerful diagnostic lens on distributed command and control. It captures both local efficiency and systemic contribution, enables comparative benchmarking across agents, and supports operational inference without requiring intrusive data collection (e.g. using network logs)—though these claims are only relevant in the context of the current simulation, and require further justification in real-life contexts. Still, the practical relevance of this simulation extends beyond abstract validation. Real-world systems such as air policing operations, C2-networks, distributed ISR cells, or autonomous swarm controllers could adopt a similar modeling approach. Constructing a PN from standard operating procedures, message routing logs, and mission trace data is feasible and non-intrusive—especially within simulations. The required information kernels include timestamped event logs, coarse context and action labels (e.g. procedural outcomes, mission state tags), and minimal routing metadata to approximate message flow. Crucially, the approach does not require deep packet inspection or intrusive instrumentation. HIT scores could be computed post hoc from existing audit trails or operational exercise logs, enabling quantitative assessment of decision efficiency at both the agent and system level.

5. Discussion

The proposed HIT score encompasses apparent limitations and caveats. A reliable MI estimation demands sufficient data, and so it is by necessity constrained by sample complexity. It may be practically achievable in simulation environments, in which ground truth is accessible. Moreover, all results in this paper treat $T$ as latency (e.g. time per decision), though the HIT score formalisms accommodate other resource dimensions without modification: in cyber-physical systems, $T$ might instead quantify energetic cost (in joules), computational burden (in, say, FLOPs), or opportunity cost in constrained scheduling domains. This flexibility allows HIT to reflect domain-relevant trade-offs while preserving its core structure. Therefore, HIT’s context-agnostic nature is a clear advantage, as it could be realistically applied in a variety of operational and experimental scenarios. On the flip side—also related to the normalized nature of HIT scores—there is no straightforward way to compare quantitative HIT scores across heterogeneous systems without qualitative understanding of the system topology, policy, and resource constraints in question. Normalization does not guarantee comparability across systems.

It is important to note that HIT, as formulated here, does not in itself imply directional causality. Mutual information (MI) is symmetric by definition: $I (C; R) = I (R; C)$ (Equation (3)). Thus, HIT quantifies the degree of statistical dependency between context and response, but not whether the agent is responding to the environment or vice versa. Disentangling causal directionality may be addressed in future extensions using PID,⁴² which explicitly separates shared, unique, and synergistic contributions in information flow. Hence, multistep causal reasoning or prediction of future contact states remains outside HIT’s scope at the time. Temporal-series HIT, trust-weighted fusion, and further exploiting the leverage factor $λ$ in multinode networks are promising extensions.

An approach of using PNs as a substrate for HIT in modeling networked systems was proposed in Section 3.4 and demonstrated in Section 4.3. PNs offered structural clarity and composability for representing multiagent decision systems. Unlike statistical or machine-learned models, PNs embed latency and causality directly into their transition architecture. This allowed each token to represent a contextual update or decision trigger, flowing through well-defined transitions that respect policy logic, synchronization points, and concurrency structures.

In the simulated five-agent C2 system, the PN model enabled precise control over latency paths, fusion behavior, and interagent coordination. The use of modular subnets per agent allowed for clean separation of responsibilities and straightforward attribution of HIT and leverage values. This architectural alignment provided meaningful metrics that surfaced both internal efficiency (via HIT) and systemic contribution (via ${HIT}_{info}$ and ${HIT}_{cmd}$ ). In addition, PNs offer excellent compatibility with real-world operational data streams. Using non-intrusive telemetry such as event logs, sensor classifications, action labels, and routing metadata, one could approximate PN execution traces sufficiently—possibly even in near real-time scenarios—for HIT computation. Thus, PNs serve as a transparent, inspectable, and interpretable bridge between HIT theory and tactical or cyber-physical applications. Therefore, as was quantitatively demonstrated experimentally, PNs are a compelling substitute to other information-theoretic approaches, like the aforementioned PID.

Overall, the HIT score furnishes a computable, interpretable, and substrate-neutral measure of efficiency grounded in classic information theory. By uniting capacity, compression, and cost, it advances Ashby’s³⁴ cybernetic program and offers a diagnostic complementary to reward curves. HIT also appears to be an attractive quantitative OODA proxy. Across the empirical studies, drops in $I (C; R)$ aligned with (simulated) orientation failures, while increases in $T$ marked delayed decide–act phases. In mesh topologies, shared evidence shortened loops, lowering $T$ and boosting HIT—mirroring Boyd’s advocacy for rapid, shared implicit guidance,⁵⁸ and in part, while acknowledging all its caveats,⁵⁹ promoting the NCW paradigms.⁵⁰ Similarly, HIT’s three factors map neatly onto Endsley’s canonical SA levels: sensor noise degraded L1 and L2 by lowering $H (X)$ and $I (C; R)$ —yet L3 (projection) remained high ( $I (R; A)$ ) because actions were policy-consistent once an identification class was chosen. This split explains how systems may appear decisive while lacking accurate comprehension—a nuance that surfaces when HIT is paired with confusion matrices.

Future work includes the following:

Extending HIT estimation to continuous observation and action spaces using nonparametric entropy estimators, such as $k$ -nearest-neighbor methods,⁶⁰ enabling application to continuous-control and sensorimotor domains,

Formal exploration of PNs as a HIT substrate under various network topologies,

Incorporating energy costs for hardware and computational solutions,

Deploying HIT in real-world environments (C2, UAVs, sensor nets)—simulated or otherwise—to monitor emergent coordination, and

Testing HIT as a predictor of out-of-distribution robustness in deep reinforcement learning, thus extending the work by Sedlmeier et al.⁶¹

6. Conclusion

The proposed HIT score was demonstrated to be a feasible quantitative proxy to qualitative canonical decision-making measures, namely OODA and SA. The sensitivity analysis and controlled simulations confirm that HIT quantifies the efficiency of information-driven decision loops in C2-like settings, especially within simulated environments where unambiguous ground truth is available. It neither guarantees correctness nor dictates doctrine, but it offers a transparent gauge of whether entropy is being compressed into action at a justifiable cost.

Incorporating PNs as a modeling substrate further strengthened HIT’s utility. The PN-based simulation of a five-agent command structure illustrated how HIT can expose both internal efficiency and systemic influence across roles. Simulation outcomes of PN-enabled HIT scores as OODA and SA proxies would be challenging to disentangle using traditional statistical metrics or unstructured simulations.

It is stressed that HIT is a quantitative diagnostic of informational efficiency not a normative framework for action selection. It does not override tactical judgment, rules of engagement, or higher-order mission objectives. As with all abstract metrics, its role could be to support, not supplant, decision-making in complex environments.

Future work should further couple HIT scores with temporal and causal metrics to capture richer aspects of OODA recursion, team cognition, and Team SA, as Boyd and Endsley intended in their seminal works. Nonintrusive real-world applications might yield further fruitful lines of inquiry in researching HIT.

Footnotes

Acknowledgements

The author wishes to thank several colleagues for their encouragement and insightful discussions. While most prefer to remain anonymous, special thanks go to Riitta Penttinen, Ph.D., Matti Puranen, Ph.D., and Kimmo Halunen, Ph.D. for their generous support throughout these unconventional endeavors.

Author Note

During the preparation of this work, the author used OpenAI’s language models in order to prepare parts of the simulation Python code. After using this service, the author reviewed and edited the content as needed, and takes full responsibility for the content of the published work.

ORCID iD

A Artturi Juvonen

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Author biography

A Artturi Juvonen is a Finnish Air Force officer (callsign “Binary”) with an M.M.Sc. (mil. tech.) and an M.Sc. (cybersecurity). He is an educator specializing in electromagnetic and cyber warfare, and a doctoral candidate in military sciences. His career spans both operational and academic fields, including fighter controller service, tactical leadership, applied radio technology, and critical infrastructure research—with work that moves between the operations room, classroom, and codebases.

References

Boyd

. Destruction and creation, 3 September 1976. https://www.coljohnboyd.com/static/documents/1976-09-03__Boyd_John_R__Destruction_and_Creation.pdf

Endsley

MR.

Toward a theory of situation awareness in dynamic systems. Hum Factors 1995; 37(1): 32–64.

Legg

Hutter

Universal intelligence: a definition of machine intelligence. Minds Mach 2007; 17(4): 391–444.

Friston

The free-energy principle: a unified brain theory. Nat Rev Neurosci 2010; 11(2): 127–138.

Solomonoff

. A formal theory of inductive inference. Part I. Inf Control 1964; 7(1): 1–22.

Axelrod

The evolution of cooperation. New York: Basic Books, 1984.

Endsley

MR.

Measurement of situation awareness in dynamic systems. Hum Factors J Hum Factors Ergon Soc 1995; 37(1): 65–84.

Taylor

. Situational Awareness Rating Technique (SART): the development of a tool for aircrew systems design. In: Salas

(ed.) Situational awareness. New York: Routledge, 2017, pp. 111–128.

Durso

Hackworth

Truitt

, et al. Situation awareness as a predictor of performance in en route air traffic control. Air Traffic Control Q 1999; 7(1): 1–20.

10.

Endsley

MR.

A systematic review and meta-analysis of direct objective measures of situation awareness: a comparison of SAGAT and SPAM. Hum Factors J Hum Factors Ergon Soc 2019; 63(1): 124–150.

11.

Loft

Bowden

Braithwaite

, et al. Situation awareness measures for simulated submarine track management. Hum Factors J Hum Factors Ergon Soc 2014; 57(2): 298–310.

12.

Endsley

Jones

DG.

Designing for situation awareness. 3rd ed. London: CRC Press, 2025.

13.

Stanton

Stewart

Harris

, et al. Distributed situation awareness in dynamic systems: theoretical development and application of an ergonomics methodology. Ergonomics 2006; 49(12–13): 1288–1311.

14.

Stanton

NA.

Distributed situation awareness. Theor Issues Ergonomics Sci 2015; 17(1): 1–7.

15.

Osinga

FPB

. Science, strategy and war: strategy and history. London: Routledge, 2007.

16.

Brehmer

. The dynamic OODA loop: amalgamating boyd’s OODA loop and the cybernetic approach to command and control. In: Proceedings of the 10th international command and control research and technology symposium, McLean, VA, 13–16 June 2005, pp. 1–15. U.S. Department of Defense Command and Control Research Program.

17.

Ortega

Braun

DA.

Thermodynamics as a theory of decision-making with information-processing costs. Proc R Soc A Math Phys Eng Sci 2013; 469(2153): 20120683.

18.

Saad

Meyer

Quantifying levels of influence and causal responsibility in dynamic decision making events. ACM Trans Intell Syst Technol 2023; 15(1): 1–22.

19.

Poisel

RA.

Information warfare and electronic warfare systems. Norwood, MA: Artech House, 2013.

20.

Hutter

Universal artificial intelligence. Berlin and Heidelberg: Springer, 2005.

21.

Vitányi

An introduction to Kolmogorov complexity and its applications. Cham: Springer, 2019.

22.

Tononi

An information integration theory of consciousness. BMC Neurosci 2004; 5(1): 42.

23.

Tononi

Boly

Massimini

, et al. Integrated information theory: from consciousness to its physical substrate. Nat Rev Neurosci 2016; 17(7): 450–461.

24.

Barrett

Seth

AK.

Practical measures of integrated information for time-series data. PLoS Comput Biol 2011; 7(1): e1001052.

25.

Oizumi

Albantakis

Tononi

. From the phenomenology to the mechanisms of consciousness: integrated information theory 3.0. PLoS Computat Biol 2014; 10(5): e1003588.

26.

Friston

A free energy principle for a particular physics, 2019. https://arxiv.org/abs/1906.10184.1906.10184

27.

Genesereth

Love

Pell

General game playing: overview of the AAAI competition. AI Mag 2005; 26(2): 62.

28.

Brockman

Cheung

Pettersson

, et al. OpenAI gym, 2016. https://arxiv.org/abs/1606.01540.1606.01540

29.

Freeman

LC.

Centrality in social networks conceptual clarification. Soc Netw 1978; 1(3): 215–239.

30.

Bonacich

Power and centrality: a family of measures. Am J Sociol 1987; 92(5): 1170–1182.

31.

Borgatti

SP.

Centrality and network flow. Soc Netw 2005; 27(1): 55–71.

32.

Cover

Thomas

JA.

Elements of information theory. Hoboken, NJ: Wiley, 2005.

33.

Shannon

CE.

A mathematical theory of communication. Bell Syst Tech J 1948; 27(3): 379–423.

34.

Ashby

WR.

An introduction to cybernetics. London: Chapman & Hall, 1956.

35.

Saltelli

Ratto

Andres

, et al. Global sensitivity analysis: the primer. Hoboken, NJ: Wiley, 2007.

36.

Saltelli

Aleksankina

Becker

, et al. Why so many published sensitivity analyses are false: a systematic review of sensitivity analysis practices. Environ Model Softw 2019; 114: 29–39.

37.

Pawitan

In all likelihood. London: Oxford University Press, 2013.

38.

Kleijnen

JP.

Sensitivity analysis and related analyses: a review of some statistical techniques. J Stat Comput Simul 1997; 57(1–4): 111–142.

39.

Juvonen

AA.

Simulations to assess the feasibility and performance of the HIT metric. https://github.com/aajuvonen/hitsim (2025, accessed 14 May 2025).

40.

Miller

. Note on the bias of information estimates. In Quastler

(ed.) Information theory in psychology: problems and methods. Glencoe, IL: Free Press, 1955, pp. 95–100.

41.

Murata

Petri Nets: properties, analysis and applications. Proc IEEE 1989; 77(4): 541–580.

42.

Williams

Beer

RD.

Nonnegative decomposition of multivariate information, 2010. https://arxiv.org/abs/1004.2515.1004.2515

43.

Granger

CWJ

. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 1969; 37(3): 424.

44.

Axelrod

. The evolution of cooperation. New York: Basic Books, 1984.

45.

Nowak

Sigmund

A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 1993; 364(6432): 56–58.

46.

Feller

An introduction to probability theory and its applications (Wiley Series in Probability and Statistics), vol. 1. 3rd ed. Nashville, TN: John Wiley & Sons, 1968.

47.

International Civil Aviation Organization. Annex 10—aeronautical telecommunications: volume IV—surveillance and collision avoidance systems. 5th ed. Montréal, QC, Canada: International Civil Aviation Organization, 2014.

48.

North Atlantic Treaty Organization. STANAG 4193: identification friend or foe (IFF), mark XIIA / mode 5 system (Standard STANAG 4193). 3rd ed. Brussels: North Atlantic Treaty Organization, 2016.

49.

North Atlantic Treaty Organization. AJP-3.3: allied joint doctrine for air and space operations (Allied Joint Publication AJP-3.3). Brussels: North Atlantic Treaty Organization, 2016.

50.

Alberts

Garstka

Stein

FP.

Network centric warfare: developing and leveraging information superiority. Washington, DC: DoD Command and Control Research Program, 1999. https://www.dodccrp.org/files/Alberts_NCW.pdf (accessed 25 October 2025).

51.

Moffat

Complexity theory and network centric warfare. Reston, VA: Forty One Cooperative Research, 2003.

52.

Genest

Zidek

JV.

Combining probability distributions: a critique and an annotated bibliography. Stat Sci 1986; 1(1): 114–135.

53.

Olfati-Saber

. Distributed Kalman filtering for sensor networks. In: 2007 46th IEEE conference on decision and control, New Orleans, LA, 12–14 December 2007, pp. 5492–5498. New York: IEEE.

54.

Battistelli

Chisci

Kullback–leibler average, consensus on probability densities, and distributed state estimation with guaranteed stability. Automatica 2014; 50(3): 707–718.

55.

Bar-Shalom

Campo

The effect of the common process noise on the two-sensor fused-track covariance. IEEE Trans Aerosp Electron Syst 1986; AES-22(6): 803–805.

56.

Perry

Moffat

Information sharing among military headquarters: the effects on decisionmaking. Santa Monica, CA: Monograph, RAND Corporation, 2004. https://www.rand.org/pubs/monographs/MG226.html

57.

Watts

BD.

Clausewitzian friction and future war. Technical Report McNair paper 52. Washington, DC: Institute for National Strategic Studies, National Defense University, 1996. https://clausewitzstudies.org/readings/Watts-Friction3.pdf (accessed 25 October 2025).

58.

Boyd

. Organic design for command and control, 1987. https://www.coljohnboyd.com/static/documents/1987-05__Boyd_John_R__Organic_Design_for_Command_and_Control__PPT-PDF.pdf (accessed 7 January 2026).

59.

Wilson

Congressional Research Service. Network centric operations: background and oversight issues for congress. CRS Report RL32411, 15 March 2007. Washington, DC: Congressional Research Service, Library of Congress. https://apps.dtic.Mil/sti/pdfs/ADA466624.pdf (accessed 27 October 2025).

60.

Kraskov

Stögbauer

Grassberger

Estimating mutual information. Phys Rev E 2004; 69(6): 066138.

61.

Sedlmeier

Gabor

Phan

, et al. Uncertainty-based out-of-distribution classification in deep reinforcement learning. In: Proceedings of the 12th international conference on agents and artificial intelligence—volume 2: ICAART, Valletta, Malta, 22–24 February 2020, pp. 522–529. Setúbal: Science and Technology Publications, Lda.

HIT: a minimalist metric for adaptive information compression in dynamic environments

Abstract

Keywords

1. Introduction

2. Related work

3. Theoretical foundation

3.1. Single-agent formulation

3.1.1. Rudimentary edge-case sanity checks

3.1.2. Relation to OODA and SA

3.2. Networked extension

3.3. Sensitivity analysis

3.3.1. Context space ( | C | )

3.3.2. Response space ( | R | )

3.3.3. Context distribution skew

3.3.4. Policy noise/accuracy

3.3.5. Latency noise

3.3.6. Sample size

3.3.7. Network asymmetry

3.4. Petri nets as a HIT substrate

4. Empirical evaluation

4.1. Seven-agent Iterated Prisoner’s Dilemma

4.2. Air policing toy models

4.2.1. Single-agent air policing simulation

4.2.2. Results

4.2.3. Interpretation

4.2.4. Multiagent air policing

4.3. PN of a five-agent C2 system

5. Discussion

6. Conclusion

Footnotes

Acknowledgements

Author Note

ORCID iD

Funding

Declaration of conflicting interests

Author biography

References

3.3.1. Context space ( $| C |$ )

3.3.2. Response space ( $| R |$ )