Sage Journals: Discover world-class research

Abstract

Knowing how to traverse complex unstructured environments is a difficult challenge, that humans achieve through logic, reasoning, and experience; yet some of the most beneficial use cases for autonomous systems require them to operate in complex environments without regular human intervention. Furthermore, for machines to support humans in such use cases, trust in decision making will be crucial, ensuring operators have confidence to deploy the capabilities. Despite its importance, enabling autonomous agents to navigate effectively and reliably in complex terrain remains an unsolved challenge. Advances in neurosymbolic artificial intelligence present an opportunity to enhance performance in complex, explainable, and uncertain decision making, such as autonomous traversability analysis. The challenge of complex environments is complicated by its non-deterministic nature; terrain will adapt and change through domains, and its properties can adapt rapidly based on external factors like weather or objects that are in proximity, which is true for one location on one day, will not persist. This article presents a new neurosymbolic model structure that was designed specifically for this task. It uses experience to build a world model, similar to that of a neural network, but with some key delineating features such as full explainability, through life adaptation or evolution, and zero-shot capability. This provides the reasoning backbone for an autonomous agent to determine the level of risk each object presents based on its context and therefore determine the best possible route.

Keywords

neurosymbolic AI machine learning knowledge-based learning autonomous systems complex environments

1. Introduction

Autonomous systems present an opportunity to transform the way humans complete some of the most dangerous, unpleasant, or persistent tasks, especially within domains such as Defence or Search and Rescue. These use cases present some of the greatest beneficiaries of autonomous systems, but have some of the most demanding requirements, most notably the ability to operate reliably in very complex terrain and dynamic domains, while maintaining a high degree of trust by their operators to complete their task at hand. Robustly operating in complex environments requires platforms to operate in both unstructured and uncertain terrain, where clear transition points between features may not exist, with high variation in slope, roughness and unpredictable terrain features like holes or depressions (Seraji & Howard, 2002; Silver et al., 2010; Siva et al., 2019). Furthermore, the characteristics of an object cannot be determined effectively without understanding the context in which it is found. Navigating requires inductive and deductive reasoning, an understanding of the environmental conditions, probabilistic judgement, and the ability to handle uncertainty. When considering autonomous agents, neither a symbolic nor a neural approach replicates them all sufficiently. Neural approaches generally fail to reason effectively and suffer from a lack of explainability but can be adaptive to out-of-distribution data, while symbolic approaches can reason but require a significant upfront knowledge base and cannot effectively generalize. Fundamentally, performing these activities within an autonomous platform is not a simple extrapolation of either approach. The traversal of complex environments remains an outstanding challenge in the field of autonomous systems (Fan et al., 2021; UK MOD, 2022).

Navigating complex terrain can be considered across a number of fields of research, such as perception, localization, cognition and motion control (Panigrahi & Bisoy, 2022). This article focuses on cognition and specifically on how to enable an agent to determine the traversability of a target object. When considering the prediction of traversability for a given object within complex environments, the continuous, layered structure of individual objects means that the assessment of an object in isolation is insufficient for making an accurate prediction. The images in Figure 1 show an example of two separate examples of the same trail object, both of which have separate performances, caused as a result of the adjacent objects. Furthermore, terrain characteristics are not consistent in all domains and environments; the performance of grass changes if it has rained in the last hour, and in winter, this may be the last day. As a result, such environmental information is the context and parameters of the overall prediction. Consideration must also be made to the rapid domain evolutions and inconsistencies required when operating within such an environment, meaning traversability predictions must have a broad generalization capacity, enabling routine handling of previously unseen situations.

Figure 1.

An example of two instances of trail objects with their surrounding context, resulting in separate risk assessments.

Vehicles will operate in close proximity to humans who, for both safety and functional reasons, need confidence in actions and to understand why a decision is to be made. As a result, operator trust must be considered within any agent cognition. This constraint makes any potential for context prediction within a conventional neural network-based solution (Gasparino et al., 2024; Jung et al., 2024; Vecchio et al., 2024; Visca et al., 2021) a barrier to practical deployment, as any decision justification would be concealed within the black-box nature of the model, inhibiting explainability. Conversely, explicit reasoning and logic are easier to interpret than neural architectures (von Eschenbach, 2021), as reasoning transcends the originator; it is communicable and can be understood externally (Mercier & Sperber, 2011). Through explicit reasoning, operators can interpret decisions and understand errors, making the prediction more deterministic and increasing trust (Ingram et al., 2021).

This concept paper builds on the concept of the World Model (LeCun, 2022), using neurosymbolic methods (Kautz, 2022) to develop a human-like approach to solving the challenge of autonomy in complex environments. It presents a model structure which enables an agent to make traversability predictions that account for an object’s context, learn dynamically with new experiences, and use causal relationships to generalize across evolving domains. This article outlines BeliefNet, a model designed to support explainable context-based prediction for complex environments.

BeliefNet uses a symbolically built neural architecture to form experience-based beliefs, overtime generating causal relationships between a target object, its context and traversability risk. The model is designed to train through life, learning with an agent’s experience, enabling adaptation to new environments. The model seeks to extrapolate causal relationships, enabling it to generalize effectively to different domains. It uses belief-based inference to form deterministic and explainable predictions, improving prediction accuracy, domain adaptation, and enhancing operator trust.

The contributions of this article are as follows: –

The proposal of BeliefNet, a Neuro[Symbolic] model structure for context-based traversability prediction for autonomous systems.

–

A demonstration of the performance of BeliefNet in an adapted version of the Yamaha CMU (Wang, 2021) dataset, to increase the performance of agent cognition in complex environments.

–

Comparison of context-based terrain traversability prediction and object-based prediction.

–

A high-level traversability taxonomy for ground platforms based on risk and speed.

2. Existing Work

2.1. Traversability Assessment

The field of traversability has received significant attention in recent years, leading to the development of three primary approaches emerging to conduct traversability assessment: terrain mapping, terrain classification, and end-to-end solutions (Beycimen et al., 2023). Lidar analysis has been used extensively in traversability mapping approaches, both in direct obstacle avoidance (Larson & Trivedi, 2011; Su et al., 2021), or in more complex feature segmentation (Agha et al., 2021; Himmelsbach et al., 2010; Xie et al., 2023). While delivering promising results, such approaches are spatial in nature, potentially over simplifying the traversability calculus by ignoring the environmental and situational semantics. Furthermore, the active nature of Lidar presents challenges in use cases where light emissions have negative secondary effects.

Terrain classification presents a method to incorporate semantics. Advances in computer vision, with the introduction of models such as YOLO (Redmon & Farhadi, 2017) and approaches such as vision transformers (Dosovitskiy et al., 2020) and panoptic segmentation (Yang et al., 2022), have made this increasingly feasible, allowing real-time inference on edge-based devices. The use of computer vision enables terrains to be segmented into constituent objects, from which semantic labels and classes can be subsequently assigned. As the terrain classification of complex environments is non-trivial, resulting from the discontinuous nature of objects, feature overlap, and environmental conditions (Manduchi et al., 2005), this continues to be an area of active research (Agha et al., 2021; Castaño et al., 2017; Chavez-Garcia et al., 2018; Filitchkin & Byl, 2012; Fritz et al., 2023; Hodgdon et al., 2021; Siva et al., 2019; Valada et al., 2017). Vision and Lidar modalities have been combined to integrate visual semantics with the spatial representations of Lidar (Maturana et al., 2018).

Terrain classification is formed of two distinct components, first detecting and isolating a specific object within the scene, then assessing the traversability of the object. Although some, such as Agha et al. (2021) have integrated both components, most of the research focuses mainly on accurately determining the object, not assessing the traversability. One challenge in this approach is that it can neglect the need to consider the environment and context of a specific object, ignoring that some objects will directly impact the traversability of others. Without such context, it can be challenging to make an accurate and reasoned determination, which is exacerbated as the complexity of the environment increases.

End-to-end deep learning approaches have had success in classifying the traversability of an image (LeCun et al., 2005; Visca et al., 2021). Self-supervised approaches, in which a platform trains a model based on self-extracted features to predict the traversability of the terrain (Seo et al., 2023; Sevastopoulos et al., 2019), reduce the volume requirements for labeled data to some degree. Such approaches can be limited in generalization performance and crucially limit explainability due to the conventional neural architecture of end-to-end deep learning approaches. The use of image segmentation, coupled with self-supervised learning, presents a method to increase explainability, but the computational requirements are prohibitive and the explicit impact of object context is not explicit and unclear (Jung et al., 2024). Although research into traversability assessment for complex environments has been significant, it remains an open area of research and one in which significant advances are required to enable autonomous systems to complete the desired tasks.

2.2. Agent Cognition

Agent cognition within the field of robotics to model actions with high levels of uncertainty has had significant success using probabilistic methods. Markov logic networks (MLNs) in event modeling have been used successfully to classify events from images based on their context (Kardaş et al., 2013), but suffer in complex environments due to cross-domain adaptation, handling partial rule activation, and probabilistic complexity (Chen et al., 2024; Khot et al.). Markov decision processes (MDPs) have also been used in path planning and path trajectory, with partially observable MDPs (POMDPs) used to handle increasing levels of uncertainty and complex environments (Lauri et al., 2023). Belief states have been integrated into POMDP to extend planning horizons and mitigate the impact of partial observability (Gürtler & Kaminski, 2025; Kaelbling & Lozano-Pérez, 2013). Applications in robotics and autonomous systems focus primarily on action selection. In which state estimation is used to determine the best action to take across a given planning horizon, formulation of the reward for a given action (such as its predicted traversability value) is often adjacent to the model. Probabilistic programming using tools such as ProbLog has sought to overcome the complexity limitations faced by MDPs and MLNs, such as probability modeling and handling uncertainties in predictions (Nitti et al., 2015; Sztyler et al., 2018), but requires a firm logical foundation, which may not be easily available in complex environments.

The concept of a belief within Bayesian epistemology considers that beliefs are not consistent, and the degree to which a belief is believed is adapted to the available body of evidence (Lin, 2024). Pearl emphasizes the centrality of causality in beliefs, expanding beyond correlation (Pearl, 2009). Pearl introduces a three-layer model that supports causality in the concept of machine learning: association, intervention, and counterfactual; progression through these layers supports the classification of causal information and improves the degree of confidence in a given belief (Pearl, 2019). Graph structures have presented a method to effectively communicate causal relationships between entities, noting that such relationships are not hierarchical or linear (Lipsky & Greenland, 2022). The application of causality to machine learning has recently been identified as a method to significantly increase generalization and cross-task adaptation. It presents a complex challenge due to the nature of feature extraction from data. Scholkopf et al. presented a number of potential approaches, such as self-supervised learning and reinforcement learning. This research outlined the importance of observation and intervention in learning causal relationships (Schölkopf et al., 2021). The applications of causality have been applied to learn causal relationships and apply them within inference (Rakesh et al., 2018; Yang et al., 2021; Zhao & Liu, 2023). Causality within artificial intelligence (AI) presents significant promise, though limited within this challenge by the data volumes required by approaches such as autoencoders. The application of a priori knowledge to autoencoders has been applied by Komanduri et al. as a method of reducing upfront data requirements (Komanduri et al., 2022).

Advances in probabilistic modeling and decision making have had a significant impact on robotics, but the high levels of variability in object class, prediction confidence, object separation, and adaptive domains seen in complex environments make logical grounding and the application of finite rule sets insufficient to effectively model the complexity. The consideration of belief states and partial observability is very relevant to context-aware traversability. The state estimation is considered not as the state of the agent given an action, but estimation of the state of beliefs held about an object, it’s context, and the resulting traversability. The ability to make predictions with partially observed inputs and an incomplete understanding of object interactions, while being able to update understanding when new information is available, is very pertinent to complex environments and domain adaptation. The concepts of beliefs and causality are also of benefit to this problem set, supporting an agent to both learn from new experiences, and supporting generalization when facing uncertainty, both of which are common in complex environments.

2.3. Neurosymbolic AI

Neurosymbolic AI is a promising area of research in machine decision making, explainability, and reasoning (Bhuyan et al., 2024). This area of research presents architectures to integrate the reasoning performance of symbolic reasoning and the learning power of sub-symbolic, connectionist or neural network-based approaches (Kautz, 2022), based on the system-1/system-2 approach defined by Kahneman (2011). The field is still growing and there remains diversity in approaches, but all have in common the structure of perception, integrated with existing knowledge (Sheth et al., 2023), and their explainability and reasoning performance make them particularly beneficial in use cases with high levels of human–machine interaction (Barnes & Hutson, 2024). Within the broader field of autonomous system navigation, neurosymbolic architectures have been used to integrate physics rules into a neural network to determine the vehicle dynamics required to traverse a given path (Zhao et al., 2024). Neurosymbolic architectures are commonly represented by six core approaches (DeLong et al., 2025; Kautz, 2022), two of particular relevance to this article are defined as NEURO;SYMBOLIC and NEURO[SYMBOLIC] (Kokel, 2020).

NEURO;SYMBOLIC represents a system in which a symbolic and neural systems work in concert with each other, communicating and passing information between them, to achieve a common objective (Kautz, 2022). Examples of this are knowledge graph integration with neural networks (DeLong et al., 2025) that allow a neural network to query, input to, and validate symbolic knowledge graphs. NSNnet, which passes between neural and symbolic modules in an attempt to solve hand-written Sudoku challenges, presents a unique perspective that maps both input and output to a non-symbolic output, with a central symbolic reasoning engine (Agarwal et al., 2021). Both examples are dependent on a core level of symbolic reasoning. The neuro-symbolic concept learner (NCSL) is designed to unify text and visual concepts through the learning of image and question-answer pairs (Mao et al., 2019). This model presents an interesting advance as it enables symbolic concepts to be learned, without implicit knowledge being defined upfront.

In contrast, the NEURO[SYMBOLIC] system is one in which a neural network learns to reason about relationships between neural entities (Kautz, 2022; Lamb et al., 2020), in effect forming a neural network of symbolic entities. This is perhaps the most complex and least mature area of research within the field. Logic tensor networks (LTNs) and logic neural networks (LNNs), which form networks from symbolic relationships and enable weighted training of the relationships using back-propagation based on a set of first-order logic statements (Badreddine et al., 2022; Riegel et al., 2020). The pLogicNet model mostly precedes the core definitions of neurosymbolic AI and represents a method similar to LTNs based on the application of MLNs (Qu & Tang, 2019). The LTN and pLogicNet are designed to improve, validate, or deconflict a set of a priori logical statements. The challenge with these approaches when applied to an agent-based approach is that they require upfront knowledge that may not be practical to achieve. Models such as the neuro symbolic reinforcement learner, INSIGHT, by Luo et al. use a neural network to learn symbolic policies that support the agent in its decision-making, enabling reasoning to be learned from the environment (Luo et al., 2024).

Neurosymbolic systems have shown significant promise in vision and multimodal tasks, such as visual question answering and scene graph generation (Bauer et al., 2025; Junaid Khan et al., 2025; Li et al., 2024). Despite success in these areas, multimodal neurosymbolic systems remain challenged in ensuring consistency between modalities as the deployment domain evolves (Lu et al., 2024). From an autonomous system perspective, this might have the most impact at the point that multiple sensors are combined into a single collaborative neurosymbolic architecture, such as the camera and Lidar, as shown previously, a common approach. Although this remains an open challenge, it could limit the breadth of neurosymbolic architecture application in autonomous vehicles.

The current state of neurosymbolic AI presents significant advances in both reasoning and explainability, the NEURO[SYMBOLIC] concept of a single neural network which encapsulates symbolic reasoning presents an opportunity to represent an agent’s world model. As with probabilistic approaches, current methods often rely on a set of a priori logic statements, leading to similar constraints on domain adaptation. As a result, BeliefNet has taken the concept of a symbolic network trained using sub-symbolic approaches, but in a manner that reflects the domain learning capabilities of models such as INSIGHT or the NSCL in which beliefs can be inferred from training. In the generation of beliefs, in opposition to rules, BeliefNet provides the ability to learn continuously from an agent’s experience, avoiding the constraint of domain-specific logic that fails to support out-of-distribution inference.

3. Approach

3.1. Overview

BeliefNet is a graph-based directed network in which nodes represent symbolic information, and unlike a neural network, the edges are not fully connected, but instead form relationships based on observation and counterfactual evidence. The network nodes and edges then act as neurons and connections in a neural network supporting weight optimization. This structure enables the model to make traversability adaptations with very small amounts of data when compared with a conventional neural network, while retaining absolute explainability in the model’s deduction. The relationship between a given set of input predicates and output results represents a belief within the model. Beliefs are something the system has some degree of confidence in being true (Newman, 2023), based on its own experiences. Conceptually, human beliefs continually evolve and adapt to our experiences and our current domain, and we learn through life. When we face something unknown, we find the set of closest beliefs, use them to make a prediction, then create a new belief that captures the separation between the prediction and the truth, often captured within the concept of predictive coding (Millidge et al., 2023). It is this function that the BeliefNet model looks to model; conventional neural networks struggle with this approach requiring full validation after each evolution. In contrast, the graph structure of BeliefNet makes domain adaptation and counterfactual generation a deterministic function of the model throughout its use.

The BeliefNet model is designed to operate post perception, so it can be agnostic to the object classification model, or even the modality. It is also capable of integrating new predicates into the model, which means new classes can be added to a perception model, and these will be incorporated into BeliefNet as they are experienced. The model is built logically before training, in which connections between objects, their context, or existing relationships (in the case of counterfactuals) are generated dynamically. After which, a forward pass through the model is made, followed by optimization and back-propagation using a conventional loss function. This can be achieved using a conventional upfront training set and continued further as the agent experiences its environment, providing an intervention mechanism overtime generating causal beliefs. Combined with the symbolism retained within each node, the structure provides the ability to activate only relevant sections of the model during inference, aiding explainability and providing reasoning in unknown situations. This approach acts as a zero-shot domain adaptation model, without the need for the high data volumes conventionally required through existing zero-shot approaches.

The model is designed for human interaction. The symbolic nature of the nodes and deliberate relationships means that any prediction can be traced through the model directly and that contributing nodes can be clearly identified. This enables operators to interact with the agent’s cognition in novel methods, which are likely to significantly enhance trust. Operators can clearly determine why a decision was made, and can actively correct the result and use this to directly train the model. Furthermore, if they hold logic that had not yet been experienced by the model, this can be integrated as testimonial knowledge, within the network directly. As a result, BeliefNet presents an approach capable of context-based traversability prediction and the ability to generate trust between the agent and the operator.

3.2. High Level Structure

The model is formed of a number of components, some of which are adaptations of existing deep-learning approaches and some which are specific to BeliefNet. At a high level, the model should be considered as post-processing of a perception model, it initializes by taking the perception output and transforming this into a graph structure, known as an instance graph. The instance graph is generated as the output of a semantic segmentation model, such as YOLO (Jocher, 2020). The predictions are further enhanced through a depth perception model (Bhat et al., 2023), estimated three-dimensional (3D) separation between objects, and augmented with environmental tags that represent the weather, light, and domain. The instance graph is a dense symbolic representation of a given image. During training the instance graphs are converted to a series of context graphs, representing a target object and the surrounding objects, distances, and environmental tags for a given object for which a prediction is made. Context graphs are passed to the building algorithm, which is a custom training method designed to extract causal relationships between objects, context, and a traversability value. This will occur even during inference, enabling new relationships to be formed as they are identified. This forms the basis of the network; each node has an activation function, bias parameter, and each edge has a connection weight. To make a prediction, the predicates within the context graph are activated and propagated through relationships in the graph, generating values at the output nodes. When training or when provided with feedback, the optimization step occurs, which uses conventional back-propagation, such as the Adam algorithm (Kingma & Ba, 2015) to adapt the weights in a supervised manner. It is the combination of the logical build process before the back-propagation which provides the reasoning capacity and explainability of the structure. The architecture in Figure 2 visually shows how these components fit together within the model.

Figure 2.

Model architecture, the high level architecture of the model is based on the structure of a neural network, but with adaptations to enable the symbolism to be retained throughout training and inference.

3.3. Data Structures

3.3.1. Context Graph

The context graph represents a target object for which a prediction is to be made, the relevant object, and environmental tags detected in proximity to the target object (the context) and how they each relate in proximity and position. It is the context graph which acts in effect as the input data to the core belief-net model. Within a given instance, there may be multiple objects about which a traversability assessment may want to be made. For each of these, a context graph ( $G$ ) is generated, representing all objects ( $V$ ) with relationships ( $E$ ) to the target, such that $G = (V, E)$ . Captured as a subgraph of the overall instance, it captures the target object ( $t$ ), context object ( $c$ ), relationship type ( $r$ ), and the strength ( $s$ ). For the traversability use case, the relationship is the positional relationship of the two objects, and strength represents the 3D Euclidean distance which is generated as post-processing from semantic segmentation. To ensure that this remains a subgraph, a relationship threshold ( $τ$ ) is established. The relationship threshold and category ranges are parameters that can be tuned within the model.

Each edge $e \in E$ within the context graph is defined as follows:

e = (t, c, r, s) .

(1)

By way of an example, consider a target object $t = grass_low_1$ , at position $p o s (t_{1}) = (1, 1, 0)$ . This is within the context of two proximal objects:

\begin{aligned} c_{1} & = {tree}_{1}, p o s (c_{1}) = (100, 50, 300), \\ c_{2} & = {puddle}_{2}, p o s (c_{2}) = (11, 4, 0), \end{aligned}

and has the environmental tags of:

\begin{aligned} c_{3} & = season:winter, \\ c_{4} & = weather:light_rain . \end{aligned}

The graph vertices become:

V = {grass_low_1, tree_1, puddle_1, season:winter, weather:light_rain} .

Each edge is computed as follows; note the strength and relations are not relevant to tags:

\begin{aligned} e_{1} & = (grass_low_1, {tree}_{1}, r_{1} = ``high right,'' s_{1} = ‖ (1, 1, 0) - (100, 50, 300) ‖_{2} = 320), \\ e_{2} & = (grass_low_1, {puddle}_{1}, r_{2} = ``low left,'' s_{2} = ‖ (1, 1, 0) - (11, 4, 0) ‖_{2} = 10), \\ e_{3} & = (grass_low_1, {season:winter}_{1}, r_{3} = ``etag,'' s_{3} = 1), \\ e_{4} & = (grass_low_1, {weather:light_rain}_{1}, r_{4} = "etag," s_{4} = 1) . \end{aligned}

Edges and vertices are only included in the context graph if the relationship strength is below the relationship threshold. If

τ = 30

G = (V = {c_{2}, c_{3}, c_{4}}, E = {e_{2}, e_{3}, e_{4}}) .

3.3.2. Data Labels

BeliefNet is fundamentally a supervised model, relying on labeled samples from which to learn. Context graphs are labeled with a traversability index value. Such a value could be infinitely complex and very specific to an individual agent’s performance characteristics. To increase generalized performance, a level of abstraction was selected which outlined the behavioral impact, rather than physical or mechanical. The developed traversability index categorizes expected speed (relative to an agent’s default) and the level of caution the agent will require in their traversability. The traversability risk analysis framework proposed by Fan et al. (2021), in which multiple metrics such as risk of collision, slippage, and contact loss are combined into a single risk measure, serves as the basis for a unitary caution value. Although traversability risk can be a regression problem (Inotsume & Kubota, 2022), discrete values are required for classification. Through the abstraction of metrics into behavioral categories, 11 distinct values were defined enabling relative traversability across platforms to be compared. These values are shown in the diagram in Figure 3.

Figure 3.

Traversability index, there are 11 discrete traversability components, which increase in complexity as defined by the variables in the right hand table. These are categories that dictate the relative speed, level of caution the platform requires and the mobility of an object. In this context, caution can be represented as the frequency of cognitive analysis, low caution objects can hold a greater frame separation between detailed processing, than high caution in which every frame may be analyzed. They are assessed based on the individual perception of a single platform, therefore these cam be considered relative to the performance characteristics of an specific platform.

Labels are assigned to a context graph in two ways, depending on the phase of training. Firstly, human labeling enables a foundational training set to be developed, in which the target objects are assigned a relevant label based upon their context. This is used for initial supervised learning, where a large dataset is of value. Secondly, the agent can self-label the target objects based upon direct traversability experience. BeliefNet provides a prediction of the behavior expected when traversing a given object. Once traversed, using methods such as those outlined by Zhao et al. (2024), the separation from expected behavior is used to generate appropriate labels for the context graph of objects. This method provides the data structures to support through-life learning of the model and the domain adaptation.

3.4. Model Structure

3.4.1. BeliefNet Nodes

BeliefNet is fundamentally based on a graphical structure of nodes and edges. As with a conventional neuron (Popescu et al., 2009), each node $n \in N$ has both an input value from each edge $i n_{e} \in I n$ and a weight $w_{e} \in W$ ; there is a bias term $b$ and an activation function to account for nonlinearity $a c t$ . That is, the output value of a belief node is as follows:

n^{'} = a c t ((\sum I n * W) + b) .

(2)

There are three types of nodes within the model, which are loosely equivalent to a single neuron within a neural network: an input node $n_{input} \in I$ , $n_{belief} \in B$ , and $n_{output} \in O$ . An $n_{input}$ represents an atomic predicate, it can only exist once within the model and only holds outgoing relationships to a set of $n_{belief}$ . The belief nodes represent a logical grouping of context predicates $C$ , akin to an AND relationship. The belief nodes act as the connection between the predicates and the output nodes. When combined into belief nodes, predicates and their logical relationship are retained within the name of the node. This enables individual nodes to be referenced directly and enables the contributing predicates to be directly identified.

\begin{aligned} Let c_{1} = g r a s s_l o w, c_{2} = h a r d c o r e_s m o o t h, \\ If ⟨ c_{1}, e, n_{belief}^{1} ⟩ and ⟨ c_{2}, e, n_{belief}^{1} ⟩, \\ n_{belief}^{1} = grass_low \land hardcore_smooth . \end{aligned}

(3)

The belief nodes can have relationships with other belief nodes, indicating counterfactual or divergent beliefs. Overtime, this component enables complex reasoning and causal relationships to emerge. If a third predicate $c_{3}$ was identified with a different output value when combined with the previous predicates, the following would be true.

\begin{aligned} n_{belief}^{2} & = (c_{1} \land c_{2}) \land c_{3}, \\ E & = {⟨ c_{1}, e, n_{belief}^{1} ⟩, ⟨ c_{2}, e, n_{belief}^{1} ⟩, ⟨ c_{3}, e, n_{belief}^{2} ⟩, ⟨ n_{belief}^{1}, e, n_{belief}^{2} ⟩} . \end{aligned}

(4)

The output nodes represent a specific output categorization. Output nodes are combined into layers, in which each node represents a traversability index value, and a the layer is indexed to the object being classified. This provides the model with the ability to classify multiple different objects with the same model backbone. As they are a multiclass classification output, each output layer is combined with a Softmax function (Bridle, 1989). It is important to note that the Softmax only applies to the specific prediction object output layer, not all outputs. This approach also sets the foundation for cross-task generalization, in which separate layers can exist for multiple tasks. Currently, it uses for object traversability risk layers; however, this could be more granular, with layers for variables like speed, roughness, and traction, each using the common model backbone.

3.4.2. The Model

The model $M$ can be represented as a combination of nodes $N$ and edges $E$ , in line with any conventional graph. The nodes are formally grouped into layers, based on their type. The input layer $L_{x}$ represents all possible atomic predicates, the output layers are 3D, with separate output layers for each prediction object or prediction task $t n$ , between there are beliefs, as a result:

L_{input}, L_{belief}, L_{output} = I, (B + E), {O_{t n^{1}}, O_{t n^{2}}, \dots, O_{t n^{n}}} .

(5)

It is important to node that $L_{belief}$ is not a linear layer, as relationships frequently exist between belief nodes, forming chains. There are no constraints to the chain length of these relationships, and there will be varying chain lengths throughout the model, since such $B$ are logically grouped into a 3D layer for simplicity.

By way of an example, suppose that the input layer is a set of three atomic predicates:

I = {n_{input} (grass_low), n_{input} (puddle), n_{input} (tree)} .

The model build process may then define two belief nodes and two output nodes:

\begin{aligned} B & = {n_{belief} (grass_low AND puddle), n_{belief} (grass_low AND puddle AND tree)}, \\ O & = {n_{output} (1), n_{output} (5)} . \end{aligned}

(6)

Then edges are defined between the nodes (details explained in the model build section).

\begin{aligned} e_{1} & = (n_{input} (grass_low);_{belief} (grass_low AND puddle)), \\ e_{2} & = (n_{input} (puddle);_{belief} (grass_low AND puddle)), \\ e_{3} & = (n_{belief} (grass_low AND puddle);_{belief} (grass_low AND puddle AND tree)), \\ e_{4} & = (n_{input} (tree);_{belief} (grass_low AND puddle AND tree)), \\ e_{5} & = (n_{belief} (grass_low AND puddle);_{output} (1)), \\ e_{6} & = (n_{belief} (grass_low AND puddle AND tree);_{output} (5)) . \end{aligned}

(7)

4. Model Training and Inference

4.1. Model Build Process

The model is designed to be persistent and adaptive throughout the lifecycle of an autonomous agent, meaning that it can be trained from no beliefs or use new instances, gained through experience, to update existing beliefs; both use the same build methodology. Conventionally, neural networks have an initialized architecture that remains constant throughout the life cycle of the model, enabling the use of matrix multiplication. However, this inhibits adaptability and explainability. As a result, BeliefNet integrates a build phase prior to weight optimization, in which relationships between predicates, beliefs, and outputs are dynamically formed, based upon presence within a supplied context graph. This occurs in the presentation of each context graph, which means that bulk training or experiential interventions retain the same capability.

The build process uses individual context graphs or instances $i n s_{m} \in I n s$ , formed of a set of the target object $t n$ , context objects ( $C^{{ins}_{m}}$ ) and a set of target labels ${ϕ_{t n}^{({ins}_{m})}}$ , where the target label is a subject object and value, representing the traversability of the object about which a prediction was made. The model first establishes that each $ϕ_{t n}^{{ins}_{m}} \in Φ$ and that all $c_{x}^{i n s_{m}} \in I$ , otherwise new predicate nodes are created. Then it seeks to identify an existing belief that matches the exact context in which $b_{y} \equiv C^{{ins}_{m}}$ , if found, it confirms that $ϕ_{t n}^{{ins}_{m}} \in W_{b_{y}}$ , otherwise it creates a new relationship $w_{b_{y}}, ϕ_{t n}^{{ins}_{m}}$ . If no direct match is found, the function searches for existing beliefs that host partial matches such that $b_{y} \subset C^{{ins}_{m}}$ , it then creates a new belief $b_{z}$ formed of $w (b_{y}, b_{z})$ and $w (C^{'}, b_{z})$ , where $C^{'} = C^{{ins}_{m}} ∖ b_{y}, b_{z}$ . If no partial matching beliefs are found, it creates the belief from the relevant input nodes directly. For each relationship, the parameters are randomly initialized to prevent biasing the model to a local minimum. By way of an example, this process is demonstrated in Algorithm 1. For an instance, in which partial matches were not required, this is shown as follows:

\begin{aligned} i n s_{m} = (t n_{1}, {c_{1}^{i n s_{m}}, c_{2}^{i n s_{m}}, c_{3}^{i n s_{m}}}, ϕ_{(t n^{1})}^{{ins}_{m}}), \\ I = {c_{1}^{i n s_{m}}, c_{2}^{i n s_{m}}, c_{3}^{i n s_{m}}}, \\ O_{t n} = {ϕ_{(t n^{1})}^{{ins}_{m}}), \dots}, \\ B = {b_{1}} = c_{1}^{i n s_{m}} \land c_{2}^{i n s_{m}} \land c_{3}^{i n s_{m}}, \\ E = {⟨ c_{1}^{i n s_{m}}, e, b_{1} ⟩, ⟨ c_{2}^{i n s_{m}}, e, b_{1} ⟩, ⟨ c_{3}^{i n s_{m}}, e, b_{1} ⟩, ⟨ b_{1}, e, ϕ_{(t n^{1})}^{{ins}_{m}}) ⟩} . \end{aligned}

(8)

However, in the event that a partial belief node already existed, it would create a edges with the existing partial node, such that:

\begin{aligned} b_{1} = c_{1}^{i n s_{m}} \land c_{2}^{i n s_{m}}, \\ b_{2} = (c_{1}^{i n s_{m}} \land c_{2}^{i n s_{m}}) \land c_{3}^{i n s_{m}}, \\ B = {b_{1}, b_{2}}, \\ E = {⟨ c_{1}^{i n s_{m}}, e, b_{1} ⟩, ⟨ c_{2}^{i n s_{m}}, e, b_{1} ⟩, ⟨ c_{3}^{i n s_{m}}, e, b_{2} ⟩, ⟨ b_{1}, e, b_{2} ⟩, ⟨ b_{1}, e, ϕ_{(t n^{2})}^{{ins}_{m}}) ⟩ ⟨ b_{2}, e, ϕ_{(t n^{1})}^{{ins}_{m}}) ⟩} . \end{aligned}

(9)

The model build can be augmented with a priori knowledge during the build phase, where testimonial knowledge can be represented in effect in first-order logic. Relationships between specific predicates can be unified as knowledge with a direct relationship to the output node. This alone would not be sufficient to capture knowledge; therefore, knowledge nodes are initiated with high default parameter values for the weights and biases, often 1, this value has obvious impact on the model, so the value must be tested based on the domain. These parameters can be included or excluded from the optimizer, meaning they can be fixed or adapt with back-propagation. This represents the fact that knowledge could be permanently infallible, which is useful for human defined ‘‘red lines,” or could be feasibly disproved by future evidence. Both are viable options within the model. This feature enables the model to draw on some of the benefits of tools such as the LTN (Badreddine et al., 2022), which reasons over a corpus of provided knowledge, while allowing the system to add or adapt this knowledge based on induction. Unlike comparative models, this is optional and not a pre-requisite, the model can be very performant without the addition of knowledge.

\begin{aligned} Knowledge predicates are defined as follows: \\ K P = {c_{1}, c_{2}, \dots, c_{n}} . \\ A relationship is formed between the knowledge predicates and the output node as follows: \\ ⟨ K P, e_{k}, ϕ_{t, n} ⟩ . \\ The relationship weight is dictated by the knowledge type as follows: \\ w_{e_{k}} = {\begin{cases} w_{m}, & if knowledge is mutable, trainable = True \\ w_{n}, & if knowledge is immutable, trainable = False \end{cases} where w_{m} < w_{n} . \end{aligned}

(10)

4.2. Dynamic Activation

The concept of relevant beliefs is also a separation from conventional ML, which has been seen in neurosymbolic AI through the freezing of specific input nodes and network dissection (Mileo, 2024). The input layer is considered to be all atomic beliefs (those of the lowest fidelity) from a given context graph; only the atomic beliefs represented in the graph are activated; this is propagated through the network. Conventionally, layers in a model are defined by depth; however, as each union of predicates adds additional information to a belief, this is referred to as the fidelity of a belief. Activated atomic beliefs are combined recursively to activate or partially activate higher-fidelity beliefs. Any node that has been activated or partially activated can be considered a relevant belief. In the output layer, all relevant beliefs are passed to the activation function (Figure 4).

\begin{aligned} First the atomic predicates are activated in the input layer: \\ For a_{i} \in A, \\ A_{i n p u t} = {\begin{cases} s, & if a_{i} \in C, \\ 0, & if a_{i} \notin C . \end{cases} \\ Higher fidelity beliefs are recursively activated: \\ \forall b \in B, Activation (b) = w_{b} \sum_{c \in b} Activation (c) if b \cap C \neq \emptyset . \\ Resulting in a set of relevant beliefs that are used to calculate the output node values: \\ R = {b \in A \cup B ∣ Activation (b) > 0} . \end{aligned}

(11)

Figure 4.

An example of the dynamic activation based on relevant beliefs, and how this propagates through the model.

4.3. Optimization

Once the model is built, the model weights are then optimized using conventional back-propagation techniques. Relevant beliefs are activated by passing a scaled distance value, represented within the context graph, where $i_{n} \equiv c_{x}^{i n s_{m}}$ , then propagating the resulting values to the output nodes. The truth value is the target label, which is compared with the output values such that:

loss (t) = Cross Entropy (ϕ_{t n}^{{ins}_{m}}, max (ϕ_{t})) .

(12)The loss is then propagated using an optimization algorithm, such as Adam (?), against the parameters existing within the nodes and edges. As only the relevant nodes were activated, the gradients outside these nodes will be zero, and, therefore, not affected. The model can hold multiple output targets (represented as output layers per target), but the individual forward pass through the model is assessed against a single target object, as such the loss is taken from

ϕ_{t n}

and not

Φ

. The model, as with all learning networks, is heavily influenced by the learning rate, which is managed by a scheduling function. This approach to optimization is applied regardless of whether this is an initial bulk supervised training, self-analysis of traversal of a predicted object or a manually labeled sample by a operator to correct erroneous behavior.

4.4. Prediction Formulation

The generation of an output also has some key separations from a conventional neural network. As mentioned above, no matrix multiplication is conducted as part of the inference process. Although this could have a performance impact, this is offset by the overall sparsity of the model; for a given inference, there may only be a small proportion of the overall model activated at any time. However, the output nodes still need to draw on the precursor nodes to formulate an output. This is done through recursive activation of nodes, in which each node calls back through the network, extracting the output of the relevant beliefs $r b \in R B$ and calculating the output of the node. This function is called each time inference is run, similarly to a conventional predict function. In integrating this function, the model is able to account for new predicates, beliefs and relationships to be integrated into the model. Uniquely, BeliefNet uses an output layer per prediction object, which provides the model with its generalization performance. Each layer has weighted relationships, and means that new output layers can be integrated into the model without having any direct experience of an object and make generalized assessments. The algorithm detailing how the outputs are generated is shown in Algorithm ??. The output formulation for a given target $t n$ is such that:

\begin{aligned} O_{t n} = {ϕ_{(t n^{1})}, ϕ_{(t n^{2})}, \dots ϕ_{(t n^{n})}}, \\ ϕ_{(t n^{n})} = \sum_{n}^{1} n^{'} (r b_{ϕ_{(t n^{n})}}^{n}), \\ t n^{'} = max (O_{t n}) . \end{aligned}

(13)

4.5. Explainability

A key feature of the model structure is the inherent traceability through the model to determine the factors that have led to a given prediction. This can be advantageous in highly regulated domains or environments where human–machine collaboration may be high. The traceability is a direct by-product of avoiding fully connected layers, meaning that an individual belief or input node can be simply and deterministically assessed for its contribution to a given output. The model nodes retain their previous outputs in state, meaning a critical path to prediction can be traced from each output node to the input node by recursively presenting the highest $n$ contributors. This has been integrated directly into the model as an explainability function. A representation of this can then be visualized, as shown in Figure 5. The contribution $C n$ for a given node $i$ to a subsequent node $j$ is defined as the combination of the output $ϕ$ and the weight in the node $w$ :

C n_{i j} = ϕ_{i} \times w_{i j} .

(14)

Figure 5.

The models graph explanation function showing the top 5 critical path contributors to the overall output, this is visualized graphically. Contributions are calculated recursively, with each layer showing the contribution to the subsequent node.

5. Experimentation

To test the BeliefNet approach, we applied the model to a traversability scenario in which it was presented with a pre-segmented and labeled image, and sought to correctly classify the traversability of specific objects within the image. Within this scenario, we sought to test three factors:

–
Terrain classification comparison: How does BeliefNet compare to a static value approach, a graph neural network approach, and a random forest classifier?
–
Data size comparison: How does BeliefNet compare to a graph neural network and random forests as the size of the training data increases?
–
Activation function comparison: How does the model adapt with different combinations of activation functions across the model layers?

A comparative test to an end-to-end model was not conducted, due to a reliance upon both Lidar and imagery for most approaches, and the comparison of a segmented classification and pixel/voxel classification is not a simple translation.
5.1. Dataset

The experimentation dataset is made up of a layered ontology used to label the Yamaha CMU data set (Wang, 2021) to better reflect the complexity of the environment. This enables object class, environmental meta-data, and class properties to be analyzed by BeliefNet. The ontology is hierarchical, with classes, subclasses, and types, increasing the overall class numbers from 11 to 72. Enabling fidelity such as vegetation_grass_low, vegetation_complex_tall, and trail_hardcore_smooth. Instances were generated from this dataset, and a baseline traversability value (as shown in Figure 3) was assigned to each object class. This served as a baseline, as it accurately represents the current terrain classification approach to the assessment of traversability, by directly assigning a value to a given class. To build a ground truth dataset, 300 images were re-labeled with human assessments of the traversability value, enabling humans to extract the image context and make a reasoned assessment on the relative risk associated with each object. This ground truth data is used as the basis for training BeliefNet. The re-labeled samples are then randomly split to provide a training and a test set, with all test metrics being completed by evaluating the test set. This established the framework from which the experimentation was conducted.

5.2. Methodology

The test was targeted at generated responses for the “grass,” “hardcore,” “soil,” “sand,” and “paved” ontology objects, which are the primary traversable objects. Multiple instances existed per training image, meaning that in total there were c.350 training samples. This is a small amount for a traditional complex network, but represents a reasonable amount of varied terrain data that an autonomous system could realistically gather about a given domain. It enables us to test the ability of the model to adapt to smaller perturbations in the domain and data. The test set was extracted as 20% of the overall training set. At all points in the test, this was used to ensure comparability. The random samples were then taken from the training set in increasing increments from 25 samples to the entire data set, and the models for each set were trained. Each model was then tested against the test set and the accuracy was judged on the correct categorization of the risk value against the human adjusted value. This was repeated 15 times and averaged for each model, with a new random test set identified for each iteration. The data holds large variability; due to its size, randomly selecting test data through multiple iterations ensures a broad set of complex challenges, especially zero-shot prediction, are represented in the test.

The comparison models selected were a random forest and a graph neural network (GraphSAGE) (Hamilton et al., 2017) combined with an XGBoost (Chen & Guestrin, 2016) classification head. The methodology for selecting these models was that they separately present a neural and a symbolic approach fit the problem, and in initial testing against a broad set of models demonstrated the most potential for extracting sufficient information from a context-based data structure, when compared to models such as a linear neural network or XGBoost alone. A key contributor to this was the volume of data the experiment was constrained to; this was a conscious choice to ensure any outcomes represented how the model could feasibly be used on a platform. Existing end-to-end approaches favor a continuous traversability classification, rather than discrete, making direct comparison not feasible. The input to BeliefNet is segmented objects from a detection model; for a conventional terrain classification model using state-of-the-art segmentation capabilities such as YOLO architectures (Jocher, 2020) or vision transformers, this object would be allocated a direct traversability value; this is, directly equivalent to the baseline traversability value. As a result, the baseline comparison value represents the performance of such approaches.

5.3. Metrics

As the overall classification metrics in this case are risk-based and incremental, performance can also be assessed by assessing the distance in separation between the predicted and actual values. A model that gets its predictions closer to the actual classification performs better than a model that is further away. To capture this, we will look at both an absolute classification and a fuzzy accuracy, which assesses the score as $+$ / $-$ 1 of the absolute.

5.4. Variables

The baseline accuracy is using the default values for an object based on its ontological class and value, compared with the human-edited values; this would be heavily skewed by sampling, so a consistent baseline from the full dataset was taken as 23% absolute accuracy and 43% with fuzzy accuracy.

In addition to the baseline values, we tested three additional approaches:

–
BeliefNet model as described in this article.
–
A random forest classifier (Breiman, 2001), which was chosen to as a comparator due to its reasoning capacity with small datasets, and its ability to explain its results, making it the most similar in output to BeliefNet
–
A graph neural network, GraphSAGE which uses an LSTM-based architecture (Hawthorne, 2021) to learn and generate context graph embedding and then passes the embedding to an XGBoost algorithm, acting as a classification head, to conduct supervised classification (Chen & Guestrin, 2016).

5.5. Results

When BeliefNet was trained to predict the outputs of the traversable object classes in the ontology (grass, hardcore, soil, sand, complex, and rock), using the full dataset it achieved 47% absolute and 81% fuzzy accuracy; this did not include a priori knowledge. When scaled with the dataset, this is, performed as shown in Figure 7. This test was repeated with only grass objects, as these present the highest proportion of the dataset and are terrain characteristics with the highest variation in the traversability index within the class; the results of which are shown in Figure 8. The average results for each model are shown in Table 1. The comparison between baseline and BeliefNet against multiple prediction objects can be seen in Figure 6, noting that the number of samples is not consistent between object types, which is related to the increased variance in some objects over others; for example, the low grass distribution is significantly lower than tall grass. In the more challenging object, tall grass, due to the higher variation, BeliefNet outperformed the baseline in both absolute and fuzzy accuracy.

Figure 6.

Experiment comparison of BeliefNet and the baseline absolute and fuzzy accuracy for individual grass prediction objects, the variance in prediction value increases in objects left to right.

Figure 7.

Experiment comparison of BeliefNet and a random forest, with a scaled dataset comparing classification of objects “grass,” “sand,” “hardcore,” “complex,” and “soil” risk classifications.

Figure 8.

Experiment comparison of BeliefNet and a random forest, with a scaled dataset comparing classification of grass objects $g r a s s l o w, g r a s s m e d i u m, and g r a s s t a l l$ risk classifications.

Table 1.

The Summary Results Using a Full Dataset Over 15 Iterations With Random Test Sets for Each of the Test Models.

Model	Prediction Objects	Absolute Accuracy	Fuzzy Accuracy
Baseline	All	23%	43%
GraphSAGE + XGBoost	All	33%	52%
Random Forest	Grass	35%	72%
	All	49%	79%
BeliefNet	Grass	35%	75%
	All	47%	81%

The graph embedding model failed to learn effective patterns within the data; this is, likely due to the additional abstraction generated by the embeddings and the small amount of data for a given prediction, preventing the model from being able to generalize effectively. This resulted in the model returning the same value for instances of a given terrain and not identifying any factors that would shift the risk. Even after training using the full dataset, the model returned an absolute score of 33% and a fuzzy score of 52%. The GraphSAGE model is the comparator to a conventional neural network; the inability to converge on a solution demonstrates the importance of a neurosymbolic approach in a complex reasoning task.

The random forest was more successful and was able to make comparable predictions in both actual and fuzzy accuracy, with the full training data achieving 79% fuzzy accuracy, compared to 81% for BeliefNet, as shown in Figure 8. Furthermore, random forests present two additional downsides compared to the BeliefNet model.

The nature of random forests means that it is challenging for them to form predictions across multiple classifications and classification objects. As a result, each classification object, such as grass_low, required its own model. While this is a standard practice, it comes with a number of drawbacks; firstly, it prevents generalized concepts from being formed across multiple terrain types, in effect reducing the training data available to each model, and this will impact domain adaptation. Secondly, in practice, there will be an i/o cost to loading new models, which could be a bottleneck in situations with more than one traversable object. Given the volume of assessments required in a continuous terrain classification, this will have a significant cost. Some instances in the dataset had five target objects, meaning five separate models would need to be loaded for one image. In contrast, BeliefNet is capable of having multiple output layers simultaneously for a single model backbone. This means that the model is able to draw generalized concepts rather than terrain-specific ones, which provides significant advantages when the domain ontology adapts. This can be seen in the data; a random forest was trained for each object, meaning that throughout the training it has always seen a representation of the object previously, whereas it is possible that BeliefNet makes classifications with no prior knowledge of an object. In all evaluation runs, BeliefNet would make a prediction on at least one class that was not in its training distribution. This represents a trade-off between accuracy and generalization and is demonstrated clearly by the separation between absolute accuracy in all prediction objects. Although this is a separation of 3%, it is likely that this is the benefit of having a specific model for each class. While this is beneficial, this is outweighed significantly by the model being able to make predictions on unseen data classes, as the BeliefNet demonstrates.

An additional advantage of BeliefNet over a random forest is related to the fixed inputs required for a random forest. The input data for the model are a fixed shape array with each item in the array reflecting a possible context object and the distance from that object. This has two drawbacks; firstly, in an ontology such as the one used in this model, with more than 70 objects, this results in a very sparse set of input data, which can lead to overfitting (Zhang & Lu, 2022) and may be a contributor to the flat learning profile. Secondly, the fixed nature means that the model cannot adapt to new objects identified within the domain. If a new object was identified, based on a new or adapted sensor classifier, the model would require retraining. In contrast, BeliefNet has a dynamic input length requiring only the predicates that are sensed to be passed, and it is designed to be extensible, and when a new predicate is identified, this can be directly integrated into the model. In this case, weights are initialized with a default value, but can then be fine-tuned, but in a manner which constrains the adaptation only to the relevant predicates, as only they are activated. This prevents having an adverse impact on existing and unrelated concepts. This flexibility and adaptive structure is core to BeliefNet’s domain generalization and establishes it as a through-life model, which grows with the agent’s understanding of the world.

To validate the performance characteristics of the model, we tested the grass sample set using a number of activation functions; in doing so, we were able to see how the model adapts over different combinations. Activation functions were assigned to the input layer and belief nodes separately, noting that they each had separate behaviors. Several functions were used: –

Leaky-rectified linear unit.

–

Linear activation, in effect the identity of the input.

–

Hard-sigmoid.

–

Learnable rectified linear unit (ReLU), which was generated with a learnable scalar parameter $p$ such that $ReLU (x) * P$ .

The experiment sought to identify any key variations in the results from the separate activation functions. Each combination was repeated 15 times and the mean results are shown in Figure 9, using a consistent learning rate of 0.001 and over 15 epochs of learning. The model performed consistently across the models. The best performing combinations were those that involved a linear function at the input layer. As the input is a function of distance, this suggests that the model benefits from retaining the symbolic information. The learnable activation functions performed well, but did not significantly outperform, suggesting that there are sufficient model parameters without the requirement to augment (Table 2).

Figure 9.

Experiment comparison of BeliefNet using separate activation functions (input and belief node) when classifying the grass objects ( $g r a s s l o w, g r a s s m e d i u m, and g r a s s t a l l$ ).

Table 2.

The comparison of varying activation functions across the accuracy metrics.

Model	Fuzzy Acc Var	Abs Acc Mean	Abs Acc Var	Abs Acc Std Dev	Fuzzy Acc Mean	Fuzzy Acc Std Dev
Linear-HardSigmoid	0.003302	0.353033	0.000431	0.020770	0.801267	0.057464
Learnable-LeakyReLU	0.000445	0.327467	0.000934	0.030568	0.735500	0.021083
LeakyReLU-LeakyReLU	0.007640	0.324967	0.004002	0.063263	0.713000	0.087407
Linear-Learnable	0.001807	0.367900	0.004646	0.068162	0.814167	0.042511
HardSigmoid-Learnable	0.004052	0.368600	0.002058	0.045362	0.751433	0.063652

Most performant function in each metric is shown in bold.

6. Discussion

BeliefNet presents an opportunity to provide a unified reasoning engine to support terrain traversal, in a manner which enables an agent to make an informed decision about risk and traversability. It is inherently extensible, meaning that it can use what it has learnt within one domain, and adapt this to unknown environments, and its inherent explainability means that operators can interpret, understand and impact decision making. This approach significantly increases performance when compared with the static value approach, and enhances the flexibility and explainability when compared to an end-to-end model. This article demonstrated the application of the BeliefNet model within an autonomous agent traversability reasoning task; however, this model structure has the potential to be applied more widely across similar tasks with high complexity and underlying logic, which may not be immediately accessible. This could be particularly relevant to domains with high-assurance or regulatory requirements, which traditionally AI struggles to meet.

6.1. Deployment Considerations

When the deployment of BeliefNet to an autonomous system is considered, there are a number of topics worthy of discussion. A key challenge of capability deployment to an autonomous system is that power and space are finite and broadly shared, and there are competing priorities. As a result, the computational overhead of any single system must be considered in the context of the system as a whole. As a component of the navigational system, BeliefNet will be expected to have comparable performance to a perception system likely running at >10 fps. Although explicit benchmarking of speed performance was out of scope for this research, the removal of matrix multiplication could have a negative impact on inference speed as the model scales. This was not seen within this experimentation, but could be mitigated by the set-based nature of nodes within the model, meaning that model size will grow logarithmically with experience. This could also be further enhanced through the addition of more complex conjunctives, such as NOT and OR, which could aggregate beliefs more succinctly.

When considering how BeliefNet fits within a deployed platform, it is valuable to consider the full information processing pipeline. This research explored the cognition element alone, but has dependencies on both the perception module and low-level control of the platform. BeliefNet does not require fixing to any given perception model, or require retraining if the perception model is. However, there is a critical dependency between the two models; for BeliefNet to make accurate context-based predictions, it depends on accurate classifications of objects within the scene. Classification errors could have a potentially greater impact on cognition output. While vision was the focus of this research, BeliefNet was intended to work with any classification modality. Platform low-level control both depends on and provides to BeliefNet. It relies on abstracted traversability values to predict the platform kinematics required to effectively traverse an object of a given value. Furthermore, once traversing an object, the performance of the platform, versus expected, provides a valuable feedback mechanism that can be used directly to inform optimization.

6.2. Further Work

This research outlines the potential for BeliefNet in the domain of complex environment traversability, but there are opportunities for further development, which could enhance its applicability. Firstly, extending the solution further to include the connection of a single, or multimodality sensor module would be the next step towards platform integration. This would also provide an opportunity for Lidar/vision combinations to test the capability of BeliefNet to collaborate across modalities. Secondly, considering the platform conversion of exteroceptive and interoceptive sensing outputs into a traversability assessment, thus creating a full learning loop for the agent, this could be another application of BeliefNet. Another area to be considered is experimentation with the learning rate for manual learning/human intervention such that learning is effective, without adversely skewing model outputs. Finally, this research into the model’s performance was completed against a single objective function; expanding the research to support multi-objective optimization would enable additional agency in more complex situations. For example, the ability for BeliefNet to support the risk/time trade-off when assessing tactical route planning.

7. Conclusion

In this article, we have defined the challenge of traversability assessment for autonomous systems when operating in complex environments, demonstrated the importance of context within predictions, and detailed BeliefNet as a novel neurosymbolic model capable of generating context-based predictions for traversability. BeliefNet is capable of learning through life from the experience of an autonomous agent, providing a method to enhance domain adaptation, and uses causal beliefs to support predictions in unknown situations. By retaining a symbolic structure within the network, it remains explainable and provides operators with the ability to interact with model training directly, enhancing trust. BeliefNet presents an advance towards enabling autonomous system deployment and performance in complex, demanding environments.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Footnotes

ORCID iDs

Tom Scott

Argyrios Zolotas

Yang Xing

References

Agarwal

Shenoy

Mausam

(2021). End-to-end neuro-symbolic architecture for image-to-image reasoning tasks. arXiv. http://arxiv.org/pdf/2106.03121

Agha

Otsu

Morrell

Fan

D. D.

Thakker

Santamaria-Navarro

Kim

S.-K.

Bouman

Lei

Edlund

J. A.

Ginting

M. F.

Ebadi

Anderson

Pailevanian

Terry

Wolf

M. T.

Tagliabue

Vaquero

T. S.

Palieri

Burdick

(2021). NeBula: Quest for robotic autonomy in challenging environments; TEAM CoSTAR at the DARPA subterranean challenge. CoRR. arXiv: 2103.11470.

Badreddine

Garcez

Serafini

Spranger

(2022). Logic tensor networks. Artificial Intelligence (p. 103649). Elsevier. https://doi.org/10.1016/j.artint.2021.103649

Barnes

Hutson

(2024). Natural language processing and neurosymbolic AI: The role of neural networks with knowledge-guided symbolic approaches. DS Journal of Artificial Intelligence and Robotics, 2 (1), 1–13. https://doi.org/10.59232/AIR-V2I1P101

Bauer

J. J.

Eiter

Higuera Ruiz

Oetsch

(2025). Visual graph question answering with ASP and LLMs for language parsing. Electronic Proceedings in Theoretical Computer Science, 416, 15–28. https://doi.org/10.4204/EPTCS.416.2

Beycimen

Ignatyev

Zolotas

(2023). A comprehensive survey of unmanned groundvehicle terrain traversability for unstructured environments and sensor technology insights. Engineering Science and Technology, an International Journal, 47, 101457. https://doi.org/10.1016/j.jestch.2023.101457

Bhat

S. F.

Birkl

Wofk

Wonka

Müller

(2023). ZoeDepth: Zero-shot transfer by combining relative and metric depth.

Bhuyan

B. P.

Ramdane-Cherif

Tomar

Singh

T. P.

(2024). Neuro-symbolic artificial intelligence: A survey. Neural Computing and Applications, 36(21), 12809–12844. https://doi.org/10.1007/s00521-024-09960-z

Breiman

(2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

10.

Bridle

(1989). Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In Advances in neural information processing systems (Vol. 2). Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1989/hash/0336dcbab05b9d5ad24f4333c7658a0e-Abstract.html

11.

Castaño

Beruvides

Haber

R. E.

Artuñedo

(2017). Obstacle recognition based on machine learning for on-chip LiDAR sensors in a cyber-physical system. Sensors, 17(9), 2109. https://doi.org/10.3390/s17092109

12.

Chavez-Garcia

R. O.

Guzzi

Gambardella

L. M.

Giusti

(2018). Learning ground traversability from simulations. IEEE Robotics and Automation Letters, 3(3), 1695–1702. https://doi.org/10.1109/LRA.2018.2801794

13.

Chen

Weitkámper

Malhotra

(2024). Understanding domain-size generalization in Markov logic networks. In A. Bifet, J. Davis, T. Krilavičius, M. Kull, E. Ntoutsi, & I. Žliobaitè (Eds.), Machine learning and knowledge discovery in databases. Research track (pp. 297–314). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-70368-318

14.

Chen

Guestrin

(2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16 (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785

15.

DeLong

L. N.

Mir

R. F.

Fleuriot

J. D.

(2025). Neurosymbolic AI for reasoning over knowledge graphs: A survey. IEEE Transactions on Neural Networks and Learning Systems, 36(5), 7822–7842. https://doi.org/10.1109/TNNLS.2024.3420218

16.

Dosovitskiy

Beyer

Kolesnikov

Weissenborn

Zhai

Unterthiner

Dehghani

Minderer

Heigold

Gelly

Uszkoreit

Houlsby

(2020). An image is worth

16 \times 16

words: Transformers for image recognition at scale. https://openreview.net/forum?id=YicbFdNTTy

17.

Fan

D. D.

Otsu

Kubo

Dixit

Burdick

Agha-Mohammadi

A.-A.

(2021). STEP: Stochastic traversability evaluation and planning for safe off-road navigation. Robotics: Science and Systems, 17, 1–7.

18.

Filitchkin

Byl

(2012). Feature-based terrain classification for LittleDog. In 2012 IEEE/RSJ international conference on intelligent robots and systems (pp. 1387–1392). IEEE. http://ieeexplore.ieee.org/document/6386042/

19.

Fritz

Hamersma

H. A.

Botha

T. R.

(2023). Off-road terrain classification. Journal of Terramechanics, 106, 1–11. https://doi.org/10.1016/j.jterra.2022.11.003

20.

Gasparino

M. V.

Sivakumar

A. N.

Chowdhary

(2024). WayFASTER: A self-supervised traversability prediction for increased navigation awareness. In 2024 IEEE international conference on robotics and automation (ICRA) (pp. 8486–8492). IEEE. https://ieeexplore.ieee.org/document/10610436

21.

Gürtler

Kaminski

B. L.

(2025). Programming and reasoning in partially observable probabilistic environments. arXiv:2506.13491 [cs]. http://arxiv.org/abs/2506.13491

22.

Hamilton

W. L.

Ying

Leskovec

(2017). Inductive representation learning on large graphs. In Proceedings of the 31st international conference on neural information processing systems, NIPS’17 (pp. 1025–035). Curran Associates Inc.

23.

Hawthorne

(2021). Inductive logic. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (spring 2021 edition). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/spr2021/entries/logic-inductive/

24.

Himmelsbach

Wuensche

F. V.

Hundelshausen

H.-J.

(2010). Fast segmentation of 3D point clouds for ground vehicles. In 2010 IEEE intelligent vehicles symposium (pp. 560–565). IEEE. https://doi.org/10.1109/IVS.2010.5548059

25.

Hodgdon

Fuentes

Olivier

Quinn

Shoop

(2021). Automated terrain classification for vehicle mobility in off-road conditions (Technical Report). Engineer Research and Development Center (U.S.). https://hdl.handle.net/11681/40219

26.

Ingram

Moreton

Gancz

Pollick

(2021). Calibrating trust toward an autonomous image classifier. Technology, Mind, and Behavior, 2(1), 1–13. https://doi.org/10.1037/tmb0000032

27.

Inotsume

Kubota

(2022). Terrain traversability prediction for off-road vehicles based on multi-source transfer learning. ROBOMECH Journal, 9(1), 6. https://doi.org/10.1186/s40648-021-00215-3

28.

Jocher

(2020). Ultralytics YOLOv5. https://github.com/ultralytics/yolov5

29.

Junaid Khan

Masood Siddiqui

Saeed Khan

Akram

Jaleed Khan

(2025). MuRelSGG: Multimodal relationship prediction for neurosymbolic scene graph generation. IEEE Access, 13, 47042–47054. https://doi.org/10.1109/ACCESS.2025.3551267

30.

Jung

Lee

Meng

Boots

Lambert

(2024). V-STRONG: Visual self-supervised traversability learning for off-road navigation. In 2024 IEEE international conference on robotics and automation (ICRA) (pp. 1766–1773). https://ieeexplore.ieee.org/document/10611227

31.

Kaelbling

L. P.

Lozano-Pérez

(2013). Integrated task and motion planning in belief space. The International Journal of Robotics Research, 32(9-10), 1194–1227. https://doi.org/10.1177/0278364913484072

32.

Kahneman

(2011). Thinking, fast and slow. Farrar, Straus and Giroux.

33.

Kardaş

Çiçekli

Ulusoy

N. K.

(2013). Learning complex event models using Markov logic networks. In 2013 IEEE international conference on multimedia and expo workshops (ICMEW) (pp. 1–6). IEEE. https://ieeexplore.ieee.org/document/6618413/

34.

Kautz

(2022). The third AI summer: AAAI Robert S. Engelmore memorial lecture. AI Magazine, 43(1), 105–125. https://doi.org/10.1002/aaai.12036

35.

Khot

Balasubramanian

Gribkoff

Sabharwal

Clark

Etzioni

(2015). Exploring Markov logic networks for question answering. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics. https://aclanthology.org/D15-1080.pdf.

36.

Kingma

D. P.

(2015). Adam: A method for stochastic optimization. In Y. Bengio & Y. LeCun (Eds.), International conference on learning representations (ICLR). http://dblp.uni-trier.de/db/conf/iclr/iclr2015.html#KingmaB14

37.

Kokel

(2020). The 6 types of neuro-symbolic systems. Section: posts. https://harshakokel.com/posts/neurosymbolic-systems

38.

Komanduri

Huang

Chen

(2022). SCM-VAE: Learning identifiable causal representations via structural knowledge. In 2022 IEEE international conference on big data (big data) (pp. 1014–1023). IEEE. https://ieeexplore.ieee.org/document/10021114.

39.

Lamb

L. C.

Gori

A. d.

Garcez

Prates

M. O.

Avelar

P. H.

Vardi

M. Y.

(2020). Graph neural networks meet neural-symbolic computing: A survey and perspective. In Proceedings of the twentyninth international joint conference on artificial intelligence, Yokohama, Japan (pp. 4877–4884). International Joint Conferences on Artificial Intelligence Organization. https://www.ijcai.org/proceedings/2020/679

40.

Larson

Trivedi

(2011). Lidar based off-road negative obstacle detection and analysis. In 2011 14th international IEEE conference on intelligent transportation systems (ITSC) (pp. 192–197). IEEE. https://doi.org/10.1109/ITSC.2011.6083105

41.

Lauri

Hsu

Pajarinen

(2023). Partially observable markov decision processes in robotics: A survey. IEEE Transactions on Robotics, 39(1), 21–40. https://doi.org/10.1109/TRO.2022.3200138

42.

LeCun

(2022). A path towards autonomous machine intelligence version 0.9.2, 2022-06-27. OpenReview Archive.

43.

LeCun

Muller

Ben

Cosatto

Flepp

(2005). Off-road obstacle avoidance through end-to-end learning.

44.

Zhu

Zhang

Jiang

Dang

Hou

Shen

Zhao

Shah

S. A. A.

Bennamoun

(2024). Scene graph generation: A comprehensive survey. Neurocomputing, 566, 127052. https://doi.org/10.1016/j.neucom.2023.127052

45.

Lin

(2024). Bayesian epistemology. In E. N. Zalta & U. Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (summer 2024 edition). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2024/entries/epistemology-bayesian/

46.

Lipsky

A. M.

Greenland

(2022). Causal directed acyclic graphs. JAMA, 327(11), 1083–1084.

47.

Afridi

Kang

H. J.

Ruchkin

Zheng

(2024). Surveying neuro-symbolic approaches for reliable artificial intelligence of things. Journal of Reliable Intelligent Environments, 10(3), 257–279.

48.

Luo

Zhang

Yang

Fang

(2024). INSIGHT: End-to-end neuro-symbolic visual reinforcement learning with language explanations. In International conference on machine learning 2024. Proceedings of Machine Learning Research. https://arxiv.org/pdf/2403.12451

49.

Manduchi

Castano

Talukder

Matthies

(2005). Obstacle detection and terrain classification for autonomous off-road navigation. Autonomous Robots, 18(1), 81–102. https://doi.org/10.1023/B:AURO.0000047286.62481.1d

50.

Mao

Gan

Kohli

Tenenbaum

J. B.

(2019). The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision. In International conference on learning representations (ICLR) 2019. arXiv:1904.12584 [cs].

51.

Maturana

Chou

P.-W.

Uenoyama

Scherer

(2018). Real-time semantic mapping for autonomous off-road navigation. In M. Hutter & R. Siegwart (Eds.), Field and service robotics (Vol. 5, pp. 335–350). Springer International Publishing. http://link.springer.com/10.1007/978-3-319-67361-5_22

52.

Mercier

Sperber

(2011). Why do humans reason? Arguments for an argumentative theory. Behavioral and Brain Sciences, 34(2), 57–74. https://doi.org/10.1017/S0140525X1000096

53.

Mileo

(2024). Towards a neuro-symbolic cycle for human-centered explainability. Neurosymbolic Artificial Intelligence, 1 (8), 1–13 https://doi.org/10.3233/NAI-240740

54.

Millidge

Song

Salvatori

Lukasiewicz

Bogacz

(2023). A theoretical framework for inference and learning in predictive coding networks. In The eleventh international conference on learning representations (ICLR). OpenReview. https://openreview.net/forum?id=ZCTvSF_uVM4

55.

Newman

(2023). Descartes’ dpistemology. In E. N. Zalta & U. Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (winter 2023 edition). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2023/entries/descartes-epistemology/

56.

Nitti

Belle

De Raedt

(2015). Planning in discrete and continuous Markov decision processes by probabilistic programming. In A. Appice, P. P. Rodrigues, V. Santos Costa, J. Gama, A. Jorge, & C. Soares (Eds.), Machine learning and knowledge discovery in databases, Lecture Notes in Computer Science (Vol. 9285, pp. 327–342). Springer International Publishing. https://doi.org/10.1007/978-3-319-23525-720

57.

Panigrahi

P. K.

Bisoy

S. K.

(2022). Localization strategies for autonomous mobile robots: A review. Journal of King Saud University - Computer and Information Sciences, 34(8, Part B), 6019–6039. https://doi.org/10.1016/j.jksuci.2021.02.015

58.

Pearl

(2009). Causality (2nd ed.). Cambridge University Press.

59.

Pearl

(2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3), 54–60. https://doi.org/10.1145/3241036

60.

Popescu

M.-C.

Balas

V. E.

Perescu-Popescu

Mastorakis

(2009). Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 8(7), 579–588.

61.

Tang

(2019). Probabilistic logic neural networks for reasoning. In 33rd conference on neural information processing systems. arXiv:1906.08495 [cs, stat].

62.

Rakesh

Guo

Moraffah

Agarwal

Liu

(2018). Linked causal variational autoencoder for inferring paired spillover effects. In Proceedings of the 27th ACM international conference on information and knowledge management (pp. 1679–1682). ACM. https://doi.org/10.1145/3269206.3269267

63.

Redmon

Farhadi

(2017). YOLO9000: Better, faster, stronger. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6517–6525). IEEE. http://ieeexplore.ieee.org/document/8100173/

64.

Riegel

Gray

Luus

Khan

Makondo

Akhalwaya

I. Y.

Qian

Fagin

Barahona

Sharma

Ikbal

Karanam

Neelam

Likhyani

Srivastava

(2020). Logical neural networks. Consortium for reliability and reproducibility (CoRR). https://arxiv.org/abs/2006.13155v1

65.

Schölkopf

Locatello

Bauer

N. R.

Kalchbrenner

Goyal

Bengio

(2021). Toward causal representation learning. Proceedings of the IEEE, 109(5), 612–634. https://doi.org/10.1109/JPROC.2021.3058954

66.

Seo

Sim

Shim

(2023). Learning off-road terrain traversability with self-supervisions only. IEEE Robotics and Automation Letters, 8(8), 4617–4624. https://doi.org/10.1109/LRA.2023.3284356

67.

Seraji

Howard

(2002). Behavior-based robot navigation on challenging terrain: A fuzzy logic approach. IEEE Transactions on Robotics and Automation, 18(3), 308–321. https://doi.org/10.1109/TRA.2002.1019461

68.

Sevastopoulos

Oikonomou

K. M.

Konstantopoulos

(2019). Improving traversability estimation through autonomous robot experimentation. In D. Tzovaras, D. Giakoumis, M. Vincze, & A. Argyros (Eds.), Computer vision systems (pp. 175–184). Springer International Publishing.

69.

Sheth

Roy

Gaur

(2023). Neurosymbolic artificial intelligence (why, what, and how). IEEE Intelligent Systems, 38(03), 56–62. https://doi.org/10.1109/MIS.2023.3268724

70.

Silver

Bagnell

J. A.

Stentz

(2010). Learning from demonstration for autonomous navigation in complex unstructured terrain. The International Journal of Robotics Research, 29(12), 1565–1592. https://doi.org/10.1177/0278364910369715

71.

Siva

Wigness

Zhang

J. G. R. H.

(2019). Robot ddaptation to unstructured terrains by joint representation and apprenticeship learning. In Robotics: Science and systems XV (Vol. 15). MIT Press.

72.

Wang

Shao

Yao

Wang

(2021). GR-LOAM: LiDAR-based sensor fusion SLAM for ground robots on complex terrain. Robotics and Autonomous Systems, 140, 103759. https://doi.org/10.1016/j.robot.2021.103759

73.

Sztyler

Civitarese

Stuckenschmidt

(2018). Modeling and reasoning with ProbLog: An application in recognizing complex activities. In 2018 IEEE international conference on pervasive computing and communications workshops (PerCom Workshops) (pp. 259–264). IEEE. https://ieeexplore.ieee.org/document/8480299

74.

UK MOD (2022). Defence artificial intelligence strategy. https://www.gov.uk/government/publications

75.

Valada

Vertens

Dhall

Burgard

(2017). AdapNet: Adaptive semantic segmentation in adverse environmental conditions. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 4644–4651). IEEE. http://ieeexplore.ieee.org/document/7989540/

76.

Vecchio

Palazzo

Guastella

D. C.

Giordano

Muscato

Spampinato

(2024). Terrain traversability prediction through self-supervised learning and unsupervised domain adaptation on synthetic data. Autonomous Robots, 48(2), 4. https://doi.org/10.1007/s10514-024-10158-4

77.

Visca

Kuutti

Powell

Gao

Fallah

(2021). Deep learning traversability estimator for mobile robots in unstructured environments. In C. Fox, J. Gao, A. Ghalamzan Esfahani, M. Saaj, M. Hanheide, & S. Parsons (Eds.), Towards autonomous robotic systems (pp. 203–213). Springer International Publishing. https://doi.org/10.1007/978-3-030-89177-022

78.

von Eschenbach

W. J.

(2021). Transparency and the black box problem: Why we do not trust AI. Philosophy & Technology, 34(4), 1607–1622. https://doi.org/10.1007/s13347-021-00477-0

79.

Wang

(2021). Yamaha-CMU off-road dataset - AirLab. https://theairlab.org/yamaha-offroad-dataset/

80.

Xie

Song

Zhao

Huang

Zhang

(2023). Circular accessible depth: A robust traversability representation for UGV navigation. IEEE Transactions on Robotics, 39(6), 4875–4891. https://doi.org/10.1109/TRO.2023.3308780

81.

Yang

Ang

Y. Z.

Guo

Zhou

Zhang

Liu

(2022). Panoptic scene graph generation. In S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, & T. Hassner (Eds.), Computer vision – ECCV 2022. Lecture Notes in Computer Science (Vol. 13687, pp. 178–196). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-19812-011

82.

Yang

Liu

Chen

Shen

Hao

Wang

(2021). CausalVAE: Disentangled representation learning via neural structural causal models. In 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 9588–9597). IEEE. https://doi.org/10.1109/CVPR46437.2021.00947

83.

Zhang

(2022). Sparse data machine learning for battery health estimation and optimal design incorporating material characteristics. Applied Energy, 307, 118165. https://doi.org/10.1016/j.apenergy.2021.118165

84.

Zhao

Liu

(2023). Causal ML: Python package for causal inference machine learning. SoftwareX, 21, 101294. https://doi.org/10.1016/j.softx.2022.101294

85.

Zhao

Wang

(2024). PhysORD: A neuro-symbolic approach for physics-infused motion prediction in off-road driving. In 2024 International conference on intelligent robots and systems (IROS). arXiv:2404.01596 [cs].

BeliefNet: A Neurosymbolic Model for Context-Based Traversability Predictions in Complex Environments

Abstract

Keywords

1. Introduction

2.1. Traversability Assessment

2.2. Agent Cognition

2.3. Neurosymbolic AI

3. Approach

3.1. Overview

3.2. High Level Structure

3.3.1. Context Graph

3.4.1. BeliefNet Nodes

4.1. Model Build Process

5.2. Methodology

5.3. Metrics

5.4. Variables

6.1. Deployment Considerations

6.2. Further Work

7. Conclusion

Funding

Conflicting Interests

Footnotes

ORCID iDs

References