Sage Journals: Discover world-class research

Abstract

Although neural-symbolic techniques for symbol grounding in sensory data have shown significant effectiveness, they still require substantial training. This article revisits symbolic-only approaches by introducing a new algorithm for creating hierarchical concept structures from spatial sensory data. The method is based on Bateson’s idea of difference as the fundamental element of concept formation. By leveraging this principle, the algorithm extracts atomic features from raw data through basic sequential comparisons within a stream of multivariate numerical values. Experiments carried out on a set of common objects indicate that the method can successfully discriminate and assimilate categories as needed without training. The results show that the model improves on the neural networks it has been tested against, which required more than 400 training examples for the same task. The results also show that the model can generate rich conceptual structures and human-like representations, which (i) facilitate high composability, (ii) support formal reasoning, (iii) inherently enable generalisation, and (iv) possess potential for generative behaviour. Consequently, this approach offers a compelling contribution to symbol grounding and neural-symbolic research by providing a seamless algorithm to bridge perception and conceptual knowledge.

Keywords

Bateson symbol grounding sensory data spatial data multivariate time series concept emergence concept extraction concept generation formalconcept analysis (FCA)

1. Introduction

The motivation behind this research is to investigate new symbolic methods capable of generating rich and composable conceptual representations of spatial objects. Humans possess remarkably sophisticated encodings of the world, characterised by their exceptional ability to differentiate and assimilate the realities they perceive. Even without extensive training, these encodings enable the formation of highly flexible and composable concept structures that support logical operations and exhibit generative capabilities. Additionally, these structures can be expressed semantically through symbols. Therefore, bridging the gap between sensory data and such internal models of the world is a pivotal challenge in artificial intelligence (AI). Symbolic or logic-based AI systems are often credited with replicating certain crucial aspects of human thought, such as compositionality, causality, and reasoning capacity (Newell, 1980; Simon, 1995). However, they face substantial hurdles when confronted with raw spatial data, including the symbol grounding problem (SGP) as formulated by Harnad (1990) (see Section 2.2), and a computational complexity bottleneck. In contrast, the connectionist approach, which emerged from the work of early cyberneticians Wiener (1948) and Pitts & McCulloch (1947), has proven exceptionally adept at handling this type of input, exemplified by pixel-based images. This proficiency has been firmly established through neural network paradigms, particularly deep learning (Lecun et al., 2015), which employ error optimisation as their core learning mechanism. However, those strengths featured by symbolic models turn out to be important limitations in connectionist methods.

In this context, the emergence of neural-symbolic models (NSMs) (Garcez et al., 2015) has drawn attention as a promising approach to address the limitations in both symbolic and connectionist AI paradigms. Indeed, the literature on NSMs has witnessed substantial growth in recent years (Hitzler et al., 2022; Sarker et al., 2021), although much earlier contributions can also be found (Fdez-Riverola & Corchado, 2003; Zhou et al., 2003). These hybrid models aim to overcome important shortcomings previously identified by prominent figures in AI research (Bengio, 2017; Cristianini, 2014; Lake et al., 2015; Xia et al., 2021). However, while NSMs are very promising, several challenges remain, particularly with regard to the SGP.

One of the primary challenges is the requirement for neural networks to be trained on labelled data, which introduces external symbolic information, potentially compromising the SGP’s objective of grounding symbols in sensory data. This is evident in the case of neural networks used for classification tasks, where labelled data is essential for training. In contrast, certain NSM approaches may appear to circumvent the reliance on external symbolic information. Notably, neural autoencoders can be harnessed to extract clusters from complex data in an unsupervised or self-supervised manner. Then, arbitrary symbols may be assigned to these clusters for further processing. However, a closer examination of some of these works (Section 3) suggests that they may not fully adhere to the zero semantical commitment condition (Z condition) proposed by Taddeo & Floridi (2005). The ‘Z condition’ asserts that an AI agent should autonomously elaborate its own semantics for the symbols it manipulates, without relying on external symbolic knowledge. Another important limitation of these methods is the need for large training sets, even when autoencoders are employed in a self-supervised fashion (and no labelling is required).

The method proposed in this article, BIGA (Bateson-inspired grounding algorithm), is primarily intended for the domain of real-world objects, restricted in this initial iteration of the method to multi-attribute one-dimensional entities. Nevertheless, it is envisioned that future work will extend this scope to incorporate higher dimensional and more complex data modalities (such as images, videos, audio, etc.). Examples include the trajectories of particles suspended in a fluid, or, more generally, any one-dimensional geometry that can be derived from, for instance, the spatial contours of material forms. Alternatively, more abstractly, a curve in space with varying width and colour (or any other set of properties) can also be considered. In all of these cases, objects can be fully represented as multivariate time series.

The algorithm employs a straightforward approach to extract symbols from a multivariate stream of numerical data using a basic feature extraction process. This operation occurs continuously as the sensor moves along an object, similar to sensing the shape of an object by swiping a finger along its contour. Thus, the act of sensing is always modelled as a dynamic sequential process, which is an essential aspect of this work. The method draws inspiration from Bateson’s concept that an idea is essentially a difference: ‘a difference that makes a difference’ (Bateson, 1999). Similarly to the approach presented by Cárdenas-García (2022) and Cárdenas-García & Ireland (2020), the proposed sensor generates qualitative information by encoding the quantitative aspects of continuous values through a comparator element. The fundamental features analysed in this method are whether two successive values $(a_{1}, a_{2})$ in the stream are equal $(a_{2} = a_{1})$ , or the second is greater than the first $(a_{2} > a_{1})$ , or vice versa $(a_{2} < a_{1})$ . These elementary comparisons are summarised as ${=, >, <}$ . For example, two consecutive $x$ -coordinate values of a curve ( $x_{1} = 2, x_{2} = 3$ ) are converted to the symbolic description ( $x >$ ), which means that comparing $x_{2}$ to $x_{1}$ yields ( $x_{2} > x_{1}$ ). Subsequently, more complex features are recursively extracted by further executing (symbolic) comparisons of the previously obtained symbols.

The work presented here hinges on the notion that sensing does not primarily grasp magnitudes, but rather, it discerns differences in magnitude. Humans exhibit remarkable proficiency in detecting variations and proportions, but fall short in assessing absolute magnitudes with precision. This inherent limitation has led to the development of tools for precise measurement. Biological neurons also follow a similar pattern, conveying signals not by their intensity but in an ‘all-or-none’ manner (firing or not firing) (Jr, 1994). While most connectionist machine learning models utilise floating point values as input and propagate their magnitudes through the network, Bateson’s differential ideas offer a promising foundation for constructing human-like cognitive models. The proposed model deviates from the connectionist error optimisation paradigm by simultaneously (i) eliminating the need for training to generate high-level concepts and, (ii) adhering to the all-or-none signal transmission nature of biological neurons. This novelty opens up the possibility of developing bio-inspired connectionist systems based on the firing scheme found in nature, capable of constructing rich conceptual representations without the need for training. An early sketch of ideas regarding the design of such system can be found by de Miguel Rodríguez (2023).

Regarding the SGP, it is not the aim of this study to propose a definitive solution, but to establish a foundation for future research by exploring alternative approaches. It is acknowledged that the method presented here does rely on external information, as it assigns semantic values to the tokens or symbols representing the atomic comparators and the sensor variables. And for this reason, it violates the Z condition introduced earlier. However, it is worth considering the possibility of leaving these semantic values unassigned, essentially rendering them unlabelled or anonymous. This could be achieved in an embodied implementation of the model, where these tokens are mapped to the system’s innate or ‘wired’ discretisation capacity. In this scenario, consensus on the meanings of these unspecified tokens could be established among a group of agents, employing mechanisms similar to those discussed by Taddeo & Floridi (2005). These mechanisms could involve schemes such as Vogt’s ‘guess game’ (Vogt, 2003) or later attempts by Taddeo & Floridi (2007). This approach shows two potential advantages:

Reduced complexity: The algorithm is based on the use of the minimum number of comparators that allow establishing a maximum differentiation between the values obtained by the sensors. For example, in the case of continuous attributes, three comparators $(<, >, =)$ are used, which is the minimum number necessary to obtain a total order in the attribute domain. A similar decision could be made with finite domain attributes where the semantics of the attribute should indicate whether this same set of comparators is required, or, whether it could be further reduced, where the minimal case requires only a single comparator.

Bottom-up emergence: The model possesses an inherent generative ability, stemming from its full symbolic traceability, from the sensor level to the highest levels of generated concepts. This generative capacity would enable producing outputs into the environment, representing aspects of the internal representations of objects. These outputs could then be perceived by other processes (agents), initiating a consensus-building loop of interactions. While the specifics of this multilateral agreement process remain unclear, there is potential for the emergence of a rudimentary language or alphabet without external interference, and it could hypothetically provide a path towards circumventing the SGP in future work.

The proposed method, based on the recursive computation of differences, might seem simplistic and potentially limited in its discriminative power at a first glance. However, the results demonstrate surprising potential for common objects, particularly when incorporating multisensory data. In this sense, for example, when perceiving a curve with a robotic or human arm, not only positional coordinates

(x, y)

but also angles at the arm’s end and at the hand’s and elbow’s joints can be utilised. The combination of symbolic features extracted from these different sets of coordinates and angles can lead to richly descriptive representations.

Although the method may face limitations in certain domains, it holds promise for achieving satisfactory discrimination in others. Data domains where the patterns rely heavily on subtle signal magnitude structures will encounter limitations, including fields such as financial time-series analysis or machinery fault detection using sound or vibration spectra. Conversely, areas where patterns are better described conceptually or through relative values and proportions are ideal for this method. Examples include the field of design, handwritten text recognition, and the characterisation of spatial trajectories, which may involve speed and acceleration profiles among other attributes. In short, if the accuracy in distinguishing everyday objects proves to be sufficient, the model can leverage its inherent advantages to address several critical aspects of contemporary AI.

As a summary, the following items capture the spirit of the model, which:

Operates on qualitative differences rather than quantitative magnitudes, aligning with human cognition.

Generates rich and composable representations without extensive training.

Exhibits complete traceability, enabling clear links from sensor data to the highest-level concepts.

Supports formal reasoning, allowing for logical manipulation of concepts.

Is naturally equipped for generalisation, enabling the application of concepts to new situations and contexts.

Generates human-like semantic concept structures, capturing the essence of human understanding.

Provides multiple conceptual definitions of the same object, reflecting the flexibility of human cognition.

Is inherently generative, allowing for the creation of new concepts and representations.

Requires minimal external symbolic input, with only three comparator symbols and one symbol per sensor variable. This makes it conducive to future development of bio-inspired implementations and new avenues to tackle the SGP.

Adopts a bottom-up approach, building complex concepts from basic atomic elements, mimicking the process of language acquisition.

Utilises a straightforward algorithm, simplifying its implementation and making it easier to build upon for future challenges and applications.

In essence, the main contribution of the method is that it provides a simple algorithm to bridge perception and semantic conceptual structures seamlessly, without either (i) necessitating combining models from different AI paradigms as is the case of neural-symbolic models, or (ii) limiting the model’s ability to create vast conceptual richness by allowing it to generate its own ‘labels’ in a bottom-up approach.

2. Fundamentals

This work aims to generate conceptual structures from atomic features acquired from sensory data. The term ‘concept’ has many definitions across a wide number of disciplines (Goguen, 2005). In the realm of computer science, these definitions range from the notion of ‘universals’ introduced by the pioneer of cybernetics, Wiener (1948), to that of ‘patterns’ handled by connectionists. However, most approaches to concept learning come from symbolic AI, with a significant emphasis on semantics. An extensive review of these approaches can be found in the work of Goguen (2005), which includes the geometrical conceptual spaces of Gärdenfors (1990), the symbolic conceptual spaces of Fauconnier, the information flow of Barwise and Seligman, the formal concept analysis (FCA) of Wille (1982), the lattice of theories of Sowa (2005), and the conceptual integration of Fauconnier and Turner. While concepts in these theories can still be viewed as a form of ‘pattern’, they possess an intrinsic symbolic structure that facilitates compositionality, generality, semantics, reasoning, explainability, and generative capabilities. Thus, the term concept in AI encompasses not only patterns, but a more fundamental suite of human-like cognition abilities. Due to the strong resemblance between the conceptual structures targeted in this study and those found in FCA, the latter has been selected for the initial iteration of the method.

2.1. Formal Concept Analysis

FCA offers a robust engine for pattern learning, reasoning capabilities, and an organised hierarchy of concept networks represented as formal lattices (Wille, 1982).

FCA relies on the idea of a formal context, which consists of a collection of attributes, objects, and their relationships. Formally, a formal context $K := (G, M, I)$ consists of two sets $G$ (the objects) and $M$ (the attributes), and a relation $I \subseteq G \times M$ . The relation $(g, m) \in I$ can be read also as the object $g$ has the attribute $m$ .

It is usual to represent a formal context through a binary table/matrix $G / M$ , where $G / M (g_{i}$ , $m_{j}) = 1$ if and only if the object $g_{i}$ has the attribute $m_{j}$ (see Table 1).

Table 1.
A Simple Formal Context Example With Four Objects and Three Attributes.

$G / M$ $m_{1}$ $m_{2}$ $m_{3}$

$g_{1}$ 0 1 1

$g_{2}$ 1 0 1

$g_{3}$ 1 0 0

$g_{4}$ 0 1 0

$G / M$	$m_{1}$	$m_{2}$	$m_{3}$
$g_{1}$	0	1	1
$g_{2}$	1	0	1
$g_{3}$	1	0	0
$g_{4}$	0	1	0

Given a set $A \subseteq G$ , the set of attributes common to the objects in $A$ is denoted by $A^{'}$ . Similarly, given $B \subseteq M$ , the set of objects which have all attributes in $B$ is denoted by $B^{'}$ .

A formal concept of $K$ is a pair of sets $(A, B)$ with $A \subseteq G$ , $B \subseteq M$ , verifying $A^{'} = B$ and $B^{'} = A$ . The set of objects $A$ is known as the extent, and the set of attributes $B$ as the intent, of the formal concept $(A, B)$ .

Given a formal context, FCA allows arriving at a concept lattice, which is a hierarchical network of relationships between the formal concepts obtained from the context (each node in the lattice is a concept): if $(A_{1}, B_{1})$ and $(A_{2}, B_{2})$ are concepts of a formal context, then $(A_{1}, B_{1})$ is called a subconcept of $(A_{2}, B_{2})$ (or $(A_{2}, B_{2})$ is a superconcept of $(A_{1}, B_{1})$ ), denoted as $(A_{1}, B_{1}) \leq (A_{2}, B_{2})$ , provided that $A_{1} \subseteq A_{2}$ (which is equivalent to $B_{2} \subseteq B_{1}$ ). The relation $\leq$ is called the hierarchical order (or simply order) of the concepts. The set of all concepts of the formal context $(G, M, I)$ ordered in this way is denoted by $B (G, M, I)$ , and is called the concept lattice of the context.

The concept lattice corresponding to the formal context in Table 1 is shown in Figure 1, where, for example, the node $(G_{1})$ is a concept with intent ${m_{2}, m_{3}} \subseteq M$ and extent ${g_{1}} \subseteq G$ .

Figure 1.

A simple formal concept lattice example.

Works like Borrego-Díaz & Páez (2022) have explored the potential of these models in the field of explainable AI. However, implementing FCA on detailed data (such as raw sensor data) has been a persistent challenge. Due to its combinatorial nature, applications often results in large and complex lattices that are computationally intensive and contain a large number of irrelevant concepts. Many studies have focused on complexity reduction to address this issue. Some of the first approaches in this regard were led by Bělohlávek exploring constraints based on the attribute dependency formulas (Belohlávek & Sklenár, 2005a), attribute equivalence (Belohlávek et al., 2004), hierarchically ordered attributes (Belohlavek et al., 2004), and the reduction of fuzzy lattices using hedges (Belohlávek & Vychodil, 2005). An important contribution to complexity reduction was the introduction of granular computing (Yao, 2000; Zadeh, 1979) to the realm of FCA (Belohlávek & Sklenár, 2005b; Wu et al., 2009), by handling floating-point values by relying on unsupervised machine learning techniques to automatically generate ‘granules’ or clusters according to the desired complexity level. Other strategies include variable threshold models (Zhang et al., 2007), Rough concept analysis (Kent, 1996; Saquer & Deogun, 1999), and Fuzzy concept analysis (Saquer & Deogun, 2001), which emerged as important contributions to the field and have been widely adopted today. Despite these proposals and more recent efforts to improve computing efficiency (Mouakher et al., 2021), enhance flexibility (Min & Kim, 2019), and reduce complexity in concept lattices (Aragón et al., 2021; Hao et al., 2021), research on FCA applied to complex sensor data remains limited, with only a few examples such as Boukhetta’s work Boukhetta et al. (2020), where FCA is used to extract sequential patterns from interval-based sequences.

2.2. The Symbol Grounding Problem

In 1990, Harnad introduces the SGP (Harnad, 1990), expanding on Searle’s earlier ‘Chinese Room Argument’ (Searle, 1980). Harnad writes: ‘ $\dots$ he pointed out that if the Turing test were conducted in Chinese, then he himself, Searle (who does not understand Chinese), could execute the same programme that the computer was executing without knowing what any of the words he was processing meant. So if there’s no meaning going on inside him when he is implementing the programme, there’s no meaning going on inside the computer $\dots$ ’ (Harnad, 2006). This argument reveals that since meanings are assigned to symbols in largely arbitrary manners, a model that manipulates these symbols lacks understanding of the meanings they represent. Symbols, naturally, can be grounded or linked to their meanings through direct assignment, meaning that the expert creating the model might hard-code those connections within the programme. However, this option presents some obvious limitations. The question then becomes whether an AI system can independently develop a set of symbols that are inherently linked to their meanings without any external input. In his seminal paper, Harnad does not provide a formal definition of the SGP. Instead, the problem is formulated through examples and the following questions: ‘How can the semantic interpretation of a formal symbol system be made intrinsic to the system, rather than just parasitic on the meanings in our heads? How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols?’.

Since the formulation of the SGP, there have been a number of attempts to develop models capable of addressing it. Many of these efforts were collected and discussed early on by Taddeo & Floridi (2005), where they introduced the notion of the zero semantical commitment condition (or Z condition) as a summary of the ultimate requirement for any valid solution to the SGP. In words of the authors, (i) ‘no semantic resources (some virtus semantica) should be presupposed as already pre-installed in the artificial agent’ and (ii) ‘no semantic resources should be uploaded from the ‘‘outside” by some deus ex machina already semantically-proficient’. The implications of this condition are far reaching: not only do symbols need to emerge from sensory data, but also, the artificial agent needs to learn autonomously how to identify those symbols as icons that have been assigned a certain perceptual reality and understand how to operate with them. Otherwise, there is the assumption that semantic capabilities are already present in the agent, thus failing the Z condition. In this context, the SGP might be crucial for comprehending how humans, thousands of years ago, developed a symbolic and semantic system purely from sensory inputs. Consequently, the SGP intersects various fields, including AI, linguistics, cognitive science, philosophy, among others. Recent studies are available by Nagoev et al. (2023), Dushkin & Stepankov (2023), Li (2022), and Chang et al. (2020).

3. Related Works

Recent advances in neural-symbolic research have witnessed a convergence of the strengths of neural and connectionist models with the benefits of symbolic and logic-based AI. SATNet Wang et al. (2019), a notable work in this domain, achieves logical reasoning within deep learning models by incorporating a differentiable maximum satisfiability solver (MAXSAT). The model transforms the task of learning a logical structure from data into the acquisition of a MAXSAT solution for a well-defined problem instance. This framework was further refined by Topan et al. (2021), where the task of classifying digits from the MNIST dataset was seamlessly integrated into SATNet. This was carried out without explicit supervision while simultaneously learning the rules of Sudoku and solving Sudoku boards. In doing so, the authors claim to have addressed the SGP, despite the intricate symbolic and logic knowledge required by the model to solve MAXSAT problems. In their study, the $0 - 9$ digits are first identified through a clustering algorithm, but without assigning any digit labels to the clusters (no supervision). Subsequently, the rules and solutions of Sudoku are learned simultaneously through examples. During this process, the digit values are revealed as if the model were solving a system of equations. This process introduces an implicit supervision mechanism, as the multiple solutions presented to the network provide sufficient information to decipher the identity of each digit.

A similar approach was adopted by Dai et al. (2019), where the authors proposed a neural network architecture to perceive pixel data from images and a logic-based reasoning engine to extract rules and knowledge from the former. To translate the patterns learned by the network into symbols that can be processed by the reasoning engine, the model assigns pseudo-labels to these patterns and then resolves a logic puzzle to determine the correct label assignment. As in the previous work discussed, this approach relies on a specific problem statement with a known solution. This puzzle-solving approach resembles the concept employed by Asai & Fukunaga (2017), where actual puzzles are solved. Another intriguing hybrid approach is presented by Garnelo et al. (2016), drawing inspiration from reinforcement learning. Their method also features a neural perception module and a symbolic reasoning engine. However, in their case, the interface between the two modules is a symbolic ontology tailored to the context of a learning agent. The ontology is explicitly uploaded to the agent based on the characteristics of its environment. This approach is particularly relevant to the work presented in the present article, since the emergent concept structures developed here could potentially serve as the ontology layer proposed in their work; thus, removing the necessity of prior ontology modelling.

In their insightful paper (Evans et al., 2021b), the authors explore the intricacies of ‘sense-making’ from a continuous stream of sensory input. They challenge the prevailing notion that prediction, retrodiction, and imputation of missing values alone are sufficient evidence for sense-making. Instead, they argue that the construction of a symbolic theory that explains the underlying patterns and relationships in the data, is the hallmark of genuine sense-making. This perspective is consistent with previous research suggesting that the ability to construct explanatory theories is a crucial element of human common sense. Building on this foundation, the authors explore a possible definition of the underlying mental model that emerges when one makes sense of sensory input, and show how this mental model implicitly enables predictive, retrodictive, and imputational abilities. However, the discrete nature and limited range of sensory data handled by the model in its current form are recognised as significant limitations. The examples provided to demonstrate the model’s capabilities, such as elementary cellular automata, drum rhythms, and other relatively simple data streams, underscore the need for a broader applicability. To address this shortcoming, the authors introduce neural networks into their subsequent work (Evans et al., 2021a), extending the reach of the model to more complex and continuous data domains. Although this integration of neural networks, requires training, it also promises to enhance the model’s ability to make sense of real-world sensory data streams. This paves the way for more sophisticated AI systems capable of true understanding.

A noteworthy aspect present in the NSMs above is their capacity to learn logic-based rules without extensive training. This constitutes a distinct advantage over traditional pattern recognition tasks that require massive supervision. Additionally, their inherent ability to extract knowledge from data without overfitting is particularly appealing, as it aligns with human learning processes and holds promise for more efficient and generalisable AI systems. The methodology presented in this article further exemplifies this trend, demonstrating how concept structures can be learned from real-world data without the need for vast sets of labelled examples. This facilitates the development of more robust and adaptable AI systems, capable of acquiring knowledge from the world without the burden of extensive labelling.

In a separate line of research, a notable study from 2015 Lake et al. (2015) explores two core questions: (i) How do humans acquire new concepts from a handful of examples? And (ii) How do humans develop such abstract, rich, and flexible representations? The first question sheds light on the realm of unsupervised or weakly supervised concept emergence and out-of-distribution learning, while the second addresses the crucial importance of natural compositionality in concept structures. Their method introduces the Bayesian programming learning framework, which learns simple stochastic programmes to represent concepts. These programs function as probabilistic generative submodels, and the model as a whole learns by fitting them to a background set of data using only a few samples per category. The implementation provided utilises the Omniglot dataset as its background set of data, encompassing both pixel and stroke data from handwritten characters. It is essential to note that in this model, the building blocks of concepts are predefined as subparts, parts, and spatial relations. Subparts represent strokes separated by brief pauses in the pen, while parts are defined by pen-down and pen-up events. Despite the general applicability of the methodology, its implementation is highly dependent on the specific nature of the dataset, provided in the form of a library rather than emerging from raw sensor data. Nevertheless, this approach shares several notable similarities with the methods presented here. A particularly interesting observation is that models capable of learning and evolving rich conceptual representations must simultaneously exhibit both generative and learning capabilities in a seamless manner. This bidirectional property is closely related to the central thesis of the present work.

Considerable research on extracting symbolic knowledge from sensory data has also focused specifically on concepts. A notable contribution is Bechberger (2021), which builds upon prior work on Conceptual Spaces (Bechberger & Kühnberger, 2018) as proposed by Gärdenfors. Their approach is supported by logic tensor networks, a type of neural network employing fuzzy membership functions. Conceptual spaces offer a vector representation of conceptual knowledge as regions within a feature space, providing a more natural approach to handle the grounding of sensory data. The combination of logic tensor networks and conceptual spaces results in a model that is capable of emerging concept structures directly from sensory data, while simultaneously leveraging the powerful learning engine provided by neural networks. The approach, then, relies on mapping symbolic knowledge into vector-based spaces. As a consequence, the extracted knowledge becomes somewhat more indirect and less semantically transparent. Finally, Nevens et al. (2020) specifically addressed the challenge of bridging the gap between continuous observations and symbolic concepts within the context of ‘grounded’ learning within the domain of classifying volumetric primitives from the CLEVR dataset (Johnson et al., 2017). Attributes such as the number of corners of shapes are extracted using computer vision libraries, including continuous-valued attributes such as colour, and other complex attributes such as the ratio between the area of the object region and the area of the rotated bounding box. Ultimately, the model is trained using both a weighted schema for concept relevance in a tutor-learner scenario.

Table 2 presents a schematic comparison of these works. It should be noted that this summary is merely a guide from the point of view of the authors, and should be taken rather loosely; most of the aspects indicated in the table are not ’black or white’, and long debates could be carried out considering the classification assigned to each work.

Table 2.
Comparison of Main Related Works.

Unsupervised No external Sensory data Semantic Application

learning knowledge encoding ability capacity scope

Wang et al. (2019) $\circ$ $\circ$ $\circ$ $∙$ $∙ ∙$

Topan et al. (2021) $\circ$ $∙ ∙$ $∙ ∙$ $∙$ $∙$

Dai et al. (2019) $\circ$ $∙ ∙$ $∙ ∙$ $∙$ $∙$

Asai & Fukunaga (2017) $\circ$ $∙ ∙$ $∙ ∙$ $∙$ $∙$

Garnelo et al. (2016) $\circ$ $\circ$ $∙ ∙$ $∙$ $∙ ∙$

Evans et al. (2021b) $∙ ∙$ $∙ ∙$ $\circ$ $∙ ∙$ $∙$

Evans et al. (2021a) $\circ$ $∙ ∙$ $∙ ∙$ $∙$ $∙ ∙$

Lake et al. (2015) $∙ ∙$ $∙$ $∙$ $\circ$ $∙$

Bechberger (2021) $\circ$ $\circ$ $∙ ∙$ $∙$ $∙ ∙$

Nevens et al. (2020) $\circ$ $\circ$ $∙ ∙$ $∙$ $∙ ∙$

BIGA (present method) $∙ ∙$ $∙$ $∙$ $∙ ∙$ $∙$

	Unsupervised	No external	Sensory data	Semantic	Application
Wang et al. (2019)	$\circ$	$\circ$	$\circ$	$∙$	$∙ ∙$
Topan et al. (2021)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Dai et al. (2019)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Asai & Fukunaga (2017)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Garnelo et al. (2016)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
Evans et al. (2021b)	$∙ ∙$	$∙ ∙$	$\circ$	$∙ ∙$	$∙$
Evans et al. (2021a)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙ ∙$
Lake et al. (2015)	$∙ ∙$	$∙$	$∙$	$\circ$	$∙$
Bechberger (2021)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
Nevens et al. (2020)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
BIGA (present method)	$∙ ∙$	$∙$	$∙$	$∙ ∙$	$∙$

$∙ ∙$ proficient/broad; $∙$ moderate/some; $\circ$ weak/null. BIGA: Bateson-inspired grounding algorithm.

The present study aims to extract symbolic features or patterns from multivariate time series without requiring a training process. Thus, it is essential to place it within the wider scope of time-series analysis. Several notable methods have emerged in this domain, including SAX (symbolic aggregate approximation) (Lin et al., 2003), where a combination of Gaussian normalisation and discretisation allows a symbolic representation to be extracted in a predetermined number of symbols. A key contribution of this work lies in the fact that distances in the symbolic space offer a lower-bound guarantee with respect to the original data.

Another notable contribution involves the incorporation of the Fourier transform into the latter approach, which establishes the symbolic Fourier approximation (SFA) method (Schäfer & Högqvist, 2012). The recent literature has focused primarily on variations or enhancements of SAX (Bountrogiannis et al., 2023; Imamura & Nakamura, 2021; Kloska & Rozinajova, 2021; Yu et al., 2023) and SFA (Li & Shen, 2022), including efforts in the field of explainable deep learning (Schwenke & Atzmueller, 2023). Additionally, alternative strategies have also emerged. For example, Nguyen & Ifrim (2023) used random symbolic sequences for efficient and accurate classification tasks. Zunino et al. (2022) developed a novel time-series representation strategy based on the concept of permutation entropy introduced by Bandt & Pompe (2002). While these works predominantly aim at providing efficient machine learning algorithms for time-series analysis, they differ from the central objective of the present study, which is to construct human-like conceptual and semantic structures from the data.

A closely related field of research that does share this same objective can be found in the area of linguistic summarisation of time series. Although, perhaps, not the most popular, it is definitely a well-established domain of research with origins in the broader field of data summarisation (Yager, 1982). Linguistic summarisation involves the extraction of relatively simple features from data streams (Aoki & Kobayashi, 2016; Aydogan, 2023; Castillo-Ortega et al., 2011; Kacprzyk et al., 2007; Özdogan et al., 2021), so that they can be expressed semantically in human-readable terms. In this context, it bears significant resemblance to the method described in this article. Nevertheless, there are two key distinctions: (i) the features extracted necessitate prior understanding of the relationships between variables (Baydogan & Runger, 2014), and (ii) the present model generates hierarchical structures of formal concepts from the features extracted, which is arguably more similar to the ways in which human knowledge is articulated.

4. Method

In subsequent discussions, a segment will be represented as a named tuple in the form ( $a_{1} = v_{1}, \dots, a_{k} = v_{k}$ ), where each $v_{i} \in R$ represents the value that the attribute $a_{i}$ of the segment can assume. While non-real values (symbolic, vector, complex, etc.) could be considered for more comprehensive representation, this initial approach will solely focus on real-valued attributes, leaving the extension to handle non-real values for future versions.

A curve or trajectory is represented as a sequence of segments sharing the same structure (identical attribute set). For instance, as will be elaborated in the subsequent subsection, a segment could be defined by the structure $(w i d t h, a n g l e)$ , implying that a curve would comprise a succession of segments, each possessing the attributes of width and angle relative to a predefined reference system. In this first formalisation, all segments feature the same length. This length can also be viewed as the distance between two consecutive sensor readings and will be referred to as the discretisation parameter. In curves that are fully continuous and differentiable, this parameter has no effect on the model’s output if its value is above a certain threshold. This threshold value depends solely on the geometric intricacies of the curve itself. In the remainder of this article, the value of the discretisation parameter is always set well above the aforementioned threshold.

In line with the focus on identifying changes in the curve’s progression, three possible comparators are naturally associated for each attribute. The choice of comparators depends on the attribute type; since numerical attributes are considered in this case, all attributes employ the same set of comparators. Specifically, the comparators (<, >, $=$ ) are utilised, consistent with their conventional interpretation when dealing with real numbers. It is worth highlighting that these comparators are disjoint and complete, implying that for each pair of possible values in the attributes, only one of the comparators applies.

The curve feature extraction algorithm operates on a curve $α := {s_{j}}_{0 \leq j \leq N}$ , where each segment $s_{j}$ possesses the structure $(a_{1}, \dots, a_{k})$ . The BIGA algorithm comprises the following steps:

Preprocessing. For each $j < N$ the comparison $s_{j}$ with $s_{j + 1}$ of each attribute is calculated and the interval $[j, j + 1]$ is constructed, which, instead of attributes, shows comparisons. When engaging with attribute $a_{i}$ and if ( $s_{j + 1} . a_{i} < s_{j} . a_{i}$ ), the change or transition will be noted as $a_{i} :<$ in the interval $[j, j + 1]$ ( $s . a$ denotes the value of attribute $a$ in the segment $s$ ).

The set of intervals with the above attributes will be denoted by $I α$ .

Compute symbolic differences. Two intervals are contiguous if they are of the form

I_{1} = [a, b]

and

I_{2} = [b, c]

. For each pair of contiguous intervals, a new interval is created from their union:

I_{1} \cup I_{2} = [a, c]

. Then, for each attribute of the structure, the transition that occurs from comparisons of the first interval to the second is computed and associated with the new union interval. In the case of numerical values, nine possible combinations may occur, which are noted for convenience as shown in Table 3 (it may be observed that an absorption effect occurs when two comparisons coincide):

Table 3.
Symbolic Comparison Scheme of the Nine Possible Basic Transition Combinations.

$I_{1}$	$I_{2}$	$I_{1} \cup I_{2}$
<	<	<
<	$=$	$<=$
<	>	$<>$
$=$	<	$><$
$=$	$=$	$=$
$=$	>	$=>$
>	<	$><$
>	$=$	$>=$
>	>	>

The set of new intervals generated (unions of contiguous intervals of

I α

) is denoted by

I^{1} α

. The set formed by (i) the original intervals and (ii) the potential unions of these intervals computed in this step is then expressed as

I^{2} α \leftarrow I α \cup I^{1} α

Remove redundancies. Information redundancy between contained intervals is checked. An interval $I_{1} = [a, b]$ is contained in an interval $I_{2} = [c, d]$ if $c \leq a \leq b \leq d$ is verified. In this case, $I_{1}$ is redundant if for each attribute, the notation of both intervals is the same.

In this step, the intervals that are redundant with respect to another existing interval in the set of intervals shall be removed from the set. Thus, the set $I^{3} α \leftarrow {I \in I^{2} α : I not redundant}$ is obtained.

Recursion.

If $I^{3} α \neq I α$ , then the operation $I α \leftarrow I^{3} α$ is performed, and the process is repeated from Step 2.

If $I^{3} α = I α$ , then the algorithm stops and returns $I α$ which is used as a formal context for FCA calculation. Through this calculation, both a concept set $C α$ and a concept lattice $Γ α$ are obtained.

In the following paragraph, a complete application example of the algorithm to a specific curve is presented. A flowchart of the method is shown in Figure 2.

Figure 2.

Flowchart of the proposed algorithm.

4.1. Example

In this example, a curve is represented in Figure 3(a), with an axis line and variable thickness as illustrated by the grey shade around it. A sensor is reading the curve at regular intervals, as shown in Figure 3(b). For the sake of simplicity, these intervals have been fixed to generate as few segments as possible, thus the poor definition of the curved sections. However, in practise, these intervals would allow for a very smooth definition of the curve.

Figure 3.

(a) Entity selected as example, and (b) breakdown into segments after sensor reading.

A time-series representation of the example curve, taking into account the angles and widths at each segment is provided in Table 4. It should be noted that the model does not operate with the specific values, but rather, it just needs to compare pairs of adjacent values and discriminate whether the values are equal or whether one value is larger or smaller than its peer.

Step 1:

Preprocessing. In the first step, value comparisons of sequential record pairs are performed to obtain symbolic properties. The first pair takes the first two segments ( $s_{0}$ and $s_{1}$ ) and their values are compared as follows:

$s_{0} (a = - 65.85, w = 0.00)$ .

$s_{1} (a = - 65.85, w = 0.05)$ .

Compare $(s_{1}, s_{0}) = {a : =, w : >}$ .

Table 4.

Time Series Representation of the Sample Entity.

Segment	Width $(w)$	Angle $(a)$
0	0.00	−65.85
1	0.05	−65.85
2	0.10	−65.85
3	0.15	−46.42
4	0.20	−7.57
5	0.26	31.28
6	0.30	50.71
7	0.36	50.71
8	0.43	50.71
9	0.50	50.71
10	0.50	50.71
11	0.50	50.71
12	0.50	74.56
13	0.50	122.27
14	0.50	169.98
15	0.50	193.84
16	0.50	193.84

Applying this operation to all pairs of contiguous segments results in the formal context or property table shown in Table 5.

Table 5.

Formal Context Table Resulting From Step 1.

Interval	$a : =$	$a : >$	$w : =$	$w : >$
$[0, 1]$	1	0	0	1
$[1, 2]$	1	0	0	1
$[2, 3]$	0	1	0	1
$[3, 4]$	0	1	0	1
$[4, 5]$	0	1	0	1
$[5, 6]$	0	1	0	1
$[6, 7]$	1	0	0	1
$[7, 8]$	1	0	0	1
$[8, 9]$	1	0	0	1
$[9, 10]$	1	0	1	0
$[10, 11]$	1	0	1	0
$[11, 12]$	0	1	1	0
$[12, 13]$	0	1	1	0
$[13, 14]$	0	1	1	0
$[14, 15]$	0	1	1	0
$[15, 16]$	1	0	1	0

Executing FCA in the resulting formal context generates the concept set presented in Table 6. The extent of each concept in the set (except

C_{1}

and

C_{10}

) is visualised in Figure 4.

Step 2:

Compute symbolic differences. Next, all contiguous intervals from the previous step are compared according to their symbolic properties. When these properties are the same (i), there is no transition. On the contrary, when there is a difference in properties (ii), the transition is expressed in the form of a new symbolic property. In both cases, a new interval is generated carrying either the preexisting symbolic property or the newly generated one. Below is an example for each case, where the transition from

{a : =}

{a : >}

is expressed as

{a : =>}

Table 6.

Resulting Concept Set After Step 1.

Concept	Intent	Extent
$C_{1}$	${}$	$[0, 16]$ (all)
$C_{2}$	${a : >}$	$[2, 6], [11, 15]$
$C_{3}$	${a : >, w : =}$	$[11, 15]$
$C_{4}$	${a : >, w : >}$	$[2, 6]$
$C_{5}$	${a : =, w : =}$	$[9, 11], [15, 16]$
$C_{6}$	${a : =, w : >}$	$[0, 2], [6, 9]$
$C_{7}$	${a : =}$	$[0, 2], [6, 9], [15, 16]$
$C_{8}$	${w : =}$	$[9, 16]$
$C_{9}$	${w : >}$	$[0, 9]$
$C_{10}$	${a : =, a : >, w : =, w : >}$	$-$

Figure 4.

Visualisation of each concept’s extent after step 1.

(i)

$[0, 1] : {a : =, w : >}$ .

$[1, 2] : {a : =, w : >}$ .

Compare $([0, 1], [1, 2]) = [0, 2] : {a : =, w : >}$ .

(ii)

$[1, 2] : {a : =, w : >}$ .

$[2, 3] : {a : >, w : >}$ .

Compare $([1, 2], [2, 3]) = [1, 3] : {a : =>, w : >}$ .

Repeating this process for all the intervals obtained in Step 1, results in a new formal context that expands on the previous one presented in Table 5. The new expanded context is shown in Table 13 (Appendix), where the data being repeated have been largely omitted.

Step 3:

Remove redundancies. At this point, each interval that is contained in any other interval is checked for redundancy: if the attributes of the shorter interval are the same as the attribute set of the larger interval, then the former is removed from the interval table. When an interval is removed from the table, it is no longer used in the process going forward. For example, the intervals $[0, 1]$ and $[1, 2]$ are contained in the interval $[0, 2]$ and also share the exact same set of attributes: ${a : =, w : >}$ . Therefore, both the intervals $[0, 1]$ and $[1, 2]$ are removed from the interval table. However, the interval $[15, 16]$ , contained in $[14, 16]$ , cannot be removed because their sets of attributes are different: ${a : =, w : =} \neq {w : =, a : >=}$ . After executing this process for all the intervals in Table 13 (Appendix), a new formal context is obtained (Table 14 Appendix) where redundant intervals have been removed.

Step 4:

Recursion. Steps 2 and 3 are repeated recursively until the model generates no new information. This occurs in the present example after three iterations, resulting in the final formal context presented in Table 15 (Appendix). From this formal context, 28 concepts are calculated by the FCA algorithm, nine of which have already been obtained in Step 1 ( $C_{1}$ to $C_{9}$ ). In Table 16 (Appendix), the remaining 19 concepts are listed, and their extents are represented in Figures 12 to 14 (Appendix). The resulting concept lattice can be found in Figure 5.

Figure 5.

Resulting formal concept analysis (FCA) lattice of the example curve ( $i . . j$ represents the interval $[i, j]$ ).

5. Experimentation and Results

In this section, the methodology presented above will be assessed in terms of its capability to both discriminate and assimilate among a family of curves. Figure 6 shows the set of curves (or trajectories) that will be considered. All curves feature variable thickness or width and flow from left to right. It can be observed that there are four pairs of visually similar curves $a \sim b, c \sim d, e \sim f$ , and $g \sim h$ .

Figure 6.

Curves setup for experimentation.

The experiments focused on providing answers to three questions: (i) Can the model generate concepts that capture common patterns among the curves? (ii) Is there a concept (or set of concepts) that uniquely characterises each pair of visually-similar curves?, And (iii) Within each pair, can the model generate a concept that discriminates each individual curve? The results of these questions are presented in the subsections below.

5.1. Common Patterns Across the Sample-Set

Although this first task is perhaps rather trivial, it allows further showcasing the features of the model. There are, of course, a great number of concepts that capture common patterns across the curves. However, for the sake of simplicity, only those concepts that are common to all samples are presented (Table 9). A list of concepts has been provided for four different combinations of sensor parameters as follows: $(a n g l e)$ , $(a n g l e, w i d t h)$ , $(a n g l e, x, y)$ , and $(a n g l e, w i d t h, x, y)$ .

Apart from the concepts provided, this list can be extended to include any subconcepts that can be extracted from them. For example, if the intent of the concept ${a : >, w : <}$ is common to all curves, then the subconcepts ${a : >}$ and ${w : <}$ are also common concepts. Just to offer a visual example, the curve sections that form the extent of one of the concepts listed in Table 7 ( ${a : <, w : ><}$ ), are shown in Figure 7.

Figure 7.

Extent of concept intent ${a : <, w : ><}$ for curves from Figure 6.

Table 7.

Concepts (Intent) Present Across All Curves in the Sample-Set.

Angle $(a)$	Angle, width $(w)$	$a$ , $x$ , $y$	$a$ , $w$ , $x$ , $y$
${a : >}$	${a : >, w : <}$	${a : >, x : >, y : >}$	${a : >, w : <, x : >, y : >}$
${a : <}$	${a : <, w : <}$	${a : <, x : >, y : >}$
	${a : <, w : >}$
	${a : <, w : ><}$

5.2. Characterisation of Visually Similar Curves

For each curve, the model provides the corresponding set of concepts according to the methodology presented in this article. Various sets of concepts have been obtained for the following sensor combinations for segments: $(a n g l e)$ , $(a n g l e, w i d t h)$ , $(a n g l e, x, y)$ , $(a n g l e, x)$ , and $(a n g l e, y)$ . For each combination, the set of concepts of each curve is compared to that of every other curve in the sample set. The set of concepts of curve $α$ will be denoted by $C α$ . If two curves $α, β$ share the exact set of concepts, then the difference between their corresponding sets of concepts should be empty: $C α - C β = C β - C α = \emptyset$ . When the concept set of one curve ( $C α$ ) is a subset of the other’s ( $C β$ ), then $C α - C β = \emptyset$ , and $C β - C α \neq \emptyset$ . Finally, if two sets of concepts are different and neither is a subset of the other, then $C α - C β \neq \emptyset$ and $C β - C α \neq \emptyset$ . The sizes of the differences between concept sets among all curves in the sample set are presented in matrix form in Tables 8 to 11.

Table 8.
Number of Differing Concepts With Parameters: $A n g l e$ .

$A n g l e$

a b c d e f g h

a 0 0 0 1 1 0 0

b 0 0 0 1 1 0 0

c 2 2 0 2 2 2 2

d 2 2 0 2 2 2 2

e 1 1 0 0 0 1 1

f 1 1 0 0 0 1 1

g 0 0 0 0 1 1 0

h 0 0 0 0 1 1 0

Table 9.

Number of Differing Concepts With Parameters: $A n g l e$ , $W i d t h$ .

$A n g l e$ , $w i d t h$
	a	b	c	d	e	f	g	h
a		4	6	5	9	9	9	9
b	4		5	6	9	9	11	9
c	14	13		7	12	12	19	15
d	11	12	5		12	12	15	10
e	9	9	4	6		0	15	15
f	9	9	4	6	0		15	15
g	0	2	2	0	6	6		0
h	5	5	3	0	11	11	5

Table 10.

Number of Differing Concepts With Parameters: $A n g l e$ , $x$ , $y$ .

$A n g l e$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		0	11	14	48	48	26	26
b	0		11	14	48	48	26	26
c	62	62		14	95	95	53	53
d	97	97	46		128	128	93	93
e	14	14	10	11		0	5	5
f	14	14	10	11	0		5	5
g	37	37	13	21	50	50		0
h	37	37	13	21	50	50	0

Table 11.

Number of Differing Concepts With Parameters: $A n g l e$ , $x$ .

$A n g l e$ , $x$
	a	b	c	d	e	f	g	h
a		0	2	2	7	7	2	2
b	0		2	2	7	7	2	2
c	15	15		0	15	15	12	12
d	15	15	0		15	15	12	12
e	7	7	2	2		0	5	5
f	7	7	2	2	0		5	5
g	7	7	4	4	10	10		0
h	7	7	4	4	10	10	0

It should be noted that since the concepts of the curves are also sets in themselves, and not simple elements, the comparison between the obtained lattices can have different interpretations depending on how the concepts of two curves are compared to each other. However, there was no qualitative difference if this comparison was made strictly (two concepts must be exactly the same), or if a more relaxed version that takes into account the containment between them was used. After performing these calculations, the difference matrices remained essentially the same.

As can be observed in Table 11, the number of differing concepts between the visually similar curves $a$ and $b$ is zero. This is true both when considering concepts of $a$ that are not present in $b$ , and vice-versa (concepts of $b$ that are not found in $a$ ). The same applies to the remaining three pairs $c \sim d, e \sim f$ , and $g \sim h$ . For clarity, the zeroes that indicate the absence of differing concepts among the target pairs have been highlighted in bold. Any other pair available from the table yields a difference in the number of concepts. A close look at Tables 8 to 10 shows that these two conditions are not met by any of the other sensor combinations presented. Thus, the sensor combination $(a n g l e, x)$ produces a unique (and non-subsumable) set of concepts for each visually similar pair of curves in the sample-set. Consequently, the question at stake in this subsection also renders a positive answer. The concept lattices pertaining to each pair of curves are shown in Figures 15 to 18 (Appendix).

5.3. Individual Discrimination of Curves

For all the curves to be uniquely characterised by a set of concepts that is also not a subset of that of any other curve, the matrix containing all the number of differing concepts should not contain any zeros. For the sensor combinations presented in the above subsection, there is no matrix that fulfils this condition; there are conceptual differences among most of the curves but not between all of them. For example, with the sensor combination $(a n g l e, w i d t h)$ , the curves $e$ and $f$ share the same conceptual definition because their concept difference is zero: $C e - C f = \emptyset$ and $C f - C e = \emptyset$ (see Table 9). However, after extending the sensor parameters to $(a n g l e, w i d t h, x, y)$ a complete discrimination is achieved. As presented in Table 12, for every pair of curves $α, β$ with $α \neq β$ of the sample-set, it is verified that $C α - C β \neq \emptyset$ and $C β - C α \neq \emptyset$ . Thus, in this final case too, the question at stake renders a positive answer.

Table 12.
Number of Differing Concepts With Parameters: $A n g l e$ , $W i d t h$ , $x$ , $y$ .

$A n g l e$ , $w i d t h$ , $x$ , $y$

a b c d e f g h

a 85 79 74 132 135 104 102

b 94 95 93 136 141 110 113

c 194 201 65 231 238 189 196

d 250 260 126 295 302 249 258

e 61 56 45 48 25 56 51

f 58 55 46 49 19 56 49

g 94 91 64 63 117 123 51

h 125 127 104 105 145 149 84

5.4. Comparison of Results

In order to provide a baseline for comparison, the last two experiments have also been tested with a neural network approach. Because of the similarities of the curves in the sample-set with the ‘curves’ in the MNIST dataset (Deng, 2012), a neural network model that excels at the MNIST challenge has been selected. The specific model has been taken from the resources available in the Keras library (Chollet, 2015). It achieves a test accuracy above 99% when classifying the $0 - 9$ digits represented as $28 \times 28$ pixel images. The chosen network architecture is detailed in the pseudo-code below:

Architecture.

1. Input layer: size=28x28

2. Convolutional 2D: depth=32, kernel size=3x3, activation=ReLu

3. MaxPooling 2D: pool size=2x2

4. Convolutional 2D: depth=64, kernel size=3x3, activation=ReLu

5. MaxPooling 2D: pool size=2x2

6. Flatten

7. Dropout: 0.5

8. Dense: size=10, activation=Softmax

Training.

Batch size: 128

Loss function: categorical cross-entropy

Optimizer: Adam

In order to apply this neural model to the sample-set used in this work, two operations are required. The first one is to convert the curves to images, and the second, to augment the sample-set of eight curves to a much larger dataset. This augmentation has been carried out through (i) random displacements, and (ii) random horizontal and vertical deformations. To avoid incurring in unfair conditions for the neural model, no rotations have been involved in the data augmentation process. The total sample space of possible variations enabled by the augmentation scheme is well above 1 million unique instances. Following this procedure, a dataset of 80,000 curve-images has been generated (10,000 variations of each original curve). An example of the instances obtained can be found in Figure 8

Figure 8.

Example of random variations used for data augmentation (curve $g$ ).

The criteria to compare the neural model with the model proposed in this article was established in terms of the number of examples each one required to perform successfully. As it has been already demonstrated, BIGA required only one example per category to correctly discriminate and assimilate the curves. In contrast, the neural model necessitates a much larger number. According to Figure 9, the accuracy of the network in discriminating single curves decreases dramatically when ‘only’ $800$ training samples are provided (singles $- 28 \times 28 < 80 %$ ). When training the network, batch sizes were always adjusted proportionally according to the number of samples given to the model. To rule out the possibility that a resolution of $28 \times 28$ pixels is insufficient to accurately capture the nuances of the curves, another dataset was generated under the same procedure but with a doubled resolution of $56 \times 56$ pixels. For this new input size, the kernel size of the convolutional layers was also doubled ( $6 \times 6$ ), yielding better results than with their original size. In this case, the results improved slightly for smaller datasets (singles $- 56 \times 56$ ). Consequently, this setup was chosen to test the capacity of the model to assimilate the pairs of curves described earlier. This time, the neural network was still able to achieve a test accuracy around $90 %$ with $800$ samples. However, at $400$ samples it dipped below $80 %$ again (pairs $- 56 \times 56$ ).

Figure 9.

Test accuracy results for the two neural network models used as the baseline for comparison.

6. Discussion

Although the proposed method is quite straightforward in its current form, the three questions posed in the experimentation section have been successfully addressed. However, these tests were conducted on a limited sample-set and, therefore, these results should be interpreted as a proof of concept rather than as generalisable findings. Further experimentation should be performed with larger datasets, such as Omniglot (Lake et al., 2015), to allow rigorous benchmarking and establish the performance of the model.

The first experiment demonstrated the ability of the model to identify common concepts across the curves, as presented in Table 7. These concepts, expressed in natural language, showcase the model’s pattern recognition capabilities. Figure 7 provides a visual illustration of one such concept (intent: ${a :<, w :><}$ ). This concept, based solely on width and angle variations, exhibits rotational and scaling invariance. The curve segments (extent) that conform to this concept exhibit a decreasing angle alongside an increasing-then-decreasing thickness. Visually, this pattern appears evident for sections $a_{1}$ , $b_{1}$ , $e_{1}$ , $g_{1}$ , and $h_{1}$ , but not as prominently for sections $c_{1}$ , $d_{1}$ , $d_{2}$ , $f_{1}$ , and $h_{2}$ . While the model’s capabilities enable the identification of such patterns, it should be emphasised that visual analogies may not perfectly represent the model’s sensing mechanisms. A more appropriate analogy might involve tactile swiping along the curves, emulating human haptic perception. Furthermore, sections $h_{1}$ , $c_{1}$ , and $d_{2}$ exhibit self-intersections, clearly visible in the figure. However, this first definition of the model lacks the capacity to readily detect such intersections. This highlights the need for further development to enhance the model’s memory capabilities to enable a more comprehensive sensory analysis.

The second experiment aimed to establish unique conceptual definitions for each pair of similar curves. These definitions were identified for the sensor combination $(a n g l e, x)$ , as presented above. It should be noted that, due to the inclusion of $x$ , these definitions are not rotation-invariant. It is evident that only considering the angle would not be sufficient to distinguish the curves $e$ and $f$ from $c$ and $d$ . The angle patterns of the former two curves are a subset of the latter two, making angle alone an inadequate discriminator. Similarly, the thickness does not exhibit a consistent pattern within each pair, further hindering differentiation. In the current methodology, the input data is strictly of numeric type. However, incorporating symbolic input data could broaden the model’s capabilities. This would enable expressions like transitions from no-sensing to sensing and vice versa. Such expressions could potentially provide additional differentiation, particularly in cases where entire curves ( $e$ and $f$ ) represent conceptual subsets of others ( $c$ and $d$ ). Additionally, the combination of sensors $(a n g l e, x, y)$ demonstrates the ability to characterise all pairs except $c$ and $d$ . Interestingly, these two curves exhibit a significant conceptual difference: $d$ decreases in $y$ after the loop, while $c$ remains upward after the loop. Finally, the combination $(a n g l e, w i d t h)$ only produces a unique definition for the pair of curves $e$ and $f$ .

Another crucial observation is that the identification of specific sensor combinations that yield unique characterisations for curves has been conducted manually. This process should be further formalised in future endeavours by developing a supervised learning methodology. For example, when dealing with large category sets, the lattice-building process may benefit from graph-based cluster detection techniques and their corresponding optimisation algorithms (Gharehchopogh, 2023).

In the instances presented here, only two samples are provided for each category and a single sensor combination is employed for the entire curve. However, engaging with rich datasets might pose a challenge. To address this, one possibility is to dynamically adapt the combination of sensor parameters for distinct sections of the curves as a training parameter. This approach could potentially provide a viable solution to this challenge, analogous to the strategies employed by Imamura & Nakamura (2021) and Imani & Keogh (2019), where semantic sequences are matched by discarding some portions of information in the middle of these sequences (‘don’t care’ regions). Other options include feature selection methods as by Ayar et al. (2022).

Alternatively, BIGA could also be used as a feature extraction engine to guide a later training process within a different model in what is known as knowledge-guided training (Dattani & Bramer, 1996; Díaz-Rodríguez et al., 2022; He et al., 2018). This guidance can improve the performance and explainability of these models while also reducing the size of the required training data. However, it should be noted that in this workflow between the symbolic model and the neural engine, training would imply deviating from the ‘firing scheme found in nature’ (as referred to previously in this text). Overall, regardless of the training strategy, a direct advantage of the approach presented in this paper lies in its ability to handle as few training samples as necessary.

The third experiment was aimed at distinguishing all the curves from each other. This was accomplished using a combination of sensors that included all parameters $(a n g l e, w i d t h, x, y)$ . Although the experiment was successful with the eight curves in the sample set, a significant limitation exists in this initial version of the model regarding its differentiating capacity. Given two S-shaped curves, for example, one with more pronounced bulges than the other, the model is unlikely to discern between them. The same applies when comparing a very long bulge to a short one, etc. This difficulty in evaluating magnitudes, generating concepts that quantify the change in sensor variables, represents a crucial challenge for this model, as depicted in Figure 10. The primary challenge lies in addressing this issue without introducing symbolic external information into the model. This self-imposed constraint distinguishes the model from the methodologies discussed earlier in the field of linguistic summarisation (Boulanouar et al., 2020; Kaczmarek-Majer & Hryniewicz, 2019; Özdogan et al., 2021). Three potential strategies exist to address the challenge of assessing magnitudes without introducing external symbolic information:

Strategy 1: Arithmetic operations. The model could perform arithmetic operations on the sensor values to extract symbolic properties. For instance, it could aggregate values like segment count, total length, average thickness, etc. from intervals of a concept’s extent. If a concept contains multiple intervals with distinct aggregated values, they could be unfolded into separate concepts. Alternatively, these aggregated values could be integrated into the model’s standard comparisons, resulting in concept intents like ${a :<, l :=}$ , where $a$ denotes the angle parameter and $l$ represents the length of the compared intervals. Additionally, the model could perform arithmetic operations directly on sensor values. For instance, it could compare the difference in angle between consecutive segments to the difference between the previous two segments, generating a symbolic token related to the magnitude of angle variation. Other operations could involve comparing the difference in value between two segments with the actual value of each.

Strategy 2: Comparison spaces. The model could create comparison spaces between objects and extract new sensory data from their relative differences. When studying the movement of vehicles for example, it could track the distance between two vehicle trajectories over time and use that distance as an additional sensor parameter. In the case of static curves, these could be positioned next to each other in various relative arrangements, similar to how children interact with toy shapes to observe the effects of different spatial configurations. Then, a distance parameter flowing along the curves could be incorporated into the sensor. This strategy would enable the model to generate concepts that directly express differences between initially similar curves. Furthermore, this approach could be extended to compare more than two curves simultaneously.

Strategy 3: Memory. The model could implement a memory to store specific values. These stored values could be compared retrospectively, beyond the scope of pairwise comparisons employed in the current method. However, this approach might deviate slightly from the model’s core principles. Humans excel at grasping proportions and ratios rather than absolute magnitudes. Therefore, approaches that prioritise relative magnitudes over absolute ones might align better with the ideas presented in this article.

Figure 10.

Two similar curves with tangible differences in angle variation that are not captured by the model.

Another limitation of the model (implicit in the experiments) is its inability to handle nested concepts effectively. For instance, the model fails to recognise a large S-shaped curve that is formed by smaller S-shaped sub-curves (as depicted in Figure 11). This limitation stems from the model’s reliance on purely symbolic representations, which can struggle to capture complex spatial relationships involving multiple scales. To address this issue, more embodied approaches could be explored. In this line, a potential solution may involve the design of a sensor system comprising a primary arm and a secondary arm (or hand). The primary arm would operate on a broader scale, providing a global overview of the curve, while the secondary arm or hand would focus on finer details, including noise and subtle patterns. This hierarchical approach could help the model better discern nested concepts and extract meaningful information from both macro- and micro-levels of observation.

Figure 11.

An overarching S-shape comprised of smaller nested S-shapes.

The last set of experiments, the comparison of results, aimed to evaluate the performance of the model against a neural network baseline. The tests showed that even a powerful convolutional architecture requires more than 400 and 800 training samples to achieve an accuracy level above $80 %$ when assimilating pairs and discriminating individual curves, respectively. Contrarily, the present method is able to perform such tasks with only one or two (in the case of curve pairs) examples per category. Furthermore, the method has shown other advantages during the testing phase. For example, neural networks are very sensitive to the configuration of hyperparameters such as kernel size or activation functions, posing a strong contrast to the approach here. Another notable benefit is that while neural nets suffer a dramatic increase in complexity as the size of the input images is scaled up, the method described in this article is not greatly affected by the size of the input. Finally, one more advantage can be found in the way it can handle differently-sized input, fundamentally improving on this more rigid aspect of the neural models that are broadly used today.

An interesting avenue of research, in relation to the comparison with neural networks, would be to use a method from the class activation method family in order to obtain which regions of the image were determinant to the model’s classification (Chattopadhay et al., 2018; Selvaraju et al., 2019; Zhou et al., 2015). These regions, along with the features extracted by the convolutional layers, could then be compared to the concepts obtained by BIGA. Ultimately, such explorations may lead to relevant results regarding the explainability of AI models.

Another important point suggested by the experimental results is that the proposed model possesses generative capabilities; its radical composability enables the formation of novel concepts or features that have not been directly perceived by the sensor by combining those acquired from prior experience. Moreover, due to the fully tractable path from sensor data to the concepts generated by the model, these unseen new features can be propagated or decoded back to the sensor level. This decoding process would facilitate the generation of the corresponding curves for these unseen concepts. However, since each concept in the model exhibits a high degree of generalisation, there are an infinite number of potential outputs for each new concept. Therefore, a heuristic framework would be necessary to determine the final output (extent) for a given concept.

Furthermore, the fact that all concepts can be readily expressed in natural language suggests two crucial points:

Semantic encoding of sensor data: The model efficiently encodes sensory data, both static (nouns) and dynamic (verbs), from the real world into human-readable language. This ability bridges the gap between the physical world and natural language representation.

Generative and creative output: By combining existing concepts into new ones through natural language queries, the model can generate novel objects, broadening its repertoire of representations. This opens up the possibility of imagining and creating novel sensory experiences.

Building on these insights, further research could delve into the potential of this model as the foundation for a natural language engine for sensory data, capable of interpreting and generating human-understandable descriptions of the world around us. Such an engine could play a significant role in multi-agent and human-agent ensembles, improving communication and collaboration between intelligent systems and humans (Black et al., 2022; Lemon, 2022). Recent advances in large language models further emphasise the potential of this approach to improve the way machines interact with and comprehend the world through language.

7. Conclusions and Future Work

This article has introduced BIGA, a model that extracts human-relatable concepts from spatial sensor data. Inspired by Bateson’s idea of difference as the fundamental building block of a concept, the model utilises atomic value comparisons ( $=, >, <$ ) to extract basic features from the data. These features are then recursively combined to form more complex concepts by identifying differences between previously detected differences. The model formulation is formal, generalisable and applicable to a wide range of spatial sensor data. However, it is specifically designed for everyday objects, rather than abstract or complex time-series data. This distinction is crucial, as numerous approaches exist for summarising time-series data using natural language. The present work distinguishes itself by prioritising generalisation, composability, flexibility, and simplicity over accuracy in pattern discrimination. These characteristics make it suitable for extracting meaningful concepts from everyday sensory experiences, enabling machines to interact with and understand the world in a more human-like manner.

As outlined in the methodology, the proposed model is designed inherently for generalisation and composability. It also possesses reasoning and potentially generative capabilities, fuelled by formal concept analysis. Three experiments have been carried out upon a sample-set of eight curves of variable thickness. Within the set, every curve is visually-similar to one other, forming four pairs in total. The first experiment aimed at assessing the ability of the model to find common patterns among all curves in the sample-set. The model found a total of nine common concepts that were present in all the curves, corresponding to a total of four sensor combinations tested (see Table 7). The second experiment aimed at exploring the capacity of the model to assimilate objects that are similar, yet different. This was tested by determining whether or not the model was able to generate a unique conceptual representation for each pair of visually-similar curves. The goal was achieved by using the sensor combination $(a n g l e, x)$ , from which the model was able to build unique concept structures for each of the four curve pairs $(a, b), (c, d), (e, f)$ , and $(g, h)$ . The third experiment tested the discrimination potential of the method. Given the complete sample-set $(a - h)$ , the model was required to produce distinct conceptual structures for each curve. Upon running the algorithm for the sensor combination $(a n g l e, w i d t h, x, y)$ , the model was able to discriminate each and every curve of the set individually (as shown in Table 12).

The comparison of results showed that a powerful neural model requires more than 400 samples to perform at the same level as what is achieved here with only one or two samples per category. Overall, the results demonstrate the ability of the method to generate ‘fairly’ rich representations of the data. These representations effectively discriminate and assimilate objects as needed. Additionally, the model can elaborate multiple representations of the same object with remarkable ease, which is one of the five pillars of creative behaviour proposed by Rowe & Partridge (1993). Optionally, training processes can be incorporated to further refine these operations. This flexibility enables the model to adapt to various data sets and tasks, making it a versatile tool for extracting meaningful concepts from spatial sensor data.

The proposed model aligns with the aspirations driving the recent surge in neural-symbolic developments. Neural-symbolic research aims to address the shortcomings of purely symbolic models, which excel in generalisation and composability but struggle to extract meaningful representations from complex data. Connectionist approaches, particularly deep learning, have demonstrated remarkable success in this regard. By combining neural and symbolic methods, researchers are creating ML models that exhibit superior generalisation, enhanced out-of-distribution learning capabilities, greater composability than their early deep learning counterparts, and the ability to generate extremely rich representations. However, these models typically require extensive training and the labelling of data samples remains a significant challenge for industry applications. In contrast, the method presented in this article eliminates the need for extensive training. It generates comprehensive conceptual representations from a minimal set of individual features. Moreover, these representations can be readily expressed in natural language without the need for feature labelling. Furthermore, every concept can be traced back to its underlying atomic comparisons, providing full explainability of the model’s reasoning process in natural language.

As discussed earlier, the concepts generated by this method can be static or dynamic, encompassing both object- and movement-related concepts (nouns and verbs). The ability to trace back each concept to its underlying atomic comparisons not only enhances explainability but also suggests the potential to build a basic language from primitive tokens such as sensor parameter identifiers and comparators ( $=, >, <$ ). This bottom-up approach is particularly notable, as the model spontaneously constructs semantic structures of complex concepts without requiring external intervention or label assignment beyond these basic tokens. This capability holds true even when the model senses a new object for the first time. Consequently, Bateson’s principle could provide a novel avenue for exploring the SGP addressed in this article.

In light of these promising aspects, the methodology presented here has the potential to make a significant impact in the field of machine learning. The main contribution is that it offers a straightforward algorithm to seamlessly integrate perception with semantic conceptual structures. This integration is achieved without the need to combine models from different AI paradigms, unlike NSMs, and without limiting the model’s ability to create extensive conceptual richness. While the current implementation may not match the granularity of representations achieved in more advanced neural-symbolic approaches, the experimentation section does demonstrate promising results. However, it is imperative to formally assess the model’s performance through rigorous benchmarking against well-established datasets. With continued research efforts within the scientific community, the gap between the proposed method and current NSMs could be significantly narrowed.

In conclusion, the proposed methodology based on the Bateson’s principle for generating concepts from sensor data calls for further attention and exploration. If this approach can demonstrate the ability to produce sufficiently rich representations, it may unlock the full potential of symbolic models without relying solely on neural networks for data acquisition and processing.

7.1. Future Work

Throughout this article, there have been a number of suggestions for future work; either in the form limitations of the model, or directly expressed as potential next developments. Although many of them have been already elaborated upon, a brief summary of all the items is listed below for clarity. They have been grouped in two sections: (i) general research avenues and (ii) specific improvements of the model.

1. General research avenues:

Researching the SGP and autonomous language emergence through AI agents and consensus-building.

Developing bio-inspired connectionist implementations of the model.

Expanding the model to higher dimensions beyond curves or trajectories (including pixel-based data such as images and spatio-temporal data such as video).

Exploring the generative capacities of the model.

Integrating the model with other neural-symbolic methods discussed in this article, such as (Evans et al., 2021a, 2021b; Garnelo et al., 2016).

Improving the explainability of AI models by exploring comparisons of concepts obtained by the model with the regions and features extracted by methods from the CAM family.

2. Specific improvements of the model:

Training: Developing a training method upon the concept structures extracted by the algorithm is an important next step. Such a method would allow leveraging the model’s unsupervised learning capabilities, also, for supervised learning.

Benchmarking: It is essential to formally evaluate the model’s performance by conducting rigorous benchmarking against well-established datasets (e.g., Omniglot for curves). If a training method is implemented as indicated in the item above, then it would be possible to carry out this important endeavour.

Handling non-numerical values as input (e.g. symbolic input): This, for instance, would enable expressions of the sort ‘transition from no-sensing to sensing’ and vice versa. Such encodings are important for example, when handling disconnected curves. This is the case when the objects being sensed comprise multiple curves that are not necessarily sequentially connected to each other (e.g. the letter ‘X’, but not the letter ‘U’).

Magnitudes: As shown in Figure 10, a significant challenge for this model is the generation of concepts that quantify changes in sensor variables. The core difficulty is to achieve this without introducing more external symbolic information into the model. Three strategies have been proposed earlier to address this limitation; they are listed here as the next three items.

Arithmetic operations: The model can be fitted to perform arithmetic operations on sensor values to derive symbolic properties from them. For example, it can aggregate metrics such as segment count, total length, average thickness, and other attributes from intervals within the extent of a concept.

Comparison spaces: The model can generate comparison spaces between objects and derive new sensor data from their relative differences. For instance, when analysing vehicle movement, it can monitor the distance between two vehicle trajectories over time and use this distance as an additional sensor parameter.

Memory: The model could incorporate a memory module to store specific values and concepts, allowing for retrospective comparisons that extend beyond the pairwise comparisons used in the current method.

Noise and overarching concepts: As shown in Figure 11, the model fails to detect overarching concepts in the presence of smaller, nested ones. A viable strategy to address this limitation could be to embrace a more embodied approach, as discussed previously in the paper.

Footnotes

ORCID iD

Fernando Sancho Caparrini

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Appendix A

Table 16.

Concepts $C_{10}$ – $C_{28}$ (All Remaining Concepts).

Concept	Intent	Extent
$C 10$	${a : =>=, w : >=, a : =, a : >=>=, a : >, a : =>,$ $a : >=>, a : =>=>=, w : >, w : =, a : =>=>, a : >=}$	$-$
$C 11$	${a : >=}$	$[2, 9], [2, 11], [11, 16]$
$C 12$	${a : >=, w : =}$	$[11, 16]$
$C 13$	${a : >=, w : >}$	$[2, 9]$
$C 14$	${a : >=, w : >=}$	$[2, 11]$
$C 15$	${a : >=>, w : >=}$	$[2, 15]$
$C 16$	${a : >=>=, w : >=}$	$[2, 16]$
$C 17$	${a : =, w : >=}$	$[6, 11]$
$C 18$	${a : =>}$	$[0, 6], [6, 15], [9, 15]$
$C 19$	${a : =>, w : =}$	$[9, 15]$
$C 20$	${a : =>, w : >}$	$[0, 6]$
$C 21$	${a : =>, w : >=}$	$[6, 15]$
$C 22$	${a : =>=}$	$[0, 9], [0, 11], [6, 16], [9, 16]$
$C 23$	${a : =>=, w : =}$	$[9, 16]$
$C 24$	${a : =>=, w : >}$	$[0, 9]$
$C 25$	${a : =>=, w : >=}$	$[0, 11], [6, 16]$
$C 26$	${a : =>=>, w : >=}$	$[0, 15]$
$C 27$	${a : =>=>=, w : >=}$	$[0, 16]$
$C 28$	${w : >=}$	$[0, 11], [0, 15], [0, 16], [2, 11], [2, 15], [2, 16], [6, 11], [6, 15], [6, 16]$

Figure 16.

Concept lattice representation for curves (c) and (d) considering variables ${a n g l e, x}$ .

Figure 17.

Concept lattice representation for curves (e) and (f) considering variables ${a n g l e, x}$ .

Figure 18.

Concept lattice representation for curves (g) and (h) considering variables ${a n g l e, x}$ .

References

Aoki

Kobayashi

(2016). Linguistic summarization using a weighted n-gram language model based on the similarity of time-series data. In 2016 IEEE international conference on fuzzy systems, FUZZ-IEEE 2016 (pp. 595–601). https://doi.org/10.1109/FUZZ-IEEE.2016.7737741.

Aragón

R. G.

Medina

Ramírez-Poussa

(2021). Reducing concept lattices by means of a weaker notion of congruence. Fuzzy Sets and Systems, 418, 153–169. https://doi.org/10.1016/J.FSS.2020.09.013

Asai

Fukunaga

(2017). Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. arXiv.

Ayar

Isazadeh

Gharehchopogh

F. S.

Seyedi

(2022). Chaotic-based divide-and-conquer feature selection method and its application in cardiac arrhythmia classification. The Journal of Supercomputing, 78(4), 5856–5882.

Aydogan

(2023). Interval type-2 fuzzy linguistic summarization using restriction levels. Neural Computing and Applications, 35, 24947–24957. https://doi.org/10.1007/S00521-023-09002-0

Bandt

Pompe

(2002). Permutation entropy: A natural complexity measure for time series. Physical Review Letters, 88, 4. https://doi.org/10.1103/PHYSREVLETT.88.174102

Bateson

(1999). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. University of Chicago Press. https://doi.org/10.7208/CHICAGO/9780226924601.001.0001

Baydogan

M. G.

Runger

(2014). Learning a symbolic representation for multivariate time series classification. Data Mining and Knowledge Discovery 2014 29:2, 29, 400–422. https://doi.org/10.1007/S10618-014-0349-Y

Bechberger

(2021). Towards conceptual logic tensor networks. CEUR Workshop Proceedings, 2969. https://osnascholar.ub.uni-osnabrueck.de/handle/unios/18216

10.

Bechberger

Kühnberger

K. U.

(2018). A comprehensive implementation of conceptual spaces. CEUR Workshop Proceedings, 2090, 41–54.

11.

Belohlávek

Sklenár

(2005a). Formal concept analysis constrained by attribute-dependency formulas. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), 3403, 176–191. https://doi.org/10.1007/978-3-540-32262-7_12

12.

Belohlávek

Sklenár

(2005b). Formal concept analysis over attributes with levels of granularity. Proceedings - International Conference on Computational Intelligence for Modelling, Control and Automation, CIMCA 2005 and International Conference on Intelligent Agents, Web Technologies and Internet, 1, 619–624. https://doi.org/10.1109/cimca.2005.1631332.

13.

Belohlávek

Sklenár

Zacpal

(2004). Concept lattices constrained by equivalence relations. CEUR Workshop Proceedings, 110, 58–66.

14.

Belohlavek

Sklenár

Zacpal

(2004). Formal concept analysis with hierarchically ordered attributes. International Journal of General Systems, 33, 283–294. https://doi.org/10.1080/03081070410001679715

15.

Belohlávek

Vychodil

(2005). Reducing the size of fuzzy concept lattices by hedges. In IEEE international conference on fuzzy systems (pp. 663–668).

16.

Bengio

(2017). The consciousness prior. arXiv 10.48550/arxiv.1709.08568. https://arxiv.org/abs/1709.08568v2.

17.

Black

Brandão

Cocarascu

Keijzer

B. D.

Long

Luck

McBurney

Meroño-Peñuela

Miles

Modgil

Moreau

Polukarov

Rodrigues

Ventre

(2022). Reasoning and interaction for social artificial intelligence. AI Communications, 35, 309–325. https://doi.org/10.3233/AIC-220133

18.

Borrego-Díaz

Páez

J. G.

(2022). Knowledge representation for explainable artificial intelligence: Modeling foundations from complex systems. Complex and Intelligent Systems, 8, 1579–1601. https://doi.org/10.1007/S40747-021-00613-5/FIGURES/8

19.

Boukhetta

S. E.

Richard

Demko

Bertet

(2020). Interval-based sequence mining using fca and the next priority concept algorithm. CEUR Workshop Proceedings, 2729, 91–102.

20.

Boulanouar

Hadjali

Lagha

(2020). A hybrid approach for linguistic summarization of time series. 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy, ICDABI 2020. https://doi.org/10.1109/ICDABI51230.2020.9325701.

21.

Bountrogiannis

Tzagkarakis

Tsakalides

(2023). Distribution agnostic symbolic representations for time series dimensionality reduction and online anomaly detection. IEEE Transactions on Knowledge and Data Engineering, 35, 5752–5766. https://doi.org/10.1109/TKDE.2022.3174630

22.

Cárdenas-García

J. F.

(2022). The central dogma of information. https://doi.org/10.3390/info13080365.

23.

Cárdenas-García

J. F.

Ireland

(2020). Bateson information revisited: A new paradigm. Proceedings 2020, Vol. 47, Page 5, 47, 5. https://doi.org/10.3390/PROCEEDINGS2020047005. https://www.mdpi.com/2504-3900/47/1/5.

24.

Castillo-Ortega

R. M.

Marín

Sánchez

(2011). A fuzzy approach to the linguistic summarization of time series. Journal of Multiple-Valued Logic and Soft Computing, 17, 157–182.

25.

Chang

Flokas

Lipson

Spranger

(2020). Assessing satnet’s ability to solve the symbol grounding problem. volume 2020-December. Cited by: 8.

26.

Chattopadhay

Sarkar

Howlader

Balasubramanian

V. N.

(2018). Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), IEEE. https://doi.org/10.1109/WACV.2018.00097.

27.

Chollet

et al (2015). Keras. https://keras.io.

28.

Cristianini

(2014). On the current paradigm in artificial intelligence. AI Communications, 27, 37–43. https://doi.org/10.3233/AIC-130582

29.

Dai

W.-Z.

Zhou

Z.-H.

(2019). Bridging machine learning and logical reasoning by abductive learning*. arXiv.

30.

Dattani

Bramer

(1996). Utilizing symbol hierarchies and qualitative models for knowledge guided induction. Number 198, pp. 2/1–2/4. Cited by: 0.

31.

de Miguel Rodríguez

(2023). Concept emergence from complex sensory data: A connectionist model. https://doi.org/10.31219/osf.io/z47jb.

32.

Deng

(2012). The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, 29(6), 141–142. https://doi.org/10.1109/MSP.2012.2211477

33.

Díaz-Rodríguez

Lamas

Sanchez

Franchi

Donadello

Tabik

Filliat

Cruz

Montes

Herrera

(2022). Explainable neural-symbolic learning (x-nesyl) methodology to fuse deep learning representations with expert knowledge graphs: The monumai cultural heritage use case. Information Fusion, 79, 58–83.

34.

Dushkin

R. V.

Stepankov

V. Y.

(2023). Principles of solving the symbol grounding problem in the development of the general artificial cognitive agents. Lecture Notes in Networks and Systems, 544 LNNS: 231–245. Cited by: 0. https://doi.org/10.1007/978-3-031-16075-2_15.

35.

Evans

Bošnjak

Buesing

Ellis

Pfau

Kohli

Sergot

(2021a). Making sense of raw input. Artificial Intelligence, 299, 103521. https://doi.org/10.1016/J.ARTINT.2021.103521

36.

Evans

Hernández-Orallo

Welbl

Kohli

Sergot

(2021b). Making sense of sensory input. Artificial Intelligence, 293. https://doi.org/10.1016/J.ARTINT.2020.103438

37.

Fdez-Riverola

Corchado

J. M.

(2003). Forecasting red tides using an hybrid neuro-symbolic system. AI Communications, 16, 221–233.

38.

Garcez

A. D.

Besold

T. R.

Raedt

L. D.

Foldiak

Hitzler

Icard

Kiihnberger

K. U.

Lamb

L. C.

Miikkulainen

Silver

D. L.

(2015). Neural-symbolic learning and reasoning: Contributions and challenges (Report No. SS-15-03/18-21). AAAI Spring Symposium.

39.

Gärdenfors

(1990). Induction, conceptual spaces and AI. Philosophy of Science, 57, 78–95. https://doi.org/10.1086/289532

40.

Garnelo

Arulkumaran

Shanahan

(2016). Towards deep symbolic reinforcement learning. arXiv.

41.

Gharehchopogh

F. S.

(2023). An improved harris hawks optimization algorithm with multi-strategy for community detection in social network. Journal of Bionic Engineering, 20(3), 1175–1197.

42.

Goguen

(2005). What is a concept? Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3596 LNAI: 52–77. https://doi.org/10.1007/11524564_4.

43.

Hao

Yang

Guo

Nasridinov

Park

D. S.

(2021). On invariance of concept stability for attribute reduction in concept lattice. Lecture Notes in Electrical Engineering, 715, 101–106. https://doi.org/10.1007/978-981-15-9343-7_14

44.

Harnad

(1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42, 335–346. https://doi.org/10.1016/0167-2789(90)90087-6

45.

Harnad

(2006). Symbol-grounding Problem. John Wiley & Sons, Ltd. https://doi.org/10.1002/0470018860.s00025

46.

Cao

Yuan

Zhang

(2018). Fast deep neural networks with knowledge guided training and predicted regions of interests for real-time video object detection. IEEE Access, 6, 8990–8999. Cited by: 49; All Open Access, Gold Open Access. https://doi.org/10.1109/ACCESS.2018.2795798.

47.

Hitzler

Sarker

M. K.

Besold

T. R.

Garcez

A. D.

Bader

Bowman

Domingos

Hitzler

Kühnberger

K. U.

Lamb

L. C.

Lima

P. M. H. V.

Penning

L. D.

Pinkas

Poon

Zaverucha

(2022). Neural-symbolic learning and reasoning: A survey and interpretation. Frontiers in Artificial Intelligence and Applications, 342, 1–51. https://doi.org/10.3233/FAIA210348

48.

Imamura

Nakamura

(2021). SPIKELET: An adaptive symbolic approximation for finding higher-level structure in time series. In Proceedings - IEEE international conference on data mining, ICDM (pp. 1120–1125). https://doi.org/10.1109/ICDM51629.2021.00132.

49.

Imani

Keogh

(2019). Matrix profile xix: Time series semantic motifs: A new primitive for finding higher-level structure in time series. In Industrial conference on data mining (pp. 329–338). https://doi.org/10.1109/ICDM.2019.00043.

50.

Johnson

Fei-Fei

Hariharan

Zitnick

C. L.

Maaten

L. V. D.

Girshick

(2017). CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017 (pp. 1988–1997). https://doi.org/10.1109/CVPR.2017.215

51.

R. G. F.

(1994). Instruments, nerve action, and the all-or-none principle. Instruments, 9, 208–235. https://www.jstor.org/stable/302006

52.

Kacprzyk

Wilbik

Zadrozny

(2007). Linguistic summarization of time series under different granulation of describing features. volume 4585 LNAI, 230–240. Springer Verlag. https://doi.org/10.1007/978-3-540-73451-2_25.

53.

Kaczmarek-Majer

Hryniewicz

(2019). Application of linguistic summarization methods in time series forecasting. Information Sciences, 478, 580–594. https://doi.org/10.1016/J.INS.2018.11.036

54.

Kent

R. E.

(1996). Rough concept analysis: A synthesis of rough sets and formal concept analysis. Fundamenta Informaticae, 27, 169–181. https://doi.org/10.3233/FI-1996-272305

55.

Kloska

Rozinajova

(2021). Towards symbolic time series representation improved by kernel density estimators. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12930 LNCS, 25–45. https://doi.org/10.1007/978-3-662-64553-6_2.

56.

Lake

B. M.

Salakhutdinov

Tenenbaum

J. B.

(2015). Human-level concept learning through probabilistic program induction. Science (New York, N.Y.), 350, 1332–1338. https://doi.org/10.1126/SCIENCE.AAB3050/SUPPL_FILE/LAKE-SM.PDF

57.

Lecun

Bengio

Hinton

(2015). Deep learning. Nature, 521, 436–444. https://doi.org/10.1038/NATURE14539

58.

Lemon

(2022). Conversational AI for multi-agent communication in natural language. AI Communications, 35, 295–308. https://doi.org/10.3233/AIC-220147

59.

Mao

(2022). The difficulties in symbol grounding problem and the direction for solving it. Philosophies, 7(5),108. https://doi.org/10.3390/philosophies7050108

60.

Shen

(2022). A new symbolic representation method for time series. Information Sciences, 609, 276–303. https://doi.org/10.1016/J.INS.2022.07.047

61.

Lin

Keogh

Lonardi

Chiu

(2003). A symbolic representation of time series, with implications for streaming algorithms. Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD ’03, 2–11. https://doi.org/10.1145/882082.882086.

62.

Min

W. K.

Kim

Y. K.

(2019). Soft concept lattice for formal concept analysis based on soft sets: Theoretical foundations and applications. Soft Computing, 23, 9657–9668. https://doi.org/10.1007/S00500-018-3532-Z

63.

Mouakher

Ragobert

Gerin

(2021). Conceptual coverage driven by essential concepts: A formal concept analysis approach. Mathematics, 9(21), 2694. https://doi.org/10.3390/MATH9212694

64.

Nagoev

Nagoeva

Anchokov

Bzhikhatlov

Kankulov

Enes

(2023). The symbol grounding problem in the system of general artificial intelligence based on multi-agent neurocognitive architecture. Cognitive Systems Research, 79, 71–84.https://doi.org/10.1016/j.cogsys.2023.01.002

65.

Nevens

Eecke

P. V.

Beuls

(2020). From continuous observations to symbolic concepts: A discrimination-based strategy for grounded concept learning. Frontiers in Robotics and AI, 7. https://doi.org/10.3389/FROBT.2020.00084

66.

Newell

(1980). Physical symbol systems. Cognitive Science, 4, 135–183. https://doi.org/10.1016/S0364-0213(80)80015-2

67.

Nguyen

T. L.

Ifrim

(2023). Fast time series classification with-random symbolic subsequences. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 13812 LNAI, 50–65. https://doi.org/10.1007/978-3-031-24378-3_4.

68.

Özdogan

Boran

F. E.

Akay

(2021). A possibilistic approach for interval type-2 fuzzy linguistic summarization of time series. Artificial Intelligence Review, 54, 3991–4018. https://doi.org/10.1007/S10462-020-09945-Z

69.

Pitts

McCulloch

W. S.

(1947). How we know universals: The perception of auditory and visual forms. The Bulletin of Mathematical Biophysics 1947 9:3, 9, 127–147. https://doi.org/10.1007/BF02478291

70.

Rowe

Partridge

(1993). Creativity: A survey of AI approaches. Artificial Intelligence Review, 7(1), 43–70. https://doi.org/10.1007/BF00849197

71.

Saquer

Deogun

J. S.

(1999). Formal Rough concept analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1711, 91–99. https://doi.org/10.1007/978-3-540-48061-7_13

72.

Saquer

Deogun

J. S.

(2001). A fuzzy approach for approximating formal concepts. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2005, 269–276. https://doi.org/10.1007/3-540-45554-X_32

73.

Sarker

M. K.

Zhou

Eberhart

Hitzler

(2021). Neuro-symbolic artificial intelligence. AI Communications, 34, 197–209. https://doi.org/10.3233/AIC-210084

74.

Schäfer

Högqvist

(2012). Sfa: A symbolic fourier approximation and index for similarity search in high dimensional datasets. In ACM International conference proceeding series (pp. 516–527). https://doi.org/10.1145/2247596.2247656.

75.

Schwenke

Atzmueller

(2023). Making time series embeddings more interpretable in deep learning: Extracting higher-level features via symbolic approximation representations. Proceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS, 36(1). https://doi.org/10.32473/FLAIRS.36.133107

76.

Searle

J. R.

(1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–424. Cited by: 3218; All Open Access, Green Open Access https://doi.org/10.1017/S0140525X00005756

77.

Selvaraju

R. R.

Cogswell

Das

Vedantam

Parikh

Batra

(2019). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision, 128(2), 336–359. https://doi.org/10.1007/s11263-019-01228-7

78.

Simon

H. A.

(1995). Artificial intelligence: An empirical science. Artificial Intelligence, 77, 95–127. https://doi.org/10.1016/0004-3702(95)00039-H

79.

Sowa

J. F.

(2005). Categorization in cognitive computer science. Elsevier Ltd. pp. 141–163. https://doi.org/10.1016/B978-008044612-7/50061-5

80.

Taddeo

Floridi

(2005). Solving the symbol grounding problem: A critical review of fifteen years of research. Journal of Experimental and Theoretical Artificial Intelligence, 17, 419–445. https://doi.org/10.1080/09528130500284053

81.

Taddeo

Floridi

(2007). A praxical solution of the symbol grounding problem. Minds and Machines, 17, 369–389. https://doi.org/10.1007/S11023-007-9081-3

82.

Topan

Rolnick

(2021). Techniques for symbol grounding with satnet. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, & J. W. Vaughan (Eds.), Advances in neural information processing systems (pp. 20733–20744). Curran Associates, Inc.

83.

Vogt

(2003). Anchoring symbols to sensorimotor control. https://web-archive.southampton.ac.uk/cogprints.org/3060/.

84.

Wang

P. W.

Donti

P. L.

Wilder

Kolter

(2019). Satnet: Bridging deep learning and logical reasoning using a differentiate satisfiability solver. In 36th International conference on machine learning, ICML 2019 (11373–11386).

85.

Wiener

(1948). Cybernetics or Control and Communication in the Animal and the Machine. Cambridge, MA: MIT Press. ISBN 978-0-262-73009-9.

86.

Wille

(1982). Restructuring lattice theory: An approach based on hierarchies of concepts. Ordered Sets, 445–470. https://doi.org/10.1007/978-94-009-7798-3_15

87.

W. Z.

Leung

J. S.

(2009). Granular computing and knowledge reduction in formal contexts. IEEE Transactions on Knowledge and Data Engineering, 21, 1461–1474. https://doi.org/10.1109/TKDE.2008.223

88.

Xia

Lee

K. Z.

Bengio

Bareinboim

(2021). The causal-neural connection: Expressiveness, learnability, and inference. Advances in Neural Information Processing Systems, 13, 10823–10836.

89.

Yager

R. R.

(1982). A new approach to the summarization of data. Information Sciences, 28, 69–86. https://doi.org/10.1016/0020-0255(82)90033-0

90.

Yao

Y. Y.

(2000). Granular computing: Basic issues and possible solutions. Proceedings of the Joint Conference on Information Sciences, 5, 186–189.

91.

Becker

Trinh

L. M.

Behrisch

(2023). Saxregex: Multivariate time series pattern search with symbolic representation, regular expression, and query expansion. Computers and Graphics (Pergamon), 112, 13–21. https://doi.org/10.1016/J.CAG.2023.03.002

92.

Zadeh

L. A.

(1979). Fuzzy sets and information granularity. Advances in Fuzzy Set Theory and Applications, 3–18. https://doi.org/10.1142/9789814261302_0022

93.

Zhang

W. X.

J. M.

Fan

S. Q.

(2007). Variable threshold concept lattices. Information Sciences, 177, 4883–4892. https://doi.org/10.1016/J.INS.2007.05.031

94.

Zhou

Z. H.

Jiang

Chen

S. F.

(2003). Extracting symbolic rules from trained neural network ensembles. AI Communications, 16(1), 3–15.

95.

Zhou

Khosla

Lapedriza

Oliva

Torralba

(2015). Learning deep features for discriminative localization. https://arxiv.org/abs/1512.04150.

96.

Zunino

Olivares

Ribeiro

H. V.

Rosso

O. A.

(2022). Permutation Jensen-Shannon distance: A versatile and fast symbolic tool for complex time-series analysis. Physical Review E, 105(4), 045310. https://doi.org/10.1103/PHYSREVE.105.045310

	Unsupervised	No external	Sensory data	Semantic	Application
	learning	knowledge	encoding ability	capacity	scope
Wang et al. (2019)	$\circ$	$\circ$	$\circ$	$∙$	$∙ ∙$
Topan et al. (2021)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Dai et al. (2019)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Asai & Fukunaga (2017)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙$
Garnelo et al. (2016)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
Evans et al. (2021b)	$∙ ∙$	$∙ ∙$	$\circ$	$∙ ∙$	$∙$
Evans et al. (2021a)	$\circ$	$∙ ∙$	$∙ ∙$	$∙$	$∙ ∙$
Lake et al. (2015)	$∙ ∙$	$∙$	$∙$	$\circ$	$∙$
Bechberger (2021)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
Nevens et al. (2020)	$\circ$	$\circ$	$∙ ∙$	$∙$	$∙ ∙$
BIGA (present method)	$∙ ∙$	$∙$	$∙$	$∙ ∙$	$∙$

$A n g l e$
	a	b	c	d	e	f	g	h
a		0	0	0	1	1	0	0
b	0		0	0	1	1	0	0
c	2	2		0	2	2	2	2
d	2	2	0		2	2	2	2
e	1	1	0	0		0	1	1
f	1	1	0	0	0		1	1
g	0	0	0	0	1	1		0
h	0	0	0	0	1	1	0

$A n g l e$ , $w i d t h$
	a	b	c	d	e	f	g	h
a		4	6	5	9	9	9	9
b	4		5	6	9	9	11	9
c	14	13		7	12	12	19	15
d	11	12	5		12	12	15	10
e	9	9	4	6		0	15	15
f	9	9	4	6	0		15	15
g	0	2	2	0	6	6		0
h	5	5	3	0	11	11	5

$A n g l e$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		0	11	14	48	48	26	26
b	0		11	14	48	48	26	26
c	62	62		14	95	95	53	53
d	97	97	46		128	128	93	93
e	14	14	10	11		0	5	5
f	14	14	10	11	0		5	5
g	37	37	13	21	50	50		0
h	37	37	13	21	50	50	0

$A n g l e$ , $x$
	a	b	c	d	e	f	g	h
a		0	2	2	7	7	2	2
b	0		2	2	7	7	2	2
c	15	15		0	15	15	12	12
d	15	15	0		15	15	12	12
e	7	7	2	2		0	5	5
f	7	7	2	2	0		5	5
g	7	7	4	4	10	10		0
h	7	7	4	4	10	10	0

$A n g l e$ , $w i d t h$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		85	79	74	132	135	104	102
b	94		95	93	136	141	110	113
c	194	201		65	231	238	189	196
d	250	260	126		295	302	249	258
e	61	56	45	48		25	56	51
f	58	55	46	49	19		56	49
g	94	91	64	63	117	123		51
h	125	127	104	105	145	149	84

$A n g l e$
	a	b	c	d	e	f	g	h
a		0	0	0	1	1	0	0
b	0		0	0	1	1	0	0
c	2	2		0	2	2	2	2
d	2	2	0		2	2	2	2
e	1	1	0	0		0	1	1
f	1	1	0	0	0		1	1
g	0	0	0	0	1	1		0
h	0	0	0	0	1	1	0

$A n g l e$ , $w i d t h$
	a	b	c	d	e	f	g	h
a		4	6	5	9	9	9	9
b	4		5	6	9	9	11	9
c	14	13		7	12	12	19	15
d	11	12	5		12	12	15	10
e	9	9	4	6		0	15	15
f	9	9	4	6	0		15	15
g	0	2	2	0	6	6		0
h	5	5	3	0	11	11	5

$A n g l e$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		0	11	14	48	48	26	26
b	0		11	14	48	48	26	26
c	62	62		14	95	95	53	53
d	97	97	46		128	128	93	93
e	14	14	10	11		0	5	5
f	14	14	10	11	0		5	5
g	37	37	13	21	50	50		0
h	37	37	13	21	50	50	0

$A n g l e$ , $x$
	a	b	c	d	e	f	g	h
a		0	2	2	7	7	2	2
b	0		2	2	7	7	2	2
c	15	15		0	15	15	12	12
d	15	15	0		15	15	12	12
e	7	7	2	2		0	5	5
f	7	7	2	2	0		5	5
g	7	7	4	4	10	10		0
h	7	7	4	4	10	10	0

$A n g l e$ , $w i d t h$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		85	79	74	132	135	104	102
b	94		95	93	136	141	110	113
c	194	201		65	231	238	189	196
d	250	260	126		295	302	249	258
e	61	56	45	48		25	56	51
f	58	55	46	49	19		56	49
g	94	91	64	63	117	123		51
h	125	127	104	105	145	149	84

A Bateson-Inspired Model for the Generation of Semantic Concepts from Sensory Data

Abstract

Keywords

1. Introduction

2. Fundamentals

2.1. Formal Concept Analysis

Table 1. A Simple Formal Context Example With Four Objects and Three Attributes. G / M m 1 m 2 m 3 g 1 0 1 1 g 2 1 0 1 g 3 1 0 0 g 4 0 1 0

3. Related Works

Table 3. Symbolic Comparison Scheme of the Nine Possible Basic Transition Combinations.

Table 8. Number of Differing Concepts With Parameters: A n g l e . A n g l e a b c d e f g h a 0 0 0 1 1 0 0 b 0 0 0 1 1 0 0 c 2 2 0 2 2 2 2 d 2 2 0 2 2 2 2 e 1 1 0 0 0 1 1 f 1 1 0 0 0 1 1 g 0 0 0 0 1 1 0 h 0 0 0 0 1 1 0

7.1. Future Work

Footnotes

ORCID iD

Funding

Declaration of Conflicting Interests

Appendix A

References

Table 1.
A Simple Formal Context Example With Four Objects and Three Attributes.

$G / M$ $m_{1}$ $m_{2}$ $m_{3}$

$g_{1}$ 0 1 1

$g_{2}$ 1 0 1

$g_{3}$ 1 0 0

$g_{4}$ 0 1 0

Table 3.
Symbolic Comparison Scheme of the Nine Possible Basic Transition Combinations.

Table 8.
Number of Differing Concepts With Parameters: $A n g l e$ .

$A n g l e$

a b c d e f g h

a 0 0 0 1 1 0 0

b 0 0 0 1 1 0 0

c 2 2 0 2 2 2 2

d 2 2 0 2 2 2 2

e 1 1 0 0 0 1 1

f 1 1 0 0 0 1 1

g 0 0 0 0 1 1 0

h 0 0 0 0 1 1 0

$A n g l e$
	a	b	c	d	e	f	g	h
a		0	0	0	1	1	0	0
b	0		0	0	1	1	0	0
c	2	2		0	2	2	2	2
d	2	2	0		2	2	2	2
e	1	1	0	0		0	1	1
f	1	1	0	0	0		1	1
g	0	0	0	0	1	1		0
h	0	0	0	0	1	1	0

$A n g l e$ , $w i d t h$
	a	b	c	d	e	f	g	h
a		4	6	5	9	9	9	9
b	4		5	6	9	9	11	9
c	14	13		7	12	12	19	15
d	11	12	5		12	12	15	10
e	9	9	4	6		0	15	15
f	9	9	4	6	0		15	15
g	0	2	2	0	6	6		0
h	5	5	3	0	11	11	5

$A n g l e$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		0	11	14	48	48	26	26
b	0		11	14	48	48	26	26
c	62	62		14	95	95	53	53
d	97	97	46		128	128	93	93
e	14	14	10	11		0	5	5
f	14	14	10	11	0		5	5
g	37	37	13	21	50	50		0
h	37	37	13	21	50	50	0

$A n g l e$ , $x$
	a	b	c	d	e	f	g	h
a		0	2	2	7	7	2	2
b	0		2	2	7	7	2	2
c	15	15		0	15	15	12	12
d	15	15	0		15	15	12	12
e	7	7	2	2		0	5	5
f	7	7	2	2	0		5	5
g	7	7	4	4	10	10		0
h	7	7	4	4	10	10	0

$A n g l e$ , $w i d t h$ , $x$ , $y$
	a	b	c	d	e	f	g	h
a		85	79	74	132	135	104	102
b	94		95	93	136	141	110	113
c	194	201		65	231	238	189	196
d	250	260	126		295	302	249	258
e	61	56	45	48		25	56	51
f	58	55	46	49	19		56	49
g	94	91	64	63	117	123		51
h	125	127	104	105	145	149	84