Sage Journals: Discover world-class research

Abstract

In this position paper, we examine some of the assumptions held about logic and its relevance to the development of modern artificial intelligence (AI), which is primarily driven by deep learning. The paper aims to address fundamental misunderstandings about logic and ultimately argue for the benefits of symbolic formalisms in modeling uncertain worlds. While it is now recognized that statistical associations learned from data are limited in their ability to understand the world, there is still a great deal of criticism and hesitancy regarding the use of symbolic logic to achieve or support a broader vision for AI. By arguing that symbolic logic is more flexible than nonexperts believe, we make a case for neurosymbolic AI, which offers the best of both worlds.

Keywords

Neurosymbolic AI logic and learning

1. Introduction

Artificial intelligence (AI) is widely acknowledged as a new kind of science that will bring about (and is already enabling) the next technological revolution. Virtually every week, exciting reports come our way about the use of AI for drug discovery, game playing, stock trading, and law enforcement. And virtually all of these are mostly concerned with a very narrow technological capability that of predicting future instances based on past instances.

Identifying statistical patterns, correlations, and associations is, without a doubt, extremely useful. In the first instance, it is needed in numerous applications to inspect features and properties of interest in observed data. It serves as the backbone of recommendation systems, for example, and is likely more than sufficient, even with flaws, when gathering context. While searching for “how to raise lambs” in an online bookstore, we might be a little disappointed if it suggests “Silence of the Lambs” by Thomas Harris, and somewhat annoyed if it suggests cookbooks on “how to cook lamb,” but such low-quality results are unlikely to have long-term effects. This type of AI might also be useful but somewhat problematic for, say, fast-tracking the review of job applications, provided these models are adjusted for bias, and a human intervenes and interprets the outcome and determines how to act further. This type of AI was largely believed to be sufficient for vision systems (Thrun et al., 2005), until it was observed that self-driving cars fail stupendously and that the state-of-the-art systems can be fooled in strange and unnatural ways (Goodfellow et al., 2014).

Be that as it may, this is a very narrow view of AI capabilities. AI, as understood by both scientists and science fiction writers, is clearly much broader. What distinguishes big-data analysis from AI is that the set of capabilities we wish to enable with the latter. We are interested in a general-purpose, autonomous computational entity that, in the very least, has agency. Many of these concerns were widely debated, discussed, and developed during the heyday of good old-fashioned AI (Dennett, 1989; Lakemeyer & Levesque, 2007; Levesque & Lakemeyer, 2001).

However, despite recognizing that data-driven statistical learning is limited in its ability to understand the world and model its knowledge (Marcus & Davis, 2019), there is still a lot of criticism and hesitancy about the use of symbolic logic to accomplish or assist in a broader vision for AI (Darwiche, 2018).

In this position paper, we examine some of the assumptions held about logic and its relevance to the development of modern AI, which is primarily driven by deep learning. The paper aims to address fundamental misunderstandings about logic and ultimately argue for the benefits of symbolic formalisms in modeling uncertain worlds. By arguing that symbolic logic is more flexible than nonexperts and critics believe, we make a case for neurosymbolic AI, which offers the best of both worlds.

2. Logic is Old-Fashioned

In the first part of this article, we will look at some of the criticisms against using logic. We then turn to a number of positive dimensions to examine the integration of logic and learning.

2.1. Neural Approaches and Nothing Else!

Modern AI has moved on, we are told. The idea of using symbolic logic is outdated, and the area of knowledge representation defined over symbolic logic is now affectionately (or perhaps pejoratively) called good old-fashioned AI or GOFAI for short. For example, Bengio et al. (2017) write: “… machine learning is the only viable approach to building AI systems that can operate in complicated real-world environments.”

In the early days of AI, John McCarthy put forward a profound idea to realize AI systems (McCarthy, 1959): he posited that what the system needs to know could be represented in a formal language, and a general-purpose algorithm would then conclude the necessary actions needed to solve the problem at hand. The main advantage is that the representation can be scrutinized and understood by external observers, and the system’s behavior could be improved by making statements to it.

Numerous such languages emerged in the years to follow, but first-order logic remained at the forefront as a general and powerful option (Morgenstern & McIlraith, 2011). Propositional and first-order logic continue to serve as the underlying language for several areas in AI, including constraint satisfaction (Bistarelli et al., 2001), automated planning (Rintanen, 2012), database theory (Libkin, 2004), ontology specification (Konev et al., 2018), verification (Barrett et al., 2009), and knowledge representation (Levesque & Lakemeyer, 2001).

And yet, “modern” AI has decided that these efforts are superfluous, or at least easily replaceable once a training dataset has been created. For example, Sejnowski (2020) writes:

The early goals of machine learning were more modest than those of AI. Rather than aiming directly at general intelligence, machine learning started by attacking practical problems in perception, language, motor control, prediction, and inference using learning from data as the primary tool. In contrast, early attempts in AI were characterized by low-dimensional algorithms that were handcrafted. However, this approach only worked for well-controlled environments. For example, in blocks world, all objects were rectangular solids, identically painted, and in an environment with fixed lighting. These algorithms did not scale up to vision in the real world, where objects have complex shapes, a wide range of reflectances, and lighting conditions are uncontrolled. The real world is high-dimensional and there may not be any low-dimensional model that can be fit to it. Similar problems were encountered with early models of natural languages based on symbols and syntax, which ignored the complexities of semantics. Practical natural language applications became possible once the complexity of deep learning language models approached the complexity of the real world. Models of natural language with millions of parameters and trained with millions of labeled examples are now used routinely.

He goes on to suggest that:

Is there a path from the current state of the art in deep learning to artificial general intelligence? From the perspective of evolution, most animals can solve problems needed to survive in their niches, but general abstract reasoning emerged more recently in the human lineage. However, we are not very good at it and need long training to achieve the ability to reason logically. This is because we are using brain systems to simulate logical steps that have not been optimized for logic. Students in grade school work for years to master simple arithmetic, effectively emulating a digital computer with a 1-s clock. Nonetheless, reasoning in humans is proof of principle that it should be possible to evolve large-scale systems of deep learning networks for rational planning and decision making.

The “theory of everything” approach in science, or perhaps its analog in AI, that of having a single algorithm/architecture/framework for all tasks (Domingos, 2015), is undoubtedly appealing. Some theoretical physicists have hopes pinned on string theory, for example, to come up with a single framework that unifies all observational data, across large and minuscule physical bodies (Dienes, 1997). Likewise, the appeal of a purely neural model is attractive. However, there is a lot to debate here.

Firstly, deep learning models are loosely inspired by the brains but not fully accurate representations (yet) (Mitra, 2014; Swanson, 2012).

Secondly, there is the notion of innateness (Tooby et al., 2005), and how much evolution might help the brain in understanding and processing the world in a structured manner. And thirdly, we must bear in mind that we still lack a complete understanding of how the neurons of a bird (let alone a human) are wired, and how that influences cognitive capabilities. Merely knowing that neural weights enable birds to solve puzzles and recognize faces does not necessarily imply that our implementation of their neurons should resemble or possess similar properties. These concerns had also been debated in the literature in the 1980s (Reeke & Edelman, 1988; Smolensky, 1987).

Lastly, Sejnowski (2020) also offers the social element of learning:

Although the focus today on deep learning was inspired by the cerebral cortex, a much wider range of architectures is needed to control movements and vital functions. Subcortical parts of mammalian brains essential for survival can be found in all vertebrates, including the basal ganglia that are responsible for reinforcement learning and the cerebellum, which provides the brain with forward models of motor commands. Humans are hypersocial, with extensive cortical and subcortical neural circuits to support complex social interactions.

Putting such issues aside, it is also worth noting that proponents of the symbolic approach to AI never explicitly claimed the existence of symbolic representations within our minds (Brachman & Levesque, 2004; Levesque & Lakemeyer, 2001). In essence, the symbolic approach offers a coherent strategy for: (a) executing symbolic expressions, which capture the knowledge of the system about the world and (b) comprehending the (idealized) implications of one’s knowledge, as specified by inference rules in logic.

As argued by Levesque (2012), this is not a novel concept—Leibniz articulated centuries ago that certain types of thinking adhere to symbolic processing. Hence, why not employ an algebraic treatment for cognition? As scientists, we may debate whether it is more useful to have an exact model of computation that approximates the reasoning in the brain (Jaynes, 1988; Smolensky, 1987) or whether we should forego these models altogether and simply be satisfied with informal descriptions of reasoning (Prado et al., 2011), as might emerge from a trained model (Creswell et al., 2022).

We reiterate that the allure of a purely neural approach is understandable, given its simplicity and the sense of a “unified theory” it evokes. However, the arguments regarding the effectiveness of the training process in capturing intricate reasoning (Hoernle et al., 2022) and the potential for incorrect (Valmeekam et al., 2022) and unreliable predictions (Azamfirei et al., 2023) suggest that a purely neural approach may not be sufficiently robust to exploit and capture structure. Indeed, despite Sejnowski (2020) offering that “it should be possible to evolve large-scale systems of deep learning networks for rational planning and decision making,” he admits also that “a hybrid solution might also be possible, similar to neural Turing machines developed by DeepMind for learning how to copy, sort, and navigate.”

More generally, by taking a step back, we realize that until the past few centuries, our understanding of the brain and neurons was limited. Yet, during this time, we were able to calculate, develop number theory, construct calculators, and ultimately build computers (Turing, 1950). Imagine if we had solely dedicated ourselves to constructing elaborate brain replicas in the hopes that they could handle (say) tax calculations for us. Most importantly, we cannot test for a capability without first defining that capability, such as (say) deduction (Prado et al., 2011).

All of this underscores the significance of the symbolic approach, which offers an idealized framework for well-defined (relative to the formal language) forms of reasoning. There is a popular analogy (Brachman & Levesque, 2004) suggesting that we need not build wings and feathers to build airplanes; comprehending the principles of aerodynamics is enough. So, why shouldn’t the development of a theory of artificial cognition be just as relevant for a type of AI that is behaviorally similar to humans in some instances, without necessarily resorting to a brain-like architecture? Or perhaps a combination of the symbolic and the neural, as offered by neurosymbolic AI (Hitzler, 2022)? The mistrust may stem from the misconception that logic and probabilistic learning are fundamentally incompatible or entirely separate domains—a notion we will now challenge.

2.2. There is a Dichotomy

A common view held by many in the broader community that there is an inherent dichotomy between symbolic logic and machine learning—the former for discrete domains, and the latter for continuous ones. The exact boundary between “discreteness” and symbolic logic might be obfuscated even in works that are strong proponents of using logic in machine learning. For example, in one of the most popular representations of probabilistic relational learning—so-called Markov logic networks (MLNs)—the following is said:

First-order logic (with assumptions above) is the special case of MLNs obtained when all weights are equal

Every probability distribution over discrete or finite-precision numeric variables can be represented as a MLN.

The assumptions these statements refer to ensure that the set of constants in the domain of discourse is finite, leading to finitely many possible interpretations, all of which are of finite size. A reader not familiar with logic might incorrectly infer that we are only able to construct discrete probability distributions using first-order logic. Likewise, even in a nuanced survey such as Cartuyvels et al. (2021) on the importance of unifying logic and learning, they write:

The term “discrete representation,” used throughout this paper, denotes a discretely valued variable that represents some concept, which can take on either a limited or a countably infinite number of distinct values.

Discrete processing consists of the application of any discrete function to input data. A discrete mathematical function has a domain, and hence a range, consisting only of discrete sets of values. Examples can be found in integer arithmetic, in computer programming languages, and in first-order logic.

As deep models are currently trained using gradient descent, it is relevant to note that discrete functions are not differentiable (in any subset of their domain).

Modern neural networks are representative of continuous processing. First-order logic or symbolic AI models are representative of discrete processing.

Although they go on to mention fuzzy logic and nonmonotonic logic later, readers might still come away with the impression that symbolic logic is primarily suited for discrete entities. To be clear, first-order logic, interpreted over a finite or a countably infinite domain and interpreted classically, does not lend itself to differentiability. We will return precisely to this point, but nonlogicians might conclude: (a) symbolic logic as used in AI is focused on discrete symbols and (b) symbolic processing in vector (real-valued) space is a separate topic of study that can be independently done from symbolic logic.

What we are seeing here is a narrowing of the use of “logic” simply as classical logic—say, as introduced in Enderton (1972)—defined over Boolean truth values. Moreover, the use of logic is also assumed to be limited to discrete propositional assertions, as seen in ontologies that capture relationships and hierarchies about commonsensical concepts (McCarthy, 1986), as well as in early attempts at logic programming (Kowalski & Sergot, 1986).

We will now discuss the use of non-Boolean truth values and continuous properties in logic, and how that is making an appearance in the area of neurosymbolic AI. The subtlety and clarification here is that indeed logical objects are discrete entities, or more precisely, discrete structures. To compute entailments, moreover, we algebraically manipulate symbols. However, these structures might capture continuous properties either by allowing nonbinary truth values, by using function symbols over the real space, or by distributions on the models of formulas. This leads to various paradigms of relational learning and neurosymbolic AI, many of which are differentiable.

2.2.1. Real-Valued Truth Values

To a large extent, it is true that the area of knowledge representation in AI focuses on discrete symbols and a Boolean interpretation (Brachman & Levesque, 2004). But, on the other hand, it has been close to 60 years since we have fuzzy logic (Zadeh, 1965), among other languages for nonbinary truth values (Katz, 1981). These allow us to assign a truth value between 0 and 1 to propositions, with the understanding that these values indicate the degree to which the proposition may be true. Fuzzy logic can also be utilized to represent ambiguous concepts, such as stating that a person is tall, without specifying tall as a categorical property.

The use of such values in propositions means that the interpretation of Boolean connectives also changes. For example, the formula $α \land β$ could be mapped onto the $min$ of the truth values of the individual formulas. That is:

α \land β ≐ min (α, β) .

α

is assigned a truth value of 0.6 and

β

is assigned a truth value of 0.4, then the conjunction would be given a truth value of 0.4. Of course, one can see that if the truth values are either 0 or 1, then the

min

function aligns with classical logic in the sense that if either

α

β

is 0, then the conjunction will also have the truth value of 0.

By construction, the outputs of neural networks can be mapped to real numbers between 0 and 1. Owing to the nature of truth values in such logics, these outputs can be directly modeled as atoms in logical formulas. This led to an early wave of neurosymbolic AI formalisms (Garcez et al., 2002) and the development of a field that integrates neural outputs in a logical language (Hitzler, 2022). Perhaps the most representative examples in this space are logic tensor networks (Badreddine et al., 2022) and other approaches based on fuzzy logic (van Krieken et al., 2022). The motivation for many of these languages is to logically capture concepts that have been learned from neural networks, in order to reason about these concepts as part of a commonsensical knowledge base. Thus, the agent would be reasoning about hierarchies and relationships, but many of the relations in this knowledge are learned directly using neural networks, presumably from observational data.

It is worth noting that reasoning about concepts and relations is an ongoing problem with neural networks—see efforts such as capsule networks (Sabour et al., 2017) and module networks (Andreas et al., 2016)—and there are very few general solutions. Neurosymbolic AI is stepping in here, especially if it were to allow a general framework for injecting knowledge expressed in a fragment of first-order logic, could be very welcome.

2.2.2. From Discrete to Continuous

Capturing the output of neural networks as truth values in a logical formula is one approach to reasoning about vector spaces. However, we can also use logic to reason about continuous properties as formulas.

Although it is common to discuss discrete properties in logical AI, it is not necessary that they must do so. Logical formulas are indeed discrete structures, but they can also express properties about countably infinite or even uncountably many objects (Belle, 2020; Belle & Levesque, 2013; Herrmann & Thielscher, 1996; Raman et al., 2013).

Reasoning about real numbers has long been an area of interest in mathematical logic (Jovanović & De Moura, 2013), going back to Tarksi, and are a major concern in satisfiability modulo theories (SMTs; Barrett et al., 2009). SMT can be seen as a generalization of satisifiability solving (SAT) for propositional logic and is being used for the verification of timed and hybrid systems that involve both discrete and continuous properties. For example, the following formula expresses that a logical function with one argument $f$ applied to $x$ is lesser than the square of $y$ . This could be conjoined using Boolean connectives with other assertions, such as one that says $y$ is greater than the two-variable function $g$ applied to $x$ and $z$ :

f (x) \leq y^{2} \land y > g (x, z) .

Here, the domain of

x, y,

and

z

could range over the set of natural numbers

N

, the set of integers

Z

, or even the set of reals

R

Therefore, we can use these formulas to represent constraints on geometric spaces. A recent body of work has examined the idea of regularizing neural networks by adding logical constraints to the loss functions. The idea is to train the network such that the loss is calculated against this logical constraint, which is backpropagated. The goal then is to train the network in such a way that predictions always satisfy these logical constraints. There is existing work on propositional constraints (Gajowniczek et al., 2020), real-valued constraints (Hoernle et al., 2022), as well as temporal formulas (Innes & Ramamoorthy, 2020), the latter of which trains the network to dynamically navigate an environment in only the valid geometric space.

One of the interesting observations in almost all of these papers on loss functions is that they demonstrate that it is much more effective to train the network using such loss functions than assuming the constraints are represented in the data. So it is much more sample efficient (Icarte et al., 2022). Moreover, some of these architectures also allow for the complete satisfaction of the constraints (Hoernle et al., 2022). This is necessary in safety-critical and high-stakes applications.

2.3. Logic is Not Good for Probabilistic Uncertainty

Classical quantifiers in logic, as well as the connectives, allow for disjunctive uncertainty, the existence of individuals, and properties applicable to all individuals in the domain. Because the data we collect are often noisy, or we sometimes have to approximate and average over populations, the use of probability theory is essential (Pearl, 1988). Since classical logic traditionally did not represent probabilistic assertions, much of the learning and uncertainty in the AI community moved away from logic. We will discuss here that the connection between logic and probability is deep, and there is a vibrant community focused precisely on this agenda (Raedt et al., 2016).

2.3.1. Probabilistic Logical Models

Since the work of Nilsson (1986), the use of logic to capture nontrivial probabilistic spaces and reason logically about events in those spaces has been a major concern in uncertainty quantification in AI (Russell, 2015) and statistical relational learning (Raedt et al., 2016). The key idea here is that it should be possible to assign probabilities to atoms, which would then provide a way to extend these probabilities to complex formulas. That is, if $α$ is a well-defined (classical) formula in a logical language $L$ , then so is $Pr (α)$ (Halpern, 2003). This leads to a representation language that may involve a combination of deterministic and probabilistic assertions, capturing the knowledge base of a putative agent. For example, Belle et al. (2016), consider the following formula:

α \land Pr (β) > Pr (γ) \land Pr (γ) \leq 0.6.

It is assumed that

α

is true, and the probability that

β

is true is greater than the probability that

γ

is true. Additionally,

γ

is believed with a probability of

\leq 0.6

. Here,

α

may be a nonprobabilistic assertion. The probability of

γ

is not given a unique value, and we are allowed to compare the likelihoods of two formulas. Such combinations are difficult to express using probability theory alone.

In recent years, there has been a steady progress in designing languages that can not only capture Bayesian networks and factor graphs (Kschischang et al., 2001), but also extend them with a relational and logical syntax. Popular languages for pragmatic specifications of logic and probability include MLNs (Richardson & Domingos, 2006), ProbLog (Raedt et al., 2007), and BLOG (Milch et al., 2005). Many of these not only investigate the representational restrictions that enable the capture of distributions succinctly, but also explore how to reason with the resulting distribution, and in some cases, learn the distributions or representations themselves. (They have to restrict the expressiveness of the language in order to ensure that their representations capture a single distribution; so the above formula may be difficult to express here too.) Consider the following program in ProbLog (Raedt et al., 2007):

0.5::heads1.

0.6::heads2. twoHeads :- heads1, heads2.

This allows us to capture a mixture distribution composed of a biased coin toss and an unbiased coin toss, with the latter having a 0.6 probability of landing heads.

Interestingly, Bayesian networks can also be modeled as ProbLog programs (Raedt et al., 2016). And what is more interesting is that probabilistic inference in Bayesian networks (Chavira, 2008), ProbLog programs (Fierens et al., 2011), MLNs (Richardson & Domingos, 2006), and factor graphs (Kschischang et al., 2001) can all be shown to be reducible to the same computational task known as weighted model counting (Bacchus et al., 2009). Weighted model counting is an extension to SAT in the sense that each satisfying assignment is assigned a weight. By computing the sum of the weights of all satisfying assignments, we can relate that sum to the conditional probability and marginals in a Bayesian network. That is, for a propositional language $L$ , assume a weight function $w$ maps its literals to $R_{[0, 1]}$ . Then, for some $ϕ \in L$ ,

WMC (ϕ, w) = \sum_{{M ∣ M ⊨ ϕ}} \prod_{{l ∣ l \in M}} w (l) .

The product operation here is defined in terms of all the literals that are true in a given model of

ϕ

As argued in Van den Broeck (2013) and Belle (2017), it is not only the case that logical languages allow us to reason about probability distributions over combinatorial spaces, but it is also the case that the syntax of logic can help capture complex relationships that are difficult to model using standard probabilistic languages (Getoor & Taskar, 2007). Moreover, by way of weighted model counting, there is a single generic approach for probabilistic reasoning over discrete, combinatorial spaces that are competitive (Chavira, 2008). It is also amenable to both exact as well as approximate inference schemes (Chakraborty et al., 2014).

Recently, there have also been extensions from discrete combinatorial spaces to continuous ones (Belle et al., 2015; Chistikov et al., 2015), referred to as weighted model integration. Here, the formula $x \in [- 5, 5]$ with a weight of $0.56$ might represent a continuous random variable $x$ whose piecewise constant density for all values between $- 5$ and $5$ is $0.56$ . Analogously, the same formula with the weight of $x^{2} / 2$ might represent a piecewise polynomial density specification for $x$ , such that for all values between $- 5$ and $5$ , its density is given by the square of that value divided by 2. As with weighted model counting, inference in this formulation is performed by means of a notion of model counting in SMT (Barrett et al., 2009).

2.3.2. Generalizing the Specification of a Distribution

Going back to the history of the use of logic in AI (Morgenstern & McIlraith, 2011), there has been considerable interest in unifying logic and uncertainty. Note that, through the use of quantifiers, it is possible to express uncertainty that may not always align with a single distribution. For instance, McCarthy and Hayes (1969) were concerned about probabilities in the early years of using first-order logic for knowledge representation. However, they make a very salient point that we need to think carefully about how numbers and first-order sentences fit together. For example, they argue (McCarthy & Hayes, 1969):

(i)
It is not clear how to attach probabilities to statements containing quantifiers in a way that corresponds to the amount of conviction people have.
(ii)
The information necessary to assign numerical probabilities is not ordinarily available. Therefore, a formalism that required numerical probabilities would be epistemologically inadequate.

There point, simply, is that we should not be expected to put probabilities on every formula; sometimes it suffices to say that $p \lor q$ holds without saying which, and by how much. Moreover, if we assign a probability of $r$ on that formula, or to, say, $\exists x P (x)$ , such an assertion in itself does not provide any additional information on how to further assign a probability to $p$ , $q$ , $P (a)$ , and so on. Many popular languages for logic and probability mentioned above, including MLNs (Richardson & Domingos, 2006), ProbLog (Raedt et al., 2007), and BLOG (Milch et al., 2005), do not allow this level of flexibility. In fact, this requires a different type of machinery altogether, one which permits multiple prior distributions (Belle & Lakemeyer, 2017). Consider a subformula from the example above:
$Pr (γ) \leq 0.6.$
The formula should, in principle, allow for every distribution that accommodates a probability of $γ$ being $\leq 0.6$ . In contrast, in ProbLog, it is assumed that there is a single distribution over the model, and not specifying a probability on (say) a disjunction might be interpreted as a hard constraint that is true in all possible worlds. However, there are languages that do permit such rich specifications. See, for example, works such as Ognjanovic and Raškovic (2000) and Belle et al. (2016).

More generally, probability measures (Gaifman, 1964) on first-order structures and other proposals on logic and uncertainty (Belle & Lakemeyer, 2017; Milch et al., 2005; Raedt et al., 2007; Richardson & Domingos, 2006) allow us to append probabilities and weights in a logical language in different ways, yielding formal frameworks that go beyond and generalize the standard definition for a probability space. There are also approaches (Dubois & Prade, 1988) that are based on possibility theory, which permits a different model for uncertainty that can be powerful when experts disagree or are uncertain about probabilistic assertions.
2.4. Symbols Without (Explicit) Semantics

In the machine-learning literature, it is not uncommon to find syntactical objects, especially well-defined symbolic expressions, such as programs, that are learned without an explicit definition of the semantics (Lake et al., 2015). In such cases, one would need to define only the interpreter and the compiler (Ellis et al., 2022), with an implicit notion that the atomic objects refer to concrete objects in the real world, as obtained by the process of symbol grounding (Tellex et al., 2011).

However, with programs in the program induction literature (Gulwani, 2010), there is (or rather, should be) an implicit logical syntax and semantics that defines: (a) what sort of expressions can be constructed and (b) what they mean and capture. For example, sequential instructions could be understood as conjunctions, and while loops can be captured using second-order quantification (Gulwani, 2010; Levesque et al., 1997; Ternovska & Mitchell, 2009). If we further want to understand what properties are entailed by these programs, then we need to define the semantics comprehensively and analyze what follows from the logical theory corresponding to a program.

Indeed, without a clear specification of how compositions of expressions should be interpreted and evaluated, how are we to know what these programs are yielding (McCarthy, 1959)? There has been a surge of a new family of programming languages that capture intricate machine-learning models. Typically, these languages allow the use of random primitives as well as operators for conditioning and providing evidence. These are referred to as probabilistic programming—see, for example, Church (Goodman et al., 2008), ProbLog (De Raedt & Kimmig, 2015), and the generic construction in Staton et al. (2016). In some cases, they might support combinations of discrete and continuous distributions, and higher-order functions. A general approach to understanding how these programs can be constructed and what sort of distributions they model is through the use of a formal semantical setup, usually in a fragment of first- or second-order logic.

See also works such as Bartha et al. (2021) for discussions on attempting to construct the semantics for one programming language syntax from another. Such a move is especially desirable if we want to check for the internal consistency of an ad hoc programming language. For philosophical arguments on the importance of semantics, see, for example, Crane (1990).

2.5. Logic is About Categorical Propositional Assertions

As discussed above, often “logic” is synonymous with (the classical interpretation of) propositional logic.

There are many systems for writing down symbols and interpreting logical symbols and formulas built up these symbols. Classical approaches include propositional logic (Boolean symbols, $A$ and $B$ is true iff $A$ is true and $B$ is true) and first-order logic, which uses quantifiers. In first-order logic, there is a domain of discourse that stands for the objects in the world. We then say that $\exists x$ . $P (x)$ is true if and only if there is some individual from the domain of discourse such that the property $P$ is true for that individual. Likewise, the formula $\forall x, y . Grandparent (x, y) \supset \exists z . Parent (x, z)$ says that if $x$ is a grandparent of $y$ , there must be some individual $z$ whose parent is $x$ . First-order logic can also use functions over reals, as seen in satisfiability modulo theory (SMT; Barrett et al., 2009). As we illustrated with weighted model integration, which is also defined using SMT, we can express formulas such as $x \leq 5 \land x \geq - 5 \land x > y^{2}$ . Here, both the variables are assumed to be nullary functions. But we could also have functions with arguments and nestings of these functions to construct well-defined formulas of the sort: $f (f (x, y), y) \leq y^{2}$ .

We might also be interested in entertaining multiple possible truth assignments to model uncertainty about the environment. For example, there is modal logic (Kripke, 1959), which can capture possibilities, beliefs, and intentions (Sardina & Lespérance, 2010). A variant of modal logic with numbers on worlds can lead to probabilistic logics (Halpern, 2003), that allow us to reason about probabilities on formulas (Fagin et al., 1990) as well as beliefs about these formulas (Belle & Lakemeyer, 2017; Fagin & Halpern, 1994).

Beyond these formalisms that map atoms (and by extension, formulas) to binary truth values, there are logics that relax that assumption. Fuzzy logics map Boolean symbols to real numbers, leading to real-valued semantics for nonatomic formulas constructed using connectives. For example, if $A$ and $B$ get values between 0 and 1, then $A \lor B$ gets a value of 1 iff $max (A, B)$ is 1. Moreover, the conjunction could also get a value between 0 and 1, by way of $min (A, B)$ . Such a definition reduces to the classical semantics when both $A$ and $B$ are assigned $1$ , in which case the maximum of the two would also be $1$ .

These are all part and parcel of symbolic logic. The choice of the language, the choice of the semantic rules that we use over the well-defined formulas, along with its computational properties such as decidability are aspects of a logical framework. Moreover, once a logical framework is considered, we could choose to prove logical entailments either by considering assignments to the variables and seeing if the consequent follows or by applying inference rules established in a proof theory (Halpern & Vardi, 1991). If we choose to add weights (Chavira, 2008), measures (Halpern, 2003), or belief functions (Dubois & Prade, 1988), this then leads to notions such as weighted model counting (Bacchus et al., 2009) and algebraic model counting (Kimmig et al., 2012), defined over the models of a formula (i.e., possible worlds). Ultimately, we could consider theorem proving (Halpern & Vardi, 1991), model checking (Baier & Katoen, 2008), SAT solving (Barrett et al., 2009), or model counting (Gomes et al., 2009), depending on the context and application.

Each of these dimensions is already impacting current inquiries into the properties of machine-learning models. For example, in tasks from knowledge-based completion to reasoning with ontology triples using neural techniques, there has been development on so-called neural theorem provers (Minervini et al., 2018). These are inspired by Prolog’s proof-theoretic backward chaining mechanism (De Raedt & Kimmig, 2015) and the aim of those works is to implement that scheme in an end-to-end learning paradigm. Both SAT solving (Wang et al., 2019) and model counting (Gajowniczek et al., 2020) are important ingredients in state-of-the-art approaches to regularizing neural networks using logical formulas. This is motivated by the need to ensure neural network predictions always satisfy certain domain constraints. Model-checking tools are mainstream for checking the robustness of neural networks (Gros et al., 2023). There is also some work (van Krieken et al., 2022) on studying whether using real-valued fuzzy logics to permit differentiability in neural networks is comparable to differentiability as a result of probabilistic extensions to model counting (Gajowniczek et al., 2020).

In summary, we can explore a variety of logical syntax and semantics, each of which may have interesting interactions with machine-learning properties and capabilities.

2.6. Monotoncity

Classical logic is monotonic. That is, if $α_{1}, \dots, α_{k} ⊨ β$ , then it cannot be the case that adding new knowledge, say, $α^{'}$ forces us to retract $β$ : formally, it has to be that $α_{1}, \dots, α_{k}, α^{'} ⊨ β$ also.

John McCarthy was concerned about the problem of monotonicity and wondered how we might deal with exceptions and abnormality. The problem of monotonicity is so ubiquitous that it even comes up in the formulation of automated planning (Reiter, 2001). For example, imagine that you have an action to paint a box blue and another action that pushes the object. Let us say we paint the object and next, we push the object. When we execute the second action, it is implicit that the color of the object does not change. So we would have to somehow codify not only what the effects of the push action are, but also what the non-effects are. And if we start writing down all the non-effects, there could be exponentially many. Moreover, there are various preconditions that must hold for us to be able to push the object. For instance, we should be strong enough to push it, we must not be holding other objects, we are presumably operating under reasonable gravity assumptions, and so on. And if we start expressing all of them, it again looks like a hopeless task. Yet under some assumptions—so-called causal completeness (Reiter, 2001)—modeling domains is feasible. These assumptions state that the conditions provided are both necessary and sufficient for describing the action. (These concerns arise in causal modeling in machine learning as well (Pearl, 2009), as we need to accurately identify all the parent variables that influence the variable of interest and describe them at the appropriate level of detail.)

If we do not make that assumption, the alternative approach would be to consider a wide range of typical cases, while also accounting for unusual and exceptional cases by incorporating the concept of abnormality. All of this requires notions of nonmonotonicity.

It might be interesting to conceptually contrast this to the machine-learning approach to dealing with anomalies and exceptions. With learning models, when trained on existing data, they can identify typical patterns and detect abnormalities within that data (Kocijan et al., 2022; Marcus, 2017). An outlier is viewed as a data point with atypical features and an unusual label. For instance, while most men in their 40s might be categorized as low- or middle-income earners, a data point representing a 40-year-old male banker would likely be classified as a high-income earner. Conversely, we might expect a large proportion of high-income earners to be male bankers in their 40s, making them the group that deviates from the norm. In this case, they would be the outliers relative to the general population.

Be that as it may, there is no universal mechanism to address default concepts in a general way with such approaches. Moreover, nonmonotonic logic reasoning has given us notions such as stable model semantics (Gelfond & Lifschitz, 1988), which now powers recent approaches to neurosymbolic learning (Yang et al., 2020). Interestingly, nonmonotonic semantics can also allow us to capture cycles in graphs (Denecker et al., 2001), which ordinarily requires recursion using second-order logic (Enderton, 1972). This may be an important aspect as we utilize neural networks for reasoning about large graphs and the web more generally (Niu et al., 2012). Thus, attempting to disregard this area of research seems premature.

2.7. Differentiability

Recent approaches to machine learning can be summarized by emphasizing the importance of differentiability as a key concept. However, it is widely held that logic cannot play a role in this. For example, Turing Award winner Yann LeCunn quips (LeCun, 2022):

How can machine reason and plan in ways that are compatible with gradient-based learning?

Our best approaches to learning rely on estimating and using the gradient of a loss, which can only be performed with differentiable architectures and is difficult to reconcile with logic-based symbolic reasoning.

But as indicated by the sections above, this view is simply uniform. Probabilities as well as real arithmetic can be mapped onto logical expressions and this means that both routes—a probabilistic one (Gajowniczek et al., 2020) and real-valued semantics (van Krieken et al., 2022) one—seem to naturally lead to differentiability. Let us elaborate further below.

There has been a historical understanding that logic and probability are compatible with each other (Belle, 2020; Raedt et al., 2016; Russell, 2015). These include topics such as 0–1 laws for studying the probability of satisfaction of first-order structures (Fagin, 1976), the use of probability to compare the fit of logical hypothesis against observations (Carnap, 1951), and perhaps most recently, the use of logic-based solvers by means of (weighted) model counting to compute conditional probabilities for Bayesian networks (Chavira, 2008).

At this point, there are plenty of approaches that explicitly use logic for the training of neural networks, especially in the context of regularization and differentiability. This started with the work of UCLA’s Semantic Loss (Gajowniczek et al., 2020) and KU Leuven’s DeepProbLog (Manhaeve et al., 2018), both of which adjust the loss function of the deep learning model based on a logical encoding of the constraints and program, respectively. This is an end-to-end approach in the sense that the predictions of the neural network are corrected using the logical solver and backpropagated to the network so that the trained network predicts outputs that are compatible with the constraints. There are also recent approaches that are based on real-valued variables, such as in Hoernle et al. (2022) and van Krieken et al. (2022). Providing arithmetic constraints to the training of deep learning networks and ensuring consistency with the provided domain knowledge is an important problem for areas such as physics (Stewart & Ermon, 2017) and robotics (Innes & Ramamoorthy, 2020).

However, it would be remiss not to point out that just because differentiability seems to be an important ingredient in the training of machine-learning models, it does not mean that we expect every scientist in the area of logic to play a game. There is still profound and rigorous work to be done on the integration of logical querying (e.g., computational effort needed to evaluate queries on a large knowledge basis Liu & Levesque, 2005) and probability (Beame et al., 2015), for example. On the representation side, there are important issues to grapple with, such as languages to reason about logic and probability that permit the domain of quantification to be countably infinite (e.g., natural numbers) and uncountable (e.g., reals) sets (Liu et al., 2023). Moreover, modal logics such as temporal logics and dynamic logics become useful for deep learning-based endeavors as we navigate to more open-ended problems in dynamic domains (Levesque et al., 1997). For example, in Icarte et al. (2022), temporal logic formulas are used to train deep reinforcement learning agents. In Sileo and Lernould (2023); Tang and Belle (2024), large language models (LLMs) are used to reason about dynamic epistemic properties (Belle et al., 2022), including the modeling of theory of mind (Fagin et al., 1995). And in Innes and Ramamoorthy (2020), a temporally extended semantic loss function is considered.

An orthogonal direction of work that has recently been considered is the capturing of neural architectures, such as graph neural networks, using fragments of first-order logic (Barceló et al., 2020). For the purposes of our discussion, it suffices to say that simply focusing on differentiability or differentiable logic does not quite capture the range of questions that one can investigate in the AI landscape. Issues such as expressiveness, computational properties, and the development of hybrid architectures that combine the advantages of logical and uncertain reasoning continue to be valuable areas of research.

It is worth noting that the meta-linguistic applications of logic can be both “external” and “internal.” In this subsection, we largely discussed the external view that the machine-learning system as a whole needs to be understood as a logical theory. This could involve providing a logical semantics with probabilistic programming or providing a logical language for multiple autonomous learning entities, even logically formalizing machine-learning properties such as fairness (Belle, 2023b). However, it is also possible to use logic as a mathematical function inside a machine-learning system—that is, applied internally—which is discussed in a few subsequent subsections. In these cases, for example, a logical formula may act as a constraint that could be incorporated into the loss function of learning paradigms or may serve as an oracle to reason correctly over machine-learning predictions. Thus, logic could be used as a mathematical language to understand the system as a whole or as a mathematical function inside a machine-learning system.

2.8. What About “Human-Like” Semantic Definitions?

The most well-studied semantics, or perhaps more accurately, the most widely used semantics in computer science, remains classical (Bradley & Manna, 2007). That is, atoms are accorded values of either 0 or 1, and so formulas become Boolean functions. If modalities are introduced, such as time and actions (Fagin et al., 1995), then we look at sequences of models: either a linear sequence or a tree-like sequence (Reiter, 2001), for example.

But as mentioned above, there are also approaches where a degree of truth is accorded to formulas, either by allowing the atoms themselves to have nonbinary values (Zadeh, 1965) or by according probabilities or other kinds of measures for complex formulas (Dubois & Prade, 1988).

All of these notions are explored by establishing some kind of well-definedness, and logicians explore the implications of those conditions. For example, intuitionistic logic looks to weaken material implication (Dummett, 1975). Nonclassical belief logics control the proof-depth of logical reasoners (Liu et al., 2004). Fuzzy logic (Zadeh, 1965) was initially introduced with the idea that a truth definition needs to be provided to vague notions (Fine, 1997) such as being tall or making water warm.

Be that as it may, there is an informal argument often made that a mathematically rigorous definition of truth is too precise. Perhaps by training neural networks with real-world observations, they might exhibit more human-like reasoning capabilities that eschew a well-defined notion altogether. The evidence for this has not yet been established. Moreover, is such a feature desirable? Let us, for the moment, consider correct reasoning and understand what can be said about deep learning models implicitly inferring logical steps.

2.8.1. Correct Reasoning

There are a number of recent papers looking at the reasoning abilities of LLMs, which are so-called transformer architectures trained on large troves of textual data (Birhane et al., 2023). Despite allowing for a number of different ways to backtrack and infer the correct premise for a query (e.g., so-called “chain-of-thought”), as shown in a number of papers, they seem to incorrectly reason in a number of different ways (Carlini et al., 2021; Creswell et al., 2022; Mirzadeh et al., 2024; Valmeekam et al., 2022; Zhang et al., 2022). For example, sometimes they struggle with symmetry (Pei et al., 2023; Yamamoto et al., 2024). Although newer models are able to recognize an increasing set of patterns and might get logical relationships and connectives, there is little evidence that they are consistently correct—as Kautz (2024) puts it: “So close, and yet so far!”

Thus, impressive as they are, these models are not reliable (Jang & Lukasiewicz, 2023). There is a also growing body of recent work on the limitations of formal reasoning with LLMs. For example, Tang and Belle (2024) consider how well LLMs perform with theory of mind reasoning, seen in card games and gossip protocols (Fagin et al., 1995). In Valmeekam et al. (2024), the performance of OpenAI’s latest model for reasoning (so-called “o1”)¹ for automated planning is considered, and generally poor performance is reported. A study from a team at Apple (Mirzadeh et al., 2024) reports that minor variations to the reasoning questions can lead to dramatic changes in performance, which is problematic. In Zhang et al. (2022), it is suggested that LLMs learn the statistical properties of logical tests, rather than emulate the correct reasoning function.

In light of these limitations, there is a compelling argument for a neurosymbolic approach. For instance, implementing a logical error checker as a post hoc mechanism could effectively verify the results, predictions, and completions generated by LLMs. For example, a systematic integration of ChatGPT and Wolfram Alpha was recently attempted.² More generally, recent approaches seek to incorporate logical solvers as oracles (Persia & Ozaki, 2022) that can validate or disprove the predictions of neural architectures, including LLMs (Miceli-Barone et al., 2023; Pan et al., 2023; Panas et al., 2024; Zhang et al., 2023).

Putting this together, the “native” reasoning capabilities of purely neural models seem clearly limited. It is, of course, plausible that a novel training architecture or new types of datasets might provide the right sort of environment for neural models to perform correct reasoning. But for the moment, validating results and/or improving the training of neural architectures using logical solvers—that is, a neurosymbolic learning pipeline—seems to be the most promising avenue. Kautz makes a stronger claim (Kautz, 2024):

The observation that tools greatly enhance the power of LLMs is not original. Indeed, commercial LLMs already make heavy use of tools—in particular, tools for internet search for the retrieval augmented generation (RAG) paradigm. Kambhampati et al. (2024) recently showed that an LLM can convert planning and verification problems presented in natural language into formal STRIPS notation and solve them using an external planning system. I go farther than most researchers pursuing the tool approach in that I mean the title of this paper, “Tools Are All You Need,” quite literally: a language model augmented with reasoning tools is sufficient to create true artificial intelligence.

He uses “tools” to mean SAT solvers, or other such logical oracles, very much in line with the thrust of this article, and goes on to argue how LLMs “are the only kind of machine learning system that, like humans, can reliably generalize from a single example,” and how that coupled with logical tools may support general-purpose AI.

2.8.2. The Intentional Stance

It is worth noting that, strictly speaking, we do not require that the semantics be given by humans, or that they be hand-written. Symbols can be obtained from low-level data (via symbol grounding), or from closely related languages (Bartha et al., 2021), or from abstract descriptions (De Raedt, 1997) of concepts (Lake et al., 2015). The use of symbols in AI also does not mean that symbolic logic experts assume humans manipulate symbols in their heads. See Levesque and Lakemeyer (2001), for philosophical discussions on this point, which can ultimately be tied to the “intentional stance” (Dennett, 1989). The intuition here is that any capability we attribute to an (artificial or human) agent could be understood in terms of intentions, beliefs, and other mental attitudes, which allow us to characterize what the agent is trying to do. It is a pragmatic perspective rather than a literal representation of the agent’s behavior model.

There is extensive work on trying to characterize natural language utterances (Moot & Retoré, 2019), including connectives (Heinamaki, 1974) and their formal counterparts (van Benthem, 1989). This also involves the use of terms and formulas, whose meaning may be built up from context and social environment (van Wijk, 2006). While the search for a logic that accurately characterizes these kinds of observations with humans is still ongoing, it is worth noting that we do not need a logical knowledge basis to be consistent either. For example, there is work on para-consistent logics (Blair & Subrahmanian, 1989).

Ultimately, we have a range of language choices to work with. We may disagree on the semantics, but having a few different systems that can be mathematically studied seems like a good start.

A follow-up question might be to the tune of: does it still make sense to bother with classical semantics? Just as it makes sense to study logic outside the context of differentiability, we would argue the study of classical semantics is also worthwhile in the AI context. Reasons include: (a) it is a well-defined mathematical model, (b) with the use of modalities and/or nonclassical semantics, we can relate different systems, (c) we do not really know which semantics best approximates human reasoning, (d) we may not want mathematical truths that play fast and loose with inevitable conclusions just because we think humans might have some cognitive biases and exhibit inconsistent reasoning, and (e) the science of robust AI is still evolving.

3. Logic and Learning can be Complementary

As already hinted above, symbolic logic can play an important role in training deep learning models but also in integrating reasoning as a post hoc process or as a metalinguistic paradigm. That is, we can ensure that the distribution of the trained network respects domain constraints (Hoernle et al., 2022). We can extract rules from trained models and reason about them outside the framework of the network (Persia & Ozaki, 2022). Or we can use the outputs of the network as inputs to a computational paradigm such as probabilistic programming (Manhaeve et al., 2018). There is very interesting work on the semantics of programs that inherently support some notion of differentiation (Abadi & Plotkin, 2019). This is an object of intense theoretical study that can have consequences on the types of distributions that are expressible in programming languages (Staton et al., 2016). So, this theory has far-reaching effects on what types of probabilistic models can be modeled effectively.

In the second half of the article, we make the following point: symbols and deep learning need not compete with each other, and can be complementary. Perhaps the most representative example of this is the burgeoning field of neurosymbolic AI (Garcez et al., 2002), which has come to encompass things such as neural program induction (Lake et al., 2015), neural theorem improvers, and differentiable logics (Zhang et al., 2023). We consider some other categories below, as usual, with overlap.

3.1. Symbolic Logic as Meta-Theory

An argument made previously (Belle, 2021) is that symbolic logic can be used to formalize notions currently out of the purview of standard machine learning. These include things such as the semantics of involved probabilistic programming languages (Staton et al., 2016) and understanding the limits of differentiable logics (van Krieken et al., 2022), but it can also pertain to a range of more exotic topics.

For example, it is very common in AI applications these days to require frameworks for multiagent reasoning (Albrecht & Stone, 2018). In explainable AI (Gunning, 2016), in particular, we might require that the robot holds beliefs about the human agent (Kambhampati, 2020). Modal logics study such phenomena. Thus, there has been a significant amount of recent work on incorporating agent modeling into learning frameworks, with multiagent reinforcement learning being a prominent example (Albrecht & Stone, 2018). Furthermore, incorporating agent modeling for explainable planning (Albrecht et al., 2021) and utilizing user-provided constraints as reward functions in reinforcement learning (Icarte et al., 2022) are topics of study.

Moreover, complex AI systems are not going to be purely based on providing predictions. They will involve search, constraint reasoning, and planning (Russell & Norvig, 2003). This has necessitated new approaches for compositionality (Staton et al., 2016) and modularity (Ternovska & Mitchell, 2009). On a related note, it was noted that weighted model counting (Gomes et al., 2009), which provides the foundation for mapping Bayesian inference to SAT solvers, can be upgraded to also reason about maximization and minimization of properties (Kimmig et al., 2012), leading to languages where a number of different AI subareas, such as search and optimization, can be unified (Belle & De Raedt, 2020).

An orthogonal but very interesting line of research in recent years looks at the expressiveness of mainstream neural architectures using logical languages. Primarily, they look at fragments of first-order logic to capture (a simplified version of) neural architectures such as transformers (Vaswani et al., 2017) and graph neural networks (Xu et al., 2018). These investigations have identified that graph neural networks capture fairly limited fragments of first-order logic (Barceló et al., 2020), while attention mechanisms have been shown to be Turing-complete (Pérez et al., 2021). In the case of graph neural networks, the community is still exploring the implications of these results but it is believed that these architectures may fail in tasks involving queries that require more expressiveness than the fragment they correspond to. So, in this sense, using logical tools to understand neural architectures can have serious implications in terms of how these architectures are being used and in which circumstances they could be considered reliable.

3.2. High-Level Knowledge

The interplay between reasoning and learning is often compared to Kahneman’s (2011) famous distinction of system 1 versus system 2 type cognition in humans (Rossi, 2024). This is owing to the fact that AI scientists, for a very long time, have been deliberating on the appropriate way to abstract away low-level perception data with high-level concept knowledge, perhaps going back to Shakey (Kuipers et al., 2017). Many “hybrid” formalisms for reasoning with perceptual data attempt to address the interplay between concepts and observations in a systematic way, for example, Kaelbling and Lozano-Pérez (2013) and Nitti et al. (2017).

Providing mechanisms as well as formal semantics for abstraction remains a topic of theoretical interest even today (Beckers & Halpern, 2019; Hofmann & Belle, 2023). Roughly, the idea is given a representation $R$ of the high-level model, to find another representation of $R^{'}$ involving low-level data and concepts, such that $R$ and $R^{'}$ agree on atoms under a suitable mapping $μ$ . That is, $R$ entails an atom $a$ iff $R^{'}$ entails $μ (a)$ . In a probabilistic setting, this might mean that we abstract a continuous distribution w.r.t. to evidence in terms of a discrete distribution (Holtzen et al., 2018). There has also been some work on abstracting causal models (Beckers & Halpern, 2019).

In the specific case of deep learning systems, a key agenda point is how to define abstract concepts, whether extracted directly from data or defined externally, in order to coordinate and interoperate with these systems (Belle & Bueff, 2023; Bueff & Belle, 2024; Lake et al., 2015).

More generally, it is widely acknowledged that concepts such as time, abstraction, and causality will play a key role in designing a general-purpose AI (Marcus & Davis, 2019). We would expect such an AI to be capable of reasoning with a rich world model, one that can be interpreted by humans (Brachman & Levesque, 2022). Roughly, the idea is that given some system description, $Σ$ , it is desirable to reason about the following:

Temporal abstractions: Given two events, $e$ and $e^{'}$ , we would like to know which happened earlier, and whether some trigger in $e$ led to whatever happened in $e^{'}$ .

Induction: Suppose we have a set of events, $p_{1}, \dots,$ $p_{n}$ . We would like to find an idealized instance that generalizes these examples, $\tilde{p}$ .

Abstraction: We would like to find atomic descriptions, $\hat{p}$ , that characterize the interaction between some of those instances (e.g., $p_{1} \land p_{2} \equiv \hat{p}$ ). The idea, then, is to use the abstract descriptor for increased comprehensibility (Hofmann & Belle, 2023; Holtzen et al., 2018).

Causation: Finally, given a causal chain from events $p_{1}$ to $p_{n}$ in the sense that $p_{1}$ or its descendants causes $p_{n}$ , we would like to understand what would happen if $p_{i}$ was set to a certain value (intervention) or assumed a value not necessarily seen in the data (counterfactual).

Although there is some work on providing a causal semantics to deep learning systems (Luo et al., 2020), it is still in the early years and studied in a limited way. In contrast, we have very well-studied models of time (Prior, 1967) and causality with symbolic calculi (Halpern, 2016; Hitchcock, 2001; Reiter, 2001). It seems like a wasted opportunity to not utilize these frameworks simply because they are purely symbolic, and hence deemed “old-fashioned.”

As has been the case for many years now, symbols can be used as abstract identifiers for human-in-the-loop systems (Kambhampati, 2020), and/or interactive machine learning especially when you have nonexpert stakeholders engaging with predictors trained on high-dimensional data. In particular, there are very concrete examples from the neurosymbolic landscape that particularly highlight the benefits of using symbols. For example, the work on reward machines (Icarte et al., 2022) looks to train deep learning-based reinforcement learning agents by means of high-level, temporally extended specifications, such as formulas expressed in linear temporal logic (Chatterjee et al., 2015). The propositions of the language are abstract descriptions of properties that can be understood by humans. There is also work on reasoning about neural concepts in a logical language. Although there have been prior works on hybrid formalisms that allow for machine-learning constructs to be used in logic (Kaelbling & Lozano-Pérez, 2013), recent neurosymbolic approaches such as DeepProbLog (Manhaeve et al., 2018) allow us to not only include neural concepts as objects in the logical program, but also to reason about this program as signals that could be fed back into the neural network training. This leads to a trained model that provides predictions and learns distributions that are consistent with the logical specification (Hoernle et al., 2022).

3.3. Symbolic Logic can Instantiate New Methods of Inference

One observation we emphasized earlier is that precisely because of the close relationship between logic and probability (Belle, 2017; Carnap, 1951; van Benthem, 2017), it is possible to use logic-based solvers for doing probabilistic reasoning. This in turn, can mean that logic-based solvers are used in learning modules in probabilistic machine learning (Van den Broeck, 2013), or perhaps to reason about the output distributions of neural networks (Gajowniczek et al., 2020).

This is primarily instantiated via weighted model counting (Gomes et al., 2009), which—as discussed above—is an extension of SAT solving to identify all possible satisfying assignments (Bacchus et al., 2009). And as mentioned, there is also an extension of this strategy to deal with continuous properties via so-called weighted model integration (Belle et al., 2015). One broader observation here is that because weighted model counting is defined in terms of weights on the possible models of a logical formula, it is possible to use different types of weights. This means a whole range of different computational tasks defined over the models of a logical formula can be approached using the same abstract specification of weighted model counting. This leads to the notion of algebraic model counting (Kimmig et al., 2012), where instead of sums over the models and products over the weights of literals, we can consider different kinds of corresponding operations such as maximum and minimum (Bacchus et al., 2009).

A notable development in this space is knowledge compilation (Darwiche et al., 2018). This stems from the observation that given a probabilistic model, we may have to compute conditional queries repeatedly. Therefore, there have been efforts to represent a logical formula as a data structure that permits the computation of model counting (Darwiche et al., 2018), including in the presence of distinct conditional queries, effectively. This development can be coupled with the notion of algebraic model counting (Kimmig et al., 2012), but it has also served as a computational backbone for many emerging representations that unify logic and probability, such as relational Bayesian and Markov networks (Van den Broeck et al., 2013)—in addition to classical Bayesian networks Chavira (2008), of course—and probabilistic logic programming languages such as ProbLog (Fierens et al., 2011).

Circuits provide a new way of doing inference with probabilistic models with the following properties: you pay a one-time cost for compiling the representation, such as a Bayesian network, into such a circuit, and then every query afterward can be done in time polynomial in the size of the circuit. There is also a broader program of learning such circuits directly (Liang et al., 2017). The goal is to find an alternative to classical machine-learning models with attractive computational properties for inference (Vergari et al., 2021). This is a new and exciting way of doing probabilistic reasoning and has even led to new approaches to inference in probabilistic programming (Holtzen, Vanden Broeck, & Millstein, 2020).

3.4. Logical Oracles

There is considerable work on verifying neural networks (Shih et al., 2019) for safety properties (Casadio et al., 2022) as well as robustness (Gehr et al., 2018), where we want to ensure that the prediction of neural networks does not change arbitrarily for small perturbations to the input. Along these lines, there is a new direction of work where logical reasoners serve as oracles to machine-learning predictions to ensure that the predictions are consistent.

A representative example here is the contrasting reasoning capabilities of large-scale learned models, such as LLMs, against that of a symbolic oracle. Recent work on Wolfram Alpha (Wolfram, 2023) looks to integrate an arithmetic solver with the output of ChatGPT so that reasoning outputs are consistent and coherent with mathematical principles. Similarly, although there is some work on how the chain-of-thought prompting approach can lead to better reasoning outputs by LLMs, the use of a logical oracle leads to provably correct outputs. The capabilities of ChatGPT, for example, have been directly studied in Frieder et al. (2023) and Jang and Lukasiewicz (2023), and the use of a logical oracle to provide an externally sourced solution to reasoning problems with LLMs is considered in Pan et al. (2023). In Sileo and Lernould (2023) and Tang and Belle (2024), such an approach has been shown to be applicable to involved problems involving the mental states of multiple agents, commonly referred to as the theory of mind (Fagin et al., 1995; Shvo et al., 2020).

Intuitively, the idea here is related in spirit to the investigations on logic-based loss functions (Gajowniczek et al., 2020) because there too, predictions are expected to conform to logical constraints (Hoernle et al., 2022).

3.5. Logic Benefits From Learning

In the article written so far, we have made the case for machine learning benefiting from logical tools and languages. However, on the other hand, looking back to the early days of logical thought, Aristotle argued for the importance of the process of induction (Belle, 2021). We need mechanisms to learn the general from the particular, which involves generalizing from specific instances to create a generic statement that applies to all instances. That is, a quantified formula that entails all the atoms. In modern AI, this process is a key source of logical knowledge obtained from data (De Raedt et al., 2015; Sap et al., 2020), in addition to the information provided by experts (Davis, 2014).

However, if our logical knowledge is to consist of a combination of expert-provided knowledge and knowledge drawn from examples, there are a number of concerns we need to address. For example, how can we ensure that a hypothesis that is consistent with the background knowledge is extracted from the observations (Muggleton et al., 2012)? What kind of properties should the resulting knowledge base have (Michael, 2007)? How do we deal with observations that might be incorrect or noisy (Bacchus et al., 1999)? How do we ensure that the formula we generalize from the observations captures not only the observations made so far but also the observations we have not yet seen and might encounter in the future (Juba, 2013; Valiant, 1999)?

In recent years, a variety of approaches ranging from statistical relational learning (Raedt et al., 2016) to probably approximate correct semantics (Rader et al., 2021) to neural program induction (Lake et al., 2015) and neural rule induction (Evans & Grefenstette, 2018) have been explored. These approaches utilize state-of-the-art machine-learning tools and theory to learn logical expressions. In some cases, noise in the observations is treated by assuming that the observations are drawn from an unknown distribution. In other cases, the generalization capabilities of neural networks are exploited to learn representations that are empirically robust to this noise.

It is now believed that machine learning will likely impact almost all of computer science because it provides a mechanism to construct models from data (Shapiro et al., 2018). This means that we will continue considering combinations of model-based and data-driven domain knowledge in the future. All of this is even more reason to not entertain notions of a dichotomy between logic and learning.

4. Concluding Thoughts

In this article, we looked at a few of the misunderstandings that arise when considering the relevance and use of symbolic AI in modern AI systems. We hope the reader is convinced that not only does reasoning and learning have significant overlap—including ideas such as model counting appearing in and linking to multiple concerns—but it is also the case that recent advances are exploiting state-of-the-art learning for reasoning (and vice versa), and in the process, improving on the state-of-the-art.

Whether there might be a future architecture that is very close in spirit to current neural models and makes logical tools redundant is yet to be seen. However, as we have argued, it is hard to imagine that, from a theoretical standpoint, logical analysis itself will become redundant, because many of the desired properties sought out are logical in nature. Despite reported advances in the reasoning capabilities of LLMs, currently seen as the culmination of large-scale deep learning models, they still struggle with consistency and correctness in both logical and arithmetic problems.

4.1. Other Dimensions

We have not discussed a few key issues that are emerging in the AI landscape. With the growing use of AI systems in financial and industrial applications, issues of trustworthiness and responsibility keep coming up (Marcus & Davis, 2019).

For example, one area where symbolic logic is widely used in many stochastic systems (Chen et al., 2013) is the verification of safety properties (Shih et al., 2019), and/or testing for robustness (Casadio et al., 2022). The idea with safety properties is to ensure that certain regions in a geometric space are avoided because they might represent dangerous operational areas. In the case of robustness, we want to ensure that small perturbations to the input do not dramatically change the prediction from the neural network. It should not come as a surprise that ideas from logic-based computer science, including temporal logic (Chatterjee et al., 2015) as well as SMTs (Barrett et al., 2009), are the main tools to formalize and investigate these types of properties.

Another interesting avenue for examining trustworthy and responsible AI is understanding the ethical principles and norms under which AI systems should operate (Dignum, 2019). In this subarea, although mainstream models of concepts such as fairness do not necessarily use logic (Verma & Rubin, 2018), further analysis of how systems could conform to ethical principles is often pursued through symbolic logic (Dennis et al., 2016). For example, notions such as act-deontology (Krarup et al., 2022) or consequentialism can be formalized as properties that the system’s execution should obey (Pagnucco et al., 2021; Winfield et al., 2014). There has been work on using symbolic causal models to understand notions of blameworthiness, and the degree of responsibility (Chockler & Halpern, 2004). Finally, there is considerable recent work on explainable planning (Kambhampati, 2020), where a formal model is used to capture the user’s intent and contrast it with the system’s understanding of the world in which it operates (Shvo et al., 2020). For an overview of how knowledge representation can provide much-needed frameworks for ethical and trustworthy AI, see Belle (2023a).

4.2. Neurosymbolic AI

As we discussed, one area where concerns about the use of logic seem to disappear is neurosymbolic AI. Neurosymbolic AI holds a lot of promise because it can offer interesting ways to combine symbolic logic and deep learning and build on the success of both. And like the maxim: “the whole is greater than the sum of the parts,” such an integration may not simply be the communication of outputs in a divorced way, but could involve a deeper type of synthesis (Hitzler, 2022). Some approaches have dealt with loss functions, while others have focused on post-hoc logical reasoning or extracting rules from networks. All of these approaches are interesting in their own right.

There is also a tradeoff, at least as per our current understanding, between the complexity and level of detail of the logical knowledge and how effectively it can integrate with a learning system. For example, papers focusing on loss functions typically deal with smaller-sized formulas and constraints (Hoernle et al., 2022), while works exploring the integration of learning with knowledge graphs often consider ontologies with more than a hundred or even a thousand nodes (Niu et al., 2012). Some may argue at this point whether these examples clearly indicate instances of neurosymbolic paradigms exceeding the capabilities of state-of-the-art machine learning. However, this is somewhat of a nebulous measure because state-of-the-art machine learning does encompass various neurosymbolic notions, even if they do not explicitly acknowledge it. Examples range from concept learning (Lake et al., 2015) to Wolfram Alpha-type integrations with LLMs (Wolfram, 2023).

Of course, with such a diversity of solutions, it may be challenging to determine the correct approach. Perhaps there is no one-size-fits-all solution, and the combination of logic and deep learning can vary depending on the application. Regardless of the specific approach, it is clear that we need to understand the principles of logical languages and semantics to ensure that resulting mathematical objects are well-defined with desired properties. This appreciation is essential for both theoretical exploration and practical applications.

It should be noted that there is a case to be made for expressive representations. For example, some might come away feeling that the best way to approach the future of neurosymbolic AI is to focus on very limited languages. But such a view may not be fruitful in the long term. For example, it is widely understood that first-order is useful for generalized assertions (Levesque, 2012), and modal logics for time and multiagent beliefs (Fagin et al., 1995). In general, the language is critical for capturing the domain correctly. In a statement remarkably similar in spirit, Judea Pearl writes (Pearl & Mackenzie, 2018):

This is why you will find me emphasizing and reemphasizing notation, language, vocabulary and grammar. For example, I obsess over whether we can express a certain claim in a given language and whether one claim follows from others. My emphasis on language also comes from a deep conviction that language shapes our thoughts. You cannot answer a question that you cannot ask, and cannot ask a question that you have no words for.

4.3. Much to Learn

To sum up, there is a lot to be gained by relating the mathematical foundations of logic and deep learning. And the benefit is not purely for the logician, but also for the deep learning researcher who wants to think more broadly than prediction with big data.

We should, of course, celebrate successes—it is neither an accident nor misplaced opportunism that logic/programming language folks are interested in learning and are eager to understand the latest and best (Gulwani, 2010). Moreover, what combination of logic and/or learning would be needed for general-purpose AI is not well understood yet. We cannot point to the exact approach or balance of innateness versus tabula rasa we need for general AI, because we simply do not know. We can only loosely articulate requirements (e.g., correct, fair, and safe by design), capabilities (e.g., ability to reason about causality, time, and space models), and corresponding desiderata.

Indeed, although AlphaGo and LLMs represent a major triumph for AI, these achievements inevitably raise questions about generality and correctness. As mentioned earlier, Kautz (2024) argues that a reasoning oracle coupled with a language model might be providing the steps toward general-purpose automated intelligence. Conversely, we may want to be wary of “silver bulletism”—the notion of a single solution addressing all of AI’s concerns and capabilities. As Levesque puts it Levesque (2014):

As a field, I believe that we tend to suffer from what might be called serial silver bulletism, defined as follows: the tendency to believe in a silver bullet for AI, coupled with the belief that previous beliefs about silver bullets were hopelessly naïve.

Silver bulletism also contributes to the hubris and hype of AI. In view of creating general-purpose, safe, and reliable AI, we need to look at the best of all worlds. And in that regard, the unification of logic and learning continues to bear fruit, of which neurosymbolic AI is the latest installment.

Footnotes

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The author was supported by the Royal Society University Research Fellowship.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

ORCID iD

Vaishak Belle

References

Abadi

Plotkin

G. D.

(2019). A simple differentiable programming language. Proceedings of the ACM on Programming Languages, 4(POPL), 1–28.

Albrecht

S. V.

Brewitt

Wilhelm

Gyevnar

Eiras

Dobre

Ramamoorthy

(2021). Interpretable goal-based prediction and planning for autonomous driving. In 2021 IEEE international conference on robotics and automation (ICRA) (pp. 1043–1049). IEEE.

Albrecht

S. V.

Stone

(2018). Autonomous agents modelling other agents: A comprehensive survey and open problems. Artificial Intelligence, 258, 66–95.

Andreas

Rohrbach

Darrell

Klein

(2016). Neural module networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 39–48). https://dblp.org/rec/conf/cvpr/AndreasRDK16.html

Azamfirei

Kudchadkar

S. R.

Fackler

(2023). Large language models and the perils of their hallucinations. Critical Care, 27(1), 1–2.

Bacchus

Dalmao

Pitassi

(2009). Solving #SAT and Bayesian inference with backtracking search. Journal of Artificial Intelligence Research , 34, 391–442.

Bacchus

Halpern

J. Y.

Levesque

H. J.

(1999). Reasoning about noisy sensors and effectors in the situation calculus. Artificial Intelligence, 111(1–2), 171–208.

Badreddine

Garcez

A. d.

Serafini

Spranger

(2022). Logic tensor networks. Artificial Intelligence, 303, 103649.

Baier

Katoen

J.-P.

(2008). Principles of model checking. The MIT Press.

10.

Barceló

Kostylev

E. V.

Monet

Pérez

Reutter

Silva

J.-P.

(2020). The logical expressiveness of graph neural networks. In 8th international conference on learning representations (ICLR 2020). https://dblp.org/rec/conf/iclr/BarceloKM0RS20.html

11.

Barrett

Sebastiani

Seshia

S. A.

Tinelli

(2009). Satisfiability modulo theories. In Handbook of satisfiability (chapter 26, pp. 825–885). IOS Press.

12.

Bartha

Cheney

Belle

(2021). One down, 699 to go: Or, synthesising compositional desugarings. Proceedings of the ACM on Programming Languages, 5(OOPSLA), 1–29.

13.

Beame

Van den Broeck

Gribkoff

Suciu

(2015). Symmetric weighted first-order model counting. In PODS (pp. 313–328). ACM.

14.

Beckers

Halpern

J. Y.

(2019). Abstracting causal models. In Proceedings of the AAAI conference on artificial intelligence (vol. 33, pp. 2678–2685). https://dl.acm.org/doi/10.1609/aaai.v33i01.33012678

15.

Belle

(2017). Logic meets probability: Towards explainable AI systems for uncertain worlds. In International joint conference on artificial intelligence. https://dblp.org/rec/conf/ijcai/Belle17.html

16.

Belle

(2020). Symbolic logic meets machine learning: A brief survey in infinite domains. In International conference on scalable uncertainty management (pp. 3–16). Springer.

17.

Belle

(2021). Logic meets learning: From Aristotle to neural networks. In Neuro-symbolic artificial intelligence: The state of the art (pp. 78–102). IOS Press.

18.

Belle

(2023a). Knowledge representation and acquisition for ethical AI: Challenges and opportunities. Ethics and Information Technology, 25(1), 22.

19.

Belle

(2023b). Toward a logical theory of fairness and bias. Theory and Practice of Logic Programming, 23(4), 865–883.

20.

Belle

Bolander

Herzig

Nebel

(2022). Epistemic planning: Perspectives on the special issue. https://dblp.org/rec/journals/ai/BelleBHN23.html

21.

Belle

Bueff

(2023). Deep inductive logic programming meets reinforcement learning. In The 39th international conference on logic programming. Open Publishing Association. https://dblp.org/rec/journals/corr/abs-2308-16210.html

22.

Belle

De Raedt

(2020). Semiring programming: A semantic framework for generalized sum product problems. International Journal of Approximate Reasoning, 126, 181–201.

23.

Belle

Lakemeyer

(2017). Reasoning about probabilities in unbounded first-order dynamical domains. In International joint conference on artificial intelligence. https://dblp.org/rec/conf/ijcai/BelleL17.html

24.

Belle

Lakemeyer

Levesque

H. J.

(2016). A first-order logic of probability and only knowing in unbounded domains. In Proceedings of the AAAI conference on artificial intelligence. https://dblp.org/rec/conf/aaai/BelleLL16.html

25.

Belle

Levesque

H. J.

(2013). Reasoning about continuous uncertainty in the situation calculus. In Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25–31, 2015. https://dblp.org/rec/conf/ijcai/BelleL13.html

26.

Belle

Passerini

Van den Broeck

(2015). Probabilistic inference in hybrid domains by weighted model integration. In International joint conference on artificial intelligence. https://dblp.org/rec/conf/ijcai/BellePB15.html

27.

Bengio

Goodfellow

Courville

(2017). Deep learning. MIT Press.

28.

Birhane

Kasirzadeh

Leslie

Wachter

(2023). Science in the age of large language models. Nature Reviews Physics, 1–4. https://www.nature.com/articles/s42254-023-00581-4

29.

Bistarelli

Montanari

Rossi

(2001). Semiring-based constraint logic programming: Syntax and semantics. TOPLAS, 23(1), 1–29.

30.

Blair

H. A.

Subrahmanian

(1989). Paraconsistent logic programming. Theoretical Computer Science, 68(2), 135–154.

31.

Brachman

Levesque

(2004). Knowledge representation and reasoning. Morgan Kaufmann Publishers.

32.

Brachman

R. J.

Levesque

H. J.

(2022). Machines like us: Toward AI with common sense. MIT Press.

33.

Bradley

A. R.

Manna

(2007). The calculus of computation: Decision procedures with applications to verification. Springer Science & Business Media.

34.

Bueff

Belle

(2024). Learning explanatory logical rules in non-linear domains: A neuro-symbolic approach. Machine Learning, 113, 4579–4614.

35.

Carlini

Tramer

Wallace

Jagielski

Herbert-Voss

Lee

Roberts

Brown

T. B.

Song

Erlingsson

, et al. (2021). Extracting training data from large language models. In USENIX security symposium (vol. 6). https://dblp.org/rec/conf/uss/CarliniHNJSTBIW23.html

36.

Carnap

(1951). Logical foundations of probability. Routledge and Kegan Paul.

37.

Cartuyvels

Spinks

Moens

M.-F.

(2021). Discrete and continuous representations and processing in deep learning: Looking forward. AI Open, 2, 143–159.

38.

Casadio

Komendantskaya

Daggitt

M. L.

Kokke

Katz

Amir

Refaeli

(2022). Neural network robustness as a verification property: A principled case study. In International conference on computer aided verification (pp. 219–231). Springer.

39.

Chakraborty

Fremont

D. J.

Meel

K. S.

Seshia

S. A.

Vardi

M. Y.

(2014). Distribution-aware sampling and weighted model counting for sat. AAAI. https://ojs.aaai.org/index.php/AAAI/article/view/8990

40.

Chatterjee

Chmelik

Gupta

Kanodia

(2015). Qualitative analysis of POMDPs with temporal logic specifications for robotics applications. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 325–330). IEEE.

41.

Chavira

Darwiche

(2008). On probabilistic inference by weighted model counting. Artificial intelligence, 172(6–7), 772–799.

42.

Chen

Diciolla

Kwiatkowska

M. Z.

Mereacre

(2013). A Simulink hybrid heart model for quantitative verification of cardiac pacemakers. In Proceedings of the 16th international conference on hybrid systems: Computation and control, HSCC 2013, April 8–11, 2013, Philadelphia, PA, USA (pp. 131–136). https://dl.acm.org/doi/10.1145/2461328.2461351

43.

Chistikov

Dimitrova

Majumdar

(2015). Approximate counting in SMT and value estimation for probabilistic programs. In TACAS (vol. 9035, pp. 320–334). https://dblp.org/rec/conf/tacas/ChistikovDM15.html

44.

Chockler

Halpern

J. Y.

(2004). Responsibility and blame: A structural-model approach. Journal of Artificial Intelligence Research, 22, 93–115.

45.

Crane

(1990). The language of thought: No syntax without semantics. https://onlinelibrary.wiley.com/doi/10.1111/j.1468-0017.1990.tb00159.x

46.

Creswell

Shanahan

Higgins

(2022). Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv preprint arXiv:2205.09712.

47.

Darwiche

(2018). Human-level intelligence or animal-like abilities? Communications of the ACM, 61(10), 56–67.

48.

Darwiche

Marquis

Suciu

Szeider

(2018). Recent trends in knowledge compilation (dagstuhl seminar 17381). In Dagstuhl reports (vol. 7). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. https://dblp.org/rec/journals/dagstuhl-reports/DarwicheMSS17.html

49.

Davis

(2014). Representations of commonsense knowledge. Morgan Kaufmann.

50.

De Raedt

(1997). Logical settings for concept-learning. Artificial Intelligence, 95(1), 187–201. https://dblp.org/rec/conf/ijcai/RaedtDTBV15.html

51.

De Raedt

Dries

Thon

Van den Broeck

Verbeke

(2015). Inducing probabilistic relational rules from probabilistic examples. In Proceedings of 24th international joint conference on artificial intelligence (IJCAI) (vol. 2015, pp. 1835–1842). .

52.

De Raedt

Kimmig

(2015). Probabilistic (logic) programming concepts. Machine Learning, 100, 5–47.

53.

Denecker

Bruynooghe

Marek

(2001). Logic programming revisited: Logic programs as inductive definitions. ACM Transactions on Computational Logic, 2(4), 623–654.

54.

Dennett

D. C.

(1989). The intentional stance. MIT Press.

55.

Dennis

Fisher

Slavkovik

Webster

(2016). Formal verification of ethical choices in autonomous systems. Robotics and Autonomous Systems, 77, 1–14.

56.

Dienes

K. R.

(1997). String theory and the path to unification: A review of recent developments. Physics Reports, 287(6), 447–525.

57.

Dignum

(2019). Responsible artificial intelligence: How to develop and use AI in a responsible way, vol. 1. Springer.

58.

Domingos

(2015). The master algorithm: How the quest for the ultimate learning machine will remake our world. Basic Books.

59.

Dubois

Prade

(1988). Representation and combination of uncertainty with belief functions and possibility measures. Computational Intelligence, 4(3), 244–264.

60.

Dummett

(1975). The philosophical basis of intuitionistic logic. In Studies in logic and the foundations of mathematics (vol. 80, pp. 5–40). Elsevier.

61.

Ellis

Albright

Solar-Lezama

Tenenbaum

J. B.

O’Donnell

T. J.

(2022). Synthesizing theories of human language with Bayesian program induction. Nature Communications, 13(1), 5024.

62.

Enderton

(1972). A mathematical introduction to logic. Academic Press.

63.

Evans

Grefenstette

(2018). Learning explanatory rules from noisy data. Journal of Artificial Intelligence Research, 61, 1–64.

64.

Fagin

(1976). Probabilities on finite models. The Journal of Symbolic Logic, 41(1), 50–58.

65.

Fagin

Halpern

J. Y.

(1994). Reasoning about knowledge and probability. Journal of the ACM, 41(2), 340–367.

66.

Fagin

Halpern

J. Y.

Megiddo

(1990). A logic for reasoning about probabilities. Information and Computation, 87(1–2), 78–128.

67.

Fagin

Halpern

J. Y.

Moses

Vardi

M. Y.

(1995). Reasoning about knowledge. MIT Press.

68.

Fierens

den Broeck

G. V.

Thon

Gutmann

Raedt

L. D.

(2011). Inference in probabilistic logic programs using weighted CNF’s. In UAI (pp. 211–220). https://dl.acm.org/doi/abs/10.5555/3020548.3020574

69.

Fine

(1997). Vagueness, truth and logic. https://philpapers.org/rec/FINVTA

70.

Frieder

Pinchetti

Griffiths

R.-R.

Salvatori

Lukasiewicz

Petersen

P. C.

Chevalier

Berner

(2023). Mathematical capabilities of ChatGPT. arXiv preprint arXiv:2301.13867.

71.

Gaifman

(1964). Concerning measures in first order calculi. Israel Journal of Mathematics, 2(1), 1–18.

72.

Gajowniczek

Liang

Friedman

Zabkowski

Van den Broeck

(2020). Semantic and generalized entropy loss functions for semi-supervised deep learning. Entropy, 22(3), 334.

73.

Garcez

A. S. d.

Broda

Gabbay

D. M.

, et al. (2002). Neural-symbolic learning systems: Foundations and applications. Springer Science & Business Media. https://dblp.org/rec/series/faia/GomesSS09.html

74.

Gehr

Mirman

Drachsler-Cohen

Tsankov

Chaudhuri

Vechev

(2018). AI2: Safety and robustness certification of neural networks with abstract interpretation. In 2018 IEEE symposium on security and privacy (SP) (pp. 3–18). IEEE.

75.

Gelfond

Lifschitz

(1988). The stable model semantics for logic programming. In ICLP (pp. 1070–1080).

76.

Getoor

Taskar

(2007). Introduction to statistical relational learning (adaptive computation and machine learning).

77.

Gomes

C. P.

Sabharwal

Selman

(2009). Model counting. In Handbook of satisfiability. IOS Press.

78.

Goodfellow

I. J.

Shlens

Szegedy

(2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

79.

Goodman

N. D.

Mansinghka

V. K.

Roy

D. M.

Bonawitz

Tenenbaum

J. B.

(2008). Church: A language for generative models. In Proceedings of the uncertainty in artificial intelligence (pp. 220–229). https://direct.mit.edu/books/edited-volume/3811/Introduction-to-Statistical-Relational-Learning

80.

Gros

T. P.

Hermanns

Hoffmann

Klauck

Steinmetz

(2023). Analyzing neural network behavior through deep statistical model checking. International Journal on Software Tools for Technology Transfer, 25(3), 407–426.

81.

Gulwani

(2010). Dimensions in program synthesis. In PPDP (pp. 13–24). ACM.

82.

Gunning

(2016). Explainable artificial intelligence (XAI). Technical report, DARPA/I20.

83.

Halpern

J. Y.

(2003). Reasoning about uncertainty. MIT Press.

84.

Halpern

J. Y.

(2016). Actual causality. MIT Press.

85.

Halpern

J. Y.

Vardi

M. Y.

(1991). Model checking vs. theorem proving: A manifesto. In J. F. Allen, R. Fikes, & E. Sandewall (Eds.), KR (pp. 325–334). Morgan Kaufmann. https://dl.acm.org/doi/abs/10.5555/3087158.3087191

86.

Heinamaki

O. T.

(1974). Semantics of English temporal connectives. The University of Texas at Austin.

87.

Herrmann

C. S.

Thielscher

(1996). Reasoning about continuous processes. In AAAI/IAAI (vol. 1, pp. 639–644). https://dblp.org/rec/conf/aaai/HerrmannT96.html

88.

Hitchcock

(2001). Causality: Models, reasoning and inference. https://www.jstor.org/stable/3182612?seq=1

89.

Hitzler

(2022). Neuro-symbolic artificial intelligence: The state of the art.

90.

Hoernle

Karampatsis

R. M.

Belle

Gal

(2022). Multiplexnet: Towards fully satisfied logical constraints in neural networks. In Proceedings of the AAAI conference on artificial intelligence (vol. 36, pp. 5700–5709). https://www.iospress.com/catalog/books/neuro-symbolic-artificial-intelligence-the-state-of-the-art

91.

Hofmann

Belle

(2023). Abstracting noisy robot programs. In AAMAS. https://dblp.org/rec/conf/atal/HofmannB23.html

92.

Holtzen

Broeck

Millstein

(2018). Sound abstraction and decomposition of probabilistic programs. In International conference on machine learning (pp. 1999–2008). PMLR.

93.

Holtzen

Vanden Broeck

Millstein

(2020). Scaling exact inference for discrete probabilistic programs. Proceedings of the ACM on Programming Languages, 4(OOPSLA), 1–31.

94.

Icarte

R. T.

Klassen

T. Q.

Valenzano

McIlraith

S. A.

(2022). Reward machines: Exploiting reward function structure in reinforcement learning. Journal of Artificial Intelligence Research, 73, 173–208.

95.

Innes

Ramamoorthy

(2020). Elaborating on learned demonstrations with temporal logic specifications. arXiv preprint arXiv:2002.00784.

96.

Jang

Lukasiewicz

(2023). Consistency analysis of ChatGPT. arXiv preprint arXiv:2303.06273.

97.

Jaynes

E. T.

(1988). How does the brain do plausible reasoning? In Maximum-entropy and Bayesian methods in science and engineering: Foundations (pp. 1–24). Springer.

98.

Jovanović

De Moura

(2013). Solving non-linear arithmetic. ACM Communications in Computer Algebra, 46(3/4), 104–105.

99.

Juba

(2013). Implicit learning of common sense for reasoning. In Twenty-third international joint conference on artificial intelligence. https://dblp.org/rec/conf/ijcai/Juba13.html

100.

Kaelbling

L. P.

Lozano-Pérez

(2013). Integrated task and motion planning in belief space. International Journal of Robotics Research, 32(9–10), 1194–1227.

101.

Kahneman

(2011). Thinking, fast and slow. Macmillan.

102.

Kambhampati

(2020). Challenges of human-aware AI systems. AI Magazine, 41(3). https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/5257

103.

Kambhampati

Valmeekam

Guan

Verma

Stechly

Bhambri

Saldyt

L. P.

Murthy

A. B.

(2024). Position: LLMs can't plan, but can help planning in LLM-modulo frameworks. In Proceedings forty-first international conference on machine learning.

104.

Katz

(1981). Łukasiewicz logic and the foundations of measurement. Studia Logica, 40, 209–225.

105.

Kautz

(2024). Tools are all you need. In Proceedings of the 4th workshop on logic and practice of programming, held in conjunction with the 40th international conference on logic programming.

106.

Kimmig

den Broeck

G. V.

Raedt

L. D.

(2012). Algebraic model counting. CoRR, abs/1211.4475.

107.

Kocijan

Davis

Lukasiewicz

Marcus

Morgenstern

(2022). The defeat of the Winograd schema challenge. arXiv preprint arXiv:2201.02387.

108.

Konev

Lutz

Ozaki

Wolter

(2018). Exact learning of lightweight description logic ontologies. Journal of Machine Learning Research, 18(201), 1–63.

109.

Kowalski

Sergot

(1986). A logic-based calculus of events. New Generation Computing, 4, 67–95.

110.

Krarup

Lindner

Krivic

Long

(2022). Understanding a robot’s guiding ethical principles via automatically generated explanations. In 2022 IEEE 18th international conference on automation science and engineering (CASE) (pp. 627–632). IEEE.

111.

Kripke

(1959). A completeness theorem in modal logic. Journal of Symbolic Logic, 24(1), 1–14.

112.

Kschischang

F. R.

Frey

B. J.

Loeliger

(2001). Factor graphs and the sum–product algorithm. IEEE Transactions on Information Theory, 47(2), 498–519.

113.

Kuipers

Feigenbaum

E. A.

Hart

P. E.

Nilsson

N. J.

(2017). Shakey: From conception to history. AI Magazine, 38(1), 88–103.

114.

Lake

B. M.

Salakhutdinov

Tenenbaum

J. B.

(2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332–1338.

115.

Lakemeyer

Levesque

H. J.

(2007). Cognitive robotics. In Handbook of knowledge representation (pp. 869–886). Elsevier.

116.

LeCun

(2022). A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. 62. https://openreview.net/pdf?id=BZ5a1r-kVsf

117.

Levesque

H. J.

(2012). Thinking as computation: A first course. MIT Press.

118.

Levesque

H. J.

(2014). On our best behaviour. Artificial Intelligence, 212, 27–35.

119.

Levesque

H. J.

Lakemeyer

(2001). The logic of knowledge bases. MIT Press.

120.

Levesque

Reiter

Lespérance

Lin

Scherl

(1997). GOLOG: A logic programming language for dynamic domains. Journal of Logic Programming, 31, 59–84.

121.

Liang

Bekker

Van den Broeck

(2017). Learning the structure of probabilistic sentential decision diagrams. In Proceedings of the 33rd conference on uncertainty in artificial intelligence (pp. 134–145). https://dblp.org/rec/conf/uai/LiangBB17.html

122.

Libkin

(2004). Elements of finite model theory. Springer.

123.

Liu

Feng

Belle

Lakemeyer

(2023). Concerning measures in a first-order logic with actions and meta-beliefs. In 20th international conference on principles of knowledge representation and reasoning. https://dblp.org/rec/conf/kr/0002FBL23.html

124.

Liu

Lakemeyer

Levesque

H. J.

(2004). A logic of limited belief for reasoning with disjunctive information. In KR (pp. 587–597). https://dblp.org/rec/conf/kr/LiuLL04.html

125.

Liu

Levesque

H. J.

(2005). Tractable reasoning in first-order knowledge bases with disjunctive information. In Proceedings of the AAAI (pp. 639–644). https://dblp.org/rec/conf/aaai/LiuL05.html

126.

Luo

Peng

(2020). When causal inference meets deep learning. Nature Machine Intelligence, 2(8), 426–427.

127.

Manhaeve

Dumancic

Kimmig

Demeester

De Raedt

(2018). DeepProbLog: Neural probabilistic logic programming. Advances in Neural Information Processing Systems, 31.

128.

Marcus

(2017). Am I human? Scientific American, 316(3), 58–63.

129.

Marcus

Davis

(2019). Rebooting AI: Building artificial intelligence we can trust. Vintage.

130.

McCarthy

(1959). Programs with common sense. In Proceedings of the teddington conference on the mechanization of thought processes (pp. 75–91). London: Her Majesty's Stationery Office.

131.

McCarthy

(1986). Applications of circumscription to formalizing common-sense knowledge. Artificial Intelligence, 28(1), 89–116.

132.

McCarthy

Hayes

P. J.

(1969). Some philosophical problems from the standpoint of artificial intelligence. In Machine Intelligence (pp. 463–502). https://philpapers.org/rec/MCCSPP

133.

Miceli-Barone

A. V.

Barez

Konstas

Cohen

S. B.

(2023). The larger they are, the harder they fail: Language models do not recognize identifier swaps in Python. arXiv preprint arXiv:2305.15507.

134.

Michael

(2007). Learning from partial observations. In International joint conference on artificial intelligence (pp. 968–974). Citeseer.

135.

Milch

Marthi

Russell

S. J.

Sontag

Ong

D. L.

Kolobov

(2005). BLOG: Probabilistic models with unknown objects. In Proceedings of the international joint conference on artificial intelligence (pp. 1352–1359). https://dblp.org/rec/conf/ijcai/MilchMRSOK05.html

136.

Minervini

Bosnjak

Rocktäschel

Riedel

(2018). Towards neural theorem proving at scale. arXiv preprint arXiv:1807.08204.

137.

Mirzadeh

Alizadeh

Shahrokhi

Tuzel

Bengio

Farajtabar

(2024). GSM-Symbolic: Understanding the limitations of mathematical reasoning in large language models. arXiv preprint arXiv:2410.05229.

138.

Mitra

P. P.

(2014). The circuit architecture of whole brains at the mesoscopic scale. Neuron, 83(6), 1273–1283.

139.

Moot

Retoré

(2019). Natural language semantics and computability. Journal of Logic, Language and Information, 28, 287–307.

140.

Morgenstern

McIlraith

S. A.

(2011). John McCarthy’s legacy. Artificial Intelligence, 175(1), 1–24.

141.

Muggleton

De Raedt

Poole

Bratko

Flach

Inoue

Srinivasan

(2012). ILP turns 20. Machine Learning, 86(1), 3–23.

142.

Nilsson

N. J.

(1986). Probabilistic logic. Artificial Intelligence, 28(1), 71–87.

143.

Nitti

Belle

De Laet

De Raedt

(2017). Planning in hybrid relational mdps. Machine Learning, 106, 1905–1932.

144.

Niu

Zhang

Ré

Shavlik

J. W.

(2012). DeepDive: Web-scale knowledge-base construction using statistical learning and inference. VLDS, 12, 25–28.

145.

Ognjanovic

Raškovic

(2000). Some first-order probability logics. Theoretical Computer Science, 247(1–2), 191–212.

146.

Pagnucco

Rajaratnam

Limarga

Nayak

Song

(2021). Epistemic reasoning for machine ethics with situation calculus. In Proceedings of the 2021 AAAI/ACM conference on AI, ethics, and society (pp. 814–821). https://dl.acm.org/doi/10.1145/3461702.3462586

147.

Pan

Albalak

Wang

W. Y.

(2023). Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning. arXiv preprint arXiv:2305.12295.

148.

Panas

Seth

Belle

(2024). Can large language models put 2 and 2 together? Probing for entailed arithmetical relationships. arXiv preprint arXiv:2404.19432.

149.

Pearl

(1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann.

150.

Pearl

(2009). Causality. Cambridge University Press.

151.

Pearl

Mackenzie

(2018). The book of why. https://www.amazon.co.uk/Book-Why-Science-Cause-Effect/dp/0241242630

152.

Pei

Jin

Liu

Geng

Cavallaro

Yang

Jana

(2023). Symmetry-preserving program representations for learning code semantics. arXiv preprint arXiv:2308.03312.

153.

Persia

Ozaki

(2022). Extracting rules from neural networks with partial interpretations. arXiv preprint arXiv:2204.00360.

154.

Prado

Chadha

Booth

J. R.

(2011). The brain network for deductive reasoning: A quantitative meta-analysis of 28 neuroimaging studies. Journal of Cognitive Neuroscience, 23(11), 3483–3497.

155.

Prior

(1967). Past, present and future. Oxford University Press.

156.

Pérez

Barceló

Marinkovic

(2021). Attention is turing-complete. Journal of Machine Learning Research, 22(75), 1–35.

157.

Rader

Mocanu

I. G.

Belle

Juba

(2021). Learning implicitly with noisy data in linear arithmetic. In Proceedings of the thirtieth international joint conference on artificial intelligence. https://dblp.org/rec/conf/ijcai/RaderMBJ21.html

158.

Raedt

L. D.

Kersting

Natarajan

Poole

(2016). Statistical relational artificial intelligence: Logic, probability, and computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 10(2), 1–189.

159.

Raedt

L. D.

Kimmig

Toivonen

(2007). ProbLog: A probabilistic prolog and its application in link discovery. In Proceedings of the international joint conference on artificial intelligence (pp. 2462–2467). https://dblp.org/rec/conf/ijcai/RaedtKT07.html

160.

Raman

Piterman

Kress-Gazit

(2013). Provably correct continuous control for high-level robot behaviors with actions of arbitrary execution durations. In International conference on robotics and automation, Karlsruhe, Germany (pp. 4075–4081). IEEE.

161.

Reeke

G. N.

Edelman

G. M.

(1988). Real brains and artificial intelligence. Daedalus, 177(1), 143–173.

162.

Reiter

(2001). Knowledge in action: Logical foundations for specifying and implementing dynamical systems. MIT Press.

163.

Richardson

Domingos

(2006). Markov logic networks. Machine Learning, 62(1), 107–136.

164.

Rintanen

(2012). Planning as satisfiability: Heuristics. Artificial Intelligence, 193, 45–86.

165.

Rossi

(2024). Thinking fast and slow in AI: A cognitive architecture to augment both AI and human reasoning (invited talk). In P. Shaw (Ed.), 30th international conference on principles and practice of constraint programming, CP 2024, September 2–6, 2024, Girona, Spain, volume 307 of LIPIcs (p. 2:1). Schloss Dagstuhl – Leibniz-Zentrum für Informatik.

166.

Russell

S. J.

(2015). Unifying logic and probability. Communications of the ACM, 58(7), 88–97.

167.

Russell

Norvig

(2003). Artificial intelligence: A modern approach (2nd ed.). Prentice-Hall.

168.

Sabour

Frosst

Hinton

G. E.

(2017). Dynamic routing between capsules. In Advances in neural information processing systems 30. https://dl.acm.org/doi/10.5555/3294996.3295142

169.

Sap

Shwartz

Bosselut

Choi

Roth

(2020). Commonsense reasoning for natural language processing. In Proceedings of the 58th annual meeting of the association for computational linguistics: Tutorial abstracts (pp. 27–33). https://aclanthology.org/2020.acl-tutorials.7/

170.

Sardina

Lespérance

(2010). GOLOG speaks the BDI language. In Programming multi-agent systems, volume 5919 of LNCS (pp. 82–99). Springer.

171.

Sejnowski

T. J.

(2020). The unreasonable effectiveness of deep learning in artificial intelligence. Proceedings of the National Academy of Sciences, 117(48), 30033–30038.

172.

Shapiro

R. B.

Fiebrink

Norvig

(2018). How machine learning impacts the undergraduate computing curriculum. Communications of the ACM, 61(11), 27–29.

173.

Shih

Darwiche

Choi

(2019). Verifying binarized neural networks by Angluin-style learning. In Theory and applications of satisfiability testing–SAT 2019: 22nd international conference, SAT 2019, Lisbon, Portugal, July 9–12, 2019, proceedings 22 (pp. 354–370). Springer.

174.

Shvo

Klassen

T. Q.

McIlraith

S. A.

(2020). Towards the role of theory of mind in explanation. In Explainable, transparent autonomous agents and multi-agent systems: Second international workshop, EXTRAAMAS 2020, Auckland, New Zealand, May 9–13, 2020, revised selected papers 2 (pp. 75–93). Springer.

175.

Sileo

Lernould

(2023). Mindgames: Targeting theory of mind in large language models with dynamic epistemic modal logic. arXiv preprint arXiv:2305.03353.

176.

Smolensky

(1987). Connectionist AI, symbolic AI, and the brain. Artificial Intelligence Review, 1(2), 95–109.

177.

Staton

Yang

Wood

Heunen

Kammar

(2016). Semantics for probabilistic programming: Higher-order functions, continuous distributions, and soft constraints. In Proceedings of the 31st Annual ACM/IEEE symposium on logic in computer science (pp. 525–534). https://dblp.org/rec/conf/lics/StatonYWHK16.html

178.

Stewart

Ermon

(2017). Label-free supervision of neural networks with physics and domain knowledge. In Proceedings of the AAAI conference on artificial intelligence (vol. 31). https://dblp.org/rec/conf/aaai/StewartE17.html

179.

Swanson

L. W.

(2012). Brain architecture: Understanding the basic plan. Oxford University Press.

180.

Tang

Belle

(2024). Tom-lm: Delegating theory of mind reasoning to external symbolic executors in large language models. In T. R. Besold, A. d’Avila Garcez, E. Jiménez-Ruiz, R. Confalonieri, P. Madhyastha, & B. Wagner (Eds.), Neural-symbolic learning and reasoning – 18th international conference, NeSy 2024, Barcelona, Spain, September 9–12, 2024, proceedings, part II, volume 14980 of Lecture Notes in Computer Science (pp. 245–257). Springer. https://link.springer.com/chapter/10.1007/978-3-031-71170-1_20

181.

Tellex

Kollar

Dickerson

Walter

M. R.

Banerjee

A. G.

Teller

Roy

(2011). Approaching the symbol grounding problem with probabilistic graphical models. AI Magazine, 32(4), 64–76.

182.

Ternovska

Mitchell

D. G.

(2009). Declarative programming of search problems with built-in arithmetic. In Proceedings of the international joint conference on artificial intelligence (pp. 942–947). https://dblp.org/rec/conf/ijcai/TernovskaM09.html

183.

Thrun

Burgard

Fox

(2005). Probabilistic robotics. MIT Press.

184.

Tooby

Cosmides

Barrett

H. C.

(2005). Resolving the debate on innate ideas. In The innate mind: Structure and content (pp. 305–337). https://philpapers.org/rec/TOORTD

185.

Turing

A. M.

(1950). Computing machinery and intelligence. Mind, 59(236), 433–460. https://dblp.org/rec/conf/stoc/Valiant99

186.

Valiant

L. G.

(1999). Robust logics. In Proceedings of the thirty-first annual ACM symposium on theory of computing (pp. 642–651).

187.

Valmeekam

Olmo

Sreedharan

Kambhampati

(2022). Large language models still can’t plan (a benchmark for LLMs on planning and reasoning about change). arXiv preprint arXiv:2206.10498.

188.

Valmeekam

Stechly

Kambhampati

(2024). LLMs still can’t plan; can LRMs? A preliminary evaluation of OpenAI’s o1 on PlanBench. arXiv preprint arXiv:2409.13373.

189.

van Benthem

(1989). Semantic parallels in natural language and computation. In Studies in logic and the foundations of mathematics (vol. 129, pp. 331–375). Elsevier.

190.

van Benthem

(2017). Against all odds: When logic meets probability. In ModelEd, TestEd, TrustEd: essays dedicated to Ed Brinksma on the occasion of his 60th birthday (pp. 239–253). https://link.springer.com/chapter/10.1007/978-3-319-68270-9_12

191.

Van den Broeck

(2013). Lifted Inference and Learning in Statistical Relational Models. PhD thesis, KU Leuven.

192.

Van den Broeck

Meert

Davis

(2013). Lifted generative parameter learning. In statistical relational artificial intelligence, AAAI workshop. https://dblp.org/rec/conf/aaai/BroeckMD13.html

193.

van Krieken

Acar

van Harmelen

(2022). Analyzing differentiable fuzzy logic operators. Artificial Intelligence, 302, 103602. https://era.ed.ac.uk/handle/1842/5822?show=full

194.

van Wijk

(2006). Logical connectives in natural language: A cultural evolutionary approach.

195.

Vaswani

Shazeer

Parmar

Uszkoreit

Jones

Gomez

A. N.

Kaiser

Polosukhin

(2017). Attention is all you need. In Advances in neural information processing systems 30. https://dl.acm.org/doi/10.5555/3295222.3295349

196.

Vergari

Choi

Liu

Teso

Van den Broeck

(2021). A compositional atlas of tractable circuit operations for probabilistic inference. Advances in Neural Information Processing Systems, 34, 13189–13201.

197.

Verma

Rubin

(2018). Fairness definitions explained. In 2018 IEEE/ACM international workshop on software fairness (fairware) (pp. 1–7). IEEE.

198.

Wang

P.-W.

Donti

Wilder

Kolter

(2019). SatNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver. In International conference on machine learning (pp. 6545–6554). PMLR.

199.

Winfield

Blum

Liu

(2014). Towards an ethical robot: internal models, consequences and ethical action selection. In Proceedings of the 14th conference towards autonomous robotic systems (pp. 85–96). https://link.springer.com/chapter/10.1007/978-3-319-10401-0_8

200.

Wolfram

(2023). Wolfram— alpha as the way to bring computational knowledge superpowers to ChatGPT. Stephen Wolfram, LLC.

201.

Leskovec

Jegelka

(2018). How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.

202.

Yamamoto

Kobayashi

Tanaka

(2024). An empirical automated evaluation and analysis of symmetrical reasoning in large language models. Authorea Preprints.

203.

Yang

Ishay

Lee

(2020). NeurASP: Embracing neural networks into answer set programming. In 29th international joint conference on artificial intelligence (IJCAI 2020). https://dl.acm.org/doi/10.5555/3491440.3491683

204.

Zadeh

L. A.

(1965). Fuzzy sets. Information and Control, 8(3), 338–353.

205.

Zhang

Huang

Naik

Xing

(2023). Improved logical reasoning of language models via differentiable symbolic programming. arXiv preprint arXiv:2305.03742.

206.

Zhang

L. H.

Meng

Chang

K.-W.

Broeck

G. V. d.

(2022). On the paradox of learning to reason from data. arXiv preprint arXiv:2205.11502.

On the Relevance of Logic for Artificial Intelligence,and the Promise of Neurosymbolic Learning

Abstract

Keywords

1. Introduction

2. Logic is Old-Fashioned

2.1. Neural Approaches and Nothing Else!

2.2. There is a Dichotomy

2.2.1. Real-Valued Truth Values

2.2.2. From Discrete to Continuous

2.3. Logic is Not Good for Probabilistic Uncertainty

2.3.1. Probabilistic Logical Models

2.3.2. Generalizing the Specification of a Distribution

2.5. Logic is About Categorical Propositional Assertions

2.6. Monotoncity

2.7. Differentiability

2.8. What About “Human-Like” Semantic Definitions?

2.8.1. Correct Reasoning

2.8.2. The Intentional Stance

3. Logic and Learning can be Complementary

3.1. Symbolic Logic as Meta-Theory

3.2. High-Level Knowledge

3.3. Symbolic Logic can Instantiate New Methods of Inference

3.4. Logical Oracles

3.5. Logic Benefits From Learning

4. Concluding Thoughts

4.1. Other Dimensions

4.2. Neurosymbolic AI

4.3. Much to Learn

Footnotes

Funding

Declaration of Conflicting Interests

Notes

ORCID iD

References