Abstract
As a new research direction in contemporary cognitive science, predictive processing surpasses traditional computational representation and embodied cognition and has emerged as a new paradigm in cognitive science research. The predictive processing theory advocates that the brain is a hierarchical predictive model based on Bayesian inference, and its purpose is to minimize the difference between the predicted world and the actual world, so as to minimize the prediction error. Predictive processing is therefore essentially a context-dependent model representation, an adaptive representational system designed to achieve its cognitive goals through the minimization of prediction error.
Introduction
With advances in cognitive science, predictive processing (PP) has emerged as a novel framework for understanding human cognition. According to PP, the brain functions as a hierarchical prediction machine, with cognitive processes centered on minimizing the discrepancy between predicted and actual experiences—essentially, engaging in prediction error minimization (Clark, 2013).
PP not only emphasizes the feedback mechanism and learning process in neural networks, but also introduces how the brain regulates perception and action through minimizing prediction error
There are also differences between PP and the embodiment paradigm. Embodied cognition focuses on how the body and environment affect cognition, while PP explains how the interaction with the environment is optimized through internal models. PP does not rely solely on sensory input and physical movements; rather, it proposes that the brain actively regulates perception and behaviour by generating internal models and making predictions and adjustments based on external feedback. Embodied cognition emphasizes the role of the body, but provides fewer details on how cognitive processes are adjusted and optimized through prediction. PP provides a more flexible framework that better explains how the brain adapts to rapidly changing environments through prediction and correction, and integrates cognition, perception and action into a continuous dynamic adjustment system.
The theoretical roots of PP can be traced back to the notion of ‘unconscious reasoning’ introduced by the nineteenth century German physicist and psychologist Helmholtz (1860/1962). In his psychophysiological studies of sensory perception, Helmholtz identified a generalization process that occurs independently of conscious thought and characterized perception as a probabilistic, knowledge-driven reasoning process. Building on Helmholtz's ideas, cognitive psychologists such as Mackay, Neisser and Gregory argued that the brain does not merely construct a current model of the world through a bottom-up accumulation of sensory inputs; rather, it predicts current sensory information using the best possible model of its potential causes in a top-down manner (Yuille and Kersten, 2006).
The PP model integrates top–down probabilistic generative models within a multilevel bi-directional framework, employing a core predictive coding strategy for efficient encoding and transmission. Initially applied to perception, this approach has since been extended to action by Friston (2010), providing a cohesive paradigm for cognitive research encompassing perception, action, learning and reasoning by the brain.
This paper aims to elucidate the PP mechanism of human cognition through the lens of adaptive representational methodology, thereby offering a deeper understanding of its intrinsic processes. We posit that both the top–down a priori prediction model and the bottom–up perceptual input model in PP serve as contextual projection mechanisms in a representational sense, with the process of minimizing prediction error resembling Bayesian inference. The kernel of Bayesian inference highlights adaptive representational features.
Adaptation of PP: Minimization of prediction error
According to PP, prediction involves minimizing the discrepancy between the predicted world and the actual world to reduce prediction error. Hohwy (2016) elucidates this process using a rainfall statistical model: the lower the prediction error, the more reliable the model becomes and the richer the information it conveys about the world. Statistical models of rainfall predict environmental states through model parameters, which are adjusted based on actual conditions. This principle applies similarly to cognitive systems.
The brain's neural construction of cognition involves two distinct types of PP: ‘top–down’ and ‘bottom–up’. Top–down predictions continuously interact with bottom–up perceptual inputs, enabling the brain to infer causes of perception and adjust actions accordingly to minimize discrepancies between predictions and actual states. Mechanistically, minimization of prediction error occurs through intra-cortical signalling and inter-cortical information transfer, with various cortical layers processing perceptual inputs across different time scales. Changes in the environment alter perceptual inputs, prompting the brain's a priori prediction model to adjust its forecasts in response.
A long-standing core issue in PP is how to explain consciousness. Since Crick and Koch (1990) proposed the neural correlates of consciousness (NCC), the search for NCC based on a neuroscience approach has become the mainstream paradigm for the study of consciousness. This development also represents the beginnings of scientific research on consciousness. However, since Chalmers (1995) proposed the hard problem of consciousness, NCC research has inevitably fallen into the problem of explaining consciousness—that is, why and how neurophysiological activities produce the conscious experience. Regarding the question of how “conscious experience” is generated, PP, being a new model for understanding human cognition, explains the mechanism of consciousness. Friston (2010) extends the concept of free energy to the behaviour and perception of living organisms, framing prediction error minimization within a ‘surprise-minimization’ context as a goal-driven process. Wiese and Friston (2021) further integrates the free energy theory with neural dynamics to propose a computational explanation of consciousness, seeking to identify the NCC through this framework. However, this reductionist approach, rooted in physicalism, grapples with the challenge of explaining how individual neuronal computations lead to the realization of consciousness. Computational explanations based on PP and the free energy principle do not address this gap.
We believe that, given the complexity of the mechanism of cognitive occurrence and the difficulty in explaining it, explaining PP from the perspective of adaptive representation may be a way out; that is, taking adaptive representation as the internal mechanism and explanatory framework of cognitive occurrence and studying consciousness from the perspective of the generation mechanism, evolutionary mechanism and cognitive mechanism of consciousness.
From an adaptive representation perspective, consciousness emerges from multilevel neural representations during the process of adaptive representation (prediction error minimization) within the brain's hierarchical prediction model. It is a holistic result of the interactions between all neurons, and not reducible to isolated properties. According to Wei (2019), evolutionary biology suggests that evolution implies adaptation, making nearly all human capacities—including mental representations—inherently adaptive. With mental representation comes cognitive ability, which, in turn, fosters mind and intelligence. Thus, adaptive representations embody a non-reductive, holistic view of consciousness grounded in materialist monism, indicating that prediction error minimization fundamentally constitutes an adaptive representation process.
Adaptation refers to the capacity of an organism to adjust to environmental changes, thereby achieving a harmonious interaction with its surroundings. Pezzulo et al. (2022) contend that the brain's PP mechanism is not a recent evolutionary development in humans; rather, it evolved gradually from predictive circuits, such as autonomic and motor reflexes, inherited from our ancestors. This mechanism is essential for addressing the challenges of adaptive conditioning. PP facilitates adaptive processes in the brain through both perceptual inference and active inference.
First, perceptual inference encompasses the ability to deduce sensory stimuli based on predictions formed by internal neural representations that are, in turn, shaped by experience (Aggelopoulos, 2015). Sensory memories derived from experience are stored within the same neuronal networks responsible for perceiving object features. Without this memory storage, these networks may struggle to recognize relevant object characteristics. Thus, stored memories can be likened to the generative models employed in Bayesian statistical inference. Computationally, perception can be framed as empirical Bayesian inference, where prior knowledge is established through experience, development and evolution, and subsequently utilized in the parameters of a hierarchical statistical model to interpret perceptual inputs.
Second, active inference implies that organisms must recognize a ‘minimal uncertainty’ causal model that reflects the probabilistic relationships between relevant events. To survive and reproduce in a constantly changing environment, organisms need to interpret sensory inputs and the potential causes behind them (Constant et al., 2022). This involves modelling the causes of events in the external world and executing possible actions dictated by their physiological capabilities.
In summary, both perceptual and active inference operate in tandem to enable the brain's adaptive processes. Perceptual inference transmits optimal predictions to lower-level models through methods such as gradient descent, expectation maximization or variational Bayes, enhancing the predictive model's adaptability to environmental changes. Concurrently, active inference ensures the precision of predictions by guiding organisms to engage in actions based on a causal model of event probabilities.
It is crucial to recognize the causal and functional relationship between active and perceptual inference. The efficacy of active inference in minimizing prediction errors relies on the accuracy of perceptual inference, while the precision of perceptual inference is contingent upon the generative model's ability to capture the causal-probabilistic structure of the external world with regard to likelihoods, dynamics and a priori knowledge. The more accurate the model, the more precise the hypotheses it generates. Thus, the cognitive system minimizes PP through the interplay of perceptual and active inference, with the process of minimizing prediction error fundamentally embodying the adaptive representation of an organism in its environment. For example, the PP of motor control is a typical adaptive representation process. When you reach out to pick up an object from a table, your brain predicts the position and size of the object, the position of your arm, muscle tension and other factors through mental representation. Through perceptual inputs such as visual and tactile feedback, the brain receives sensory information about the position of your hand as well as the size and shape of the object in real time. If the predicted movement trajectory deviates from the actual movement (for example, your hand touches the table next to the object instead of the object itself), the brain will perceive the error and adapt the action plan based on this feedback—such as by adjusting the angle and speed of the arm—to reduce the error and ensure that the object can be successfully grasped.
However, it should be pointed out that attributing all cognitive errors to prediction error minimization is controversial. Cognitive errors may not only be the result of prediction errors, but also heuristic decision-making, cognitive biases as well as social and cultural factors. Tversky and Kahneman (1974) proposed that people do not always make decisions based on rational reasoning, but instead rely on simplified cognitive shortcuts (such as heuristics) to make quick judgments, which can lead to systematic cognitive errors. Although these errors are not based on the prediction error minimization mechanism, they are closely related to quick, intuitive decision-making. From the perspective of adaptive representation, PP provides an important perspective for explaining cognitive errors, although it is not the only explanatory framework in this context. PP must be combined with heuristic decision-making, cognitive bias, emotion, as well as social and cultural factors to enhance the adaptive representation ability of the cognitive system.
The representational nature of PP: Manifestation through structural representation and contextual projection
As previously mentioned, PP conceptualizes the external world as a probabilistic model rooted in causal mechanisms. Specifically, whether and how object A causally affects object B is treated as a probabilistic event. The causal-probabilistic laws of the external environment dictate the statistical model of the sensory input system, and the statistical data derived from sensory signals serve as the sole source of the causal-probabilistic structure of the external world. We propose a function that maps the state of the world (the external causal probabilities of sensory signals) to the state of the sensory system, which PP realizes by constructing a generative model. In essence, the human cognitive system develops an internal model of the external world based on the statistical data of sensory input.
The generative model posited by the PP hypothesis operates as a multilevel two-tiered structure. This two-tiered approach consists of hidden variables and sensory inputs, with the hidden level corresponding to the upper tier and the sensory level to the lower tier. The objective of each level is to minimize the prediction error of the level beneath it. We view this process as an adaptive representation, which necessitates an understanding of the concept of ‘representation’.
Representation, as a theoretical construct in cognitive science and the philosophy of artificial intelligence, is pivotal to addressing human cognitive challenges. Thus, it is essential to examine PP theory from a representational standpoint. Gładziejewski (2016) characterizes the PP model as a form of mental representation based on structuralism, suggesting that the model's functional role closely resembles that of a map in the real world. He metaphorically refers to the PP model as a ‘map’ and draws analogies through the four functional attributes of a map to illustrate the adaptive representation process inherent in PP.
The first of these attributes is structural representation. In scientific cognition, a model serves as an accurate representation of certain aspects of the world. According to mathematical structuralism, a scientific theory or model constitutes a minimal understanding of scientific representation, reflecting a process that transitions from description to reproduction of scientific explanation. Structure is indispensable in this process. Methodologically, various forms of structuralism converge on the idea that two objects or systems within the representation relationship possess a ‘shared structure’ (Wei, 2018). In this view, representation is a structural mapping, where the spatial relationships between components of the map correspond to those of the represented terrain. For instance, A’, B’, and C’ on the map correspond to buildings A, B, and C in the physical landscape. From this, it can be inferred that if A’ is closer to C’ than to B’, then building A is closer to building C than to building B. Consequently, the effective functioning of the PP model hinges on its similarity to the causal-probabilistic structure of the world, enabling the generative model to map the state of the world to the state of the sensory system.
The second attribute is operational guidance. Maps provide users with operational guidance to facilitate actions based on the terrain. For example, when navigating an unfamiliar city, maps direct the user on which routes to take, where and when to turn, and which paths to avoid. More critically, maps also guide cognitive behaviour, for instance, when we consult maps to form accurate judgments about relative distances between terrain features or select alternative routes to a specific destination. Similarly, PP models direct human actions, aiming to minimize prediction errors. As noted, organisms utilize two strategies for minimizing prediction errors: updating previous probabilistic mappings of the world—termed perceptual reasoning—and selectively sampling new sensory inputs through actions, known as active inference (Kersten, 2022).
The third attribute is separability. Maps offer operational guidance in an offline manner; for instance, we can study a map of Beijing while in Taiyuan, navigating cognitively without being physically present in Beijing. Likewise, the operation of generative models can be decoupled from the environment. PP theory views generative models as processes of endogenous control and internal prediction. It is thus the generative model—not the external world itself—that guides and regulates the cognitive system. Active inference is informed by internal assumptions regarding the causes of the external world, rather than through direct interaction with the environment. Representation serves as a cognitive construct even when its object is absent.
The fourth attribute involves detection and correction of representation error. Errors in map representations can obstruct our ability to reach a destination. We can assess the inaccuracy of a map by identifying such obstacles and adjust our actions accordingly. Similarly, the generative model in PP can detect representation errors. When the generative model is inaccurate, the active reasoning it informs will result in clear prediction errors; conversely, even an accurate generative model may yield prediction errors due to imprecise sensory input signals. When the error signal exceeds a certain accuracy-weighted threshold, the system prompts re-engagement in perceptual inference to more effectively minimize prediction errors. This process exemplifies representation error detection within the PP model.
We assert that—whether through the probabilistic mapping of generative models (structuralism) or the map metaphor of PP (analogism)—the fundamental process involves the subject ‘projecting A to B’. This represents a contextual projection process grounded in representation. The concept of ‘projection’ encompasses meanings such as reference, substitution, representation, symbolization, deduction, mapping and analogy. It also incorporates contextual factors embedded in the representation tool—such as structure, attributes, semantics, background and situation—onto the target being represented. This contextual projection framework can integrate various theories of representation, leading to the establishment of a unified theory known as ‘contextual projection theory’ (Wei, 2022a).
Given the above, how does PP project representations based on context? We propose that the contextual projection in PP unfolds in two stages: first, the contextual projection stage, where the model designer creates the PP model informed by their own knowledge context—referred to as first-order projection; second, the representational projection stage, where the model designer applies the PP model, with its contextual attributes, to explain the external world—termed second-order projection. Here, contextual projection acts as ‘background radiation’, serving as a semantic field.
The attributes, structures and characteristics of the PP model primarily arise from the interaction between the subject and the target object. Before designing the intermediary object, the subject must repeatedly observe, measure and envision the target object, constructing the intermediary based on the acquired data or attributes. This constitutes the contextualization and re-contextualization processes of representational projection. The top–down a priori predictive model and the bottom–up perceptual input therefore represent the subject's projection process to contextualize the target object through observation, measurement and imagination.
Adaptive representational modes of PP
Our adaptive and representational analysis of PP suggests that the cognitive system possesses the capacity to autonomously represent the target object in response to changes in the environment or context. Consequently, we conceptualize PP as an adaptive representational process.
In the cognitive prediction process, the object, as a natural phenomenon, serves as a first-order presentation, while the subject's description of it using a representational tool, such as language, constitutes a second-order expression. Representation, in this context, is viewed as a ‘reproduction’—a re-expression of the object in question. As an adaptive representation, PP embodies an indirect abstract representation, which utilizes symbols or mathematical equations, relying on imagination for symbolic representation (Wei, 2022b). Bayesian inference and predictive coding are the two primary representations within the PP model, and they work in tandem to facilitate PP in the brain.
First, Bayesian inference is formed by Bayes’ rule, derived from two basic principles of probability theory—namely, the product rule and the summation rule of probability (Etz and Vandekerckhove, 2018). The formula for the product rule is expressed as P(A, B) = P(B) P(A | B) = P(A) P(B | A). Here, the probability that both event A and event B are true is equal to the product of the probability of event B occurring and the conditional probability of event A occurring, which, according to the principle of symmetry, is also equal to the product of the probability of event A occurring and the conditional probability of event B occurring. If we assume that event A and event B are mutually independent events, then the product rule can be simplified to: P(A, B) = P(A) P(B | A) = P(A) P(B). The formula for the summation rule is expressed as P(B) = P(A1, B) + P(A2, B) + … + P(Ak, B), where {A1, A2, …, Ak} is a disjoint set, the probability of event B occurring is equal to the joint probability of event B with the set of events involving event B {A1, A2, …, Ak} and the sum of the joint probabilities.
Obviously, Bayesian inference is scientific reasoning through the application of the above two rules (i.e., Bayesian hypothesis testing). Furthermore, the process of hypothesis testing is the process of characterizing the adaptation goal, which is expressed in the following form: the Bayesian inference refers to the event M as the hypothesis held by the researcher and ¬M as the competing hypothesis, and the two events can be combined to form a disjoint set: {M, ¬M}. If P(M) and P(¬M) are used to denote the prior probabilities of hypothesis M and competing hypothesis ¬M, respectively, and P(X | M) and P(X | ¬M) are used to denote the probability of making a particular prediction about the outcome X of each experiment, then Bayes’ rule for hypothesis M can be expressed as P(M | X) = P(M, X) / P(X). According to the product rule of probability theory, this formula can, in turn, be expressed as P(M | X) = P(M) P(X | M) / P(X), and the Bayes rule for hypothesis ¬M can be expressed as P(¬M| X) = P(¬M) P(X | ¬M) / P(X), where P(M | X) and P(¬M | X) are referred to as posterior probabilities.
Uncertainty is a commonly encountered challenge in neural processing within the brain. Often, the brain cannot accurately determine the potential cause of incoming sensory stimuli. For instance, when hearing the rustling of leaves at night, it becomes essential to infer whether the cause is a dangerous predator or merely the wind (Tang and Xu, 2022). PP suggests that, to address sensory ambiguity, the brain's perceptual system employs Bayesian inference rules to deduce the source of sensory stimulation (Aitchison and Lengyel, 2017). Thus, Bayesian inference exemplifies a typical adaptive representation.
Second, Brown et al. (2011) integrated the PP generative model with hierarchical predictive coding, depicting top–down multilevel connectivity as a means to predict and fully interpret driving sensory signals, with only prediction errors propagating information within the system. Hohwy (2020) details predictive coding using the following equations:
Here, accuracy itself is inferred, enabling the model to produce decreasing predictions of accuracy used in the denominator of Equation 3, thereby creating a variable learning rate. The internal model is hierarchical, and the weighting of prediction errors at any given level can be modulated by learned laws. By iterating perceptual inference across levels, the internal model generalizes the causal structure and dynamic processes of the environment. In hierarchical reasoning, each level
A typical example is the process of motion perception in visual perception, where the brain minimizes the prediction error in the PP by predicting the position and trajectory of an object. The first step is the initial prediction. Suppose you are observing a moving object (for example, a car driving along the road). Your brain makes a preliminary prediction based on previous experience and the current state of motion of the object (its speed, direction)—that is, predicting the position of the object at the next moment. The second step is perceptual input. Your eyes capture the current position of the object and its image on the retina. This image will undergo certain changes on the retina, such as the offset of the object's position and change in shape. The third step is to calculate the error. Your brain compares the predicted position of the object with the actual perceived position. If there is a difference between the predicted position and the actual position (that is, the prediction error), your brain will perceive the existence of the error. The fourth step is error minimization adjustment. Once the prediction error is perceived, your brain will adjust the next prediction. It will use the current visual information and combine past experience (such as the law of motion of the object) to correct the prediction model, thereby reducing the error of the next prediction. For example, if the prediction deviation is too large, the prediction of the speed or direction of the object may be adjusted. Of course, the above steps are a dynamic process that is continuously updated and adjusted. In motion tracking, the brain constantly updates the state of motion of the object, thereby reducing prediction errors and ensuring that perception of the object's position is as close to the actual position as possible.
However, while both predictive coding and Bayesian inference emphasize the integration of bottom–up sensory inputs with top–down a priori predictive modelling, they differ in focus and the types of data they naturally handle. Predictive coding treats prediction error as the sole information propagated within the system, but does not clarify how computational prediction is achieved or how prediction errors are utilized. Conversely, Bayesian inference offers an optimal algorithm for computational prediction, but does not specify the underlying neural representations (Aitchison and Lengyel, 2017). Predictive coding elucidates these neural representations, whereas Bayesian inference outlines the outcomes of the predictive computations. Together, these two approaches converge to form Bayesian predictive coding, a representation of PP that fundamentally embodies an adaptive representation process.
Conclusion
In summary, the PP model represents a cognitive system grounded in Bayesian inference, offering a novel explanation of human cognitive mechanisms through two explanatory levels: top–down generative prediction and bottom–up perceptual input. This model marks a significant breakthrough in cognitive neuroscience (He, 2022). To date, the theory of PP has been primarily explored through two research paths: the connectionist account of predictive processing, which aims to develop a finite sampling prediction model informed by Bayesian principles and active inference through multilevel prediction pathways; and the embodied predictive processing perspective, which emphasizes the substantial impact of bodily actions and proprioception on intracranial Bayesian prediction, aligning with a prediction model that couples the body and brain based on action-driven initiatives (Zhu and Liu, 2022).
We believe that both the connectionist account of PP and embodied PP fundamentally represent an adaptive form of representation, differing only in their modes of expression. Regardless of the form, all representations exhibit adaptive characteristics. The processes of prediction error minimization in PP and the Bayesian predictive coding mechanism demonstrate that the human cognitive system functions as an adaptive representation system that emerges from adaptations to external environments, where the mental world originates from and exists independently of the natural world, interacting with it to generate a realm of knowledge (representation). Thus, the process of knowledge creation is essentially the process of adaptive representation.
Given the powerful explanatory power of PP, it is not only applicable to basic research in cognitive science, but also has broad prospects for practical applications. For example, PP can help design more intelligent autonomous systems, such as self-driving cars and robots. These systems can actively generate predictions about future environments based on the PP mechanism and adjust their behaviour in real time according to sensory input. PP can also help understand and treat mental illnesses, such as autism spectrum disorder, schizophrenia and depression. These diseases are often accompanied by cognitive biases or prediction errors. For example, individuals with autism may exhibit oversensitivity or slow responses to perception, which means that their predictive models fail to adjust and update effectively. Intervention methods based on PP may help improve the perception and behaviour of these patients by training and adjusting their predictive models. In short, with the development of technology, applications based on PP show strong potential in multiple fields, from education, sports and medicine to artificial intelligence. Furthermore, the mechanism of PP can help to improve existing technologies, optimize human behaviour and provide smarter systems. In the future, as the understanding of PP continues to deepen, its application scenarios may become more extensive and in-depth.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This study was supported by the National Social Science Fund of China's project ‘Philosophical Research on the Challenge of Artificial Cognition to Natural Cognition’ (grant number 21&ZD061).
Author biographies
Zhichao Gong is a PhD candidate at the School of Philosophy, Shanxi University. He is also a visiting student at The Center for Philosophy of Natural and Social Science at the London School of Economics and Political Science. His current research directions are predictive processing and philosophical research on mental issues in the field of artificial intelligence.
Yidong Wei is a second-level professor and doctoral supervisor at the School of Philosophy, Shanxi University. From 1999 to 2002, he studied for a doctorate in philosophy of science and technology at Shanxi University. From March to October 2003, he was a visiting scholar at the Department of History and Philosophy of Science and the Needham Institute of Cambridge University in the United Kingdom. From 2021 to present, he has served as the chief expert of the National Social Science Fund of China's project ‘Philosophical Research on the Challenge of Artificial Cognition to Natural Cognition’ (grant number 21&ZD061).
