Abstract
Embodied cognition—the idea that mental states and processes should be understood in relation to one’s bodily constitution and interactions with the world—remains a controversial topic within cognitive science. Recently, however, increasing interest in predictive processing theories among proponents and critics of embodiment alike has raised hopes of a reconciliation. This article sets out to appraise the unificatory potential of predictive processing, focusing in particular on embodied formulations of active inference. Our analysis suggests that most active-inference accounts invoke weak, potentially trivial conceptions of embodiment; those making stronger claims do so independently of the theoretical commitments of the active-inference framework. We argue that a more compelling version of embodied active inference can be motivated by adopting a diachronic perspective on the way rhythmic physiological activity shapes neural development in utero. According to this visceral afferent training hypothesis, early-emerging physiological processes are essential not only for supporting the biophysical development of neural structures but also for configuring the cognitive architecture those structures entail. Focusing in particular on the cardiovascular system, we propose three candidate mechanisms through which visceral afferent training might operate: (a) activity-dependent neuronal development, (b) periodic signal modeling, and (c) oscillatory network coordination.
Keywords
Paradigm shifts occur with remarkable frequency in the behavioral and cognitive sciences. In the early 20th century, the nascent field of experimental psychology was revolutionized by a behaviorist program that confined itself to the investigation of observable activity (Lashley, 1923; Watson, 1919). Within a generation, behaviorism was itself the target of a counter-revolution—one that sought to rehabilitate the scientific propriety of mentalistic explanations through formal notions of computation and information processing (Aspray, 1985; Figdor, 2019). The early successes of this “information-processing paradigm” presaged the widespread institutionalization of cognitive science in the decades that followed (Bechtel et al., 1998; Miller, 2003). However, as the century drew toward its close, yet another revolutionary movement was in the offing. This radical new enterprise, which sought to ground cognition in the agent’s physical constitution and bodily activity, set the agenda for an embodied cognitive science (A. Clark, 1999).
Unlike its predecessors, the success of this “embodied revolution” is difficult to gauge. Although the concept of embodiment has influenced a wide variety of research programs (e.g., Adolph & Hoch, 2019; Casasanto, 2011; Engel et al., 2013; Lara et al., 2018; Longo et al., 2008; Pulvermüller, 2013; L. Smith & Gasser, 2005; Vallet et al., 2016), many cognitive scientists continue to conceive of cognition in terms of brain-bound representations and computational processes. However, optimism that the transformative potential of the embodied view may yet be realized has been reinvigorated of late by the latest revolutionary movement to sweep through the cognitive sciences. This new paradigm, which has been heralded for its capacity to resolve seemingly intractable differences between embodied and more orthodox species of cognitive science (e.g., Allen & Friston, 2018; A. Clark, 2016), is known most generally as predictive processing.
This article sets out to critically evaluate the prospects of an embodied cognitive science under the predictive processing paradigm. We begin by surveying the variety of ways in which the concept of embodiment has been interpreted in the cognitive sciences (Getting a Grip on Embodied Cognition) before examining how these various conceptions align with influential treatments of embodiment under predictive processing (Predictive Processing and [Embodied] Active Inference). In particular, we focus our discussion on embodied formulations of active inference, one of the most ambitious and formally developed frameworks within the predictive processing literature. Our analysis suggests there is little evidence such accounts motivate anything more than a weak (perhaps even trivial) interpretation of embodiment. This interpretation is consistent with deflationary arguments that conceive of embodied theories of cognition as being easily assimilated within more orthodox accounts of neural representation and information processing.
We then go on to consider how recent active-inference models of interoceptive processing and autonomic regulation may point toward a novel, substantive understanding of embodied cognition. To this end, we introduce the visceral afferent training hypothesis, which posits that the rhythmic physiological dynamics generated by visceral organs, such as the heart, play an instrumental role in early cognitive development (A Diachronic Perspective on Embodiment). More specifically, we propose that rhythmic visceral activity produces temporally structured patterns of stimulation that sculpt the structural and functional organization of emergent brain networks. We evaluate the strength of this hypothesis in light of influential arguments that seek to deflate or deny the role of embodiment in cognition (Causation Versus Constitution Redux). Finally, we offer some reflections on the broader unificatory potential of embodied active inference (Prospects for a Unified Philosophy and Science of Embodiment) before closing with some brief comments on the scope and future development of our hypothesis (Concluding Remarks).
Getting a Grip on Embodied Cognition
The notion that cognition is “embodied” has gained widespread currency among cognitive scientists; however, the precise meaning of this claim remains surprisingly elusive (Aizawa, 2015). In the philosophy of mind, “embodied cognition” has come to stand for a diverse assortment of views, some of which appear to be mutually exclusive. This section provides a brief orientation to this variegated theoretical landscape.
To a first approximation, advocates of embodied cognition are committed to some version of the embodiment thesis. Roughly (and minimally) expressed, this is the idea that the cognitive agent’s bodily constitution “intrinsically constrains, regulates, and shapes the nature of [its] mental activity” (Foglia & Wilson, 2013, p. 319). 1 A corollary of this thesis is that mental phenomena cannot be properly understood without taking account of the mind’s physical realization in a particular kind of body and the way that body interacts with its local environment. Proponents of embodied cognition often position themselves in opposition to the information-processing paradigm at the heart of orthodox cognitive scientific inquiry—a paradigm that has traditionally been applied to study cognitive systems in abstraction from their materiality and situatedness.
Four doctrines of cognitivism
In order to appreciate the motivation behind certain influential versions of the embodiment thesis, it will be useful to characterize the core philosophical commitments typically associated with the information-processing paradigm. These commitments are succinctly captured by the view that “cognition is computation over meaningful representations in the brain”—a conception that remains prevalent among cognitive scientists today (see, e.g., Barack & Krakauer, 2021). This proposition can be decomposed into three basic claims: that (a) cognition involves content-bearing states that refer to (carry information about) some distal state of affairs (representationalism), (b) cognition involves rule-governed operations that manipulate or transform cognitive states (computationalism), and (c) cognition is realized within the confines of the brain (internalism).
Traditionally, cognitive scientists have supplemented this package of claims with an additional commitment to functionalism, which conceives of cognitive states as determined by the causal or functional roles they play in the cognitive system at large. Under functionalism, what individuates state X as a particular mental state (e.g., being in pain) is the fact that state X (a) is reliably occasioned by an appropriate chain of events (e.g., stubbing one’s toe, nociceptive input to brain stem nuclei, etc.) and (b) reliably elicits an appropriate set of mental and/or behavioral outputs (e.g., toe rubbing, profanity).
Functionalism implies that cognitive phenomena may be instantiated in any suitably configured architecture, irrespective of the material composition of that architecture. This entails that identical cognitive states and processes could be implemented in radically different substrates (multiple realizability; Putnam, 1967). Conversely, identical physical substrates could be (re)organized such that they instantiate radically different mental states (Fodor, 1975). The main point here is that functionalism essentially sunders cognitive phenomena from the physical systems that realize them, thereby providing an impetus for studying cognitive processes in abstraction from their implementational details—the details of their physical embodiment.
Much more could be said about each of the claims enumerated in this section (for some classic expositions and objections, see Fodor, 1975, 1981; Haugeland, 1978; Newell, 1980; Pylyshyn, 1980, 1984; Searle, 1980). For present purposes, it suffices to say that representationalism, computationalism, internalism, and functionalism—the core doctrines of cognitivism—laid the philosophical foundations on which the edifice of modern cognitive science was raised. It is some version of these doctrines on which most proponents of embodied cognition have their sights set.
Varieties of embodiment
Although a thoroughgoing examination of the various ways in which the embodiment thesis has been elaborated lies well beyond the scope of this article, it will be useful to sketch out a broad spectrum of views according to the strength of their anticognitivist convictions (for further discussion, see Alsmith & de Vignemont, 2012; Goldman & de Vignemont, 2009; Miłkowski, 2019; Shapiro, 2019a; M. Wilson, 2002).
Embodied cognition is perhaps most commonly associated with a provocative set of ideas that took root in the early 1990s. Seminal work by Varela et al. (1991) and Brooks (1991) set the agenda for “strong” (or “radical”) interpretations of the embodiment thesis, which characteristically advocate the thoroughgoing rejection of (at least some of) the philosophical commitments underpinning mainstream cognitive science. Strong forms of embodied cognition champion the radical dissolution of the boundaries that are conventionally assumed to separate brain, body, and world, ushering in a fundamental decentralization and redistribution of the processes through which consciousness and cognition are produced (Anderson et al., 2012; Chemero, 2009; Favela, 2014; Gallagher, 2017; Haugeland, 1998; Thompson & Varela, 2001).
Although strong varieties of embodied cognition do not necessarily deny the brain’s involvement in the emergence of cognitive activity, the significance of its role is dramatically reduced in comparison to standard, “neurocentric” views. Just as Brooks’s subsumption architecture engendered robots capable of autonomously navigating their environment without recourse to complex algorithms or central representations (see also Braitenberg, 2004), cognitive systems are said to be coupled with their environments in ways that obviate the need for intensive computation over internal mental states (Beer, 2003; van Gelder, 1995). Rather, adaptive behavior emerges in virtue of sensorimotor feedback loops that exploit the structured relations brought forth via agent–environment interactions (Barrett, 2011; Hutto & Myin, 2013; Thelen & Smith, 1994; A. D. Wilson & Golonka, 2013). To the extent that cognitivism lacks the conceptual resources to adequately capture such phenomena, proponents of strong embodiment consider it incapable of delivering a complete cognitive science.
Not all subscribers to the embodiment thesis are so pessimistic, however. Those who endorse a more moderate position do not see embodied cognition as being intrinsically at odds with orthodox cognitivism but consider them eminently compatible with one another. Moderates claim that embodiment makes an important (and perhaps unique) contribution to the cognitive economy at least some of the time (A. Clark, 1997, 2008b). They sympathize with the radical’s analysis of the complex interplay between the agent’s physical constitution and its external milieu but do not take this as warrant for abandoning explanations that appeal (for instance) to representationally rich internal computations. The move to a moderately embodied cognitive science is thus more of a reorientation than a revolution, one that augments the orthodox cognitivist picture without substantially undermining its core theoretical commitments (Goldman, 2012; Kiverstein, 2012).
Moderates are occasionally accused of purveying a rather “weak” or impoverished account of embodiment (see, e.g., Chemero, 2013; Di Paolo, 2018; Gallagher, 2017). Standard-bearers for embodied cognition do not generally aspire to weak versions of the embodiment thesis. In the extreme, weak embodiment amounts to a mere truism: Minds are implemented in physical systems, the precise nature of which plays some role in determining the kinds of states they can occupy. The range of color sensations a particular organism might experience depends in part on the composition and organization of photoreceptor cells in its retinae. Similarly, an individual’s competence in various cognitive tasks will typically degrade as their blood alcohol concentration increases. Adherents of classical cognitivism need not deny that bodily states and structures can constrain or impact mental states and conscious phenomenology; they may simply dismiss such observations as incidental to the core project of understanding the essential workings of the mind.
Causation versus constitution
One way of delineating weaker from more substantive notions of embodiment is to appeal to the distinction between causal and constitutive dependency relations (e.g., Aizawa, 2007; Block, 2005; Prinz, 2009; see also McDowell, 1994). To continue the previous example, neurocognitive functioning is causally dependent on blood chemistry: The composition of circulating blood must be kept within certain homeostatic bounds for normal brain activity to occur. The fact that deviations beyond such bounds lead to predictable impairments of cognitive function may be of interest in certain specialized domains; however, observations of this sort do little to advance the case for a substantive interpretation of the embodiment thesis. Blood chemistry is relevant to cognitive function only inasmuch as having a functional heart (lungs, kidneys, etc.) is relevant—it forms part of the background causal matrix that generates the necessary biophysical conditions for brain activity to occur. 2 Exogenous causal factors that support, perturb, or modulate cognitive dynamics are thus to be differentiated from the dynamics themselves—the cognitive operations or processes that are the central object of cognitive scientific inquiry.
Constitutive (rather than causal) accounts of embodiment, on the other hand, argue that certain extraneural structures are part and parcel of the representational currency and computational processes that make up the mind. A. Clark (2008a, 2008b) offers one such account, whereby he “weaponizes” the substrate neutrality of functionalism to argue that mental states and cognitive operations may supervene on physical structures that extend into the body and external world (cf. Menary, 2010a; R. A. Wilson, 1994). Likewise, Noë’s (2004) enactive account of visual perception eschews the trivial remark that what one sees depends on where one looks (i.e., a merely causal explanation), arguing instead that one’s implicit knowledge or “mastery” of such sensorimotor contingencies is constitutive of perceptual experience (cf. Gibson, 1979; Hurley, 1998; O’Regan & Noë, 2001; see also Seth, 2014). Although the details of such accounts vary considerably, the broader point here is that embodiment is being construed as playing an important and distinctive role in the way cognition is realized—a role that is not susceptible to the sort of deflationary attitude with which orthodox cognitivism dismisses weak conceptions of embodiment as accidental or trivial consequences of physical implementation.
Nevertheless, not all proponents of embodied cognition are committed to a strictly constitutive reading of the embodiment thesis. Goldman and de Vignemont (2009) endorse an explicitly causal account of embodied representation in the domain of social cognition (but see Goldman, 2016). Influential theories of grounded cognition (Barsalou, 2008) and embodied simulation (Gallese & Sinigaglia, 2011) might likewise be construed as offering causal accounts of the way bodies determine (or constrain) cognitive phenomena. 3 For present purposes, it suffices to say that embodiment may be couched as playing a nontrivial causal role within the broader cognitive economy, although views of this sort tend to be more modest and conservative (i.e., more readily assimilated into mainstream cognitive science) than most other species of embodied cognition.
This concludes our brief tour of embodied theories of cognition. Although we shall return to debates about causation and constitution in Causation Versus Constitution Redux, what has been presented thus far should be sufficient to contextualize and evaluate recent attempts to marry the embodiment thesis with predictive processing theories of cognition. The next section briefly introduces the active-inference formulation of predictive processing before going on to examine the strength of its embodied cognitive credentials.
Predictive Processing and (Embodied) Active Inference
Predictive processing theories have come to dominate the philosophical and scientific landscape over the past decade. Much like embodied cognition, “predictive processing” is something of an umbrella term capturing a range of positions that vary in their scope and ambition (for helpful overviews, see Hohwy, 2020; Piekarski, 2021; Wiese & Metzinger, 2017). We focus our attention here on active inference (Parr et al., 2022), a computational framework developed under the free-energy principle (Friston et al., 2006; Friston & Stephan, 2007). Given the abundance of contemporary literature dealing with this framework, we limit ourselves here to a minimal exposition of its key concepts.
Broadly speaking, active inference seeks to provide a formal explanation for the emergence of adaptive processes (e.g., action, perception, learning, and decision-making) in complex biological systems, such as ourselves. Under this scheme, adaptive dynamics are construed as emergent properties of self-organizing systems that conform to a variational principle of least free energy, which states that biological systems must change in ways that decrease their free energy in order to survive (Friston, 2010; Friston et al., 2006). “Free energy” is an information-theoretic quantity that places an upper bound on the negative log probability (i.e., surprise or self-information) of an observation, given a (generative) model encoding beliefs about the way observations are generated. 4 In the predictive processing literature, this quantity is often equated with prediction error (cf. predictive coding; Huang & Rao, 2011; Rao & Ballard, 1999).
Two important corollaries of the free-energy principle are that (a) free-energy-minimizing systems infer the causes of their sensory inputs in an approximately Bayes-optimal fashion (cf. the Bayesian brain hypothesis; Knill & Pouget, 2004), such that hidden states in the environment come to be internally represented in the form of probabilistic model parameters, and (b) such systems select those actions most likely to realize their expected sensory states (Friston, 2012). This means that adaptive biological systems must use sensory information to optimize an internal model of external conditions and deploy this model to guide actions that change the environment—thereby harvesting new sensory inputs that stimulate further belief updating (Corcoran et al., 2020). A formal description of the update rules underwriting these dynamics is provided by the process theory implementation of active inference (Friston, FitzGerald, et al., 2017; R. Smith et al., 2022).
On first blush, notions of internal modeling, representation, and Bayesian inference would seem to situate active inference firmly within the cognitivist tradition. However, this observation notwithstanding, active inference clearly departs from classical cognitivist theorizing on a number of fronts. One notable example is the shift in emphasis from “bottom-up” explanations of feature detection and scene (re)construction (e.g., Marr, 1982) to “top-down” inferences on sensory evidence. In this way, active inference casts perception as fundamentally pragmatic and action oriented in nature—a subjective interpretation of the world conditioned on the agent’s beliefs, feelings, and desires (A. Clark, 2013, 2016; Ramstead et al., 2020; Williams, 2018a, 2018b).
It is not our intention to exhaustively survey the various points of contact between active inference and influential theoretical perspectives on either side of the embodiment debate, or to make any definitive claims about the proper situation of the active-inference framework with respect to cognitivist versus embodied philosophies of mind (but see Allen & Friston, 2018; Nave et al., 2020, for recent discussion). Neither do we dispute claims to the effect that active inference is in some sense “fundamentally embodied” (e.g., Allen & Friston, 2018, p. 2460)—in part due to apparent widespread agreement on this point (even those who defend purportedly cognitivist interpretations of predictive processing grant some version of the embodiment thesis; see, e.g., Hohwy, 2016, 2018) but mainly due to the fact that, as was hopefully brought into relief in the previous section, such claims are essentially vacuous in the absence of further qualification. Our goals here, rather, are the following: (a) to examine how the concept of embodiment is commonly deployed under active inference, (b) to ask whether this concept fulfills any special function within the broader scheme of the framework, and (c) to consider whether active inference discloses any substantive new insights into the nature of embodiment (and its attendant controversies).
Embodied models
An obvious place to begin is Friston’s extensive work on the free-energy principle. The notion of embodiment is frequently invoked by Friston, most notably in connection with the concept of the generative model. Models are said to be embodied in cortical hierarchies, brains, and entire organisms (or more abstractly, their phenotypes; e.g., Friston & Stephan, 2007). Biological systems, for instance, are supposed to “distil structural regularities from environmental fluctuations . . . and embody them in their form and internal dynamics,” such that they “become models of causal structure in their local environment” (Friston, 2012, p. 2101). In an intriguing twist, environments are also said to embody their agents “in the sense that the physical states of the agent are part of the environment” (Friston, 2011, p. 89). This formulation leads Friston to some interesting conclusions about the recursive implications of modeling (and being a model of) one’s environment and the deeply existential ramifications of active inference more generally (cf. Hohwy, 2016, 2021).
Whether brains, organisms, and other biological systems really do embody models of their environments, or whether they merely lend themselves to being described in such terms, is a matter of contemporary debate (see, e.g., Andrews, 2021; Baltieri et al., 2020; van Es, 2020). Setting this question aside, the dual implication that constituents of the agent’s (literal or fictive) generative model (a) extend beyond the brain and (b) are intimately bound up with (or “attuned” to) environmental dynamics would seem to comport well with certain strong varieties of embodied cognition (see, e.g., Bruineberg et al., 2018; Bruineberg & Rietveld, 2014; Kirchhoff & Kiverstein, 2020).
Nonetheless, the extent to which the embodiment of generative models speaks to the embodiment of mind is an open question. For instance, one might apply pressure on the implicit assumption that the “organism-level” generative model is the appropriate target for cognitive scientific inquiry: If some partition of this model were found that delimited cognitive processes within the confines of the brain, the relevance of non-neural model components for explanations of cognitive phenomena would require independent motivation. Importantly, however, active inference does not appear to offer any principled means for arbitrating such boundary disputes (cf. Bruineberg et al., 2021; A. Clark, 2017a; Facchin, 2021; Kirchhoff & Kiverstein, 2021; Ramstead et al., 2021). Rather, the framework permits the individuation of multiple nested models ranging from the cellular to the societal scale and beyond. Model definition is thus predicated on prior assumptions (choices) about the nature (and bounds) of the system in question (i.e., the object of study; Parr, 2020).
Generalizing this point: Although it may be true that the active-inference framework permits one to cast biological systems in terms that are consistent with (or at least, reminiscent of) stronger flavors of embodiment (e.g., the view that embodied agents enjoy direct perceptual access to the world and its affordances; for discussion, see Anderson, 2017; Linson et al., 2018; Orlandi, 2016), this fact alone does nothing to compel such views—additional philosophical work is required to demonstrate why such interpretations ought to be favored over more conservative, deflationary alternatives. 5 This raises the very real possibility that, rather than delivering some sort of resolution to the 30 years’ war that proponents of embodied cognition have waged on cognitivism, appeals to active inference have merely shifted the conflict into a new theater (we shall return to this theme in Prospects for a Unified Philosophy and Science of Embodiment).
Embodied feelings
Whereas the formal models posited under active inference might not be sufficient to arbitrate disputes about cognitive embodiment, the biological consequences of their physical instantiation could prove more decisive. The basic idea here is that cognition is embodied in the sense of being fundamentally geared toward the maintenance of homeostasis (Barrett, 2017; Damasio, 2018; Seth, 2015)—the delimited set of physiological states that define the conditions of the organism’s biological viability (Corcoran & Hohwy, 2018). The expected (homeostatic) states entailed by the agent’s phenotype thus constitute the normative criteria against which the agent’s current and future sensory states are evaluated, a notion that resonates with certain enactivist conceptions of embodiment (see, e.g., Colombetti, 2014; Di Paolo, 2005; Kirchhoff & Froese, 2017; Thompson, 2007). This idea sets the stage for a deeply affective conception of mind, whereby bodily and emotional feeling states play a key role in guiding adaptive action (cf. Damasio, 1994, 2010; Panksepp & Northoff, 2009).
Several embodied interpretations of active inference have been developed on the premise that affective experience is rooted in the body’s physiological condition. The first wave of such work provided a predictive coding style treatment of interoception, the (conscious or unconscious) sensory processing of internal bodily states (Pezzulo, 2014; Seth, 2013; for precursors with less explicit emphasis on embodiment, see Gu et al., 2013; Seth et al., 2012). Subsequent elaborations of these interoceptive-inference accounts have sought to explain how aberrant interoceptive processing and autonomic regulation might engender various psychopathologies (Owens et al., 2018; Paulus et al., 2019; Quadt et al., 2018; R. Smith et al., 2020), including depression (Arnaldo et al., 2022; Badcock et al., 2017; Barrett et al., 2016; Seth & Friston, 2016; Stephan et al., 2016) and anxiety-related disorders (J. E. Clark et al., 2018; Gerrans & Murray, 2020; Linson et al., 2020; Peters et al., 2017).
The purported inseparability of cognitive and affective processing (Kiverstein & Miller, 2015; Pessoa, 2008), coupled with the deep linkage between physiological and emotional feeling states (Critchley & Garfinkel, 2017; Gu et al., 2019), makes affect a prime target for embodied theories of active inference. It is noteworthy, then, that some active-inference models of affective experience do not appear to ascribe any special role to the body—rather, affective states are characterized in terms of domain-general computational mechanisms that track the evolution of prediction error dynamics over time (Hesp et al., 2021; Joffily & Coricelli, 2013; Van de Cruys, 2017). Although such models might ultimately prove compatible with their interoceptive-inference counterparts (see Fernandez Velasco & Loev, 2021), their existence would seem to imply that active inference is not necessarily committed to an explicitly embodied account of affective experience.
Embodied selves
Perhaps the most significant development of active inference with respect to embodied cognition is not the putative impact of interoceptive states on affective experience but rather the means by which interoceptive information is incorporated within the cognitive economy at large. Under active inference, interoceptive streams converge with exteroceptive (e.g., visual, auditory) and proprioceptive (motor) modalities at integrative regions of the cortical hierarchy (Gu et al., 2013; Owens et al., 2018; Pezzulo et al., 2015). Such multi- or cross-modal dynamics open the door for interoceptive information to influence a wide variety of cognitive domains, offering a principled explanation of the way internal bodily states may bias or condition perceptual decision-making (see, e.g., Allen et al., 2016; Pezzulo, 2014; Pezzulo et al., 2018; cf. Barrett & Bar, 2009).
Evidential support for the multimodal integration of bodily signals has accrued from studies investigating bodily self-consciousness, a fundamental constituent of self-awareness (Blanke & Metzinger, 2009; Gallagher, 2000, 2005). In experimental paradigms that produce the rubber-hand (Botvinick & Cohen, 1998) and full-body illusions (Ehrsson, 2007; Lenggenhager et al., 2007; Mizumoto & Ishikawa, 2005), simultaneous visuotactile stimulation induces the uncanny experience of tactile sensations that seemingly arise from an artificial limb or body avatar (for discussion, see Apps & Tsakiris, 2014; Aspell et al., 2012; Blanke, 2012; Blanke et al., 2015; Limanowski & Blankenburg, 2013). This effect is so compelling that subjects often report a sense of ownership over the alien limb or virtual body, as if it had been incorporated as part of (or in place of) their own body (Tsakiris, 2010; cf. de Vignemont, 2011; for related discussion of disordered bodily processing, see Fotopoulou, 2015).
Variants of these paradigms in which visual stimulation occurs in conjunction with rhythmic interoceptive events, such as heartbeats (Aspell et al., 2013; Heydrich et al., 2018; Suzuki et al., 2013; see also Sel et al., 2017) or breathing cycles (Adler et al., 2014; Allard et al., 2017; Betka et al., 2020; Monti et al., 2020), reveal that alterations in bodily self-consciousness can be induced via the integration of exteroceptive and interoceptive signals. That is, simply augmenting the visual representation of a virtual body (part) with cardiac- or respiratory-synchronized pulsations is sufficient to modulate experiences of bodily ownership in the absence of concomitant tactile stimulation.
These findings lend credence to the notion that the brain integrates multiple sources of sensory information to infer which parts of the environment are a part of its body (and more abstractly, a part of its self). However, whether such inferences qualify as instances of embodied cognition is debatable; those of a cognitivist persuasion might be inclined to argue that the mere representation of an object as part of oneself (or not) fails to motivate any claims about the embodiment of cognition per se. Indeed, the directionality of these effects would seem to be precisely backward from an embodied cognitive perspective: These illusions showcase the fragility or susceptibility of body-related phenomenology to exteroceptive information (e.g., via the attenuation of somatosensory precision; Zeller et al., 2015, 2016); most proponents of embodied cognition, on the other hand, typically want to argue for the body’s role in shaping or determining one’s perception of the world. 6
Embodied rhythms
Happily, we can eschew this concern by turning our attention to a complementary line of research investigating the cross-modal effects of bodily rhythms on cognition. Rather than presenting exteroceptive stimuli in synchrony with a series of rhythmic interoceptive events, such as the heartbeat (as in interoceptive variants of the rubber-hand and full-body illusions), cycle-timing paradigms investigate how sensorimotor and cognitive processing varies as a function of time within each physiological cycle. Of interest here are a subset of cycle-timing studies that involve emotionally neutral, non-body-related stimuli, which are therefore immune to the worry that these phenomena may be domain-specific quirks of bodily or affective processing (for broader overviews of this literature, see Azzalini et al., 2019; Critchley & Garfinkel, 2018; Skora et al., 2022).
Most cardiac cycle-timing studies compare the difference between stimuli that are presented during a heartbeat (more precisely, during the systolic time window, in which the brain is receiving a burst of interoceptive input caused by the contraction of the heart) and stimuli presented between successive beats (i.e., during the diastolic period, in which the heart is relaxing and refilling with blood). Early reports that sensorimotor processing is facilitated during diastole relative to systole (Birren et al., 1963; Callaway & Layne, 1964; Saari & Pappas, 1976; Sandman et al., 1977; Walker & Sandman, 1982) have been replicated and elaborated in more recent years (Edwards et al., 2007; McIntyre et al., 2007; Quelhas Martins et al., 2014; Stewart et al., 2006; Yang et al., 2017). Furthermore, evidence that certain cognitive functions are enhanced (Fiacconi et al., 2016; Pramme et al., 2014, 2016; Rae et al., 2018), and certain forms of spontaneous action preferentially initiated (Galvez-Pol et al., 2020; Kunzendorf et al., 2019), during systole has also begun to accrue.
How do these findings (and analogous results from respiratory cycle-timing studies; Flexman, 1974; Kluger et al., 2021; Nakamura et al., 2018; Park et al., 2020; Perl et al., 2019; Waselius et al., 2019; Zelano et al., 2016) contribute to debates about embodied cognition? From one perspective, such data might seem to furnish incontrovertible evidence of the mind’s embodiment. After all, this literature attests to the pervasive influence of rhythmic internal dynamics on mental states, insofar as momentary fluctuations in afferent feedback determine whether or not an object enters into conscious awareness (in the case of near-threshold stimulation), when it is perceived (in the case of spontaneous sampling), and the extent to which it is subsequently processed and acted upon. And although the magnitude of such effects is admittedly modest, such fine details may prove highly consequential under certain ecologically valid conditions (Azevedo et al., 2017; see also Fridman et al., 2019).
As with our earlier treatment of multisensory integration in Embodied Selves, however, the implications of such findings are not clear-cut. The cognitivist might be intrigued to learn that visceral organs exert subtle influences over exteroceptive decision-making but need not concede that these effects are any more compelling than (say) the soporific consequences of a heavy lunch. This rebuttal might seem too glib, but phasic fluctuations in perceptual acuity, alertness, reaction speed, and so on are well-established correlates of biological rhythms spanning multiple time scales (see, e.g., Jennings, 1986; Okawa et al., 1984); and yet, such observations are not generally adduced in support of embodied cognition. Instead, these phenomena belong to that class of factors whose influence upon cognitive processing and conscious experience is widely regarded as “merely” causal.
What matters here, then, is not so much evidence that the heartbeat (breathing cycle, gut rhythm, etc.) modulates cognitive function but rather the nature of the mechanism(s) underwriting such effects. These mechanisms are not fully understood, but converging evidence suggests that afferent discharge caused by ventricular contraction exerts a transient inhibitory effect on cortical activity. As such, each incoming volley of cardio-afferent feedback induces a momentary perturbation of the brain’s ongoing dynamics, hindering its capacity to process information from other modalities. An alternative (but not mutually exclusive) explanation cites the physical distention of blood vessels caused by the pulse wave as responsible for disturbing the sites where sensory information is registered, such as the retinae (Allen et al., 2019; see also Macefield, 2003). External stimuli are simply more difficult to detect when they coincide with (and are obscured by) pulsatile motion (but see Grund et al., 2022).
Seen in this light, the cardio-afferent feedback thought to underlie differences in sensorimotor and cognitive performance observed across different phases of the cardiac cycle appears to be a source of noise, a nuisance variable that impedes the brain’s capacity to go about its information-processing business. Although some research findings suggest the afferent signals associated with systolic contraction might benefit neural processing in certain contexts (e.g., Garfinkel & Critchley, 2016; Pramme et al., 2016), the tendency among active-inference-inspired explanations of cycle-timing effects is to view the mechanical consequences of the heartbeat as a form of interference that lowers the precision of exteroceptive sensory input (where precision refers to the brain’s estimate of the reliability or “signal-to-noise ratio” of sensory data). In principle, one could envisage an agent whose cognitive capacities remain perfectly intact—indeed, are perhaps even improved on average—having eliminated the pulsatile “artifacts” caused by the heartbeat (e.g., by replacing the heart with a continuous-flow device). Not only does the role of cardiac interceptive feedback in consciousness and cognition appear to be more causal than constitutive in nature, but its contribution seems more of a “bug” than a “feature.”
To sum up: Active inference provides a powerful framework for explaining how the brain integrates multiple streams of sensory information to construct perceptual experience and regulate activity. Insofar as such schemes admit internal bodily states as important sources of interoceptive sensory input, they would seem to open the door to a more embodied understanding of the mind. But as we have highlighted, embodiment means different things to different people; although active inference might be hospitable to certain varieties of embodied cognition, the explicitly embodied versions of active inference analyzed here do not compel strong interpretations of the embodiment thesis. Brains constantly track the evolving dynamics of their bodily states and sometimes use this information to inform perception and action in other domains. Yet, on these accounts at least, it is the brain (and only the brain) that does the essential work of modeling the state of the body and the world. Visceral signals are accorded no special role within this inferential hierarchy and appear to be treated no differently than other forms of sensory input (see also Hohwy, 2016; Hohwy & Michael, 2017). Indeed, what appears on first blush to be some of the most promising evidence favoring the pervasive influence of interoceptive feedback on the mind turns out on closer inspection to be just another “noise trajectory” (Allen et al., 2019) that the brain must learn to live with.
A Diachronic Perspective on Embodiment
Thus far, we have evaluated the nature of embodiment under active inference from a broadly “synchronic” perspective; that is, we have analyzed how the active-inference framework might explain particular phenomena that occur over relatively brief time frames within mature cognitive agents. By contrast, relatively few researchers within this field have adopted an explicitly “diachronic” (or developmental) perspective (although this situation is beginning to change; see Atzil et al., 2018; Ciaunica, Constant, et al., 2021; Ciaunica & Crucianelli, 2019; Fabry, 2017, 2018; Fotopoulou & Tsakiris, 2017; Köster et al., 2020; Montirosso & McGlone, 2020; Wozniak, 2019). The key idea we wish to explore in the remainder of this article is that the most novel and interesting insights active inference has to offer the embodied cognition debate derive not from its account of how the adult brain balances converging streams of interoceptive and exteroceptive information but from the story it has to tell about the formative influence of visceral signals on the nascent mind. This idea highlights the critical role of embodiment in early cognitive development—a role that may be obscured from view when analyzing the behavior of mature cognitive systems.
The synchronic perspective encourages one to conceive of the cognitive agent as a fully formed, pregiven entity and attempts to systematically analyze the structure of its behavior in order to discover its internal logic. From this vantage, it is perfectly natural to view the brain as the central controller responsible for governing and coordinating various sorts of bodily activity—a view active inference inherits from mid-20th-century cybernetics (Seth, 2015). Although this perspective is characteristic of traditional cognitivism, it is by no means unique to it—proponents of the embodiment thesis might likewise adopt a synchronic perspective when entering into disputes about the bounds of cognition, the aptness of representational and computational explanations of mental phenomena, and so on. 7
By contrast, the diachronic perspective seeks to examine how cognitive systems emerge and develop over time. Although not necessarily undermining or contradicting conclusions derived from synchronic analysis, this approach may help to reframe or disrupt conventional assumptions about the nature of cognitive systems in ways that bring new insights into view. Recent focus on the embodied interactions between infants and caregivers, for instance, has highlighted how bodily and social exchanges during the early phases of childhood development play a fundamental role in realizing homeostatic regulation, laying the foundations for higher cognitive functions (Fotopoulou et al., 2022; Fotopoulou & Tsakiris, 2017; Seth & Tsakiris, 2018). Analyses of this sort remind us that the brain undertakes a long and circuitous journey on its way to assuming its sovereign status and that the sensory experiences encountered along this trajectory may have profound ramifications for the maturation of the cognitive system at large.
In what follows, we bring a diachronic perspective to bear on the emergence of so-called interoceptive noise trajectories and their regulation under active inference. The basic hypothesis we propose is that the seemingly trivial (or perhaps mildly detrimental) fluctuations in neural activity associated with periodic physiological events such as the heartbeat (Embodied Rhythms) are the residuum of a more fundamental developmental process. More specifically, rhythmic visceral dynamics are posited to play an instrumental role in carving out the basic inferential architecture that enables the mature brain to successfully model and modulate its sensory flows. This idea, which we refer to as the visceral afferent training hypothesis, invites a more substantive interpretation of the contribution of internal bodily activity in the emergence of biological cognition. A key implication of this hypothesis is that the rhythmic visceral dynamics instantiated during the first weeks and months of development lay the foundations for a nested hierarchy of neural models, thereby enabling the brain to encode and predict the ubiquitous spatiotemporal regularities of the environment in which it is embedded.
Learning from within
Granting that the brain modulates its activity in accordance with inferences about unfolding visceral dynamics (see, e.g., Allen et al., 2019; Corcoran et al., 2021; Seth & Tsakiris, 2018), the question arises as to how the brain comes to model such processes in the first instance. One possibility is that the neurophysiological apparatus responsible for monitoring and controlling physiological variables is “hardwired” by natural selection. The notion that brains come prepackaged with certain homeostatic prior expectations is fairly widespread within the active-inference literature (e.g., Allen & Friston, 2018; Allen & Tsakiris, 2018; Burr & Jones, 2016; Sims, 2017), although the details concerning such innate programming are generally left unspecified. However, even if the fetal brain is “initialized” with genetically encoded prior expectations, the developmental and adaptive changes that (at least some) physiological parameters undergo over the life span suggests such priors can be modified or overwritten in response to prevailing environmental conditions (cf. Yon et al., 2019).
Generalizing this point, it seems plausible that the basic structure of neurally encoded generative models may be inherited in the form of genetically specified “wiring diagrams,” whereas the parameterization of such models may depend on the patterns of stimulation to which their underlying constituents are exposed (cf. Singer, 1986). In statistical parlance, this translates to the process of model fitting; under active inference, the optimization of model parameters corresponds to learning (Friston et al., 2016). We conjecture that prenatal development encompasses both the formation of anatomical structures that support the hierarchical generative modeling of sensory states and the tuning of model parameters in response to the “training data” availed by the internal (fetal) and external (intrauterine and more distal) environments. Although both of these processes clearly persist well beyond birth, we posit that the late embryonic–early fetal stage of prenatal development constitutes a critical period for establishing basic forms of structure learning and predictive regulation that ground more sophisticated forms of model updating in life outside the womb (cf. Friston, 2017).
Our treatment here concentrates on the interoceptive data afforded by the cardiovascular system. This focus accords with the prominent theme of heart–brain interaction in both the recent interoceptive inference literature (surveyed in Predictive Processing and (Embodied) Active Inference) and the broader tradition of psychophysiological research (e.g., Lacey & Lacey, 1978; Obrist, 1981; Porges, 1995). This focus is also motivated by the empirical fact that the heart is the first organ to form and begin functioning within the vertebrate embryo (Yutzey & Kirby, 2002). Even before the embryonic heart tube has acquired its familiar four-chambered morphology, it generates coordinated, rhythmic contractions resulting in electrical signals reminiscent of the adult electrocardiogram (Boullin & Morgan, 2005). Blood circulation is established by the 4th week of embryogenesis (Tan & Lewandowski, 2020), while the neural tube is still closing (Stiles & Jernigan, 2010). The circulatory system is thus fully operational before the nervous system has even begun to acquire its basic organizational structure (see Fig. 1 for a comparative illustration of neural and cardiac development in the human embryo).

Key milestones during the embryonic development of the human cardiovascular and central nervous systems. Top row illustrates the emergence of precursory brain structures, beginning with the formation of the neural tube 20 embryonic days postconception (i.e., E20). Neural tube closure occurs over the course of the 4th week (E22–27). Following closure, the neural tube expands anteriorly, and the primary vesicles (i.e., the prosencephalon, mesencephalon, and rhombencephalon) become discernible. The secondary vesicles (telencephalon and diencephalon deriving from the prosencephalon; metencephalon and myelencephalon deriving from the rhombencephalon) are established by the end of Week 7 (E49). See Stiles and Jernigan (2010) for further details. Middle row depicts the gross morphological development of the human embryo. An illustration of the term fetus and placenta (which forms the interface between the fetal and maternal circulatory systems) is included for comparison. Bottom row highlights key structural and functional milestones in embryonic heart development. The heart tube is formed by the fusion of the endocardial tubes on E21, swiftly followed by the first signs of peristaltic pumping activity (E21–22). Blood begins to flow through the heart tube on E24; active circulation is established by E28. The division of the heart into four chambers and the appearance of the cardiac pacemaker (sinoatrial node) are realized by E35. Cardiac morphology approximates that of the term fetus by Week 7. See Kirby (2007) and Tan and Lewandowski (2020) for further details. Note, diagrams are schematic and not drawn to scale. Brain diagrams adapted from Kolb and Fantie (2009) and Stiles and Jernigan (2010); heart diagrams adapted from Doyle and colleagues (2015) and Srivastava (2006).
Contrary to the standard (synchronic) picture in which the mature brain orchestrates various autonomic activities to manage multiple systemic demands, the embryonic brain must develop the capacity to track (and adaptively regulate) physiological activity within the body. The first step of this journey likely occurs with the self-organization of brain stem nuclei responsible for detecting physiological fluctuations arising from cardiovascular and other organ systems. Once receptive to perturbations generated by the body’s physiological processes, the task then is to infer the likely causes of such sensory inputs. Structurally, this situation is akin to standard predictive-coding and active-inference accounts of exteroceptive perception, whereby the brain is said to invert a generative model in order to infer the hidden causes of its sensory states. 8
It is not easy to pinpoint the precise stage at which the developing nervous system becomes receptive to information about the state of its environment. Nonetheless, it seems reasonable to posit the rhythmic pulsations of the circulatory system as being among the first sensations registered by the nascent brain. Within the lower portion of the brain stem, the subnuclei of the nucleus tracti solitarii (NTS; a major viscerosensory relay center for cardiovascular, respiratory, gastrointestinal, and gustatory input; Saper, 2002) undergo an intensive period of cytoarchitectonic development during the final few weeks of the first trimester (Cheng et al., 2006). Vagal nerve fibers linking this region to the heart and major blood vessels are already established by the time this period of cellular differentiation gets underway (Cheng et al., 2004; Kirby, 2007), with more widespread networks of innervation unfolding over the course of the second trimester (Gordon et al., 1993; Pappano, 1977).
Presumably, the kinds of inferences supported by neuronal populations at the brain stem level are relatively simple; neurons within the mature NTS may, for example, register whether blood pressure is increasing or decreasing on the basis of baroreceptor activity and likely do so without entertaining more complex beliefs about the distal states of affairs driving such fluctuations. Even at this level, however, some rudimentary form of learning (belief updating) may be necessary in order to calibrate prior expectations over cardiovascular parameters. Fetal blood pressure increases linearly over the second half of pregnancy (Struijk et al., 2008); prevailing heart rate decreases most steeply between the 16th and 20th weeks of gestation and more gradually thereafter (Pillai & James, 1990). If neuronal activity within the NTS encodes priors over these parameters, it is plausible that these neurons gradually adapt their response functions in line with accumulating sensory evidence of a sustained shift in their values (see Segar, 1997, for discussion of such “chronic resetting” during pre- and postnatal development).
The potential for more sophisticated modeling of cardio-afferent signals increases at higher levels of the interoceptive neural hierarchy, where longer epochs of cardiac activity can be extracted and integrated with other streams of sensory information. In the mature brain, NTS efferents project beyond the brain stem to a variety of higher neural centers (Benarroch, 1993; Loewy, 1981; Saper, 2002; R. Smith et al., 2017; see Fig. 2). Neuroanatomical studies indicate that key limbic components of this network emerge early in human ontogeny; the embryonic hypothalamus is discernible 5 weeks post-fertilization, and the amygdaloid nuclei begin to differentiate between Weeks 6 and 8 (Müller & O’Rahilly, 2006). If subcortical regions such as these are capable of receiving information from barosensitive brain stem circuits during the early postembryonic period, the periodic fluctuations of the cardiac rhythm could represent the first form of sensory patterning extracted by integrative centers within the prenatal brain.

The anatomy of heart–brain communication. Schematic overview of major brain centers involved in cardiovascular interoception and regulation. Baroreceptor afferents embedded in the aortic arch and carotid sinuses convey cardio-afferent feedback (red arrows) to the brain stem. This information is integrated in the nucleus tracti solitarii (NTS), triggering rapid alterations in autonomic outflow (blue arrows) to the sinoatrial (SA) node, atrioventricular (AV) node, and myocardium via the nucleus ambiguus (NA) and paravertebral (sympathetic) ganglia to maintain blood pressure stability (Benarroch, 2008). Cardio-afferent signals are also relayed from the NTS to higher regions within the central autonomic network, including the periaqueductal grey (PAG), parabrachial nuclei (PBN), hypothalamus, amygdalae, insulae, anterior cingulate cortex (ACC), and ventro-medial prefrontal cortex (vmPFC). Please note, arrows are intended to provide a heuristic representation of ascending neural pathways; top-down brain connections have been omitted for clarity. Adapted from Palma and Benarroch (2014), Sizarov and colleagues (2013), and Végh and colleagues (2016).
In sum, the early establishment of autonomous cardiovascular activity in the developing embryo, coupled with the emergence of brain stem and limbic nuclei (and their accompanying white matter tracts; Huang et al., 2009) toward the end of the first trimester, furnishes the necessary substrates for basic interoceptive processing to occur. The propagation of cardio-afferent information through brain stem structures to limbic regions marks the genesis of the brain’s capacity to “model its own dynamic noise trajectories” (Allen et al., 2019, p. 24), although at this early stage, the visceral inputs that inscribe such trajectories are very much signals to be processed rather than noise to be suppressed. This leads us to the hypothesis that rhythmic fluctuations in visceral afferent feedback constitute a salient “training signal” that drives the adaptation of early brain networks and the generative models they entail. The most interesting upshot of this visceral afferent training hypothesis is the possibility that these adaptive processes extend beyond the interoceptive domain, playing a foundational role in the formation and structuring of the cognitive architecture at large. The next three sections develop this idea in relation to three candidate mechanisms through which visceral afferent training might operate: (a) activity-dependent neuronal development, (b) periodic signal modeling, and (c) oscillatory network coordination (Fig. 3).

Overview of proposed visceral afferent training mechanisms. (A) Stimulation of neuronal development and circuit organization. Following the principle of activity-dependent neuronal development, we posit that periodic visceral afferent signals provide patterned input into undifferentiated populations of brain stem nuclei (i), thereby driving the development of functionally organized neural circuits (ii). We further propose that these inputs propagate up the neural hierarchy to carve out channels of communication between integrative nodes within the central autonomic network (iii). In this way, visceral afferent training signals play a key role in sculpting how information is processed and exchanged between network hubs. (B) Modeling of periodic fluctuations as recurrent events. The maturation of hierarchically organized neuronal populations (i) affords the opportunity to integrate the transient sensory responses evoked at lower (e.g., brain stem) levels of the hierarchy (discrete electrocardiogram [ECG] signals; bottom layer) into temporally extended representations (continuous ECG signal; middle layer). We construe the formation of such representations in terms of model building, whereby a series of events in the environment are not only tracked and grouped together as part of a single periodic process but also predicted (forecast) to recur with some degree of temporal precision (hatched ECG; top layer). Once models of rhythmic visceral afferent inputs are established, they may be generalized or adapted to fit other sources of rhythmic sensory input (e.g., vibro-acoustic stimulation). In this way, models of recurrent visceral perturbations may facilitate the decomposition of complex inputs into their constituent components (ii), thereby separating internal (e.g., cardiac) from external (e.g., acoustic) signals and random noise. (C) Organization of large-scale network dynamics. Independent of their role in driving neuronal development and model formation, low-frequency physiological rhythms may also contribute to the coordination of network activity. Specifically, the cardiac rhythm is suggested to serve as a pacemaker that helps to organize communication between disparate networks through the temporal alignment of their oscillatory dynamics (i). This notion provides a compelling developmental perspective on Klimesch’s (2013, 2018) observation that the canonical neural frequency bands (e.g., delta, theta, alpha, beta) may be derived from a binary hierarchy that takes the cardiac rhythm (HR) as its scaling factor (ii).
Visceral afferent training drives activity-dependent neuronal development
Sensory stimulation has long been known to play a crucial role in the structural and functional development of the immature brain (Wiesel & Hubel, 1963a, 1963b). Even before visual and auditory pathways gain access to external stimuli, spontaneous bursts of neuronal activity are responsible for carving out precise patterns of network connectivity (Friauf & Lohmann, 1999; Hanganu-Opatz, 2010; Kandler et al., 2009; Katz & Shatz, 1996; Mooney et al., 1996; Penn & Shatz, 1999). Spontaneous network activity has also been observed within various other regions of the central nervous system, including the spinal cord, brain stem, cerebellum, hippocampus, and neocortex (see Blankenship & Feller, 2010; Feller, 1999; Yuste, 1997).
The visceral afferent training hypothesis posits that interoceptive input generated by the spontaneous activity of peripheral organ systems plays a similarly instrumental role in fine-tuning the organization of neural circuitry within nascent brain stem structures (Fig. 3A). What is particularly interesting about this idea beyond the domain of interoceptive processing per se is the potential for correlated, temporally structured patterns of evoked neural activity to influence the development of integrative centers beyond the brain stem. Given the extensive connectivity among hubs within the central autonomic network (Palma & Benarroch, 2014), and the various interfaces these hubs share with other key brain regions (see, e.g., Kleckner et al., 2017), early exposure to periodic interoceptive stimulation could have profound effects on the way core brain networks are refined and remodeled—and, by extension, on the way information is routed through these structures.
From an active-inference perspective, this notion suggests that the genetic specification of generative models might be relatively sparse, concentrating on the mechanisms governing cellular differentiation and pattern formation (phenomena that have also been explained in terms of self-organization via free-energy minimization; see Friston et al., 2015; Kuchling et al., 2020; Wright & Bourke, 2021). The imprecise, stereotypical circuitry laid down during these early stages of neurogenesis are subsequently refined and elaborated in accordance with Hebbian (and possibly other) learning principles (Goodman & Shatz, 1993; Kirkby et al., 2013; Leighton & Lohmann, 2016). The capacity for such remodeling (both in the physical sense of synaptic stabilization and pruning and in the more abstract sense of model parameterization and reduction) essentially relieves the genome of the burden of mapping out the precise developmental trajectory of specialized neural circuitry—a responsibility borne instead by domain-general learning mechanisms that remain operative throughout the life span (Singer, 1986).
Delegating the fate of neuronal wiring to activity-dependent adaptive processes is an efficient and flexible strategy for fine-tuning network connectivity. However, such an approach is only likely to succeed if the right kind of activity is reliably instantiated at the right stage of development. In the case of visual and auditory pathways, this problem is solved by the induction of spontaneous activity on the basis of genetically programmed priors (Katz & Shatz, 1996). Such an arrangement may not be necessary within interoceptive pathways (or may be superseded much earlier by experience-dependent adaptation), on account of the early availability of viscerosensory input. As outlined previously, the early development of the cardiovascular system in particular means that a continuous source of periodic afferent stimulation is already available by the time the brain stem and higher centers become receptive to such stimuli. Because this input is essentially guaranteed in virtue of the fact that healthy brain development cannot proceed in the absence of a functional circulatory system, it constitutes a highly dependable stimulus for driving activity-dependent neuronal remodeling.
Visceral afferent training inculcates a model of periodic fluctuation
In the healthy human fetus, average heart rate usually increases to a peak of ~170 beats per minute at approximately 9 to 10 weeks (i.e., around the same time NTS nuclei are undergoing an intensive period of maturation; Cheng et al., 2006), gradually decreasing thereafter (Hornberger & Sahn, 2007; Pillai & James, 1990). This translates to a heartbeat every ~350 ms. Leaving aside the potential import of this recurrent form of (self-generated) stimulation as a driver of brain stem network development (Visceral Afferent Training Drives Activity-Dependent Neuronal Development), this signal presents perhaps the first opportunity for higher brain centers to model rhythmic activity originating from beyond the central nervous system (Fig. 3B). Hypothalamic and amygdaloid nuclei may, for instance, adapt to periodic activity conveyed via the brain stem by establishing oscillatory network dynamics that synchronize to this input. Such phase-locked neural oscillations encode predictions about the timing of afferent stimuli, constituting the first step toward the development of more sophisticated neural models of external periodic events (including those necessary to quell physiological noise trajectories of the sort posited by Allen et al., 2019).
In the adult brain, the amygdalae have been implicated (along with other interoceptive hubs, such as the insulae) in the generation of the heartbeat-evoked potential (Park et al., 2018), an electrophysiological response that may constitute a neural correlate of cardiac interoceptive prediction error (Ainley et al., 2016; Petzschner et al., 2019). Recent evidence that this response is sensitive to auditory stimuli presented in synchrony with the heartbeat (Banellis & Cruse, 2020; Pfeiffer & De Lucia, 2017) suggests these networks encode expectations about the timing of recurrent sensory events. From a more general perspective, the perpetual reverberation of baroreceptor feedback through deep subcortical regions may be crucial for establishing the first neural representations of a stable pattern of variation—a generative process or hidden cause—in its distal (bodily) environment. Although it is surely the case that all developing neuronal populations are subject to some form of “external” stimulation (even if the source of such stimulation originates from elsewhere within the nervous system), the crucial point here is that there exists some series of events that can be resolved as a recurrent pattern against the background hum of neural noise and metabolic activity. In carving out periodic cardiac activity as a stable, coherent signal amid the tumult of sensory fluctuations, the brain takes its first step on a lifelong journey of “sensemaking.”
Although the basic concept of repetition need not entail periodicity, the additional quality of regularity conferred by the heartbeat’s periodic nature is also notable. From a modeling perspective, the regularity of the cardiac cycle renders the timing of each afferent impulse highly predictable and thus easier to “tag” as a recurring signal rather than a random sequence of unrelated events. 9 Periodicity essentially confines the signal to a single, narrowband frequency channel to which a receptive neural population can “tune in,” thereby enabling the hidden source of this input to be decorrelated from competing activity (cf. blind source separation; Bell & Sejnowski, 1995; Isomura et al., 2015; Isomura & Friston, 2018). A way of characterizing this scenario in the language of active inference is to say that periodic signals express higher precision (i.e., signal-to-noise ratio) than their more irregular counterparts. As such, periodic stimuli generate salient sensory data that compel model updates in line with sensory evidence. Once a model of the periodic process has been established, evidence in favor of its parameterization is rapidly accrued in virtue of its capacity to generate highly accurate predictions about the timing of successive sensory inputs.
Assuming the general shape of this story is on the right track, it is worth pausing to consider why heartbeat-related afferent input ought to play such an important role in the training (or fitting) of basic generative models. It is certainly plausible that other physiological processes give rise to periodic neural stimulation, especially if subcortical receptivity to interoceptive signals occurs later than we have assumed (e.g., during the second half of pregnancy). However, cardiovascular feedback seems like a good candidate to focus on for two reasons: first, as mentioned already, the early development of the circulatory system guarantees such feedback is available as soon as the brain is sufficiently mature to detect it (which, moreover, raises the possibility of its active participation in the maturational process itself; see Visceral Afferent Training Drives Activity-Dependent Neuronal Development); second, the frequent and perpetual nature of such feedback renders it a more tractable learning signal than many other potential candidates that occur less regularly and/or over slower time scales (e.g., changes in the chemical composition of the amniotic fluid or circulating blood due to maternal feeding).
Whether or not other sources of recurrent stimulation are easier to resolve and predict once cardio-afferent fluctuations have been captured within a subcortical generative model is unclear. Once in possession of a model of cardiovascular interoceptive dynamics, other kinds of periodic input might become more tractable—easier for the brain to “tap in” or “lock on” to (cf. cardio-auditory synchronization in mature brains; Banellis & Cruse, 2020; Pfeiffer & De Lucia, 2017). This is to say that brain networks may exploit (repurpose or generalize) an established model of rhythmic cardio-afferent dynamics to bootstrap the modeling of other periodic signals emanating from elsewhere within the fetal (or maternal) body. Relatedly, once a model characterizing the periodic nature of heartbeat-evoked neural activity has been established, the brain may begin to discern other patterns of sensory input in relation (or contrast) to this signature (cf. fetal sensitivity to “oddball” deviations in rhythmic sequences; Draganova et al., 2005, 2007; Huotilainen et al., 2005). In this way, the heartbeat might be said to inculcate an ur-concept of repetition or recurrence, in which the basic distinction between discrete states or phases is elevated to the recognition of a certain kind of patterned continuity—a (generative) process—unfolding over time.
Visceral afferent training signals promote coordinated network dynamics
From a dynamical systems perspective, the periodic nature of the heartbeat might also serve a more mechanistic function as a pacemaker for oscillatory neural activity and information processing (Fig. 3C). The idea that the cardiac rhythm acts as a pacemaker for central and peripheral dynamics is not new (see, e.g., Coleman, 1921) and has recently been revived in work linking the spectral architecture of neural and visceral organ systems (Klimesch, 2018; see also Başar, 2008; Corcoran et al., 2018; Kluger & Gross, 2021; Tort et al., 2018). Without digressing into these ideas too deeply, there is an obvious congeniality between such views and the empirical facts of early fetal development as sketched out earlier. For instance, Klimesch (2013, 2018) has proposed that many characteristic oscillatory dynamics observed in the brain and the body can be unified as part of a binary hierarchy that takes the heartbeat as its scaling factor (cf. Rassi et al., 2019). Ascribing the heartbeat as the fundamental rhythm from which all other members of the frequency hierarchy are derived makes perfect sense when one considers the early functional emergence of the cardiac system and its physical influence on early brain development.
To be clear, we are not claiming that neural populations oscillate solely in response to cardiac or other sources of rhythmic visceral activity. Neuronal oscillations are often described as “intrinsic” and “spontaneous” in character (Buzsáki & Draguhn, 2004; Llinás, 1988; Wang, 2010) and may be conceived of as an emergent property of free-energy-minimizing neuronal ensembles (Palacios et al., 2019). Developing neuronal networks should therefore begin to evince coherent network dynamics as random or intermittent bursts of neural activity converge toward attractor basins of synchrony (Luhmann et al., 2016; Rulkov, 2001; Wright, 2011). The novel idea on offer here is that these emergent, self-organizing dynamics may be guided or driven toward particular regions of phase-space by extrinsic physiological rhythms, such as those generated by the cardiovascular system.
By analogy to the way the infant brain is equipped with the necessary neural machinery to acquire any natural language but requires immersion within a particular linguistic environment in order to realize this potential (Gopnik et al., 2001; Kuhl, 2004), the basic thought here is that periodic visceral stimulation may entrain a particular regime of oscillatory patterning (first in the brain stem and then higher centers) amid a largely disorganized agglomeration of spontaneous neural activity. Such patterning might be crucial for opening up coherent lines of communication between disparate brain networks (Fries, 2005, 2015; Palva & Palva, 2018), paving the way for hierarchical message passing and model updating (Bastos et al., 2020; Chao et al., 2018; Friston, 2008; Friston, Parr, et al., 2017; Michalareas et al., 2016). Although it might be possible for maturing brain regions to settle into oscillatory regimes by dint of the intrinsic properties of their cellular constituents, and perhaps even to cycle through a repertoire of regimes as a consequence of interregional coupling relations that wax and wane over time (Downes et al., 2012; Wright, 2011), rhythmic input from the heart might establish a kind of scaffold around which early (sub)cortical network activity stabilizes and organizes itself.
The potential involvement of the cardiac rhythm as a pacemaker (or “control parameter”; see Haken, 1983) in the self-organization of neuronal oscillations is interesting from an embodied cognitive perspective because it shows yet another way in which visceral activity may shape emergent brain dynamics (cf. Van Orden et al., 2012). This notion is also relevant for active-inference accounts that conceive of neuronal oscillations as a means by which predictions and prediction errors are conveyed between neuronal populations (see, e.g., Friston, 2019). Specifically, the temporal structuring of neuronal oscillatory dynamics imposed by the heartbeat may help to organize network communication such that the transmission of information between different brain regions is facilitated or optimized (e.g., via the alignment of disparate oscillatory regimes in relation to a common fundamental frequency). If cardio-afferent feedback does play a role in enabling the emergence of large-scale oscillatory brain dynamics—or in determining the functional profile of such dynamics—it would follow that the cardiovascular system makes an important contribution to cognitive development.
But therein lies the rub: If any of the putative visceral afferent training mechanisms proposed in this section are to have any significant bearing on the embodiment debate, we need to show that these mechanisms are not “merely” part of the background causal matrix on which brain development unfolds but play a decisive role in defining this trajectory. This returns us to the distinction between causal and constitutive dependency relations raised in Causation Versus Constitution. Even if one is suspicious of the view that embodied accounts must be constitutive accounts on pain of triviality, the provision of just another weakly embodied causal account is unlikely to progress existing debates very far. Hence, if active inference is to shed new light on the question of embodiment, it needs to motivate theoretical positions that are not susceptible to such deflationary rebuttals. We address this concern next.
Causation Versus Constitution Redux
Let us be clear on the issue at stake. According to critics like Adams and Aizawa (2008; Aizawa, 2010), there has been widespread conflation within the embodied cognition literature of things that affect cognitive processes and things that are (part of) cognitive processes. Failure to respect this distinction results in what they label the coupling-constitution fallacy, whereby components of the causal chain linking to (i.e., coupled with) some cognitive process are mistakenly classed as constituents (or realizers) of that process. 10 As such, the burden is on the embodiment theorist to (a) show that the cognitive process in question is indeed constitutively dependent on some bodily process, (b) argue that certain species of causal dependency are sufficient for (nontrivial) embodiment, or (c) explain why the causal-constitutive distinction is unfit to adjudicate between cognitive and noncognitive processes.
Prima facie, the visceral afferent training hypothesis would seem to fall short of the constitutive standard stipulated by Adams and Aizawa (2008). We have not claimed that the activity of developing visceral organ systems is intrinsically cognitive in any sense (indeed, we have not even asserted if, when, or what kind of fetal brain activity suffices for cognition); neither have we argued that visceral organ systems constitute an extension of the fetus’s developing cognitive apparatus. By process of elimination, then, our hypothesis must turn on an essentially causal story, whereby visceral signals exert their influence on the brain by virtue of coupling relations. This being the case, our hypothesis would seem unlikely to furnish any substantive advances beyond existing accounts of embodied active inference—it simply speculates that cross-modal interactions between interoceptive and exteroceptive processing streams may be rooted in fetal development.
Constitution through causation?
This conclusion is too hasty, however. One reason for caution is that Adams and Aizawa’s (2008) coupling-constitution fallacy is predicated on an essentially synchronic view of cognition (Menary, 2010b) and may as such be ill suited to the diachronic perspective on offer here (see also Kirchhoff, 2015). Moreover, hard-and-fast distinctions between causation and constitution become challenging in the context of temporally extended processes (Shapiro, 2019a, 2019b), especially when those processes are recursive and/or embedded within complex dynamical systems, such as biological organisms (A. Clark, 2008b; Gallagher, 2017; Kirchhoff, 2015; Leuridan & Lodewyckx, 2021). For example, at least some causal factors appear to be “proper constituents” of the processes they interact with, such that those processes simply could not exist (and perhaps could not be conceived) without them. 11
Proper treatment of these metaphysical issues lies well beyond the scope of this article; however, we take such objections as provisional justification for the notion that some cases of causal dependency may be deeply and irrevocably enmeshed with the emergence of cognition—at least (or perhaps especially) when dealing with complex, nonstationary processes of the sort encountered in ontogenesis (Mc Manus, 2012). From this perspective, visceral afferent training signals might be viewed as playing an instrumental role in driving the development of a particular kind of cognitive architecture, “configuring” or “formatting” the brain’s generative model such that it supports a particular range of inferences (e.g., about periodic structure embedded in sensory flows) and cognitive operations (e.g., the ability to distinguish internally from externally generated sensations). If this is right, and the causal “inscriptions” of interoceptive inputs help organize the neural dynamics that ultimately constitute cognitive processing, such signals represent a crucial factor in determining the kinds of minds instantiated in animal brains. 12
A strong reading of the visceral afferent training hypothesis implies that the foundations of mind are laid down and molded in accordance with multidimensional vectors of sensory information availed by the internal bodily environment. Put another way, bodies like ours—or more specifically, the temporally structured physiological dynamics they entail—are necessary conditions for the emergence of minds like ours. This interpretation is reminiscent of (and perhaps continuous with) an early strain of thought in embodied cognition, whereby specific features of one’s morphology shape and constrain the sorts of concepts one can access or form (e.g., Barsalou, 1999; Lakoff & Johnson, 1980, 1999; cf. Shapiro, 2019b). The idea that one’s entire conceptual apparatus might derive from a base set of concepts grounded in one’s physical interactions with the world shares a certain resemblance to the idea that the brain’s transactions with its visceral states cause proximal changes (i.e., in the format of the generative model) that propagate through to more distal, abstract mental representations (e.g., the elaboration of a model of oneself qua agent in the world).
This point of contact with embodied theories of conceptual grounding is useful insofar as it reminds us that, as mentioned in Causation Versus Constitution, such theories can be read as rather weak interpretations of the embodiment thesis. Indeed, the indirect causal link we posit between visceral and cognitive processes might render the visceral afferent training hypothesis susceptible to a similar charge. Moreover, the mediated nature of this causal link belies another subtle departure from these earlier views: What really matters for our hypothesis is not so much the nature of one’s body but rather that of the afferent signals conveyed to the brain—signals that could in principle be instantiated in various organisms consisting of very different physiological and morphological features. Although manipulating the temporal characteristics of fetal heartbeat dynamics might be expected to have profound ramifications for the development of brain dynamics under our hypothesis, we have provided no reason to think that substituting the heart with a device that periodically stimulates the arterial baroreceptors according to age-matched parameters should make any notable difference to the trajectory of cognitive development.
Embodied mind or envatted brain?
Hypothetical considerations of this sort almost inevitably lead one to the perennial question of envatment, perhaps the acme of anti-embodiment thought experiments. In the classic “brain in a vat” scenario, the disembodied brain floats in a vat of life-sustaining nutrients while communicating with a sophisticated computer by means of wires or transceivers affixed to various nerve endings. The intuition—at least for the cognitivist—is that this setup would be sufficient to replicate the cognitive processes and conscious experiences instantiated in an embodied brain, thus demonstrating that the mind is only contingently (i.e., causally) related to its non-neural body (e.g., Metzinger, 2003). Embodiment theorists might counter that this scenario surreptitiously reintroduces a body and a world via the vat and computer simulation (e.g., Hurley, 2010; Thompson & Cosmelli, 2011; see also Hohwy, 2017); however, unless a strong argument for the body’s unique involvement in the “preprocessing” or “formatting” of afferent input prior to central processing can be successfully mounted, it remains difficult to see how this position could support anything beyond a weak interpretation of the embodiment thesis.
We do not pursue the possibility of peripheral information processing and/or formatting here because we are unaware of any resources within the active-inference framework that would help arbitrate this question. Neither do we believe that a diachronic reimagining of envatment can help defeat the argument; while establishing brain-computer interfaces that keep up with neurogenesis might prove technically challenging, we see no reason in principle why a “developmental” program of stimulation should not run just as well as its “mature” counterpart. Indeed, it might even be the case that a diachronic variant of the brain in a vat would be capable of producing an even more convincing simulation of phenomenal experience, because there would be no room for an uncanny disconnect between one’s intended actions and their expected consequences (consider how laggy or jittery mouse cursor movement disrupts the fluency of one’s actions and degrades one’s sense of agency in the virtual domain). More specifically, there would be no opportunity for any such slippage to occur because the developing brain is calibrated to the temporal structure of the computer-mediated action-perception loop from the get-go.
There is perhaps one unique possibility introduced by the visceral afferent training hypothesis that might argue against the perfect emulation of experience through envatment. This is the possibility that normal cognitive development and functioning is influenced not only by the sensory information generated by peripheral oscillators such as the heart but also by the mechanical influence such oscillatory systems exert over neural tissue (cf. Kim et al., 2016; Mosher et al., 2020). If this were the case, the envatted brain would require special apparatus designed to mimic the periodic physical forces that would ordinarily be generated by a beating heart. Although this arrangement does not entail the necessity of a cardiovascular system per se for normal cognition to arise, the necessity of some additional mechanism designed to emulate the functional profile of such bodily rhythms begins to seriously undermine the supposedly disembodied status of the envatted brain (cf. Thompson & Cosmelli, 2011).
Setting this possibility aside, should the (nomo)logical possibility of a brain-in-a-vat scenario under the visceral afferent training hypothesis raise a red flag to defenders or advocates of embodied cognition (to whom the very concept of envatment is anathema)? Our view is that anyone who endorses active inference must be prepared to entertain such possibilities; as an essentially functionalist framework (Colombo & Wright, 2021; Hohwy, 2016), active inference accommodates envatted brains as a possible solution to the problem of free-energy minimization. 13 As such, the brain in a vat occupies one end of a spectrum of embodied active-inference schemes, and cognitive systems extending beyond the body (or, indeed, incorporating multiple bodies) populate the other—such is the versatility of the active-inference formalism.
Prospects for a Unified Philosophy and Science of Embodiment
Philosophers and cognitive scientists on both sides of the embodiment debate continue to claim active inference as a vindication of their views, in spite of their (supposedly fundamental) disagreements about the nature of cognition. We have suggested that the formal framework availed by active inference may accommodate a wide variety of philosophical positions on the question of embodiment without necessarily helping to arbitrate the merits and shortcomings of these alternatives. Perhaps active inference is silent on many of the details on which such arguments turn; perhaps the framework will eventually reach a level of maturity enabling it to rule out (or severely undermine) many of them. Presently, however, the promise of any unification, fusion, or synthesis of these divergent strands of thought remains largely unfulfilled; indeed, many theorists seem more interested in demonstrating that their favored brand of embodied cognition can be cloaked in the formal garb of active inference, rather than seeking any genuine dialectical engagement with proponents of rival views.
But perhaps this conclusion presents an unduly pessimistic appraisal of the literature. A more optimistic interpretation might appeal to the following line of reasoning: By drawing together a large range of (embodied and cognitivist) theories under a single framework, active inference has helped bring previously unseen (or at least neglected) points of contact between supposedly divergent views to the fore. For example, the ability to reformulate seemingly internalist, inferential, and representation-laden notions of generative modeling and prediction error minimization in the language of “optimal grip” and “attunement” (e.g., Bruineberg & Rietveld, 2014; Kiverstein et al., 2019) shows how the active-inference framework avails both a common ground and a shared vocabulary—the foundations of any productive dialogue. Perhaps, then, there is reason to be hopeful that the active-inference perspective might yet resolve (or dissolve) long-standing disputes between competing factions. Or, if it does not help theorists to work out their differences, it might at least help them to work out which differences really matter.
As foreshadowed at the beginning of Predictive Processing and (Embodied) Active Inference, we did not set out to resolve disagreements about the proper interpretation of embodiment under active inference or to convince the broader philosophical and scientific community that active inference somehow bridges or transcends the schism between cognitivism and embodiment. What we have tried to do, rather, is bring some of these deep-rooted tensions to the fore. On our analysis, much of the conceptual and empirical work conducted under the rubric of embodied active inference lends itself to a fairly deflationary interpretation; indeed, proponents of a more “radically” embodied cognitive science would likely consider it entirely in keeping with mainstream cognitivism (cf. Di Paolo, 2018). This is not to say that embodied active inference is incompatible with the more radical wing of the embodiment spectrum; it is simply to point out that much of the contemporary work in this area has little in common with the strong aspirations and provocative rhetoric historically leveraged in the name of embodied cognition.
How does the visceral afferent training hypothesis fit into this picture? What we have attempted to do here is demonstrate how the adoption of a diachronic perspective on active inference enables one to recast a seemingly deflationary interpretation of embodiment (i.e., visceral feedback as noise trajectory) as part of a more substantive account of the relation between body and mind. Although this hypothesis does not purport to radically overturn cognitivist-friendly interpretations of active inference, it goes beyond existing accounts of the way cognitive processing incorporates inferences about physiological states (interoceptive inference) and their temporal evolution (cycle-timing phenomena). We have argued that these accounts are vulnerable to the charge of being “merely” causal in nature, thus undermining their capacity to shed philosophically interesting light on the question of embodiment. By contrast, our diachronic account highlights the foundational role of interoceptive dynamics in guiding (or driving) the early development of neural structures and the cognitive architectures they implement.
Concluding Remarks
Active inference (and predictive processing more generally) represents one of the most exciting and fertile theoretical frameworks to have emerged since the dawn of the cognitive revolution. Much like the information-processing paradigm before it, active inference supplies a versatile suite of formal tools that can be applied to a remarkable diversity of problems and scenarios. Such scope and versatility explain the widespread appeal of the framework—and in particular, its ability to attract adherents of markedly divergent theoretical views. Although we consider this a virtue of active inference, it remains incumbent on those working within the framework to remain cognizant of its limits. Active inference alone neither settles nor eliminates the difficult philosophical issues at the heart of the embodiment debate.
This is not to say that active inference cannot be leveraged to develop interesting models of embodiment, as we have attempted to do here. Neither is it to denigrate the work on embodied active inference that was surveyed in Predictive Processing and (Embodied) Active Inference. To the contrary, we consider such models theoretically interesting and useful accounts of their various target phenomena. Our intention, rather, was to challenge the claim that these models provide compelling accounts of embodied cognition as the latter has been conceptualized in the philosophy of mind. More positively, we hope the visceral afferent training hypothesis inspires fresh debate about the ways in which the concept of embodiment has been understood under active inference—and about the ways in which it might yet be understood.
Whether the visceral afferent training hypothesis ultimately succeeds in delivering a genuinely embodied account of cognitive development will depend on its ability to mature beyond its current (somewhat embryonic) form. To preempt some potential objections, let us briefly raise a few caveats before closing.
First, the mechanisms we have sketched as putative mediators of visceral afferent training are intended as biologically plausible examples of the ways early-emerging rhythmic visceral dynamics might sculpt cognitive architectures—they do not exhaust the visceral afferent training hypothesis. An important objective for future work will be the incorporation of active regulatory dynamics that emerge in cardiovascular and other peripheral systems as the brain begins to exert control over its autonomic (and humoral) outflows (see Corcoran et al., 2023). Similarly, the contribution of other interoceptive modalities (e.g., signals arising from the gastrointestinal tract; see Décarie-Spain et al., 2023; Rebollo et al., 2021; Shine et al., 2022; R. Smith et al., 2021), the coordinated integration of cross-modal sensorimotor channels, and the consolidation of fetal behavioral states into slowly evolving sleep-wake cycles constitute exciting avenues for future development.
Second, we do not mean to imply that our hypothesis affords a complete account of the way bodily structures and processes may contribute to cognition—visceral afferent training does not exhaust embodied active inference. Indeed, some training mechanisms may go hand in hand with other putatively embodied processes, such as (for instance) the establishment of sensorimotor loops in the tactile and proprioceptive domains (Ciaunica, Safron, et al., 2021; Fagard et al., 2018; Martínez Quintero & De Jaegher, 2020). Our argument that interoceptive rhythms may provide a scaffold or reference frame for other sensory flows clearly depends on the capacity to organize such flows in relation to these rhythms; we do not suggest that interoceptive dynamics alone are sufficient.
Finally, although our account has focused on the role of heart–brain communication during early cognitive development, it represents only a small piece of a much larger ontogenetic and phylogenetic puzzle (Fields & Levin, 2020; Ramstead et al., 2018). Ultimately, our account must be integrated within a much grander narrative about the way self-organizing cellular and embryonic processes lay the foundations for the sorts of systems-level interactions we have articulated here and how these interactions impact individual and interpersonal cognitive dynamics over the course of the life span. And in a beautifully circular twist that is especially befitting of active inference, the personal, social, and environmental dynamics that structure life outside the womb must also be understood in terms of their own influence over cellular and embryonic processes, as propagated through successive generations via genetic, epigenetic, and cultural modes of information transmission.
This final remark returns us to the theme of reproduction and inheritance, reminding us of an influential motif in recent work at the nexus of human development and embodied active inference: that of the “first prior” (Allen & Tsakiris, 2018; Ciaunica, Constant, et al., 2021; Fotopoulou et al., 2022). This notion might be understood in terms of the phenotypic information bound up in the organism’s genetic endowment—or, more abstractly, as a primal form of belief about one’s own existence and the priority accorded to states that are conducive to that existence. The visceral afferent training hypothesis complements this idea with a more explicitly adaptive twist: The body constitutes not only the agent’s first prior but also its first teacher—one that provides reliable interoceptive instruction to guide the elaboration of generative models throughout the prenatal period and beyond. And just like any good teacher, the body equips its student with precisely the right foundations to go out into the world, apply its knowledge to new problems, and acquire deeper levels of understanding along the way—thus becoming a master in its own right.
Footnotes
Acknowledgements
We thank Youjia Lu, Tessel Blom, and Mitch Catterall for their collaboration on the Action-Guided Conscious Experience project. We also thank Jakob Hohwy, Sahib Khalsa, Matthieu Koroma, Michael Levin, members of the Cognition and Philosophy Lab (especially Kevin Berryman, Tom Darling, Manja Engel, Stephen Gadsby, and Niccolò Negro), and two anonymous reviewers for valuable feedback on previous versions of this manuscript.
Transparency
Action Editor: Tina M. Lowrey
Editor: Interim Editorial Panel
