Abstract
Our memories help us plan for the future. In some cases, we use memories to repeat the choices that led to preferable outcomes in the past. The success of these memory-guided decisions depends on close interactions between the hippocampus and medial prefrontal cortex. In other cases, we need to use our memories to deduce hidden connections between the present and past situations to decide the best choice of action based on the expected outcome. Our recent study investigated neural underpinnings of such inferential decisions by monitoring neural activity in the medial prefrontal cortex and hippocampus in rats. We identified several neural activity patterns indicating awake memory trace reactivation and restructuring of functional connectivity among multiple neurons. We also found that these patterns occurred concurrently with the ongoing hippocampal activity when rats recalled past events but not when they planned new adaptive actions. Here, we discussed how these computational properties might contribute to success in inferential decision-making and propose a working model on how the medial prefrontal cortex changes its interaction with the hippocampus depending on whether it reflects on the past or looks into the future.
Keywords
Introduction
Decision-making is part of our everyday life, from choosing what to eat and who to interact with. To make optimal decisions, we repeat choices that worked in the past such as taking the same previous route to work (memory-guided decisions). However, sometimes, we encounter unfamiliar situations where we must infer the potential consequences of new actions. Such inferential decisions are made using new information deduced from prior knowledge built by integrating multiple memories. In this commentary, we discuss the implications from our recently published study 1 revealing the distinct role of the medial prefrontal cortex (mPFC) in flexible action selection using prior knowledge.
Owing to meticulous research in the past decades, we know that memory-guided decision-making relies on close interactions between the hippocampus and mPFC. 2 In contrast, neural underpinnings of inferential decision-making are not completely understood. Human imaging studies found that the hippocampus and mPFC became active while subjects deduced relationships by mentally connecting two separately learned relations.3,4 Moreover, electrophysiological experiments in mice have revealed that hippocampal neurons coding a stimulus in one association are coactivated with those coding a stimulus in another association. 4 Although this finding provides a clue on how the brain discovers hidden relationships across separately learned information, it remains unclear how the brain links the inferred relationships with new actions. Therefore, we recorded neural activity in the mPFC and hippocampus while rats flexibly chose an action based on a learned associative rule in the environment. 1
Prefrontal neuronal ensemble dynamics during inferential decision-making
In our “rule transfer” task, rats must first learn that a conditioned stimulus (CS), a short auditory or visual cue, will be followed, after a brief delay, by an unconditioned electric shock to the eyelid (the US). During this conditioning, rats are confined to two contextually different rooms. In each room, only the auditory or visual stimulus elicits the shock, while the other stimulus is presented alone. The shock-inducing stimulus is reversed between the rooms. Thus, the CS and room combinations define the “rules” of whether the US occurs after the CS. After training was completed, the rats faced a novel rule transfer test. Initially, they were confined to the rooms and received the same rule presentation as in previous days. Halfway through the session, a divider was removed, enabling them to move between the rooms. In this test phase, rats may choose to perform an “escape response” where, upon US delivery, they run to the opposite room to avoid subsequent shocks. Rats may also choose an “explore response” where they move away from the US omission room and receive a shock in the next trial. Crucially, rats were never trained to show either of these responses. Therefore, it is up to them to choose the most adaptive response in this new situation among their innate behavioral patterns.
We observed that after receiving the shock, some rats chose an escape response and avoided further US delivery. Notably, these rats did not move immediately after receiving the US. Instead, they remained still for a few seconds before running directly towards the opposite room. Such extended decision time and efficient execution are signatures of deliberative evaluation that uses prior knowledge to infer the consequences of actions. 5 This view is further supported by the fact that decision time was much shorter in a group of naive rats that did not learn the rules beforehand.
During the rule-transfer test, we recorded the spiking activity of individual neurons from the prelimbic region of the mPFC from trained rats. We identified two key features from mPFC ensemble dynamics during the deliberation process, (1) spontaneous reactivation of prior knowledge and (2) the formation of new multi-neuron co-activity patterns that link prior knowledge to new actions.
Spontaneous reactivation of prior knowledge
While we are actively engaged in a task on hand, many neurons alter their spiking activity in response to the presence of external variables, including sensory stimuli and action initiation. Complementary to engaged periods, there are times when we are stationary or engage in innate behaviors unrelated to the task. During these disengaged periods, our mind often wanders. Our attention drifts away from perceivable features of the present moment to internally generated thoughts about events in the past and future. 6 An emerging strategy to decipher the wandering mind is to apply a decoding-based approach to neural activity during the disengaged period. 7 Specifically, machine learning classifiers are first trained to discriminate task-related neural activity patterns from other patterns during engaged periods. These classifiers can then be applied to the disengaged period to pinpoint precisely when task-related neural patterns spontaneously emerge.
With this strategy, we detected many time points at which neural representations of previously learned rules became spontaneously reactivated in the mPFC. Reactivation of these rule events was frequently detected before the rats initiated escape and explore responses. Importantly, the frequency of reactivations before escaping, but not exploring, was a predictor of the overall success in avoiding shock; the high-performing rats had more reactivation events before escaping than the low-performing rats. We also observed that this predictive power was stronger for the rule signaling shock omission than the rule signaling shock delivery.
The reactivation of the shock omission rule may reflect rats imagining what would happen if they escaped to the other room. In contrast, the reactivation of the shock delivery rule may reflect rats recalling the CS-US pairing they received a moment ago. The fact that the former was a better predictor for successful escape led us to consider that imagining potential future consequences is more critical for adapting to sudden environmental changes than simply reflecting on past outcomes.
The formation of new multi-neuron co-activity patterns that link prior knowledge to new actions
These spontaneous reactivation events mark the point in time at which the mPFC is engaged in internally generated thoughts about the past and future. To probe how information is being processed during these internal states, we leveraged the Hebbian cell assembly theory stating:
“Any two cells or systems of cells that are repeatedly active at the same time will tend to become ‘associated,’ so that activity in one facilitates activity in the other.” 8
Presuming the cell assembly theory holds true, if we can find temporal patterns where specific mPFC neurons fire together, then those neurons are likely wired together after being repeatedly coactivated. We can assume this coactivation provides a mechanism of internally linking information coded by these neurons together. An established approach to evaluate cell assembly activity is first to extract assembly templates using linear dimensionality reduction. 9 Then, by projecting the ensemble activity onto the templates, we can determine the strength of assembly activation at certain time points.
Indeed, we detected many such cell assemblies during the rule transfer test and found that the activity of several assemblies was elevated around spontaneous rule reactivation events. Each of these “rule assemblies” consisted of three to five neurons that were selective for the CS-US association, the US, or the action. Furthermore, these cell assemblies were more robustly activated after the reactivation events that led to escaping compared to exploring. Therefore, spontaneous rule reactivation may be an essential modulator of rule assembly activity.
We also tested another possibility that the activity of these rule cell assemblies might be controlled by incoming inputs from the hippocampus. In particular, the hippocampus generates high-frequency oscillatory patterns called sharp wave ripples (SWRs), 10 during which hippocampal cells show highly synchronized burst firings. We observed that these hippocampal bursts activated ∼30% of mPFC cells, of which most were strongly selective for the CS, but not for actions or the US. In addition, the activity of rule assemblies was elevated around SWRs only during the first half of the rule-transfer phase while rats were confined to the rooms. When rats were allowed to avoid the US by escaping, the activity of rule cell assemblies was no longer temporally coupled with SWRs.
We argue that the elevated activation of rule cell assemblies that included rule-coding and action-coding cells reflects a moment of knitting together previously learned rules with novel actions. Notably, the activity of these assemblies was modulated by incoming inputs from the hippocampus during SWRs when rats recall past events in a familiar environment. In stark contrast, when rats engage in future-oriented thinking, hippocampal inputs during SWRs lose control over the activity of rule assemblies. Instead, internally triggered reinstatement of learned rules governs their activity. Thus, we speculate that the mPFC listens to the hippocampus when it uses the memory to repeat the past choice. In contrast, the mPFC disconnects from the hippocampus when it uses that memory to evaluate the potential consequences of various actions.
Working model of hippocampal-mPFC interaction supporting the use of memory for future-oriented cognition
The research presented here builds on the expanding investigations of neuro-computational mechanisms behind inferential decision-making. Although previous work reported neural computations that join separately learned relationships, 11 we believe our study is the first to demonstrate self-organized neural dynamics that link the inferred relations to novel actions. From our findings, we propose that the hippocampus and mPFC support sequentially the two major requirements of inferential decision-making: the discovery of hidden relationships and connecting them with action and outcome.
To discover latent relationships, the hippocampus combines neurons coding for different stimuli using structured reactivations during SWRs. The hippocampus not only joins these neural representations in the same temporal pattern as they occur during experience but also creates imaginary combinations that never occurred before.12,13 The latter ability is essential for the flexible reorganization of learned information to discover their hidden relationships. In addition, hippocampal SWRs also recruit mPFC and other neocortical activity to guide the formation of neocortical representations. The neocortex sorts old and new information based on their commonality and latent relationships, focusing on predicting behaviorally relevant consequences. The accumulation of this process leads to the building of prior knowledge which is the underlying set of facts, assumptions, and rules we have about the environment.
The formation of this knowledge base is essential for the latter process of inferential decision-making, connecting the inferred stimulus relationships with novel actions and outcomes. During this process, mPFC neurons become more sensitive to local dynamics rather than incoming inputs from the hippocampus. In this case, perceivable information serves as a partial cue to reactivate mPFC cells coding matching information stored in the knowledge base (spontaneous rule reactivation). This leads to the recruitment of multiple cell assemblies, including neurons coding the knowledge and neurons coding various innate actions. We speculate that either mental simulation or a few actual action executions will select an assembly coding the optimal action, and the activity of this assembly will strengthen functional connectivity among neurons belonging to that assembly. Then, the selected assembly can guide flexible action selection during a novel experience and subsequently incorporate the action into the mPFC knowledge structure for future use.
Future studies on hippocampal-mPFC interactions supporting inferential decision-making
This two-stage model provides a reasonable explanation presenting the intersection of our findings with previous work. To further improve this model, we identified three future directions. First, it remains unknown how the influence of hippocampal inputs on mPFC cell activity is flexibly adjusted in a situation that requires inferential decision-making. Novelty in a familiar situation is one factor that makes inferential decisions dominate over memory-guided decisions. Considering that environmental novelty is tracked by neuromodulator systems, such as norepinephrine or dopamine, 14 we speculate that environmental novelty will induce rapid elevation of neuromodulator signals, thereby triggering a drastic network state transition to adjust the balance between externally and internally driven neuronal processes in the mPFC.
Second, it is worth investigating whether and how neurons in the mPFC synchronize firings with neurons in the hippocampus to form cross-structural assembly patterns. By conducting multisite single-neuron recordings and using improved analytical approaches, 15 we will be able to more accurately decipher the interactions of coordinated spiking between the hippocampus and mPFC and to provide a more holistic view of neural ensemble dynamics during inferential reasoning and decision-making.
Third, although we investigated hippocampal-mPFC interaction during SWRs, other rhythmic activities such as theta oscillations also coordinate the activity of individual neurons and cell assemblies. 15 Therefore, it is necessary to examine mixed hippocampal-mPFC assemblies at various timescales and different network states to determine precisely when the hippocampus and mPFC are coordinated or disconnected during the inferential process.
Conclusion
We introduced an essential study uncovering that the mPFC interacts with the hippocampus differently depending on whether the rats use memories to repeat the past choice or evaluate new choices. We hope the proposed model from our observations can set forth further investigations into how the mPFC can predict—beyond recall—the best action for better future adaptation.
Footnotes
Author contributions
KT conceptualized and directed the project. YS and KT wrote the manuscript and interpreted the results.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by NSERC Discovery Grant (KT). Natural Sciences and Engineering Research Council of Canada (grant number NSERC Discovery Grant (RGPIN-2020-04479)).
