Sage Journals: Discover world-class research

Abstract

This survey focuses on the standard assumption in DSGE models: rational expectations (RE) with perfect information (PI) aka full information (FI)—hence FIRE. RE means model consistent expectations—agents be they households, firms, banks or policymakers know your model. PI (or FI) means agents observe or can infer the current and past state variables in your model. RE + PI (or FIRE) is a strong assumption. The purpose of this survey is to examine the literature that relaxes RE or PI or both. This is relevant for DSGE models in general, but particularly so for the efficacy of monetary policy in a New Keynesian environment when the expectation by agents of future policy is of crucial importance.

JEL Classification: C11, C18, C32, E32

Keywords

Behavioural macroeconomics imperfect information heterogeneous expectations

Introduction

There have been a number of recent assessments of the ‘state of macro’ and the contribution of dynamic stochastic general equilibrium (DSGE) models—a list that is by no means exhaustive would include: Blanchard (2009, 2016), Blanchard et al. (2010, 2013), Driffill (2011), Pesaran and Smith (2011), Blanchard and Summers (2017), Vines and Wils (2018), Christiano et al. (2018) and Levine (2020).

This survey has a more narrow focus on the standard assumption in DSGE models: rational expectations (RE) with perfect information (PI) aka full information (FI)—hence FIRE. RE means model-consistent expectations—agents be they households, firms, banks or policymakers know your model. PI (or FI) means agents observe or can infer the current and past state variables in your model. RE + PI (or FIRE) is a strong assumption. The purpose of this survey is to examine the literature that relaxes RE or PI or both. This is relevant for DSGE models in general, but particularly so for the efficacy of monetary policy in a New Keynesian (NK) environment when the expectation by agents of future policy is of crucial importance.

We begin with departures from RE and a recent behavioural macroeconomics literature. The ‘Behavioural Macro models’ section sets out the most common equilibrium concepts found in this literature. The third section sets out a standard NK model we use as an application in the rest of the paper. The fourth section moves on to models with heterogeneous agents consisting of both RE and non-RE agents and examines a class of equilibria when the latter can learn from the former through reinforcement learning. The fifth section then moves on to RE models where the PI assumption is relaxed in favour of imperfect information (II). The sixth section reviews important empirical results that assess, first, what we describe as the ‘wilderness’ of departures from RE and, second, the ability of the RE NK model with the II assumption to provide a better data fit than PI. The seventh section concludes the article.

Beyond RE: Equilibrium Concepts

In departures from RE, two sets of equilibrium concepts and related literature need distinguishing. The first is statistical learning, which poses the question: Can agents learn to be rational through econometrics and, in particular, recursive least-squares learning? The second are a class of equilibria which do not converge to RE which we term behavioural macro-models. We consider these in turn.

Statistical Learning

Applications to macroeconomics were pioneered by Evans and Honkapohja (2001). The main idea is to replace RE with statistical forecasts based on knowledge of the structure of the RE solution—perceived law of motion (PLM) found by recursive least squares. A statistical equilibrium is then one where in a stochastic steady state the PLM is equal to the actual law of motion (ALM). If the learning processes n converge in this sense and the PLM = ALM = the RE equilibrium, we have what the literature terms E-Stability . This idea has been described as the ‘principle of cognitive’ consistency: ‘economic agents should be as smart as (good) economists’ (see the survey by Evans & Honkapohja, 2009). Other more recent surveys on statistical learning that their seminal contribution subsequently produced include Milani (2012) and Eusepi and Preston (2018). It should be noted that these papers adopt different approaches to learning—Euler learning versus anticipated utility discussed later and see also section 4.4 of Eusepi and Preston (2016). But either approach assumes agents are good econometricians and use well-specified forecasts of the model RE equilibrium.

To formalize the concept, consider the state-space form of a log- linearized DSGE model:

A_{0} y_{t - 1} + A_{1} y_{t} + A_{2} E_{t} y_{t + 1} + B_{0} w_{t} = 0,

(1)

where y_t is the state vector of endogenous variables in deviation form about a steady state. Matrices are functions of parameters j. The model is driven by exogenous driving AR(1) processes w_t.

w_{t + 1} = ρ_{w} w_{t} + \in_{t + 1} \in_{t} \sim i . i . d .

The minimal state variable (MSV) RE solution is

y_{t} = b y_{t - 1} + c w_{t} .

(2)

In the OLS learning equilibrium, agents know the form of the solution (2) and use recursive least squares to estimate

y_{t} = b_{t} y_{t - 1} + c_{t} w_{t},

(3)

where [b_t, c_t] are time-varying parameters. E-stability has a large literature in itself, which includes McCallum (2007) and Ellison and Pearlman (2011).

Behavioural Macro-models

This class of models have one or more of the following features: (a) adaptive expectations in models of individual rationality (b) heterogeneous expectations and reinforcement learning (c) cognitive discounting and (d) agent inattention in otherwise rational models. We examine five concepts in turn.

Concept I: Restricted Perception Equilibria (RPE): In an RPE, agents misspecify the law of motion (2). For example, they may not observe w_t and assume a first-order VAR

y_{t} = b_{t} y_{t - 1} + \in_{t} ​ ​,

(4)

Let y_i,t be the ith component of y_t. Then assuming y_t is observable (the data). Assume a perceived law of motion in the form of simple AR(1) learning rules

y_{i ​,}_{t} = ρ_{i} y_{i, t - 1} + \in_{i ​,}_{t},

(5)

Solving for the actual law of motion, this leads to first-order autocorrelations in the stochastic steady state F (ρ, j), where ρ is the row vector of ρ_i and j are remaining parameters. Then given j, the stochastic consistent expectations equilibrium (SCEE) is the solution with the fixed point:

ρ * = F (ρ *, ϑ) \Rightarrow ρ * = ρ * (ϑ) .

(6)

See Hommes and Zhu (2014) and Hommes et al. (2023). It should be stressed that the SCEE is not the RE equilibrium, unlike statistical learning with e-stability.

Gobbi and Grazzini (2015) perform OLS on a first-order VAR of the full state, including shock processes. Eusepi and Preston (2011) perform OLS on a finite approximation of an infinite VAR of a subset of the state space. Hommes and Zhu (2015) use a parsimonious first-order VAR to fit mean and persistence of each state variable to data. All these papers assume the solution of an RE model can be approximately expressed as a finite VAR, which in itself can be a strong assumption as shown by Fernandez-Villaverde et al. (2007). All these papers also use the SCEE concept (aka a Bayesian learning equilibrium). This contrasts with k-level thinking of Garcia-Schmidt and Woodford (2019) and Farhi and Werning (2019), where beliefs are updated iteratively with observed temporary and non-stochastic expectations equilibria over n stages.

Concept II: k-Level Thinking: See Garcia-Schmidt and Woodford (2019) and Farhi and Werning (2019). Consider a consumption function (derived later) where consumption (C_t) of a household is a function of the expected present and future interest rate and factor prices (the wage and profits). Write as

C_{t} = f (X_{t}, E [X_{t + 1}, X_{t + 2}, ..]) .

(7)

with perfect foresight E[X_t _{+ i}] = X_t _{+ i}, so beliefs coincide with outcomes. In a stochastic environment, they coincide on average. k-level with k = 1 thinking proposes a temporary equilibrium such that given a set of beliefs ${\hat{X}}_{t + i}^{0}$ which can be an initial RE equilibrium or steady state, then given observations of X_t

C_{t}^{1} = f (X_{t}, {\overset{⌢}{X}}^{0}_{t + 1}, {\overset{⌢}{X}}^{0}_{t + 2} ..) .

(8)

Similarly, for k = 2 thinking, we have

C_{t}^{2} = f (X_{t}, {\overset{⌢}{X}}^{1}_{t + 1}, {\overset{⌢}{X}}^{1}_{t + 2} ..) .

(9)

and so on. In the applications of this concept cited, as k → ∞, this iterative process converges with the RE equilibrium and has also been used to compute the solution of RE models.

Concept III: Adaptive Expectations: Adaptive expectations (AE) has a long history in macroeconomics going back to Milton Friedman; see Friedman (1968). The AE rule takes the form:

E_{t}^{*} y_{t + 1} = E_{t - 1}^{*} y_{t} + λ (y_{t} - E_{t - 1}^{*} y_{t}); λ \in [0, 1] .

(10)

By iteration, this can be written as

E_{t}^{*} y_{t + 1} = λ y_{t} + (1 - λ) E_{t - 1}^{*} y_{t} = λ \sum_{i = 0}^{\infty} {(1 - λ)}^{i} y_{t - i} .

(11)

Thus, the expected value is a weighted average of past values of y_t. Gelain et al. (2019) find that such a rule in an estimated NK model fits the data better than RE. It should be noted that, as for k-level thinking, AE is not an SCEE.

Anufriev et al. (2015) propose a more general adaptive expectations rule:

\begin{array}{l} E_{t}^{*} y_{t + 1} = E_{t - 1}^{*} y_{t} + λ_{1} (y_{t} - E_{t - 1}^{*} y_{t}) \\ + λ_{2} (y_{t} - y_{t - 1}); λ_{2} \in (- 1, 1) . \end{array}

(12)

This conforms with lab experiments, a speciality of Hommes and colleagues.

Concept IV: Euler Learning, Anticipated Utility and Individual Rationality: Throughout the learning literature, a division occurs between the Euler learning (EL) and anticipated utility (AU) approaches. EL is a more straightforward concept: in a linearized RE model featuring forward-looking expectations E_tx_t _{+ 1}, this expression is replaced with an adaptive expectations rule, usually a special case of (10) for example with λ = 0.

Turning to AU, a closely related literature develops the concept of internal rationality (IR) (see Adam & Marcet, 2011). Under both IR and AU, agents maximize utility under uncertainty, given their constraints and a consistent set of probability beliefs about payoff-relevant variables that are beyond their control or external. Then with IR, beliefs take the form of a well-defined probability measure over a stochastic process (the ‘fully Bayesian’ plan). See Eusepi and Preston (2011) for an RBC BR model with AU, Preston (2005) and Woodford (2013), who adopt a similar NK framework, and Branch and McGough (2018) provide a discussion of EL versus AU. Cogley and Sargent (2008) compare the IR with AU and encouragingly find that AU can be seen as a good approximation to IR.

Concept V: Inattention-Cognitive Discounting. Agents in the model perceive reality with some myopia and inattention as in Gabaix (2020). They are otherwise rational. An interesting discussant report is Cochrane (2016). This is related to finite-time horizon optimization as in Woodford (2018). Optimal policy applications are Levin and Sinha (2019) and Benchimol and Bounader (2019). An open economy application is Kolasa et al. (2022).

This subsection has reviewed a number of equilibrium concepts found in the literature that relax the RE assumption. In the rest of the paper, we will compare a standard NK that assumes RE with a number of behavioural counterparts. In the third and fourth sections, the behavioural model chosen is that with AU learning (concept IV). In the sixth section, the need for robust policy is demonstrated in its most clear fashion by comparing the RE model with EL (concept IV) and the inattention-myopia model (concept V). Finally, the section ‘Does Imperfect Information Improve Data Fit?’, reverts to AU in a comparison between RE with perfect and imperfect information.

RE and Bounded Rationality in the NK Model

Ultimately, our application will be conducted in terms of a linear NK RE model, under both perfect and imperfect information, and in a behavioural NK model. But first we step back to the underlying non-linear NK model and introduce the distinction between internal decisions and aggregate macro-variables. We start with the non-linear RE model and proceed from pure RE to pure BR in stages. The complete model set-up and its balanced growth steady state are summarized in Deák et al. (2023).

This subsection has reviewed a number of equilibrium concepts in the literature that go beyond RE.

Households

Household j chooses savings and between work and labour supply. Let C_t(j) be consumption and H_t(j) be the proportion of this available for work or leisure spent at the former. The single-period utility we choose, compatible with a balanced growth steady state, is

U_{t} (j) = U (C_{t} (j), H_{t} ​ (j)) = log (C_{t} (j)) - \frac{H_{t} ​ {(j)}^{1 + ϕ}}{1 + ϕ}

and the value function of the representative household at time t dependent on its assets B is

V_{t} (j) = V_{t} (B_{t - 1} (j)) = E_{t} \sum_{s = 0}^{\infty} ϐ^{s} U (C_{t + s} (j), H_{t + s} (j)) .

The household’s problem at time t is to choose paths for consumption {C_t(j)}, labour supply {H_t(j)} and holdings of financial savings to maximize V_t(j) given by (13) given its budget constraint in period t

\begin{array}{l} B_{t} (j) = R_{t} B_{t - 1} (j) + W_{t} H_{t} (j) + Γ_{t} - C_{t} (j) \\ - T_{t} - \frac{ϖ}{2} {(B_{t - 1} (j) - B)}^{2} . \end{array}

(14)

where B_t(j) is the given net stock of real financial assets at the end of period t, W_t is the wage rate, T_t are lump-sum taxes, Γ_t are profits from wholesale and retail firms owned by households. In order to allow for a wealth distribution heterogeneous agents introduced later and to achieve a stationary path for bond holdings, we introduce a portfolio adjustment cost.¹ R_t is the real interest rate paid on assets held at the beginning of period t given by the Fischer equation

R_{t} = \frac{R_{n, t - 1}}{Π_{t}} .

(15)

where R_n,t and Π_t are the nominal interest and inflation rates, respectively. W_t, R_n,t, Π_t and Γ_t are all exogenous to household j. As usual, all real variables are expressed relative to the price of final output. The standard first-order conditions are

\begin{array}{l} E_{t} [Λ_{t,}_{t + 1} (j) R_{t + 1}] = 1 + ϖ (B_{t} (j) - B), \\ \frac{U_{H, t} (j)}{U_{C, t} (j)} = - W_{t} . \end{array}

where Λ_{t,t + 1} (j) ≡ ϐ $\frac{U_{C, t + 1} (j)}{U_{C, t} (j)}$ is the stochastic discount factor for household j, over the interval [t, t + 1]. For our choice of utility function $U_{C, t} = \frac{1}{_{C t}}$ and U_H,t = $- H_{t}^{ϕ}$ so these become

ϐ E_{t} \frac{C_{t} (j) R_{t + 1}}{C_{t + 1} (j)} = 1 + ϖ (B_{t} (j) - B),

(16)

C_{t} (j) H_{t} {(j)}^{ϕ} = W_{t} \Rightarrow H_{t} (j) = {(\frac{W_{t}}{C_{t} (j)})}^{\frac{1}{ϕ}} .

(17)

The first-order conditions up to now are suitable for the RE solution. We now express the solution in a form suitable for moving from a RE to a learning equilibrium. We consider the limit as ϖ → 0. Solving (14) forward in time and imposing the transversality condition on debt, we can write

B_{t - 1} (j) = P V_{t} (C_{t} (j)) - P V_{t} (W_{t} H_{t} (j)) - P V_{t} (Γ_{t}) + P V_{t} (T_{t}) .

(18)

where the present (expected) value of a series $X \equiv {X_{t + 1}}_{i = 0}^{_{\infty}}$ at time t is defined by

P V_{t} (X_{t}) \equiv E_{t} \sum_{i = 0}^{\infty} \frac{X_{t + i}}{R_{t, t + i}} = \frac{X_{t}}{R_{t}} + \frac{1}{R_{t}} P V_{t} (X^{t + 1}),

(19)

writing R_t,t+i = R_tR_t _{+ 1}R_t _{+ 2} • • • R_{t + i} as the real interest rate over the interval [t − 1, t + i].

The forward-looking budget constraint (18) holds for the representative household. If we allow RE and BR agents to borrow from or lend to one another, we must allow for B_t₋₁ = 0. Then in a symmetric equilibrium with C_t(j) = C_t and H_t(j) = H_t, (18) and (17) become

\begin{array}{l} B_{t - 1} = P V_{t} (C_{t}) - P V_{t} \frac{W_{t}^{1 + \frac{1}{ϕ}}}{C_{t}^{\frac{1}{ϕ}}} - P V_{t} (Γ_{t}) + P V_{t} (T_{t}), \\ H_{t} = {(\frac{W_{t}}{C_{t}})}^{\frac{1}{ϕ}} . \end{array}

Solving (16) forward in time and using the law of iterated expectation, we have for i ≥ 1

\frac{1}{C_{t}} = ϐ^{i} E_{t} \frac{R_{t + 1, t + i}}{C_{t + i}}; i \geq 1.

(20)

We now express the solution to the household optimization problem for C_t and H_t that are functions of point expectations ${E_{t} W_{t + i}}_{i = 1}^{_{\infty}},$ ${E_{t} R_{t + 1, t + i}}_{i = 1}^{_{\infty}}$ and ${E_{t} Γ_{t + i}}_{i = 0}^{_{\infty}}$ treated as exogenous processes given at time t.² With point expectations, we use (20) to obtain the following optimal decision for C_{t + i} given point expectations E_tR_{t + 1,t+i}:

C_{t + i} = C_{t} ϐ^{i} E_{t} R_{t + 1, t + i}; i \geq 1,

(21)

E_{t} (W_{t + i} H_{t + i}) = \frac{(E_{t} W_{t + i}) 1 + \frac{1}{ϕ}}{C_{t + i}^{\frac{1}{ϕ}}} .

(22)

Substituting (21) and (22) into the forward-looking household budget constraint, using $\sum_{i = 0}^{\infty} ϐ^{i} = \frac{1}{1 - ϐ}$ and E_tR_t,t+i = R_tE_tR_{t + 1},_t+i for i ≥ 1, we arrive at

\frac{C_{t}}{(1 - β)} = \frac{1}{C_{t}^{\frac{1}{ϕ}}} (W_{t}^{1 + \frac{1}{ϕ}} + \sum_{i = 1}^{\infty} {(β^{\frac{1}{ϕ}})}^{- i} {(\frac{E_{t} W_{t + i}}{E_{t} R_{t + 1, t + i}})}^{1 + \frac{1}{ϕ}}) + Γ_{t} - T_{t} + \sum_{i = 1}^{\infty} \frac{E_{t}^{*} (Γ_{t + i} - T_{t + i})}{E_{t}^{*} R_{t + 1, t + i}}

which can be written in recursive form as

\frac{C_{t}}{(1 - β)} = \frac{1}{C_{t}^{\frac{1}{ϕ}}} (W_{t}^{1 + \frac{1}{ϕ}} + Ω_{1, t}) + Γ_{t} - T_{t} + Ω_{2, t}

Ω_{1, t} \equiv \sum_{i = 1}^{\infty} (ϐ^{\frac{1}{ϕ}}) {}^{- i} {\frac{E_{t} W_{t + i}}{E_{t} R_{t + 1, t + i}}}^{1 + \frac{1}{ϕ}} = ({ϐ^{\frac{1}{ϕ}})}^{- 1} {\frac{E_{t} W_{t + 1}}{E_{t} R_{t + 1, t + 1}}}^{1 + \frac{1}{ϕ}} + \frac{Ω_{1, t + 1}}{ϐ^{\frac{1}{ϕ}} E_{t} R_{t + 1}}

Ω_{2, t} \equiv \sum_{i = 1}^{\infty} \frac{E_{t} (Γ_{t + i} - T_{t + i})}{E_{t} R_{t + 1, t + i}} = \frac{E_{t} (Γ_{t + 1} - T_{t + 1})}{E_{t} R_{t + 1, t + 1}} + \frac{Ω_{2, t + 1}}{E_{t} R_{t + 1}}

Consumption is then given by (23) assuming point expectations or by the symmetric form of the Euler equation (16) under full rationality (i.e., households know symmetric nature of equilibrium with C_t(j) = C_t). C_t is a function of rational point expectations ${E_{t} W_{t + i}}_{i = 1}^{_{\infty}},$ ${E_{t} R_{t, t + i}}_{i = 1}^{_{\infty}}$ and ${E_{t} Γ_{t + i}}_{i = 1}^{_{\infty}}$ which can be treated as exogenous processes given at time t or as rational model-consistent expectations. Since E_tf (X_t) ≈ f (E_t(X_t)); E_tf (X_tY_t)) ≈ f (E_t(X_t)E_t(Y_t)) up to a first-order Taylor-series expansion, assuming point expectations is equivalent to using a linear approximation (given below), as is usually done in the literature.

Firms, Government Expenditures and Monetary Policy

This section sets out the wholesalers and the retail sector which optimizes using Calvo-pricing contracts. We close the non-linear set-up with resource and balanced government budget constraints, a monetary policy rule and by specifying the structural shocks in the economy. Wholesale firms employ a Cobb–Douglas production function to produce a homogeneous output

Y_{t}^{w} = F (A_{t}, H_{t}) = A_{t} H_{t}^{α},

where A_t is total factor productivity. Profit-maximizing demand for labour results in the first-order condition

W_{t} = \frac{P_{t}^{w}}{P_{t}} F_{H, t} = α \frac{P_{t}^{w} Y_{t}^{w}}{P_{t} H_{t}} .

(24)

The retail sector costlessly converts a homogeneous wholesale good into a basket of differentiated goods for aggregate consumption

C_{t} = \int_{0}^{1} C_{t} {(m)}^{(ζ - 1) / ζ} d m^{ζ / (ζ - 1)},

(25)

where ζ is the elasticity of substitution. For each m, the consumer chooses C_t(m) at a price P_t(m) to maximize (25) given total expenditure $\int_{0}^{1} P_{t} (m) C_{t} (m) d m$ . Assuming government services are similarly differentiated, this results in a set of demand equations for each differentiated good m with price P_t(m) of the form

Y_{t} (m) = {\frac{P_{t} (m)}{P_{t}}}^{- ζ} Y_{t},

(26)

where $P_{t} = {[\int_{0}^{1} P_{t} {(m)}^{1 - ζ} d m]}^{\frac{1}{1 - ζ}} .$ P_t is the aggregate price index. C_t and P_t are Dixit–Stigliz aggregates (see Dixit & Stiglitz, 1977).

Following Calvo (1983), we assume that there is a probability of 1 − ξ at each period that the price of each retail good m is set optimally to P_t^O(m). If the price is not re-optimized, then it is held fixed. For each retail producer m, given its real marginal cost $M C_{t} = \frac{P_{t}^{w}}{P_{t}},$ the objective is at time t to choose {P_t^O(m)} to maximize discounted real profits

E_{t} \sum_{k = 0}^{\infty} ξ^{k} \frac{Λ_{t, t + k}}{P_{t + k}} Y_{t + k} (m) P_{t}^{o} (m) - P_{t + k} M C_{t + k}

subject to (26), where $Λ_{t,}_{t + k} \equiv ϐ^{k} \frac{U_{C, t + k}}{U_{C, t}}$ , is the stochastic discount factor over the interval [t, t + k]. The solution to this is standard and given by

\frac{P_{t}^{o} (m)}{P_{t}} = \frac{ζ E_{t} \sum_{k = 0}^{\infty} ξ^{k} Λ_{t, t + k} {(Π_{t, t + k})}^{ζ} Y_{t + k} M C_{t + k}}{^{(ζ - 1)} E_{t} \sum_{k = 0}^{\infty} ξ^{k} Λ_{t, t + k} {(Π_{t, t + k})}^{ζ} {(Π_{t, t + k})}^{- 1} Y_{t + k}}

Denoting the numerator and denominator by J_t and JJ_t, respectively, and introducing a mark-up shock MS_t to MC_t, from Online Appendix D, we write in recursive form

\frac{P_{t}^{o} (m)}{P_{t}} = \frac{J_{t}}{J J_{t}},

(27)

J_{t} - ξ E_{t} [Λ_{t, t + k} Π_{t + 1}^{ζ} J_{t + 1}] = \frac{1}{1 - \frac{1}{ζ}} Y_{t} M C_{t} M S_{t},

(28)

J J_{t} - ξ E_{t} [Λ_{t, t + 1} Π_{t + 1}^{ζ - 1} J J_{t + 1}] = Y_{t} .

(29)

Using the fact that all resetting firms will choose the same price, by the law of large numbers, we can find the evolution of inflation given by

1 = ξ {(Π_{t - 1, t})}^{ζ - 1} + (1 - ξ) {(\frac{P_{t}^{o}}{P_{t}})}^{1 - ζ} .

(30)

Price dispersion lowers aggregate output as follows. Market clearing in the labour market gives

H_{t} = \sum_{m = 1}^{n} H_{t} (m) = {\sum_{m = 1}^{n} (\frac{Y_{t} (m)}{A_{t}})}^{^{\frac{1}{α}}} = {(\frac{Y_{t}}{A_{t}})}^{^{\frac{1}{α}}} {\sum_{m = 1}^{n} (\frac{P_{t} (m)}{P_{t}})}^{\frac{- ζ}{α}}

using (26). Hence equilibrium for good m gives $Y_{t} = \frac{Y_{t}^{w}}{Δ_{t}^{α}},$ where price dispersion is defined by

Δ_{t} \equiv ({\sum_{m = 1}^{n} (\frac{P_{t} (m)}{P_{t}})}^{- \frac{ζ}{α}}),

Assuming that the number of firms is large from Online Appendix E, we obtain the following dynamic relationship:

Δ_{t} = ξ Π_{t}^{\frac{ζ}{α}} Δ_{t - 1} + (1 - ξ) {(\frac{J_{t}}{J J_{t}})}^{- \frac{ζ}{α}} .

To close the model, we first require total profits from retail and wholesale firms, T_t, are remitted to households. This is given in real terms by

T_{t} = \underset{retail}{\underset{︸}{Y_{t} \frac{P_{t}^{w}}{P_{t}} Y_{t}^{w}}} + \underset{Wholesale}{\underset{︸}{\frac{P_{t}^{w}}{P_{t}} Y_{t}^{w} - W_{t} H_{t}}} = Y_{t} - α \frac{P_{t}^{w}}{P_{t}} Y_{t}^{w}

using the first-order condition (24). Then to complete closure, we have resource and balanced government budget constraints

Y_{t} = C_{t} + G_{t} = C_{t} + T_{t}

where G_t is an exogenous demand process, and a monetary policy rule for the nominal interest rate given by the following implementable Taylor-type rule

\log (\frac{R_{n, t}}{R_{n}}) = ρ {}_{​ ​ r ​ ​ ​ ​}l og (\frac{R_{n, t - 1}}{R_{n}}) + (1 - ρ_{r}) (θ {}_{​ ​ π ​ ​ ​ ​}l og (\frac{Π_{t}}{Π_{t a r g, t}}) + θ_{y} \log (\frac{Y_{t}}{Y}) + θ_{d y} \log (\frac{Y_{t}}{Y_{t - 1}})) + \in_{M P, t}

(31)

and ε_MP,t is an i.i.d. shock to monetary policy. Π_targ,t is a time-varying inflation target and together with A_t, G_t and MS_t follows an AR(1) process. This completes the model.

Recovering the NK Workhorse Model

We now show that the linearized form of the non-linear model about the steady state reduces to the standard workhorse model where rational expectations E_ty_{t + 1} and E_tπ_{t + 1} or non-RE E*_ty_{t + 1} and E*_tπ_{t + 1} can be treated as expectations by individual households and firms, respectively, of aggregate future output and inflation. We consider the linearized form of the above set-up about a zero inflation and growth deterministic steady state. We also ignore lending or borrowing between RE and BR agents. With RE, the household j’s first-order conditions take one of two forms. First, linearizing (23) we have

\begin{array}{l} α_{1} c_{t} (j) = α_{2} w_{t} + α_{3} (ω_{2, t} + r_{t}) + α_{4} ω_{1, t}, \\ ω_{1, t} = α_{5} E_{t} w_{t + 1} - α_{6} E_{t} r_{t + 1} + ϐ E_{t} ω_{1, t + 1}, \\ ω_{2, t} = (1 - ϐ) (γ_{t} - g_{t}) - r_{t} + ϐ E_{t} ω_{2, t + 1}, \\ γ_{t} = \frac{1}{γ_{y}} y_{t} - \frac{α}{γ_{y}} (w_{t} + h_{t}), \end{array}

(32)

from (23) where lower case variables x_t = log(X_t/X), where X is the steady state of $X_{t}; c_{y} \equiv \frac{C}{y}, γ_{y} \equiv \frac{Γ}{y}, g_{y} \equiv \frac{G}{y} a n d γ_{t}$ is exogenous profit per household (a function of aggregate consumption and hours). Positive coefficients are given by $α_{1} \equiv 1 + \frac{α}{ϕ}, c_{y} α_{2} \equiv (1 - ϐ) (1 + \frac{1}{ϕ}) \frac{α}{c_{y}}, α_{3} \equiv \frac{γ_{y}}{c_{y}}, α_{4} \equiv \frac{ϐ α}{c_{y}}, α_{5} \equiv (1 - ϐ) (1 + \frac{1}{ϕ}) and α_{6} \equiv (1 + \frac{1}{ϕ}) .$ $α_{1} \equiv 1 + \frac{α}{ϕ}, c_{y} α_{2} \equiv (1 - ϐ) (1 + \frac{1}{ϕ}) \frac{α}{c_{y}}, α_{3} \equiv \frac{γ_{y}}{c_{y}}, α_{4} \equiv \frac{ϐ α}{c_{y}}, α_{5} \equiv (1 - ϐ) (1 + \frac{1}{ϕ}) and α_{6} \equiv (1 + \frac{1}{ϕ}) .$ Alternatively, from the Euler equation (16):

c_{t} = E_{t} c_{t + 1} - E_{t} r_{t + 1}

(33)

in a symmetric equilibrium. Under RE, (32) or (33) leads to the same equilibrium, but under BR, this is no longer the case.

Linearizing the household supply of hours decision, the resource constraint and the Fisher equation, we have

y_{t} = (1 - g_{y}) c_{t} + g_{y} g_{t},

(34)

r_{t} = r_{n, t - 1} + π_{t},

(35)

h_{t} = \frac{1}{ϕ} (w_{t} - c_{t}),

(36)

which completes the decisions of the household. Substituting out for c_t from (34)

y_{t} = E_{t} y_{t + 1} - (1 - g_{y}) E_{t} r_{t + 1} + g_{y} (E_{t} g_{t + 1} - g_{t}) .

(37)

Turning to the supply side, for the wholesale sector

y_{t} = a_{t} + α h_{t},

(38)

m c_{t} = w_{t} - y_{t} + h_{t} .

(39)

For retail firm m, linearizing the pricing dynamics (27)–(29) about a zero net equation steady state and solving forwards, we have

\begin{array}{l} p_{t}^{o} (m) - p_{t} = ϐ ξ E_{t} [π_{t + 1} + p_{t + 1}^{o} (m) - p_{t + 1}] + (1 - ϐ ξ) (m c_{t} + m s_{t}) \\ = E_{t} \sum_{i = 0}^{\infty} {(ϐ ξ)}^{i} [ϐ ξ π_{t + i + 1} + (1 - ϐ ξ) (m c_{t + i} + m s_{t + i})] . \end{array}

Then, in a symmetric equilibrium, we have

π_{t} = \frac{(1 - ξ)}{ξ} E_{t} \sum_{1 = 0}^{\infty} {(ϐ ξ)}^{i} [ϐ ξ π_{t + i + 1} + (1 - ϐ ξ) (m c_{t + i} + m s_{t + i}),

where E_t[π_t_+i+1] and E_t[mc_t _{+ i} + ms_t _{+ i}] are expectations of aggregate inflation and real marginal costs, both variables exogenous to individual price-setters. However, if price-setters know they are identical they know the aggregate price level over non-optimizing and optimizing firms

p_{t} (m) = ξ p_{t - 1} + (1 - ξ) p_{t}^{0} (m)

(42)

to obtain in a symmetric equilibrium

p_{t}^{o} (m) - p_{t} = p_{t}^{o} - p_{t} = \frac{ξ}{(1 - ξ)} (p_{t} - p_{t - 1}) = \frac{ξ}{(1 - ξ)} π_{t} .

Then, substituting back into (40), we arrive at

π_{t} = \frac{(1 - ξ) (1 - ϐ ξ)}{ξ} E_{t} \sum_{i = 0}^{\infty} ϐ^{i} (m c_{t + i} + m s_{t + i}) .

which omits learning about aggregate inflation. Under RE, (41) and (43) are equivalent. (43) is equivalent to

π_{t} = ϐ E_{t} π_{t + 1} + λ (m c_{t} + m s_{t}),

(44)

where $λ = \frac{(1 - ξ) (1 - ℓ ξ)}{ξ}$ , which is the familiar linearized Phillips curve expressed in terms of the real marginal cost mc_t and the mark-up shock ms_t. Substituting for the former from (38) and (39), we arrive at

π_{t} = ϐ E_{t} π_{t + 1} + λ \frac{1 + ϕ}{α} (y_{t} - a_{t}) - \frac{g_{y}}{1 - g_{y}} g_{t} + m s_{t},

(45)

where we note that y_t − a_t is the output gap. Equations (37), (45) and the Taylor rule (31) constitute the 3-equation NK RE model in output, inflation and the nominal interest rate given exogenous shock processes for g_t, ms_t and the monetary shock. A simpler form omits government spending g_t so g_y = 0 and replaces the aggregate demand shock in (45) with an exogenous process that can be thought of as a risk premium shock to the Fischer equation (35).

The form of the Phillips curve (43) is often used in the behavioural NK literature (see, e.g., De Grauwe, 2012b), but as we have shown, this assumes that firms know they are identical. In our BR model with AU learning, we use (32) and (41), which do not make this assumption.

AU Learning and Market-consistent Information

With AU learning, our learning model is one where agents make fully optimal decisions given their individual specification of beliefs but have no macroeconomic model to form expectations of aggregate variables. We draw a clear distinction between aggregate and internal quantities so that identical agents in our model are not aware of this equilibrium property (nor any others).

To close the model, we need to specify the manner in which households and firms form their expectations. To do so, we assume that variables which are local to the agents, in a geographical sense, are observable within the period, whereas variables that are strictly macroeconomic are only observable with a lag. This categorization regarding information about the current state of the economy follows Nimark (2014). He distinguishes between the local information that agents acquire directly through their interactions in markets and statistics that are collected and summarized, usually by governments, and made available to the wider public.³ The policy rate is announced by the central bank, so it is observed without a lag and it is common knowledge. Given this, we assume an adaptive expectations forecasting rule given below by (47) and (48) about variables external to agents’ decisions. Let x_t = r_t, r_n,t, π_t, w_t, γ_t, g_t, then household expectations are given by

E_{t}^{*} X_{t + i} = E_{t}^{*} X_{t + 1}; i \geq 1.

(46)

Expressing E_tω_{1,t + 1} and E_tω_{2,t + 1} in (32) as forward-looking summations and using (46), we arrive at the individual learning consumption equation

\begin{array}{l} α_{1} c_{t} = α_{2} w_{t} + α_{3} (ω_{2, t} + r_{t}) + α_{4} ω_{1, t}, \\ ω_{1, t} = \frac{1}{1 - ϐ} α_{5} E_{t}^{*} w_{t + 1} - α_{6} (ϐ E_{t}^{*} r_{t n, t + 1} - E_{h, t}^{*} π_{t + 1}) - α_{6} r_{n, t}, \\ ω_{2, t} = (1 - ϐ) (γ_{t} - g_{t}) - r_{t} + \frac{ϐ}{1 - ϐ} ((1 - ϐ) (E_{t}^{*} γ_{t + 1} - E_{t}^{*} g_{t + 1}) - E_{t}^{*} r_{t + 1}), \end{array}

which is now expressed in terms of one-step-ahead forecasts by the standard adaptive expectations rule⁴:

E_{t}^{*} x_{t + 1} = E_{t - 1}^{*} x_{t} + λ_{x} (x_{t - j} - E_{t - 1}^{*} x_{t}); x = w, r_{n}, π, γ - g; j = 0, 1

(47)

Households make inter-temporal decisions for their consumption and hours supplied given adaptive expectations of the wage rate, the nominal interest rate, inflation and profits. These macro-variables may in principle be observed with or without a one-period lag (j = 1, 0), but as stated earlier, we assume j = 0 for market-specific variables w_t, γ_t − g_t, and j = 1 for aggregate inflation π_t. However, we assume the current nominal interest rate, r_n,t_, is announced and therefore also observed without a lag.

We distinguish household and firm expectations $E_{h, t}^{_{*}} π_{t + 1,} E_{f, t}^{_{*}} π_{t + 1} .$ Then for retail firm m.

\begin{array}{l} E_{t}^{*} π_{t + i + 1} = E_{t}^{*} π_{t + 1}; i \geq 0, \\ E_{t}^{*} (m c_{t + i} + m s_{t + i}) = E_{t}^{*} (m c_{t + 1} + m s_{t + 1}); i \geq 1, \\ p_{t}^{o} (m) - p_{t} = \frac{ϐ ξ}{(1 - ϐ)} E_{f, t}^{*} π_{t + 1} + (1 - ϐ ξ) (m c_{t} + m s_{t}) + \frac{ϐ}{(1 - ϐ)} E_{t}^{*} (m c_{t + 1} + m s_{t + 1}), \end{array}

where again one-step-ahead forecasts are given by the adaptive expectations rule:

E_{t}^{*} x_{t + i} = E_{t - 1}^{*} x_{t} + λ_{x} (x_{t - j} - E_{t - 1}^{*} x_{t}); x = π, (m c + m s); j = 0, 1.

(48)

Retail firms make inter-temporal decisions for their price and output given adaptive expectations of the aggregate inflation rate and their post-shock real marginal shock wage rate. As before, these variables may be observed with or without a one-period lag (j = 1, 0), but for aggregate inflation, we assume j = 1 as for households, but j = 0 for the market-specific variable mct. Note that we can in principle distinguish between households’ and firms’ expectations of inflation.

Heterogeneous Expectations and Reinforcement Learning

There is a growing literature within behavioural macro-models based on the Brock and Hommes (1997) framework where agents learn from each other through reinforcement learning. More recently, DeGrauwe has used this framework based on the 3-equation linearized workhorse NK model.

RE expectations are then replaced with boundedly rational (BR) with simple fore casting rules; that is, replace E_t (RE) with E^*_t (non-RE). This is the Euler equation learning (EL) approach. There are two types of agents with different forecasting rules. Both can use simple misspecified forecasting rules as in De Grauwe (2012b). One set can be rational as in Branch and McGough (2010) and Massaro (2013). See also Young (2004), Choi et al. (2009), De Grauwe (2011, 2012a) and Hommes et al. (2019). Jump and Levine (2019) provide a survey. All these papers feature misspecified equilibria which are not SCEE: the PLM is inconsistent with the ALM. There is a major modelling issue: Euler learning versus anticipated utility.

Heterogeneous Expectations with Fixed Proportions of RE and BR Agents

Now we turn to the heterogeneous expectations model with BR(AU) agents alongside RE agents with fixed proportions of each type. We assume all RE agents know the composite model. In addition, we impose informational inconsistency by assuming they have the same II set as the BR(AU) agents. The latter do not know the model, but do make individually optimal decisions given individual observations of the states and belief formations. The composite RE–BR model then has an equilibrium (in non-linear form)

\begin{array}{l} H_{t}^{d} = n_{h, t} {(H_{t}^{s})}^{R E} + (1 - n_{h, t}) {(H_{t}^{s})}^{B R}, \\ C_{t} = n_{h, t} {(C_{t})}^{R E} + (1 - n_{h, t}) {(C_{t})}^{B R} = Y_{t} - G_{t}, \\ \frac{p_{t}^{o}}{p_{t}} = n_{f, t} {\frac{p_{t}^{o}}{p_{t}}}^{R E} + (1 - n_{f, t}) {\frac{p_{t}^{o}}{p_{t}}}^{B R}, \end{array}

Zero net wealth in aggregate implies that $n_{h, t} B_{t}^{_{R E}} = - (1 - n_{h, t}) B_{t}^{_{R E}} .$ We consider the properties of the model with fixed exogenous proportions of RE and BR agents.

Figure 1.

RE vs RE–BR Composite Expectations with n_h = n_f = 0.5, λ_x = 0.25, 1.0; Taylor rule with ρ_r = 0.7, υ_pi = 1.5 and υ_y = 0.3, υ_dy = 0; Monetary Policy Shock.

For our model of BR with AU, Figure 1 plots the impulse response functions (IRFs) with standard parameters for the rule for a shock to monetary policy under fast and slow learning. Not surprisingly, fast learning sees an IRF converge faster to the RE case, but in either case BR introduces more persistence compared with RE. This suggests that this feature should lead to a better fit of the data without relying on other persistence mechanisms (shocks, habit or price indexing). This we examine in the estimation of our model.⁵

Endogenous Proportions of Rational and Non-rational Agents: Reinforcement Learning

Up to now we assume that the proportions of rational and non-rational agents n_y,t and n_π,t are exogenous. As in Massaro (2013), in the estimation and main conclusions that follow, we retain this assumption, but in this sub section, we explore the extension that endogenizes these decisions by agents. Following Brock and Hommes (1997) and the reinforcement learning literature in general, these can be chosen as follows:

n_{x, t} = \frac{\exp (- γ Φ_{x, t}^{R E} ({x_{t}}))}{\exp (- γ Φ_{x, t}^{R E} ({x_{t}})) + \exp (- γ Φ_{x, t}^{A E} ({x_{t}}))},

(49)

where $- Φ_{x, t}^{R E} ({x_{t}}) and - Φ_{x, t}^{A E} ({x_{t}})$ are ‘fitness’ measures, respectively, of the forecast performance of the rational and non-rational predictor of outcome {x_t} = {y_t}, {π_t} given by a discounted least-squares error predictor

Φ_{x, t}^{R E} ({x_{t}}) = μ_{R E} Φ_{x, t - 1}^{R E} ({x_{t}}) + (1 - μ_{R E}) ({[x_{t} - E_{t - 1} x_{t}]}^{2} + C_{x}),

(50)

Φ_{x, t}^{A E} ({x_{t}}) = μ_{A E} Φ_{x, t - 1}^{A E} ({x_{t}}) + (1 - μ_{A E}) ({[x_{t - j} - E_{t - 1 - j}^{*} x_{t - 1}]}^{2}; j = 0, 1,

(51)

where ρ_RE and ρ_AE capture the memory of the agents forming RE and AE (a measure of forgetfulness of past observations). C_x represents the relative costs of being rational in learning about variable x_t. Thus, the proportion of rational agents in the steady state is given by

n_{x} = \frac{\exp (- γ C_{x})}{\exp (- γ C_{x}) + 1},

which is pinned down by the γC_x.

A complete treatment of the model would require a departure from the linear Kalman filter solution for the II case for which we exploit the closed-form saddlepath solution that Pearlman et al. (1986) show both exists and is unique. We have also exploited the convenience of linear Bayesian estimation. In what follows we confine ourselves to the RE PI case and use the linear estimates obtained up to now.

Agents with reinforcement learning that now have proportions of rational households (n_h,t) and firms (n_f,t) are given by (49). Table 1 provides a third-order perturbation solution of the non-linear NK RE(PI)-BR model. We use the Bayesian estimation of the linear model in ‘the first, second, third section’ etc. where the model is linearized and the proportions n_h,t and n_f,t are fixed. Non-linear estimation would be required to pin down the parameters n_h, n_f in the steady state, and $μ_{h}^{R E, B R}, μ_{f}^{R E, B R}$ and γ in the reinforcement learning process and goes beyond the scope of this article. So here we impose them as reported in the table. We also scale the estimated standard deviations of the shocks using a parameter σ = 1, 2.

Table 1.

Third-order Solution of the Estimated NK RE(PI)-BR Model; $μ_{h}^{_{R E}} = μ_{h}^{_{B R}} = μ_{f}^{_{R E}} = μ_{f}^{_{B R}} = 0;$ ; γ = 1, 100, 1,000.

Variable	Stochastic Mean	Standard Deviation (%)	Skewness	Kurtosis
$\frac{C_{t}}{C}$	0.9993	2.47	0.2792	0.0371
$\frac{H_{t}}{H}$	1.0002	0.19	0.0192	0.0327
$\frac{w_{t}}{w}$	0.9996	2.15	0.2771	0.0215
$\frac{Π_{t}}{Π}$	0.9999	0.46	0.0159	0.0645
$\frac{R_{n, t}}{R_{n}}$	0.9999	0.46	0.0070	0.0651
$Φ_{h, t}^{R E} - C_{h}$	–0.000065	0.000020	–0.7589	0.9487
$Φ_{h, t}^{A E}$	–0.000084	0.000054	–1.8238	5.7852
$Φ_{f, t}^{R E} - C_{f}$	–0.000011	0.000009	–0.7203	0.7834
$Φ_{f, t}^{A E}$	–0.000069	0.000053	–2.2156	8.8686
n_h,t(γ = 1; σ = 1)	0.093301	0.000004	1.8039	6.0897
n_f,t(γ = 1; σ = 1)	0.098603	0.000004	2.2688	9.2725
n_h,t(γ = 100; σ = 1)	0.094221	0.003634	1.8039	6.0897
n_f,t(γ = 100; σ = 1)	0.101751	0.004303	2.2688	9.2725
n_h,t(γ = 1000; σ = 1)	0.102506	0.036343	1.8039	6.0897
n_f,t(γ = 1000; σ = 1)	0.130105	0.043030	2.2688	9.2725
n_h,t(γ = 1000; σ = 2)	0.129993	0.146939	1.8403	6.6096
n_f,t(γ = 1000; σ = 2)	0.224367	0.174046	2.3668	10.5098

The main results from these simulations are as follows. First, reinforcement learning introduces high kurtosis and skewness⁶ in macro variables. Second, reinforcement learning coupled with higher volatility of exogenous shocks results in the numbers of rational agents increasing from the estimated deterministic steady-state value of 0.093 and 0.099 to 0.13 and 0.22 for households and firms, respectively, in the stochastic steady state. Third, given that bounded rationality is a welfare-reducing friction in these models, it follows that volatility can actually be welfare-increasing in our homogeneous expectations setting.

Perfect Versus Imperfect Information

The seminal paper on the general solution of linear RE models assuming perfect (aka full) information (the standard assumption) is provided by Blanchard and Kahn (1980) showing existence and conditions for uniqueness.

Perfect information means that at time t, all agents have full information about all the state variables of the system. Conventional estimation is performed under the assumption that agents have perfect information (including shocks), but econometricians do not. Thus there is an inconsistency about information available to agents and econometricians. Here we adopt the informational consistency principle, which states that agents and econometricians have the same imperfect information set. Thus if econometricians do not have current data on technical progress, then it is assumed that agents also do not have this.

Angeletos and Lian (2016) provide an important survey paper on what they refer to as incomplete information literature. Here a comment on terminology is called for. Our use of perfect/imperfect information corresponds to the standard use in dynamic game theory when describing the information of the history of play driven by draws by nature from the distributions of exogenous shocks. Complete/incomplete information refers to agent’s beliefs regarding each other’s payoffs and information sets. In our set-up, the latter informational friction is absent.

Minford and Peel (1983) were the first to show the importance of information sets for the IRFs and second moments of RE models. Pearlman et al. (1986) generalized this for the general linear model. Pearlman (1992) extended this to optimal policy for fully optimal and time-consistent rules. Kalman filter ‘learning’ is central; see Hamilton (1994) and Adam and Billi (2006). Pearlman and Sargent (2005) and Levine et al. (2023) extend the representative agent II solution to a heterogeneous agent framework with diffuse information and show that a finite-space solution is available. The solution procedure of Pearlman et al. (1986) is applied in Collard and Dellas (2004, 2006, 2007), Levine et al. (2012a, 2012b) and Cantore et al. (2015). Following on from Pearlman (1992), Svensson and Woodford (2001, 2003) investigate the properties of the optimal solution under II. Ellison and Pearlman (2011) show e-stability (convergence to RE equilibrium under imperfect information). II is distinguished from the rational inattention literature, in which information assumptions are imposed, whereas in the latter, the acquisition of information was endogenous. See Sims (2005) and Mackowiak and Wiederholt (2009, 2011).

Why II? Some Empirical Motivation

Evidence from Forecast Surveys, Outcomes and Forecast Errors. See Coibion and Gorodnichenko (2012, 2015), Coibion et al. (2018) and Angeletos and Sastry (2020). The main finding: Initial under-reaction of beliefs in response to shocks followed by delayed overreaction.

Bayesian Estimation of DSGE Models. II improves data fit compared with PI. See Collard et al. (2009) and Levine et al. (2012a).

Real Effects of Monetary Policy

II with the diverse information pricing model predicts highly persistent effects on real activity in contrast to the Phelps–Lucas model. This results in a hierarchy of expectations as seen in beauty contest models (forecasting the forecasts of others). To show this, we consider the following model from Woodford (2003).

In log-linear form, let q_t be an exogenous process for nominal income and y_t be output. Then the Lucas Philips curve is

\begin{array}{l} γ_{t} = ξ (q_{t} - E_{t} q_{t}) with common knowlege \\ = \sum_{k = 1}^{\infty} ξ {(1 - ξ)}^{k - 1} (q_{t} - q_{t}^{(k)}) with diverse information \end{array}

where q_t^(k) ≡ Ē_t[q_t^k^–1] is the k-order average expectation of the k–1 order average expectation. These higher-order expectations result in persistent effects of surprises without introducing other features such as Calvo or Rotemberg pricing.

Empirical Results

The Wilderness of Non-rationality

This section demonstrates the need for robust policy design using a special case of the four models for which in a balanced growth deterministic steady state both net inflation and growth is zero. Then about such a steady state, the linearized models take the form:

Myopia-RE models

x_{t} = M E_{t} [x_{t + 1}] - (r_{n, t} - E_{t} π_{t + 1} - r_{n, t}^{*}) + u_{t}, (IS curve)

(52)

π_{t} = ϐ M^{f} E_{t} [π_{t + 1}] + k x_{t}, (Phillips curve)

(53)

r_{n, t} = r_{n, t}^{*} + ϑ_{π} π_{t} + ϑ_{x} x_{t} (Taylor interest rate rule)

(54)

where x_t is the output gap, π_t is the gross inflation rate, r_n,t is the nominal interest given by the original Taylor rule (ϑ_π = 1.5, ϑ_x = 0.2), r^*_n,t is its natural rate, u_t is a demand push shock, m = M^f = 1 for the RE case and M < 1, M^f < 1 for the myopia case.

To formulate possible heuristic rules that encompass those in these papers, we draw upon the general form of adaptive expectations from Anufriev et al. (2015) discussed in the ‘Behavioural Macro models’ section that takes the log-linear general form

E_{t}^{*} (y_{t + 1}) = {[E_{t - 1}^{*} (y_{t})]}^{1 - λ_{y}^{1}} {[y_{t}]}^{λ_{y}^{1} + λ_{y}^{2}} {[y_{t - 1}]}^{- λ_{y}^{2}}, 0 < λ_{y}^{1} < 1, - 1 < λ_{y}^{2} < 1.

(55)

This encompasses simple adaptive expectations ( $λ_{y}^{2} = 0$ ), ‘trend extrapolation’ ( $λ_{y}^{1} = 0$ ), and a ‘fundamentalist’ rule $(λ_{y}^{2} = λ_{y}^{1} = 0) for which E_{t}^{*} (y_{t + 1}) = E_{t - 1}^{*} (y_{t}) =$ the model’s steady state. In the latter paper parameters $λ_{y}^{1} and λ_{y}^{2}$ are modelled as changing overw time, as the agents repeatedly fine-tune the rule to adapt to the specific market conditions. In their paper, this learning is embodied as a heuristic optimization with a genetic algorithm procedure and introduces the individual heterogeneity to the model. In our paper (as in much of the behavioural macro-literature), we embody the rules with fixed parameters into a representative agent DSGE NK model and allow the data to pin down their values in the estimation of the model.

EL models

x_{t} = E_{t}^{*} [x_{t + 1}] - (r_{n, t} - E_{t}^{*} π_{t + 1} - r_{n, t}^{*}), (IS curve)

(56)

π_{t} = ϐ E_{t}^{*} [π_{t + 1}] + K x_{t} + U_{t} (Phillips curve)

(57)

plus (54) as before where E_t^*(x_t _{+ 1}) and E_t^*(π_t _{+ 1}) are given by the general adaptive expectations rule (55) with y = x, π, which reduces to the simple adaptive expectations rule by putting $λ_{y}^{2} = 0$ . We refer to these two cases as GAE and SAE, respectively.

In Figures and 2 and 3, parameter values are set at their priors used later in the estimation. The demand shock follows an AR(1) process with persistence ρ_u = 0.75. These two graphs clearly illustrate the absence of robustness for the original Taylor rule, both in terms of the impulse responses to the demand shock in Figure 2 and the policy space that gives determinacy and stability in Figure 3. This clearly demonstrates the need for robust policy design across competing models (see Deák et al., 2023).

Does Imperfect Information Improve Data Fit?

We estimate five NK models with different assumptions regarding expectations and information summarized in Table 2. For the RE agents in either the ‘pure’ or composite RE–BR model, we compare the PI or II assumptions.

For each of these five models, Bayesian methods are employed to separately estimate the model parameters using Dynare adapted to handle II.⁷ The sample period is 1984:1–2008:2, a subset of that used in Smets and Wouters (2007), which is also used extensively in the empirical and RBC literature. These observable variables are the log differences of the real GDP (GDP_t) and the GDP deflator (DEF_t), and the federal funds rate (FEDFUNDS_t). All series are seasonally adjusted and taken from the FRED Database available through the Federal Reserve Bank of St. Louis and the US Bureau of Labor Statistics.

We first focus on Pure RE, Pure BR(AU) and Comp RE(PI)–BR(AU) when RE agents have a PI set. We employ the Bayes factor (BF) from the model marginal likelihoods to gauge the relative merits across the three models in Table 3.

Figure 2.

Impulse Responses Comparison Between Four Log-linearized Models to a Demand Shock.

Figure 3.

Note: Green region is determinacy/stability and red region is indeterminacy or instability for EL models

Table 2.

Summary of Estimated Models.

Model	Description
Pure RE(PI)	NK RE model under PI
Pure RE(II)	NK RE model under II
Pure BR(AU)	NK BR model with AU learning
Comp RE(PI)-BR(AU)	Composite model with RE(II) and BR(AU) learning
Comp RE(II)-BR(AU)	Composite model with RE(II) and BR(AU) learning

Table 3.

Log-likelihood Values and Posterior Model Odds: RE Agents with PI.

Model	Pure RE(PI)	Pure BR(AU)	Comp RE(PI)-BR(AU)
LL	1656	1666	1672
Prob	0.0000	0.0034	0.9966

The BR models—Pure BR(AU) and Comp RE(PI)-BR(AU)—all substantially outperform, their RE counterpart, which is firmly rejected by the data. Formally, using the Bayesian statistical language of Kass and Raftery (1995), a BF, the quotient of the probabilities reported, greater than 100 (marginal log-likelihood difference over 4.61), offers ‘decisive evidence’. Thus, we have decisive support for the pure BR and some composite behaviour from the US data we observe. The BF difference between the non-RE models is also strong.

Next we assume a II set for the RE agents: I_t = [Y_s₋₁, Π_s₋₁, R_n,s], s ≤ t. An important point to stress is that this is the same information set we assume for BR agents when they come to update their heuristic rule. In this sense, we now have informational consistency across BR and RE agents, and also with the econometrician estimating the model. This feature, we believe, is new for the heterogeneous behavioural NK model literature. The results for the likelihood race are reported in Table 4.

Table 4.

Log-likelihood Values and Posterior Model Odds: RE Agents with II.

Model	Pure RE(II)	Pure BR(AU)	Comp RE(II)–BR(AU)
LL	1692	1666	1708
Prob	0.0000	0.0000	1.0000

A very different picture now emerges when comparing the RE model with the behavioural alternatives. Two results are worth noting. First, RE with imperfect information (Pure RE(II)) wins the likelihood race against both Pure BR(AU) and Pure RE(PI). Again, in formal Bayesian language, the RE(II) model decisively dominates the pure BR-AU learning model and, not surprisingly, decisively dominates RE(PI), a finding that is consistent with that in Levine et al. (2012a). The second interesting result is that, when the composite heterogenous expectations model is estimated assuming the same II information set for everyone (Comp RE(II)–BR(AU)), it generates the highest log-likelihood value and outperforms all the competing models in fitting the data.

These results suggest that persistence can be injected into the NK model to improve data fit in two contrasting ways: bounded rationality with learning through heuristic rules or retaining RE but with II and Kalman-filtering learning.

Concluding Remarks

Our results for the workhorse NK model suggest a new perspective for the macro/NK/learning literature. Avenues for future work could embed the RE–BR composite model into a richer NK model along the lines of Smets and Wouters (2007), extend the linear Kalman filter to accommodate the non-linearity in reinforcement learning and use non-linear estimation methods to identify a number of parameters that cannot be identified using linear Bayesian estimation. The latter two non-linear extensions are major challenges. Future work could also examine optimal monetary policy and follow Geweke and Amisano (2012) and Deák et al. (2023) to address what has been called the ‘wilderness of non-RE’ to design a robust rule across all the BR model variants discussed in the article.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

Notes

References

Adam

, & Billi

R. M.

(2006). Discretionary monetary policy and the zero lower bound on nominal interest rates. Journal of Monetary Economics. Forthcoming.

Adam

, & Marcet

(2011). Internal rationality, imperfect market knowledge and asset prices. Journal of Economic Theory, 146(3), 1224–1252.

Angeletos

G. M.

, Huo

, & Sastry

K. A.

(2020). Imperfect macroeconomic expectations: Evidence and theory. Technical report. NBER Macroeconomics Annual.

Angeletos

G. M.

, & Lian

(2016). Incomplete information in macroeconomics: accommodating frictions on coordination. Handbook of macroeconomics. Elsevier.

Anufriev

, Hommes

, & Makarewicz

(2015). Simple forecasting heuristics that make us smart: Evidence from different market experiments (Working Paper Series 29). Economics Discipline Group, UTS Business School, University of Technology, Sydney.

Benchimol

, & Bounader

(2019). Optimal monetary policy under bounded rationality. (Technical report, Working Papers 19/166). IMF.

Blanchard

(2009). The state of macro. Annual Review of Economics, 1(1), 209–228.

Blanchard

(2016). Do DSGE models have a future? Policy Briefs PB16-11, Peterson Institute for International Economics.

Blanchard

, Dell’Ariccia

, & Mauro

(2010). Rethinking macroeconomic policy. Journal of Money, Credit and Banking, 42(s1), 199–215.

10.

Blanchard

, Dell’Ariccia

, & Mauro

(2013). Rethinking macro policy ii (Staff Discussion Notes 13/03). IMF.

11.

Blanchard

O. J.

, & Kahn

C. M.

(1980). The solution of linear difference models under rational expectations. Econometrica, 48(5), 1305–1311.

12.

Blanchard

, & Summers

L. H.

(2017). Rethinking stabilization policy: Evolution or revolution? (Working Papers 24179). NBER.

13.

Branch

W. A.

, & McGough

(2010). Dynamic predictor election in a new keynesian model with heterogeneous agents. Journal of Economic Dynamics and Control, 34(8), 1492–1508.

14.

Branch

W. A.

, & McGough

(2018). Heterogeneous expectations and micro-foundations in macroeconomics. In Handbook of Computational Economics, volume 4. Elsevier Science.

15.

Brock

W. A.

, & Hommes

C. H.

(1997). A rational route to randomness. Economet-rica, 65, 1059–1095.

16.

Calvo

(1983). Staggered prices in a utility-maximizing framework. Journal of Monetary Economics, 12(3), 383–398.

17.

Cantore

, Levine

, Pearlman

, & Yang

(2015). CES technology and business cycle fluctuations. Journal of Economic Dynamics and Control, 61(C), 133–151.

18.

Choi

J. J.

, Laibson

, Madrian

B. C.

, & Metrick

(2009). Reinforcement learning and savings behavior. Journal of Finance, 64(6), 2515–2534.

19.

Christiano

L. J.

, Eichenbaum

M. S.

, & Trabandt

(2018). On DSGE models. Journal of Economic Perspectives, 32(3), 113–140.

20.

Cochrane

J. H.

(2016). Comments on ‘A Behavioral New-Keynesian model’. Technical report, Mimeo, Hoover Institution and Stanford University.

21.

Cogley

, & Sargent

T. J.

(2008). Anticipated utility and rational expectations as approximations of bayesian decision making. International Economic Review, 49(1), 185–221.

22.

Coibion

, & Gorodnichenko

(2012). What can survey forecasters tell us about informational rididities. Journal of Political Economy, 120(1), 116–159.

23.

Coibion

, & Gorodnichenko

(2015). Information rigidity and expectations formation process: A simple framework and new facts. American Economic Review, 105(8), 2644–2678.

24.

Coibion

, Gorodnichenko

, Kumar

, & Ryngaert

(2018). Do you know that I know that you know… (Working Paper No. 24987). NBER.

25.

Collard

, & Dellas

(2004). The new Keynesian model with imperfect information and learning. mimeo, CNRS-GREMAQ.

26.

Collard

, & Dellas

(2006). Misperceived money and inflation dynamics. mimeo, CNRS-GREMAQ.

27.

Collard

, & Dellas

(2007). The great inflation of the 1970s. Journal of Money, Credit and Banking, 39, 713–731.

28.

Collard

, Dellas

, & Smets

(2009). Imperfect information and the business cycle. Journal of Monetary Economics. 56(S), 38–56.

29.

De Grauwe

(2011). Animal spirits and monetary policy. Economic Theory, 47(2–3), 423–457.

30.

De Grauwe

(2012a). Booms and busts in economic activity: A behavioral explanation. Journal of Economic Behavior and Organization, 83(3), 484–501.

31.

De Grauwe

(2012b). Lectures on behavioral macroeconomics. Princeton University Press.

32.

Deák

, Levine

, Mirza

, & Pham

(2023). Negotiating the wilderness of bounded rationality through robust policy. School of Economics Discussion

Recent Developments in DSGE Modelling: Beyond FIRE

Abstract

Keywords

Introduction

Beyond RE: Equilibrium Concepts

Statistical Learning

Behavioural Macro-models

RE and Bounded Rationality in the NK Model

Households

Firms, Government Expenditures and Monetary Policy

Recovering the NK Workhorse Model

AU Learning and Market-consistent Information

Heterogeneous Expectations and Reinforcement Learning

Heterogeneous Expectations with Fixed Proportions of RE and BR Agents

RE vs RE–BR Composite Expectations with nh = nf = 0.5, λx = 0.25, 1.0; Taylor rule with ρr = 0.7, υpi = 1.5 and υy = 0.3, υdy = 0; Monetary Policy Shock.

Endogenous Proportions of Rational and Non-rational Agents: Reinforcement Learning

Perfect Versus Imperfect Information

Why II? Some Empirical Motivation

Real Effects of Monetary Policy

Empirical Results

The Wilderness of Non-rationality

Does Imperfect Information Improve Data Fit?

Impulse Responses Comparison Between Four Log-linearized Models to a Demand Shock.

Concluding Remarks

Footnotes

Declaration of Conflicting Interests

Funding

Notes

References

RE vs RE–BR Composite Expectations with n_h = n_f = 0.5, λ_x = 0.25, 1.0; Taylor rule with ρ_r = 0.7, υ_pi = 1.5 and υ_y = 0.3, υ_dy = 0; Monetary Policy Shock.