Sage Journals: Discover world-class research

Abstract

This paper explores the notion of complementarity in the modeling of social systems. Complementary variables in physics exhibit a duality, and similar dualities are found in the analysis of social, political, cultural, and neurophysiological structures. In this paper, I show that the management of uncertainty in a hierarchical model exhibits certain dualities that closely parallel those observed in social systems. I argue that, due to the necessity of coordination within collectives, agents will mirror each other to a large extent, and will develop heuristics and tactics that are aligned with how each agent is managing uncertainty dualities. I connect sociological literature on social structures with neuroscientific, economic, and psychological developments, and show a simplified derivation of these connections from free energy principles. I then explore how these connections can be used to gain insights into observable socio-cultural processes.

Keywords

Complementarity duality Bayesian learning cultural structure free energy

“It is of course the ambition of every experimenter […] to make a discovery, to sail safely between the Scylla of intellectual prejudice which makes us reject evidence not really integrated without preconceived notions, and the Charybdis of irrelevance which has swallowed many working days spent in pursuit of instrumental artifice. ( Deutsch, 1958 : 98)”

Introduction

A collective of autonomous agents has a deeply embedded complementarity: each agent can be viewed as an individual, or as the collective, but not both at the same time. This is simply because one is the inverse of the other: individuals cannot be a collective, otherwise they’d lose their individuality; and a collective cannot be only made of individuals, otherwise it would cease to be a collective. However, autonomous members of a collective (agents) are both, at the same time: individuality and collectivity are inextricable (Breiger, 1974). In order to handle this inextricability and complementarity, an agent can make use of a hierarchical model, which involves a duality or tradeoff between abstract/coarse and concrete/precise representations. Using a concrete representation for itself as individual, and a more abstract representation for the collective, an agent can model both at the same time. However, the agent must carefully trade off these two complementary representations, to avoid being either swallowed by the Charibdis of concrete details and losing sight of the collective, or foundering on the Scylla of abstract social demands and losing sight of itself. The same tradeoff occurs in scientific discovery between abstract prejudices and concrete empirical details, elegantly expressed by Martin Deutsch as quoted in the epigraph.¹

This is the same tradeoff as between Berlin’s hedgehog and fox (Berlin, 1961). The hedgehog uses a single policy (or model) which works under many conditions (it curls up in a ball), while the fox figures out the entire current situation and works out a method for getting his way. The fox is a precise, fragile agent who focuses on the concrete, and the hedgehog is a coarse, robust agent focusing on the abstract. Foxy strategies, however, are favored by a Bayesian learner (MacKay, 2003), simply because the likelihood of data for the fox will be higher as it uses a lower bias (more precise) model. The fox is only able to handle a very limited number of situations, although it does these very well. The hedgehog is able to handle many more situations, but does not perform optimally in any of them. The same tradeoff appears in machine learning as the bias/variance tradeoff (Bishop, 2006): learned patterns in data can be underfit (high bias, abstract, low variance models, hedgehogs who give power to the model), or overfit (low bias, concrete, high variance models, foxes who give power to the data).

It is apparent that the same complementarity exists in the brain: abstract and concrete construals are complementary and inextricable from a neuroscientific perspective (Gilead et al., 2020). While concrete construals are complex and precise (e.g., a set of features, preferences, and probability distributions representing a domain ontically grounded in the world), abstraction allows for easier, more general, but less precise action selection (e.g., a general-purpose policy of action or apparently heuristic shortcut). Abstract construals gain importance in social situations due to an inability to estimate risk over concrete construals (ambiguity, Knightian uncertainty). Abstractions of social situations, however, are not descriptions of an objective world, but rather of an intersubjective, cultural, one. Part of the argument I present is that, for these abstractions to be useful, they must be shared among members of a collective, and so their tradeoffs between abstract and concrete construals will be similar.

Bohr (1950) may have been the first to propose that the complementarity between individuality (concrete) and collectivity (abstract) was a fundamental aspect of human social organization. Stephenson (1986a, 1986b) furthered this line of inquiry by describing social complementarity as between social structure and cultural structure, and Martin (2009) describes it succinctly in terms of passing from individual, instrumental, concrete tendencies or heuristics that shape the relationship between persons, to collective, relational, abstract tendencies that shape the relationships between sets of persons. That is, the social structure “allows for a generalization of the principles of action inherent in the structure, and the relationship seems like a ’thing in itself’ that people can orient to, rather than orienting to one another” (Martin, 2009: 336). Martin (2009: 340) also notes that the “duality of heuristic and structure becomes generalized to the duality of value and position.” While heuristics and structures play a role when focusing on the individual, social evaluation (value), and social identity (position) play the same role when focusing on the collective. Once this shift has been made, an institution is created where individual heuristics become “unmoored” from their original social structure, and by entering the cultural structure, “it is these free-floating heuristics that we call institutions” (Martin, 2009: 339).

Martin (2009) also finds, based on the earlier sociometric work of Feld and Elmore (1982) and White (1992), that these tradeoffs are not binary, but ternary. Through analysis of three heuristics that are used by people to form collectives, Martin reveals that the complementarity between social and cultural structure may be expressed in three different ways in social groups. Martin shows how these three heuristics are mutually exclusive, in that using more of one means using less of the others, a ternary complementarity.² A social group is constrained to place shared weight on each heuristic, leading to a social organization for that particular balance of heuristics. I draw a parallel in this paper between these “styles” of social organization and three notions of freedom or equality arising in political science (Anderson, 2017). As agents adjust their representations to be more collective, higher bias models increase the equality of agents, while reducing their freedom to be individuals. These same higher bias models (hedgehogs) are more secure and robust against changing environments, that is, they have lower variance. Since there are three ways for agents to “be” a collective, there are three associated types of freedom and equality, as agents adjust three, mutually complementary tradeoffs (Hoey, 2021).

I show in this paper that these ternary complementarities are equivalent, and can be modeled with a two-level Bayesian model. Using a free energy based approach, I show how three dimensions of uncertainty (or bias) are revealed as three sets of parameters in the hierarchical model. These parameters represent three capacities of agents for representing information at a concrete or abstract level. Finally, I connect the three parameters of the Bayesian model to the three heuristics of cultural/social structures, and the three freedoms discussed above.

I start by reviewing related material from economics and psychology, and then introduce the basic concept of complementarity, followed by neuroscientific, social psychological, sociological, and economic versions of it, knitted together as expressions of the same thing. I then discuss how three different complementarities lead to three social structures and three notions of political freedom and equality. Following this, I introduce a simplified free energy model that can be used as a computational basis for linking together these various complementarities. Lastly, I go through a set of descriptive examples showing how this computational model could be used to explain some social and political behaviors.

Related work

An epistemic division of labor is implied by the formation of a collective, one that is in constant flux as the world interacts with itself. This division of labor involves a tradeoff, between what is beneficial to the group and what is beneficial to the individual. Many modern economic, game theoretic and artificial intelligence theorists account for this by using a two-factor preference or utility function, with one piece for the individual utility, and a separate piece for the utilities of others in the group, which are assumed to be known (a classic original paper is by Fehr and Schmidt, 1999). However, Adam Smith called such a modification of preference a “feeble spark of benevolence” (Smith, 1759: 159), and pointed to a “stronger power, a more forcible motive, […] reason, principle, conscience” (Smith, 1759: 159), which he associates with a love of honor, and a desire to avoid scorn. I argue here that praise and scorn (and other social emotions like shame and guilt) are in fact ladled out by the person being praised or scorned. He looks at himself through the mirror of society³ and thus “man has […] been rendered the immediate judge of mankind” (Smith, 1759: 151). How should an agent know how to ascribe praise and scorn to his own actions? He is so guided through his learning process, and by emotional signaling from others, to construct exactly that world in which he will feel such praise and scorn for his own actions.

One may ask why should this desire for praise and loathing of scorn not simply be a modification of preference? To make use of such a preference, an individual would need to know which of his actions will elicit praise and which will elicit scorn. To do that, he would need to maintain a model of each other person he interacts with, in essence, a theory of mind (ToM), in which the beliefs, desires and intentions of other agents are inferred from their behaviors. Such approaches arise from a philosophical tradition of methodological individualism: each agent is considered to be an independent, intelligent entity who can believe, desire, and intend independently of any other agents. Intelligent systems based on methodological individualism typically maintain an (incomplete) model of each other agent (Doshi and Gmytrasiewicz, 2009; Sutton, 2022), and use inductive (theoretical) or deductive (simulation) processes to plan the best joint actions to take. In such approaches, agents and their behaviors are measurable, objective elements that are responsive to other agent’s actions, and these actions are completely under the actor’s control. Although the view of the mind as being able to infer a model of other minds has become nearly incontrovertible in psychology (Goldman, 2012), proposals that go beyond theory of mind are being investigated (Gallagher, 2008; Kiverstein, 2011; Overgaard, 2017; Slors, 2012), and the ideas presented here fall into this category. These proposals are for a “minimal theory of mind” in which “we have one system for computationally efficient but inflexible mindreading and another system for flexible but cognitively demanding mindreading” (Slors, 2012: 20). In this minimal view, social emotions such as praise and scorn take center stage as strong forces for group participation. This view of synchronized diversity in groups means abandoning ToM as an explanation for individual behavior, advocating instead for a theory in which agent A internalizes marginal beliefs about what all other agents in A’s group feel that A should be doing. The marginal belief is an estimate about how other agents feel on average, collectively, and makes salient the possible futures it is consistent with. Individuals can only optimize preferences over these possible futures: “…all rational thought moves within a non-rational framework of beliefs and institutions” (Hayek, 1960: 269). How they do this is based on which sets of heuristics they use: how they manage uncertainty.

I emphasize that this marginal belief is not reducible to globally shared modifications of individual preference. Suppose that people had individual preference functions, each with an additional term representing the group. The additional term (primarily the weight relative to the individual term) needs to be shared among all group members. There is nothing preventing an agent in this situation to modify this shared weight to her advantage. When she does so, everyone does so, and the shared weight vanishes.

Mechanism design

The modern economic construct most similar to this idea is that of mechanism design in which a game is constructed so that the decision-theoretically rational move of each player includes adjustments for benefiting the group (Myerson, 1991). Each player acts as an individual, but the rules of the game are such as to limit his opportunity to only those actions that may also benefit the group. Most decision-theoretic approaches assume a principal, however, who fixes the mechanism. Further, it is assumed that the mechanism must be incentive compatible, that is, agents must prefer (have incentive) to follow the mechanism, given their individual utility functions. The approach I am presenting does not make either of these assumptions. Instead, the rules of the game are those patterns of neural activity that are brought into play by a given social context. Those not brought into play, those not made salient, and those that do not consume energy, are those which do not carry this group benefit alongside the benefit to the individual. The agent simply never notices that these other, non-social, options. No principal is needed (unless one considers the entire society as a principal), and agent preferences may not be aligned with the mechanism imposed on them. For example, many might prefer to go topless in the summer when it’s hot, but would simply not see that as an option, even though there is nothing preventing it. The social stigma associated with certain social prescriptions runs deep, and blinds many to opportunities. A similar process occurs in scientific investigation, in which a paradigm sets the rules brought into play, and the resulting scientific experiments benefit the accepted paradigm (Kuhn, 1962).

Dual-process models

The approach I am presenting in this paper is closely related to Bayesian dual-process models often used in cognitive science. Oaksford and Chater (2001) discuss how logical reasoning is often difficult for humans, especially in defeasible or ambiguous situations, and on how probabilistic models and Bayesian reasoning can give more parsimonious explanations for human deviations from rationality. The ideas I am discussing fall very clearly under the same umbrella, and tackle the important question of “the balance of System-1 versus System-2 processes in human reasoning” (Oaksford and Chater, 2001: 356). This balance is well studied in social psychology (Chaiken and Trope, 1999), but many different terms are used to refer to the two levels of processing. “Cognitive” processing is often referred to as deliberative, reflective (Ortony et al., 2005), conscious (Smith et al., 2019) or “System 2” (Stanovich and West, 2000), whereas “emotional” processing is called automatic, routine (Ortony et al., 2005), or “System 1” (Stanovich and West, 2000). Behavioral economists have brushed against computational dual-process models by proposing a variety of mechanisms that explain the experimental evidence of pro-social (e.g., cooperative) behavior in humans. Early work on motivational choice (Messick and McClintock, 1968) proposed a probabilistic relationship between game outcomes (payoffs) and cooperative behavior. This led to the proposition that humans make choices based on a modified utility function that includes some reward for fairness (Rabin, 1993) or penalty for inequity (Fehr and Schmidt, 1999). Modifications to the utility function based on identity have also been proposed (Akerlof and Kranton, 2000). A similar idea was taken up by Hoey et al. (2021), in which the sociological affect control theory (Heise, 2007) is used as a “System 1” process encoding societal norms and prescriptions.

Evans (2008) differentiates between parallel-competitive (PC) and default-interventionist (DI) dual process models. In PC, instrumental (System 2) plans of action are, if used sufficiently, hard-coded into an associative network that can later be quickly retrieved by System 1, and which then competes with ongoing System 2 reasoning for a given situation. In DI, the System 1 process sets a context in which the System 2 process can reason. The dual parallel constraint networks of Glöckner and Betsch (2008) are an example of DI in which the model has a “primary” network that rapidly settles to a maximally coherent description of the context and the revelation of an option to take, and a “secondary” network that is called upon if the primary network cannot find consistency.

In cognitive science, there are many examples of hierarchical or “dual process” models. Consider a recent example in Hawkins et al. (2023), which uses a Bayesian hierarchical model of language to show how a number of effects such as the convergence to efficient communication strategies in a repeated reference game. Their model has three levels and three associated parameters, much like the model I discuss in this paper. The results in Hawkins et al. (2023) are applied to a specific case of a reference game, meaning that elements of the model take on more specific meanings and functional forms, but the underlying social uncertainty principle is the same. Some results in Hawkins et al. (2023) are presented over a range of parameter settings, but the main results are shown for one specific setting that highlights the effect they are looking for. The model I am presenting in this paper shows how the parameters are arbitrary, but mutually constraining, and is able to model what effects will occur with different parameter settings.

Basic concepts

In this section, I review complementarity from mathematical, neuroscientific, and social-psychological perspectives.

Complementarity

In physics, conjugate variables are sets of variables that are complementary in that a signal can be represented in one or the other, but not both or neither.⁴ Further, it is often more helpful to frame a situation in terms of one feature or the other. Features f₁ and f₂ of some object are conjugate if (Goldstein et al., 2002).⁵

\begin{align} f_{1} & = derivative of the object’s action with respect to f_{2} \\ = d a / d f_{2} \end{align}

(1)

Features related in this way exhibit complementarity because, as the measurement precision of one feature increases, the precision of the other must decrease, and so there is an associated (non-quantum) uncertainty principle,

Δ f_{1} + Δ f_{2} = σ^{2}

(2)

where Δf_i is the (inverse) precision (or uncertainty, variance) modeled by feature f_i, and the sum of the two uncertainties must be the variance of sensory data, σ².

Neuroscientific complementarity

Gilead et al. (2020) describe a model of the functioning of the human brain as a hierarchical set of levels of abstraction, each of which has a two-fold structure. The abstractum is a general description of the situation at some level, while the concretum is the specifics of the situation at that level. A pair of concreta and abstracta lie in two distributed functional networks in the brain, and are mutually exclusive in the sense that information must be represented in one or the other, but not both. Shapira et al. (2012) review this as an “accuracy/detail” tradeoff that relates to the bias variance tradeoff in machine learning. As something unknown is represented in greater and greater detail (lower bias model, less abstraction), it will be more variant in the world (higher variance), meaning that training the model on different instances of the same abstract category will result in different models. As the unknown is represented more and more abstractly (higher bias model), it will be less variant in the world. Gilead et al. (2020) use an example of trying to predict what a blind date will look like. A low bias, high variance model would be a complete description of all the characteristics of the date (e.g., 5′11″, blonde hair, blue eyes), but it would often be wrong. A high bias model would be “like a human,” which would be a poor description of any specific date, but would be correct in almost all cases (low variance).

Thus, I conjecture that neuroscientific abstractum and concretum of Gilead et al. (2020) are conjugate:

abstractum = derivative of action with respect to concretum.

That is, the abstraction of a concept with respect to an action is the change of that action with respect to concrete instances of that concept. The action can be physical interaction with the world, and more broadly changes in mental or emotional state. As previously noted (footnote 5) it is more like a measure of energy flow than a discrete entity.

Consider also that neural processing (projection, posterior calculation) of abstracta and concreta will mutually affect each other, but the order in which they occur is significant. If I know exactly what the situation is (to precise detail, so this rules out most interactions with humans), then the abstractum is not necessary, and so is maximally variant. However, in an ambiguous situation, the concretum is more variant and the abstractum must be more precisely defined to obtain action. Conversely, if the abstractum is perfectly defined, all possible changes in the concretum still need be accounted for, and so the concretum would become maximally variant (relatively). If the abstractum is loosely defined, a single concretum is precisely specified. What this implies is that the degree of variation in abstractum and concretum must be related by a (non-quantum) uncertainty principle:

Δ A + Δ C = σ^{2},

(3)

where A = abstractum and C = concretum, Δ denotes the variance in the model, σ² is the actual variance in the agent’s environment (which we take as fixed). Thus, decreasing the variance of one of A or C means the variance of the other must increase to be able to handle the variance in the environment. If σ² changes, then one or both of ΔA or ΔC also has to change.

Equation (3) may be used to give a simple interpretation to the results in Shapira et al. (2012), in which it is shown that people use a more abstract categorization in situations of increased uncertainty. Indeed, increases in concrete uncertainty lead to ΔC increasing, which according to equation (3) implies that ΔA must be decreasing. As ΔA is associated with modeling uncertainty, this implies that the abstracta will be forced to use simpler models.

Social psychological and sociological complementarity

The social psychology of complementarity originated with Bohr (1950) and was later taken up by Stephenson (1986a, 1986b). Stephenson’s idea was to do a factor analysis (called Q) in the space of people (across variables of interest) rather than one (called R) in the space of variables of interest (across people) (Burt, 1940). In doing so, the complementarity of these two approaches was revealed. In the one, Q, the world is viewed in terms of a cultural structure, that is, the inter-relationships between networks of types of people (e.g., a neighbor is someone who is helpful and friendly). In the other, R, the world is viewed in terms of a social structure, that is, how specific individuals are connected to each other (e.g., I see my neighbor Chad every day) (Wallace, 1983). It is perhaps easiest to think of social structure as a property of individuals (e.g., I am connected to Chad and he is connected to me), while cultural structure is a property of the enclosing nested set of groups to which an agent belongs (e.g., the particular social practices and norms of my neighborhood, enclosed in the western cultural feelings about neighborliness more generally). Factoring people in the space of features (R) gives a subspace spanned by vectors of features. Factoring features in the space of people (Q) gives a subspace spanned by vectors of people. These two subspaces are complementary: one (R) is estimating variance across data in a space of features, while the other (Q) is estimating variance across features in a space of data.

Now consider the size of these spaces. There are potentially thousands of features one can use to describe a person: race, age, clothing style, eyelash length, etc. In appraising a person in terms of features (R), one gets a very high dimensional vector, leading to a computationally intensive process. In contrast, appraising a person in terms of people (Q), one can use a much lower dimensional space, resulting in less computational overhead. Since cultural and social structures are complementary, I may re-write the definition of complementarity (equation (1)) as:

cultural structure = derivative of action w.r.t. social structure.

The cultural structure defines how my action should change as a function of changes to the social structure. This duality between structure (“objective pattern of relationships”) and culture (“subjective understandings guiding relationship formation”) is discussed at some length by Martin, who points out that the content of a relationship “must be one that makes reference to [the] subjectively understood action imperatives…” (Martin, 2009, all quotes p. 17).

This type of reasoning is well modeled by the social psychological Bayesian Affect Control Theory (BayesAct) (Hoey et al., 2016), which fuses the sociological concepts of identity with a control principle and a computational implementation, and is a model that stresses emotional consistency as a driver of action (i.e., is something humans seek). BayesAct is a sociological model in that it assumes individuals are aligned with their social embeddings (groups they belong to), but allows for individual deviance from the social average. That is, an individual may not share these average social meanings, perhaps because they are from a different cultural or sub-cultural group (e.g., people who distrust Western medicine). I will not go further into details of this method, but I overview these models briefly in the supplemental material.

Three freedoms and three forms of dominance

The search for a mechanism that connects individuals with collectives can be approached from the collective side, or the individual side. In the former (this section), I start by describing different types of freedoms from a political perspective, based in the work of Anderson (2017). In the latter (next section), I start from individual heuristics for connecting with others, based in the work of Martin (2009). In both cases, the complementarities that I find are ternary, not binary. Once I sketch out these ternary complementarities, we will consider a Bayesian model that exhibits the same properties.

Anderson (2017) describes three types of political or personal freedom as positive, negative, and republican. Duals to these freedoms are measures of equality with corresponding labels, positive, negative and republican equalities. Table 1 shows these freedoms/equalities analyzed along various dimensions. The third row shows what freedom means for this type: what exactly is free? The fourth row shows what the equivalent notion of equality is, which requires some form of control: who/what ensures that people remain equal? The first two rows show the advantage that comes with the corresponding freedom (column), and the type of uncertainty experienced when that freedom is obtained. The last two rows show the type of certainty/security obtained when the corresponding freedom is relinquished, and it’s corollary, the way in which private property is protected. In the following, I give more details on each type of freedom/equality.

Table 1.

Three dimensions of freedom/equality in tabular form. Columns are the three freedoms, while rows are (1) the advantage gained by having that freedom; (2) the type of uncertainty; (3) the type of freedom (Graeber and Wengrow, 2021); (4) the corresponding type of dominance/equality; (5) the advantage gained by losing the freedom; (6) how private property is ensured when the freedom is lost.

	Positive	Republican	Negative
Advantage	Opportunity	Liberty; independence	Agency deviance freedom to move
Uncertainty	Uncertain knowledge	Uncertain safety	Existential uncertainty: Who are we? What do we feel?
Freedom	Freedom to explore	Freedom to disobey orders	Freedom to be, freedom to organize society
Control and equality	Control of information: (rational) plan	Control of violence: Guns	Control of identity: habits, rituals
Certainty and security	Rational plan expertise;technology	Monarch; force	Identities, habits, rituals, schismogenesis
Private property	By law	None (monarch owns everything)	By trust/respect everything is God’s and property is stewarded, not owned

Positive freedom implies opportunity, there are more options open to agents. Uncertainty is over knowledge, facts and future possibilities, of which there are many more in a state of positive freedom. Conversely, positive equality is maximized when everyone is acting in exactly the same way and the world is predictable and everyone is aligned. In such a state, everyone acts according to a single plan, which could be a rational plan. Typically, this type of positive equality is guaranteed through control of information, and security is ensured by the rational plan that is implicit in what information is provided. That is, people are given information sufficient for them to infer the correct plan, and no more. Private property is also secured by the rational plan. Rational plans can be practically implemented through laws, or through technology. “Order had been sought before, […] in drill, regimentation, inflexible social regulations, the discipline of caste and custom: after the seventeenth century it was sought in a series of external instruments and engines” (Mumford, 1934: 364). It is fairly well accepted now that in environments that present ambiguity or cognitive difficulty, agents rely more on social learning (Henrich, 2020: 64), which means they rely more on others. Thus, increasing positive freedom decreases negative and republican freedoms.

Republican freedom means people are not subject to anyone’s unaccountable will, and is also known as independence or liberty. This freedom to disobey orders (or simply a lack of orders), brings with it uncertainty about safety, as no one is constrained (to not interfere with other people). Conversely, republican equality means everyone is equivalent and dependent, but coercion is needed to ensure that agents follow the same plan: someone controls the coercion, typically with weapons. Security is provided by a prince or despot, such that all independence is removed by subjugation to the despot’s unaccountable will. People are treated equally, but must conform to the despot, and so republican equality implies homogeneity in the population. However, within that homogeneity, a smart and honest despot gives his subjects lots of opportunities (positive freedoms) and lets them have free choices (negative freedoms) but can impose an arbitrary will to ensure everyone is steering in the same direction. “Private” property is ensured because it does not exist: Everything is owned by the despot. Liberty means no one is forced to do anything, so to have a functioning society, order must be provided through a loss of positive freedom (fewer opportunities) or negative freedom (more social constraints).

Negative freedom is defined by agent’s freedom to choose actions, from whatever choices are available. This means a freedom to organize society in whatever way desired, but the result is an existential uncertainty. Negative equality, on the other hand, means everyone is highly constrained (to do only that which everyone else thinks they should be doing). Moving toward negative equality means removing people’s abilities to choose their own actions, but without coercive force. One way to do this is by defining social identities or roles, and then inducing stringent requirements on how actions should be coherent with these identities. If these culturally approved dynamics become institutionalized, they remove negative freedom of individuals to act in whatever way their will directs them. Thus, an increase in coherence between (seemingly self-imposed) actions and behaviors leads to a reduction in the space of actions under consideration, accompanied by a corresponding increase in negative equality in which actions are constrained by social prescriptions. These social prescriptions are usually emotional, for example, shame-inducing if broken, and can be ensured by a highly charismatic leader (or God) who sets the prescriptions. A state of “world closedness,” extracted from a state of “world openness,” is a result (Berger and Luckmann, 1967: 51). Importantly, the negative freedoms removed through cultural dynamics are hidden from the actor in the sense that she doesn’t even perceive them at all. In the words of Sartre (1943: 358): “In so far as I am the object of values which come to qualify me without my being able to act on this qualification or even to know it, I am enslaved.” Security is provided through the enactment of these social prescriptions, through habits, rituals and even schismogenesis. Private property is ensured through trust, by sentiment, by mutual respect, in somewhat the same sense as in republican equality. There is not really ownership, but rather stewardship of some communal property. For example, although I don’t own the house I live in (I rent it), everyone walking by would say “that’s Jesse’s house.” Negative freedom means everyone can be who they want to be, and social order must be ensured through reductions in liberty or opportunity.

The word “negative” sometimes causes confusion in this context, so I will clarify. Positive freedom means there are things you can do if you want: there is opportunity. Negative freedom means there are not things that you cannot do: you have agency. Often negative freedom is reduced by the values of the society an agent is embedded in. For example, everyone has the positive freedom to go topless in Canada: dress is not restricted by law, the intent of which is to increase equality between genders. However, many Canadians do not have sufficient negative freedom to go topless: the consequent social shaming or aggression may reduce their agency to do so. Republican freedom is easier to distinguish: it is a lack of coercion. The same Canadian is not coerced to not go topless in Canada, coherently with the law. However, in some countries, the same person may be coerced to cover up by threats (e.g., of imprisonment).

The distinction between negative and republican equality is examined in depth by Durkheim (2014/1893), who labels them as organic and mechanical solidarity. Durkheim’s analysis is primarily about the social changes occurring during the industrial revolution, where the collectivity shifted from mechanical solidarity (solidarity through similarity, homogeneity of the population, rule by a prince) to organic solidarity (solidarity through division of labor, heterogeneity of the population, specific and complementary roles for individuals). Further, Durkheim (2014/1893) points to a difference in how the law is applied in each society. While mechanical solidarity requires punishment or retribution to enforce conformity, organic solidarity requires restitutive legal measures to ensure sufficient equality. I return to these different forms in when discussing state emergence.

To summarize, the three freedoms are as follows,

• Positive freedom is opportunity—as positive freedom is increased, more opportunities for action are added. Opportunities for action are usually social opportunities—for example, other people. It is positive because as it increases, opportunities increase (added, positive).

• Republican freedom is freedom from violence, or having no one coercing anyone else to do something (usually with threats of force). It is also negative in the sense that it is increased by the relaxation of constraints (e.g., threats). However, these are concrete threats to a person or their property or their family, while negative freedoms are abstract constraints on the individual.

• Negative freedom is freedom from constraints from others (or the “freedom to build new social worlds” (Graeber and Wengrow, 2021)). It is negative because as it increases, agency increases, and constraints decrease (subtracted, negative).

Heuristics for forming collectives

Martin (2009) describes three mutually exclusive heuristics that are observed to occur in human collective formation, based on sociometric studies and reflecting the work of Feld and Elmore (1982) and White (1992). These three heuristics are used by humans when dealing with ambiguity, and I can connect them with the three freedoms discussed in the previous section as follows.

• Choose those who are chosen by others. This heuristic operates at the individual level and implies preferential attachment, meaning a reduction of positive freedom (there are fewer opportunities with only one patron).

• Choose alter if alter chooses ego. This heuristic operates at the level of dyads and implies reciprocity, meaning a reduction in republican freedom (to be really equivalent, the dyad must be controlled by a third party as neither can take control unilaterally).

• Form a collective or small world. This heuristic operates at the group level and implies transitivity, meaning a reduction of negative freedom (one must be labeled as a group member).

Martin’s analysis starts by noting that observable characteristics, if stable and easy to estimate, are the source of preferential attachment (e.g., obey the bigger agent). However, if such characteristics are unstable or difficult to estimate (as in ambiguous social situations), then more interaction is needed. Thus, a complementarity is revealed between how precisely definable the environment is, and how much interaction with others is needed. To complicate matters, one can interact with others in two different ways, based on dyadic relationships of equality or on collective relationships of equality. Once again, a ternary structure is revealed between three complementarities. The apices of this ternary structure can also be related to the three notions of freedom discussed above.

I note these also appear related to the set of three evolutionary heuristics proposed by Nowak (2006). Kin selection involves choosing those whom others (your kin) choose, evolutionarily. Reciprocity maps to Martin’s second heuristic, while Nowak’s indirect altruism encourages transitivity (A’s action on B is observed by C, such that C expects the same from A, depending on C’s relationship with B), as Martin’s third heuristic.

Uncertainty

The biggest hurdle faced by an agent attempting to optimize success by modeling their social world, is uncertainty. I take uncertainty to be grounded in the physical world, of which one component is the agent doing the optimization. Taking a Bayesian viewpoint, uncertainty is a degree of belief. However, it is also a property of the world, as in I am uncertain if the tree bough will break in this great wind. The tree bough’s structural integrity is not easy to predict.

Uncertainty is handled by people in three complementary ways, which correspond to three things at play: the individual, the group, and the connection between the individual and the group. Another way of saying this is the objective (external, the group) the subjective (internal, the self) and the connective (membership in the group). The representations of the social context in an agent’s brain or mind pervades reason and thought, and the way in which each agent in each context trades off the social and individual contexts will be defined by, and will define, the social order and thus reality: “the relationship between the individual and the objective social world is like an ongoing balancing act” (Berger and Luckmann, 1967: 134).

Returning to Table 1, the second row (“Uncertainty”) shows the different types of uncertainty, and in the fifth row, the corresponding certainty/security that comes with a reduction in that uncertainty/freedom. Positive freedom corresponds to uncertain knowledge: The external states are highly sensitive to the ecological niche, including the actor himself. Each agent has many different models, or ways of seeing the world, and can exercise discretion at which one is used in a given situation. The actor may be in a state with great uncertainty (i.e., waiting for a test result), or in one of great certainty (where he is an expert, or where he is following a rational plan designed by an expert). Republican freedom corresponds to uncertain safety: without a coercive force enforcing a plan, agent’s individual plans are likely to cross paths, possibly leading to conflict and danger. As this freedom is reduced, safety is increased through the protection of the prince’s use of force. Finally, negative freedom corresponds to existential uncertainty, in the model of the “self” (which may include a model of the world as well). High confidence is usually a hallmark of a lot of negative certainty, and may be increased through rituals and schismogenesis.

Thus, different uncertainties in a two level model lead to different freedoms and certainties in a social group. As I show in the next section, these three dimensions are a basic property of hierarchical Bayesian models, but it is my construction to map these three dimensions to three types of freedom. Since these three dimensions are properties of a Bayesian hierarchical model, they are highly likely to be at play in each individual’s brain, which means they’re highly likely to be at play in how cultural and social structures develop. I further am hypothesizing that these same mechanisms are at play collectively. That is, both group and individual must be doing this in the same way.

Free energy

In this section, I derive a simple Bayesian model used as a minimal working example that ties these different complementarities together. The key idea is that it is the management of uncertainty across a set of (at least two) complementary levels of abstraction that defines how a social system connects individuals with the collective. I approach from an information theoretic standpoint, starting with a discussion of free energy.

The complexity of an agent’s environment, defined as the total number of configurations accessible to the agent, is related to the free energy (MacKay, 2003). A configuration is one potential arrangement of the state of the world in the future. The accessible configurations are those that the agent can achieve as a function of its actions (and possibly exogenous events including the actions of other agents). The accessible configurations are defined by (and define) the agent’s econiche (Bruineberg and Rietveld, 2019). This is the expected set of temporally dynamic configurations of other agents and objects which are accessible to the agent in its current environment. The dynamic stability of these configurations is crucial to the establishment of a long-lasting equilibrium or steady state. I will say an agent is aligned with his econiche (and thus to all other agents and objects in it), if the agent is at an equilibrium in which the number of expected configurations is minimized (or at least the probability over them is maximized, and the free energy is minimized). An agent wishing to assess the free energy and to use it as a guide for action needs to model the relative probabilities of all the future configurations of the world. Whether these configurations are known or not, a successful agent must predict how likely each one is. Importantly, it must include itself within this prediction.

As an agent’s world becomes more complex (e.g., with the addition of other social agents), this true free energy becomes intractable to model within an agent’s resource bound, and so the agent must approximate. The agent, primarily interested in evaluating how surprising outcomes will be, will need to average over all possible future configurations, something that it cannot reasonably do. It can, however, construct something that is always bigger (more surprising) than the surprise of the environment, known as the variational free energy. An agent’s variational free energy is an internal measure of how well its (approximate) model fits the real world, combined with the true degree of surprise (the true free energy). A very well matched agent will not feel surprised very often, whereas a mis-aligned agent will often feel surprised. As surprise is a key factor in survival, this concept is supported by an intuitive evolutionary argument. In fact, even evolution is thought to follow a similar optimization of surprise, leading to species that are better and better adapted to their econiches (Badcock et al., 2019). Minimizing this variational free energy is the same as moving toward a state of true free energy which is minimal, and to a model that is the best possible abstraction of the real world given the resource bound, and is consistent with the econiche in which it is embedded (Bruineberg and Rietveld, 2019; Friston, 2010). Typically the variational free energy is optimized in an iterative approach where expectations over the future give a set of models are combined with expectations over the models (including the agent itself and its goals).

The variational free energy can be written by considering that, to survive, an agent can be viewed as constructing a generative model of its sensory readings, o, known as the evidence and written p(o). Agents want to maximize the evidence, because they do not want to be surprised by o. Agents can maximize this by minimizing −log p(o), which is the definition of surprise and of the true free energy: the degree of predictability and expressiveness of the future, from an absolutely collective perspective. The agent, to accomplish this, the agent can infer a set of states, s, which constitute its belief about the true state of the world, s*, and sum out these states before taking the negative logarithm,

- \log \sum_{s} p (o, s) .

(4)

Sometimes (as in the next section) this is conditioned on model parameters, −log ∑_s p(o, s|θ). This expression will be difficult to compute due to the summation, but we can use a variational method to approximate it. This involves multiplying through by a function q(s), and then using Jensen’s Inequality to derive the most commonly used expression for (what is now) the Variational Free Energy:

K L (q (s) ‖ p (s)) - \sum_{s} q (s) \log p (o | s)

(5)

The first term is the KL-divergence, or difference, between the variational approximation of the posterior, q(s), and the prior expectation p(s). This is the complexity given by the number of bits needed in the model to predict the future from the prior. Futures that are less predictable (from p(s)) require more complex models. The second term is the accuracy: the expectation of the outcomes given the state. Notice the same complementarity here between complex models that are highly accurate but fragile, and simple models that are not accurate, but robust: both optimize the free energy in different ways. The next challenge is that an agent cannot estimate p(o|s) in a predictive model simply because it has not observed o in the future, and must therefore do so in expectation given its current model.

Policies of action (mappings from sensor readings to motor commands) can be computed using this variational free energy approach, and yield a quite general class of normative model of intelligence as based on active inference (Ramstead et al., 2019). This class includes solution techniques that can be used on the type of deep Bayesian model I am introducing in this paper. The computation of a policy of action resolves to inferences about model parameters, coupled with inferences about expected futures. The optimization is usually termed the free energy principle, which states that agents try to maximize control of their perceptions. That is, agents use their output to try to control their input, and only indirectly have effects on the world itself. This is fundamentally different than the traditional approach in artificial intelligence and engineering of mapping from input to output. In the free energy approach, an agent’s primary goal is to build a sufficiently flexible internal model of the world in which he is embedded, and of his actions in that world and their effects. The agent can do this in two ways: by constructing a better internal model, or by modifying the outside world directly, both aimed at making the environment less surprising. One way to modify the internal model is through a set of parameters that governs how precise/sensitive the model is with respect to the incoming data distribution, roughly corresponding to the flexibility/security tradeoff I have been discussing in this paper. To ensure the model has sufficient flexibility and accuracy, a Bayesian learning approach is followed, as described in the following section.

Bayesian learning

A Bayesian learning problem is one of computing a distribution P(m|d) over models, m ∈ M, given a dataset, d. I can compute this using a prior over models P(m) and a likelihood function giving the expected data given each model, P(d|m). The Bayesian learning problem is then:

P (m | d) = \frac{P (d | m) P (m)}{\sum_{m \in M} P (d | m) P (m) .}

(6)

This learning problem is difficult because of the denominator on the right side, also known as the partition function, the negative logarithm of which is the free energy, G. This summation is over a potentially very large set of models, M. Each agent’s concern is to keep this summation as small as possible, but big enough to handle the potential variability of the complex system around it. An agent using a fixed model m (a non-Bayesian) would not need to do this summation, but would struggle if P(d|m) was constant across many possible datasets d, a condition signaling that the model is uninformative with respect to the domain: the weight of evidence for m is not there, it is challenging to predict future data. The Bayesian, on the other hand, would simultaneously have another model, m′ ∈ M for which there may be sufficient variability in P(d|m′) to allow the agent to learn a function that predicts future data. This (instantiated) m′ is a preferred model for its informational or other utility purposes, and for the Bayesian would become more salient (more likely), although the Bayesian agent still holds onto all other m ∈ M.

Simplified model

As a minimal working example, consider the graphical model shown in Figure 1, in which o is the observations: evidence coming from the boundary states or sensorium, interpreted denotatively (i.e., translatable into object labels that denote real objects in the world). While X is the concrete denotative state, related to the denotative evidence o, Y is a more abstract, connotative state (i.e., related to abstract “feelings”). Action by the agent requires some direct connection to the outside world, through some actuator, and so is part of X, but may be mirrored in the connotative state Y. This minimal model will allow a mathematical formula for the tradeoff to be derived, and will serve as a basis for the following sections.

Figure 1.

Graphical model used as a basis.

Parameters of this model are α, γ, and δ, which I consider here to be simple variances. That is, higher values of these parameters mean models that can handle more uncertainty, in the sense of more diverse inputs from the level below. In Figure 1, I have drawn the parameters (which parameterize the links pointed to) in squares, and the links between X, Y and O as generative. This is not necessary (the arrows can be removed) without changing the analysis substantially. I am deliberately avoiding introducing temporal thickness, which, although important for action selection in general, does not substantially modify the analysis I present here.

If using this model, an agent’s primary interest is to compute P(X = x|O = o, θ), or the probability of each of his states (including actions), x, given the evidence, o, and the model parameters, θ ≡ {α, δ, γ}. To do so, the agent must compute:

\begin{align} P (x | o, θ) & = \int_{Y} P (x, y | o, θ) \\ = [\frac{1}{P (o | θ)}] \int_{y} P (o | x, δ) P (x | y, γ) P (y | α), \end{align}

(7)

where P(⋅|θ) simply means this probability distribution is parameterized by θ. Using the axiom ∫_xP(x|o, θ) ≡ 1, I find that the normalizing factor is

P (o | θ) = \int_{x} \int_{y} P (o | x, δ) P (x | y γ) P (y | α) d y d x .

(8)

This is also the evidence, partition function, or e^−G, where G is the free energy, as described in the last section (since G = −logP (o|θ)).

To better understand the mechanics of this model, I show a simple example of equation (8) using uni-modal Gaussian distributions over continuous valued spaces. That is, I will assume the following form for the partition function:

P (o | θ) \propto \int_{x} e^{- \frac{{(o - M_{o} (x))}^{2}}{2 δ^{2}}} [\int_{y} e^{- \frac{{(x - M_{x} (y))}^{2}}{2 γ^{2}}} e^{- \frac{y^{2}}{2 α^{2}}} d y] d x,

(9)

where M_o and M_x are some unknown functions that map between o ↔ x, x ↔ y, respectively.

I now carry out the integrals in equation (9) in the simplest possible case where o, x and y all live in the same vector space. This is a highly unlikely latent space to discover, but any linear functions M would yield the same uncertainty relationships that I uncover with M = I, the identity function. I return to this question in the supplemental material, but for now, using M = I, equation (9) becomes

P (o | θ) \propto \int_{x} e^{- \frac{{(o - x)}^{2}}{2 δ^{2}}} [\int_{y} e^{- \frac{{(x - y)}^{2}}{2 γ^{2}}} e^{- \frac{y^{2}}{2 α^{2}}} d y] d x,

(10)

Each integral can be done by simply completing the squares (see supplemental material), yielding:

P (o | θ) = \frac{1}{\sqrt{2 π σ^{2}}} e^{- \frac{1}{2} \frac{o^{2}}{σ^{2}}}

(11)

where

σ^{2} = α^{2} + γ^{2} + δ^{2} .

(12)

This is a measure of an agent’s individual free energy. I assume that an individual lives in a society that defines a free energy minimum, C. This minimum is the one at which agents in the society “fit” together best (or the minimum they have found so far). Thus, the individual agent is tasked with keeping this quantity is minimal:

- \log (P (o | θ)) = \frac{1}{2} \log (2 π) + \log (σ) + \frac{1}{2} \frac{o^{2}}{σ^{2}} .

(13)

Returning to equation (5), this quantity is that over which the expectation is taken for the accuracy term in the agent’s approximation to the free energy (the variational free energy). Thus, decreasing σ to increase accuracy will result in an increase in complexity, which works in an opposing direction in terms of minimizing the variational free energy. This implies is that the true variance in o must be matched by σ² in order to minimize the entire free energy, that σ² = α² + γ² + δ² must be constant (at least as constant as the agent’s econiche), and thus that X, Y and O must be jointly complementary at equilibrium: each pair of variables are complementary and induce an duality, an uncertainty principle, such that all three together form a ternary complementarity. In the restricted case where all M = I, this means the parameters must lie on a simplex, as shown in Figure 2.

Figure 2.

Simplex on the three parameter dimensions of $\hat{α}, \hat{δ}, \hat{γ}$ . Also shown are the dimensions of freedom and equality, and poles of perfect freedom ■ and perfect equality •. The star is in the most central position possible for a group.

Any linear M would also work the same (see supplemental material), but nonlinear M would be yet another story. Temporal depth could be added to this analysis, only encumbering the equations with additional (historical) values for x and y (and possibly θ). Once this is added, the model becomes a hierarchical partially observable Markov decision process, or h-POMDP, which has long been studied in operations research, decision science, and artificial intelligence (Åström, 1965). POMDPs are foundational models for artificial intelligence, and come with a large array of machine learning methods for their learning, solution and usage (Boutilier et al., 1999; Kaelbling et al., 1998).

One interesting thing to note is that while the parameter θ lies on a simplex if one is attempting to keep free energy constant, the variance of the posterior P (x|o, θ) given by equation (7) must lie on a simplex, but in the “inverse” space of $\hat{θ} = {\hat{α}, \hat{γ}, \hat{δ}}$ , where $\hat{θ} = 1 / θ$ and similarly for its component parameters $(\hat{α} = 1 / α, \dots)$ . This is simply because the variable x is not marginalized over and so the (inverse) variances sum in quadrature. The two spaces are closely related, however, and there is a manifold in both that may not be planar in any case given our assumptions about M.

This overly simplistic model of information processing is meant only to show the three complementarities in the processing of information by a hierarchical Bayesian model. In the following, I will discuss examples of how this management of uncertainty maps to the dimensions of freedom and equality previously discussed.

Uncertainties and freedoms

I can now return to the discussion of heuristics and freedoms and draw a parallel by noting that, as one of an agent’s uncertainty parameters increases, the chances it will be using a similar model to others in its groups decreases, implying that the freedom of individuals (to be different from others) has increased. Conversely, if uncertainty decreases, then coordinated agents will be much less free to be individual, and will become more homogeneous, more equal. In this sense, freedom and equality are opposites. Increasing equality means decreasing freedom, and vice-versa. Returning to the hedgehog and fox, the fox represents freedom, individuality, flexibility and heterogeneity, while the hedgehog is equality, sociality, security and homogeneity. This pits the individual against the group, the “figure it all out alone” against the “do what everyone else does” strategies. While the fox sails close to the Charibdes of irrelevance, the hedgehog edges toward the Scylla of prejudice.

Now consider the simple Bayesian model from the previous section from an individual agent’s perspective. If $\hat{δ}$ decreases, this means P (o|x, δ) is more dispersed: more concrete situations can be handled by the model, opportunities and positive freedom are increased. If $\hat{δ}$ increases, it is the opposite, concreta are precisely modeled, and due to the sharing requirement, agents in a collective will need to align their models to ensure collaboration, leading to much more similar agents with fewer opportunities, more positive equality. Therefore, I conjecture that the lower level parameter δ corresponds to positive freedom (and so $\hat{δ}$ corresponds to positive equality).

Consider now if $\hat{α}$ decreases, this means P (y|α) is more dispersed: agents are more free to interpret situations as they please, to “be” who they want, and negative freedom is increased. Similarly to $\hat{δ}$ and positive equality, $\hat{α}$ increasing means increasing negative equality as people are required to interpret the world in more similar ways. Therefore, I conjecture that the higher level parameter α corresponds to negative freedom (and so $\hat{α}$ corresponds to negative equality).

Finally, if $\hat{γ}$ decreases, this means P (x|y, γ) is more dispersed: agents are more free to link concrete situations to abstract construals in any way they want. This corresponds to republican freedom as there is no one specifically dictating what this connection should be. Similarly, $\hat{γ}$ decreasing means an increase in republican equality, as everyone needs the same interpretation of events, which must be specified by a dictator. Therefore, I conjecture that the mid-level parameter γ corresponds to republican freedom (and so $\hat{γ}$ corresponds to republican equality).

Individuals, groups, and social movement

This section connects the individual and group using the Bayesian model introduced above, and discusses some ideas of how social movement and change may be represented. In the following, I will simplify and draw the simplex as shown in Figure 3(a), in which I use dashed arrows to denote movement through this cognitive space, but off of the simplex and toward the origin. Solid arrows are the opposite (on the simplex, or off the simplex toward ∞). I will use the notation $\hat{θ} ↑$ or $\hat{θ} ↓$ to denote increases or decreases in parameter $\hat{θ}$ (where $\hat{θ}$ may be any of $\hat{α}, \hat{δ}$ or $\hat{γ}$ ).

Figure 3.

Different configurations of a group’s movements: (a) movement of a social group across the simplex—dashed arrows are off the simplex toward the origin, while solid arrows are off the simplex toward ∞, (b) a small group of individuals moves coherently, and (c) incoherently. $\hat{α}$ = negative equality, $\hat{γ}$ = republican equality, and $\hat{δ}$ = positive equality.

Agents have incentive to “copy” their neighbors to a certain degree, as this a known route to survival (Deutsch, 2011; Henrich, 2016). Thus, I can draw (Figure 3(b)) a small group of people on the simplex, all moving in a coherent direction, or (c) I can imagine another small group moving in a rather more incoherent fashion. I can also summarize with a single arrow showing the whole group moving as in Figure 3(a).

I can now inquire as to what relationship these parameters and the simplex have to social organization and human intelligence, or how individual processing of information can lead to coordinated group behavior. To gain insight into this, I first look to James (1890) and Peirce (1955), founders of the pragmatic approach, who each discussed the nature of meaning as it relates to action. The pragmatic maxim holds that what is represented in the brain, language in particular, is not the objective world outside of us, but rather a complete description of the world outside and inside, and the relationship between the two (Peirce, 1955). For example, my concept of friend includes everything about a friendship, including what my actions might be within it. Once I (abstractly) decide someone is not a friend, I cross a very rapid gap in my comprehension of the world. This gap is what James’ referred to as the “flights and perchings” of thought (James, 1890: 243), and what underlies binocular rivalry (Hohwy et al., 2008). The gaps may also be the basis of transfer learning and mental travel, as I can cross the gap metaphorically, constructing potential solutions without having ever encountered the situation (Holyoak and Thagard, 1996; Lakoff and Johnson, 1980). Similarly, much phenomenological thinking revolved around the relationship between the world and an embodied agent’s perceptual and motor systems (Gallagher, 2020; Heidegger, 1927). The Gibsonian view, for example, was direct perception of the affordance of something, or how it is related to the perceiver’s actions (Gibson, 1986). Glenberg (1997) examines the role of memory and embodiment in making this connection, and is allowing for effective suppression of environmental noise, thereby allowing for embodied action.

The analysis above shows that a two-level Bayesian model has three variance parameters, and these three parameters are co-dependent given the goal of estimating free energy or a posterior probability distribution over agent actions (e.g., settings of actuators) given evidence (e.g., sensor readings). Their co-dependence means that increasing one leads necessarily to a decrease in at least one of the others, in order to maintain the same variance in the posterior. This negative or inverse co-variation can be represented with a simplex as described by equation (12). If members of a group were processing information differently, they would be making different tradeoffs between individual and group. While some degree of variation will be unavoidable, group members not “fitting the mold” (e.g., free riders) will be ostracized from the group. I therefore argue that group members will also share these model parameters, and will be located at roughly the same location on the simplex in their model space. In fact, the location of any society on the simplex is more like a cloud of points that, on average, sits roughly on the simplex, but may be overall more or less equal, or more or less equal on any dimension. One could then picture the global political world as a constellation of these clouds of points, each of which represents a particular group or subgroup (and there may be hierarchical structure within the clouds).

A group of agents who are processing information using parameters $\hat{α}, \hat{δ}, \hat{γ}$ ,⁶ can be located in a certain region on the simplex. Will these agents remain at this parameter setting? If not, which direction will they move? Recall that the “movement” of an individual (as part of a group) across the simplex means that the individual is changing how they are processing information. In the following, I discuss four ways that a group may move in the parameter space $\hat{α}, \hat{δ}, \hat{γ}$ . The first is the random process in which all individuals are processing information slightly differently, and being subjected to individual conditions. Thus, all members of the group are “muddling” through (Lindblom, 1959), yet remaining somewhat together. In particular, any group will have members both on and off any simplex, at least because of random exogenous events, and because the absolute scale of the parameters will be different for each individual. When an individual is off the simplex (relative to other group members), there will be a free energy pressure to return to it, which will be followed by the individual, thereby hopefully reducing the free energy of the group (although each individual cannot do this unilaterally). If one were to measure the average location of all members in the parameter space, movement of this average is possible given the underlying randomness, in a Brownian type of motion.

The second mechanism for social movement is larger-scale exogenous events which affect many group members at once. For example, a natural disaster throws whole groups of people off the simplex by massively decreasing $\hat{δ}$ : the stable, certain world is suddenly replaced by chaos. Sudden decreases in $\hat{γ}$ and $\hat{α}$ are also possible through sudden increases in coercive force (a military coup), or sudden decreases in public trust (a lockdown). A society so displaced must regroup, but will do so organically (each agent individually), and will arrive back on the simplex in a rag-tag group with an increase in variation. To smooth this variation out, one way may be through intensive emotional sharing of information, perhaps in effervescent ritual (Graeber and Wengrow, 2021). The result of this smoothing is then a unified position for the group. This view of history has no implicit tie to any changes in environmental or social/institutional ecologies. It is equally applicable to all eras of human society, requiring only a somewhat similar degree of computing power (e.g., restricted to homo sapiens).

There may also be physical, structural, or emotional constraints on a group that cannot be circumvented, and which reduce the “operational” region of the simplex in a third mechanism for social movement. For example, widely dispersed hunting and farming lands might implicitly prevent any authority developing (and thus prevent $\hat{γ} ↑$ ). Similarly, randomness in the natural world may limit any system’s tendency toward $\hat{δ} ↑$ . Structural forms at play in the society (i.e., the social structure or who is connected to who), may play a similar role in restricting the operational space. For example, certain social structures implicitly limit the amount of interaction that is possible between members. Since these structural forms arise from the parameters $\hat{θ}$ being used, there is a feedback process in which individual models create cultural and social structures, and these structures then modify the model parameters due to a change in interaction patterns. As pointed out by Martin (2009), some social structures even have a tendency to coalesce into other structures, moving across the simplex as they do so. Finally, genetic diversity might limit a group’s ability to increase some parameter in $\hat{θ}$ indefinitely.

In the fourth and last mechanism, different forms of inference may be more salient in different regions of the simplex and take more or less important roles at different junctures in history, harkening back to Hegel’s dialectic (Hegel, 1807). Such ideas have been explored in a free energy formalism (Beni and Pietarinen, 2021). I explore this idea further in Hoey (2022), but mostly have left this for future work.

Therefore, the structures at play in the society, formed by different processes of uncertainty management, as well as ecological and environmental pressures, combine to locate a group and to shape its trajectory. What humans do next is attempt to form larger and larger groups. Why do they do this? One reason is inequality between groups, leading to some wanting to submit their neighbors to their will, possibly due to a perceived need to differentiate their group from their neighbors (“we are different than them”). This “schismogenesis” (Graeber and Wengrow, 2021) often relies on “structures of refusal” and is there specifically to enhance group membership and motivated cooperative action.

Descriptive examples

Harkening back to Plato, the abstract idea of a generic object (say, a bed) has an existence of its own, and is superior in some way to all such objects (e.g., all individual, real, beds). The seventeenth century disengagement from this restriction allowed for a more sophisticated analysis of the nature of knowledge, resulting in its attribution to three factors proposed by Locke (1690: 343): intuition, agreement of ideas, and sensation. This is further reflected in Locke’s political philosophy in which he discusses three freedoms defined by three constraints: permission from others, the will of others, and the laws of nature. While the permission from others is held intuitively (I have the negative freedom to act how I see fit without explicitly asking permission, but social prescriptions restrict the space of actions I can reasonably take), the actions of the self must cohere with the will of others, restricting republican freedom (increasing homogeneity and republican equality). Finally, the laws of nature are unavoidable and must be adhered to, restricting positive freedom to carry out whatever strategy we desire (e.g., the laws of nature prevent me from doing anything that involves walking on water).

My goal in this essay has been to show how these three elements of the nature of knowledge (and freedom) are closely related to different ways of managing uncertainty in the human mind, shared across a group. Uncertainty may arise from the external world, leading to a lack of “sensation” knowledge, or from the internal (mental) world, leading to a lack of “intuitive” knowledge, or from the agreement between the two, leading to a lack of “agreement of ideas.” I have attempted to show that these three are mutually exclusive, but the exact tradeoff used between the three is somewhat arbitrary (conditional on the environment): a group may function as a society using any of the infinitude of possible settings of these three uncertainty management elements (subject to remaining on the simplex), so long as the whole group is using the same settings. If we accept this, there is no longer any right and wrong setting, simply a need to understand, accept, and adapt to groups using different modes of operation. In this final section, I will discuss how these settings result in different structures in human social networks, in particular: how states emerge, the dynamics of public policy, the function of a separation of powers, the causes of political polarization, and the categorization and legibility of rules. I close each sub-section with a summary showing relationships to freedoms, free energy, and the simplex. My aim is to show how the two-level Bayesian approach I have proposed can be used to model a wide variety of human social and political phenomena, demonstrating increased generality. However, as I mentioned previously, the model is an h-POMDP which has a suite of well-developed techniques for machine learning of the parameters. Thus, confirmatory or falsifying evidence for each example could be generated by learning the model from different datasets and exploring the resulting parameters.

State emergence

One important aspect of classical political philosophy centers around the idea of a “state of nature,” from which modern society is thought to have arisen. It is hypothesized that security is the prize for giving up freedom, and that increasing security leads to state formation. Nozick (1974) discusses state emergence by pointing out that a Lockean state of nature differs from a Hobbesian one in that it allows for an immutable law of nature. In Hobbes’ original state, it is a war of all against all, corresponding to pure freedom on all three dimensions: no one forces, nor even suggests what a person should do, and everyone is free to do anything at all, including murdering those who have something enviable (Hobbes, 1651). In Locke’s state of nature, however, there are certain immutable laws (such as the right of everyone to stay alive) which must be respected. However, these laws can never be complete, and so there must be some “invisible hand” that guides the resulting society (Locke, 1690).

Indeed, if we let $\hat{θ} \to 0$ , everyone will be following different policies and making different predictions, and as there is no one even suggesting to anyone to do anything in particular, these policies will likely conflict (e.g., over scarce supply of sustenance). One need not go all the way to the origin, as the same conflicts arise when a group “falls off” the simplex (due to an exogenous event, say), and then must “return” to it. Typically, being “off” the simplex results in battle that ends in some kind of resolution, of which there are three possibilities (Nozick, 1974: 16). In the following, I have labeled these possibilities using the corresponding type of freedom lost in each case, and treat the problem as one between two homogeneous groups that differ from each other but start from a Hobbesian state of nature $(\hat{θ} \to 0)$ .

Positive freedom lost

The two sides realize that the battle is unresolvable or at least inefficient, and create a set of laws and a bureaucracy to implement the laws. It must then be a common agreement that everyone follows the laws, and each individual loses opportunities as they must follow a pre-set path. As the paths are pre-set, $\hat{δ} ↑$ . However, ecological uncertainty comes into play, meaning that $\hat{δ}$ can only rise so far (as the world cannot be perfectly accurately predicted), and there always remains an ecological uncertainty gap and corresponding ecological uncertainty boundary shown in Figure 4(a), which must be resolved using one of the other two methods (i.e., by sacrificing republican freedom and having a retributive force, or sacrificing negative freedom and having restitutive institutions), or any admixture of the two (see discussion of Durkheim (2014/1893) above).

Figure 4.

Uncertainty boundaries crossed in state emergence: (a) ecological (unresolved conflict, laws and rules); (b) force (winner takes all, loser conforms); (c) social (shismogenesis, separation). $\hat{α}$ = negative equality, $\hat{γ}$ = republican equality, and $\hat{δ}$ = positive equality.

Republican freedom lost

A winner takes over and forces the loser to conform $(\hat{γ} ↑)$ . Rebels are handled retributively (i.e., with punishment). However, there is a force uncertainty gap and associated force uncertainty boundary, shown in Figure 4(b), which is the limit beyond which violence cannot go further (because it would destroy the entire population). Thus, the population must give up negative and/or positive freedom. There either must be some degree of identity and trust between people, or there must be some laws to which they adhere.

Negative freedom lost

Warring sides agree to disagree, and occupy different territories. The inhabitants of each territory (or clients of each protective association) are not free to redefine themselves as inhabitants of the other. They have lost this negative freedom of definition, and a schismogenesis occurs through structures of refusal: each party makes sure they do things to differentiate them from the other group (Graeber and Wengrow, 2021). The complementary identities that so form essentially modify the models used by each group, but they do so by making identities more precisely defined, such that $\hat{α} ↑$ . There is no authority needed in either territory (so $\hat{γ} ↓$ ) and within each territory everyone has all options normally available to them (so $\hat{δ} ↓$ ). Mercier (2020) discusses a generalization of this idea based on mutually cooperative information vigilance and sharing. However, there is a social uncertainty gap and corresponding social uncertainty boundary, shown in Figure 4(c) because $\hat{α}$ cannot get too high or else people lose complete touch with reality (and each person is isolated in their own cave). They must give up republican and/or positive freedom. Someone must be nominated to punish the transgressors, or else the two sides must agree to some rules to make up for their disagreements.

Nozick (1974) also considers in some detail how “meaningful” work and self-esteem are related to equality and freedom. For example, selling “green” products that are more expensive, but give the buyer and seller a sense of integrity. Consumers may band together to pay more for those products produced by the more “meaningful” workers. This is $\hat{α} ↑$ , since it must be a group decision to become “people who support meaningful workers” or even “people with integrity,” a more precise specification of what people can and cannot be. This is also known as self-esteem: being sure about one’s place in society (MacKinnnon, 2015). In such situations, self-esteem is driven by differentiation between people, such that pure equality seems to imply a lack of self-esteem. However, it is not necessary to go this far, because people don’t have to all achieve self-esteem in the same way. Thus, “The most promising ways for a society to avoid widespread differences in self-esteem would be to have […] a diversity of different lists of dimensions and weightings” (Nozick, 1974: 245). What this means is that different people can evaluate themselves on different dimensions. The existence of these different dimensions implies heterogeneity increases, $(\hat{γ} ↓)$ , but people can more reliably and easily establish a social identity with some degree of certainty $(\hat{α} ↑)$ .

Further, Nozick (1974: 247–252) discusses how meaningful work (as a proxy for self-esteem, essentially) may be achievable in three ways. The first way is described above, by increasing $\hat{α}$ and providing self-esteem through social reinforcement. Second, the workers themselves may choose to take less pay such that their more “meaningful” work distribution yields products at prices competitive with those firms who do not have such “meaningful” work. This is the same as $\hat{δ} ↑$ : the workers are deliberately hobbling themselves by restricting their options to only the meaningful ones. Finally, the society may be forced $(\hat{γ} ↑)$ to support meaningful workers, for example, with a new, enforceable, tax. In all cases, we have an increase in equality (one dimension has gone up).

Summary (state emergence):

Freedom	Different freedoms are restricted following conflict during state emergence
Free energy	Resolution of conflict leads to more predictable worlds, a decrease in free energy
Simplex	A group that falls off the simplex must return to it by sacrificing one or more freedoms

Public policy

The standard model of policy changes is one in which policymakers have static (unchanging) preferences, (also assumed in many economic theories emanating from the seminal work of Arrow (1951)), and that elections are the process by which the people insert different preference functions into the legislature. However, Jones and Baumgartner (2012) note severe deficiencies in this model based on the leptokurtic (with fatter tails than normal) shape of the distribution in budgetary commitments. Instead, they describe a theory of government information processing that has many parallels with the approach I am presenting. They focus on a bounded rationality model of information processing that puts emphasis on attention and salience rather than on rationality. The resulting “stick-slip” dynamics of punctuated equilibria has many echoes in the work of Taleb (2001) on unimaginable events (Black Swans) arising out of the fat tails of complexity, and is analogous to the scientific revolutions arising from a new paradigm of investigation: “scientific investigation [is] a succession of tradition-bound periods punctuated by non-cumulative breaks” (Kuhn, 1962: 207). Similar dynamics have long been noted in other fields including literature, music, and the arts (Kuhn, 1962: 207).

The idea is that agents have a bound on the amount of information processing they can do, as proposed by Simon (1967). These agents would like to be able to figure everything out rationally, but simply cannot. Whatever is left behind by them (whatever overflows from the cup of their mind) may then possibly accumulate in institutional practices, eventually emerging in a catastrophic or sudden and unexpected event in which policy changes dramatically. These sudden changes arise from the institution itself, but look like they arise randomly. Indeed, “In a not-unfamiliar story line, a problem festers ‘below the radar’ until a scandal or crisis erupts; policymakers then often claim ‘nobody could have known’ about the ‘surprise’ intervention of exogenous forces, and then scramble to address the issue” (Jones and Baumgartner, 2012: 7). In general, the individual agents will not keep up with the institutional change, as it is the institutions they participate in that are suffering from this skewness. The punctuated equilibrium model of Jones and Baumgartner (2012) gives an account of these stick-slip dynamics with a “panic” button that participants can hit. The panic button essentially overweights by a great deal one of the possible indicators that are being attended to. It is like a super magnifying glass that is focused on one policy (Jones and Baumgartner, 2012: 135). Such a panic button can be accounted for in the model I am presenting as follows. The accumulation of misalignment typically triggers a disruptive event that arises from within, for example, from a group member “misbehaving” and being civilly disobedient. Once the offset becomes too great, individual agents will start to defect, and once this emigration hits a tipping point, another model (another location on the simplex) takes over and re-balances the society, in what may be a sudden and drastic change. Following the trigger event, there are individuals making determinations about the events that occurred. In many cases, individuals may suffer high misalignment (movement off the simplex), which in turn causes change in parameters for at least some participants. Once these changes occur, they are difficult to reverse. If many people have the same shifts (which is likely, given the shared nature of the model, they will all be doing it so as to reduce the overall free energy of the group, which has been displaced by the trigger), then an overall shift occurs in the group, precisely because it is made up of the members making that shift. This can occur at the level of any group, including one of policymakers. These panics then induce events that overweight one indicator, and thus are the source of the leptokurtic distributions.

The model proposed by Jones and Baumgartner (2012) is compelling in the effects it predicts and the stick-slip dynamics seem to fit the evidence. What I am proposing fits well with their model, but the underlying reasons for the bound on rationality are different. In their Simonesque bounded rationality approach, each agent is struggling to rationally figure everything out, and only is capable of doing so much, meaning his attention must focus on one or the other option or model. The emphasis is on human inability to process information due to overload. In the more sociologically oriented approach I am presenting, the individual is not even trying to figure things out rationally, instead he places the emphasis on the group, such that the bound is not an involuntary weakness of individuals, but rather a voluntary (but partly group-oriented) choice to let the society in which an individual is embedded do the heavy cognitive lifting. Notice the paradox created here because the individual is the group.

In the view I am proposing, individuals can no longer arrogantly claim responsibility for the intelligence of the group, and must instead acknowledge the powerful role of socialization in creating and maintaining the longer-term momentum of evolutionary wisdom contained in the species as a whole. Indeed, presuming that individuals are responsible for consciously steering society where it ends up going, “reveals only the narrowness of an outlook uninformed by humility” (Polanyi, 1951: 199).

Summary (public policy):

Freedom	As a society becomes more and more skewed toward one type of freedom, eventually a reset occurs which sends them back toward the middle.
Free energy	Misalignment of agents causes increases in free energy (the accumulation noted above), and this in turn leads to tipping points at which it resets (decreases).
Simplex	Misalignment moves a society off the simplex, and the reset sends them back.

Separation of powers

The liberal democratic ideal of a separation of powers can be framed as a three-way tension across the simplex of parameters I have been discussing. This tension was first described by Montesquieu (1748/1989: 63): “In order to form a moderate government, one must combine powers, regulate them, temper them, make them act; one must give one power a ballast, so to speak, to put it in a position to resist another.” Here, I describe how this ballast can be viewed as uncertainty management.

While the democratically elected legislative body determines laws that reflect the understandings of people $(\hat{α} ↑)$ , the judiciary and courts implement specific cases of these laws $(\hat{δ} ↑)$ , and the executive or head of state enforces a proper application of the law to the specific instances, $(\hat{γ} ↑)$ . However, any one of these institutions may go too far, requiring a damping mechanism in the opposite direction, which is provided by the other two. These three damping mechanisms correspond to the minimum requirements for coercion pointed out by Hayek: “the prevention of violence and fraud, the protection of property and the enforcement of contracts, and the recognition of equal rights of individuals to produce […] and sell […]” (Hayek, 1960: 338).

• Prevention of violence and fraud: If the executive raises $\hat{γ}$ too high by unilaterally seizing power and enforcing certain values, the force uncertainty boundary is crossed with excessive violence, leading to a totalitarian regime where all property is the ruler’s. The law and legislature can try to dampen the ruler’s power by introducing new laws (raising $\hat{δ}$ ) and by making the will of subgroups of people more salient and public (raising $\hat{α}$ ), respectively.

• Protection of property and enforcement of contracts: If the legislature raises $\hat{α}$ too high by attempting to represent too much diversity of opinion, then the social uncertainty boundary is crossed and infighting between groups to set laws and elect rulers becomes untenable. The ruler and courts can try to dampen this effect, by ignoring certain subgroups (raising $\hat{γ}$ ) and by forcing subgroups of people to unite (raising $\hat{δ}$ ), respectively.

• Recognition of equal rights to produce and sell: If the judiciary raises $\hat{δ}$ too high by setting too many rules, then the ecological uncertainty boundary is crossed and exogenous events disrupt the system. The executive and the legislature then dampen this excess by disobeying the law by force (raising $\hat{γ}$ ) and pointing out subgroups of citizens being unfairly targeted by it (raising $\hat{α}$ ), respectively.

One can visualize these effects on the simplex by showing a force toward one apex countered by two forces toward the opposite apices, as shown in Figure 5(a)–(c), corresponding to the three damping mechanisms (Montesquieu’s “ballast”), respectively (Hoey and Schröder, 2023).

Figure 5.

(a) Excessive increase in $\hat{γ}$ by the head of state is countered by damping from the judiciary and legislature ( $\hat{δ}$ and $\hat{α}$ , resp.); (b) excessive increase in $\hat{α}$ by the people is countered by damping from the judiciary and executive ( $\hat{δ}$ and $\hat{γ}$ , resp.); (c) excessive increase in $\hat{δ}$ by the courts is countered by damping from the legislature and executive ( $\hat{α}$ and $\hat{γ}$ , respectively). Damping forces are shown as spring symbols and uncertainty boundaries are shown in red (see Figure 4).

Summary (separation of powers):

Freedom	As freedoms are eroded by one element of a democratic trinity, the other two can step in to restore them.
Free energy	As an uncertainty boundary is crossed due to the actions of one branch, groups are forced into higher free energy states, which can be resolved using the other two branches.
Simplex	As an uncertainty boundary is crossed due to the actions of one branch, groups are forced off the simplex and must find a way to return to it, often using the other two branches as tools.

Political polarization

A contradictory finding is that both non-exposure and exposure to opposing viewpoints increases political polarization (Bail et al., 2018; Facciani, 2020), but this may be partly due to experimental conditions making connotative interpretations more or less salient. Facciani (2020) argues that “including the variable with whom the participant discusses ‘important matters’ is crucial for accounting for these inconsistencies.” That is, whether the experiments ask for information about connotatively meaningful persons or not (a.k.a friends or strangers) is important for gauging the resulting effect. These different polarization mechanisms are also related to different forms of social capital (Putnam, 2000). While “bonding” social capital refers to like-minded people in homogeneous organizations becoming deeply connected to one another, “bridging” social capital implies trust of out-group members (Putnam, 2000: 358). In both cases, what distinguishes the individual to these two situations is precisely their connection to the people spreading mis-information. When the connection is more denotative (strangers), a bonding mechanism excludes these out-group others, resulting in polarization due to a confirmation effect (echo chamber). On the other hand, when the connection is more connotative (friends), then without a bridging mechanism to include out-group others, polarization results due to a conformity effect.

This same difference is noted in work on the political science of polarization, in which it is increasingly well known that there are two types of polarization being measured (Mason, 2013). While the first relates to the substantive issues being considered such as a position on social welfare or government size (a denotative, confirmation effect leading also to polarization), the second relates to partisan furor (a connotative conformity effect leading to polarization). While some have argued that the former causes the latter (Webster and Abramowitz, 2017), others have argued for the opposite (Mason, 2013, 2015). In particular, Iyengar et al. (2012) discusses using social identity theory (Tajfel and Turner, 1986) as a basis for interpreting partisanship and polarization, arguing that affective identity is an important aspect of polarized beliefs, as echoed by Facciani (2020).

The two effects are also found in research on threat and uncertainty, and are shown to be somewhat separable (Haas and Cunningham, 2014). While uncertainty decreases political tolerance in the presence of threat (conformity with the group leads to ignoring evidence from strangers), it increases political tolerance in low threat conditions (confirming evidence from friends is accepted). What Haas and Cunningham (2014) show is that if people feel safe (e.g., because they are surrounded by similar others and their values are not being confronted), they will be less politically tolerant when only non-conflicting information is presented, as in an echo chamber, but more so when conflicting evidence is presented. When people feel unsafe (they are surrounded by strangers and exposed to alternative values), then they will be less politically tolerant and will discard conflicting information and stick to their guns.

The effects of incongruence are also noted by Marietta and Barker (2019: Chap. 11), who showed that people are more willing to work with someone who shows congruent beliefs. This is because when people confront incongruence, they rely more on affective meanings, which would label this incongruent person as part of the out-group, and therefore not someone good to work with. The focus on affective meanings in this case trumps the potential benefits of working with this person, reflecting the notion that people opt for acceptance in a group over accuracy of evidence. However, Pennycook et al. (2021) has noted that while people will act according to this maxim (acceptance over accuracy), they actually prefer accuracy over inaccuracy. The key is that in order to engage in exploratory behaviors that would reduce inaccuracy (uncertainty), one needs to be in a socially and emotionally coherent situation. Once this coherence is disrupted (e.g., by the presentation of incongruent information from strangers), then focus returns to exploitation of existing structures, reliance on affective meanings and stereotypes, and epistemological laissez-faire (uncertainty is ignored instead of reduced).

Conflicting evidence increases uncertainty both connotatively and denotatively, while non-conflicting evidence does the opposite. That is, if an agent encounters evidence that contradicts her beliefs, she will feel less certain about the world (denotative) and less certain about herself and her group (connotative). Her free energy will increase and she will be knocked off the simplex. If the conflicting evidence is coming from a stranger, the agent can resolve the extra uncertainty by discounting the information (as it is from a non-trusted other), leading to a conformity polarization. On the other hand, if conflicting evidence is coming from a friend, then connotative uncertainty remains low while denotative uncertainty increases. The agent can only “double-down” on her friendship in this case, accepting the conflicting evidence, increasing political tolerance, and strengthening her friendship (the alternative is to disregard the information and forgo the friendship). Contraversely, if non-conflicting evidence is coming from a friend, uncertainty is decreased both connotatively and denotatively, leading to increased polarization through confirmation (echo chamber).

To take a simple example, consider an agent who believes the earth is flat. Suppose this agent is presented with evidence from a stranger that the earth is round. The agent is likely to disregard this conflict, and conform to his in-group of flat-earthers, reinforcing his beliefs. On the other hand, the agent would easily accept evidence of the world being flat coming from a friend, confirming his beliefs and affirming his friendship (a.k.a. group membership). Contraversely, evidence that the earth is flat from a stranger may decrease connotative polarization (maybe these strangers are really friends?), and evidence that the earth is round from a friend may decrease denotative polarization (maybe the earth really is round!). If I replace “the earth is flat” with any public policy decision (e.g., control guns, legalize drugs), and the same effects are expected to hold.

Such considerations are also taken up in the study of how groups integrate conflicting evidence, often becoming more firmly entrenched in their original beliefs, in a “backfire” like effect. Hahn (2024) uses a naïve Bayes model to implement the integration of evidence, some of which may be conflicting, and shows how this can account for belief polarization in groups. Such a model falls under the umbrella of the type of hierarchical model I am considering in this paper, but does away with one parameter, leaving only two. Hahn et al. (2020) does a similar analysis but looks at network structures and how this affects the flow of information. In an agent-based simulation, they create agents with parameter sets (credences) drawn from a narrow distribution around uninformative (high entropy) distributions. Essentially, in my model, they are starting all agents off from the exact same point on the simplex. This is restrictive in the sense the human groups may be far more diverse than this. The model I am proposing considers the non-homogeneous case, where parameters may not be shared, or may be significantly different in different groups. This could be used to extend the analyses in Hahn, for example, by considering multiple groups with different parameter settings, and how this can lead to conflict due to different interpretations of a situation (see preliminary examples in (Hoey and Schröder, 2023)). Falandays and Smaldino (2021) use a similar model to Hahn (2024) (a “mixture of Gaussians” which is a naïve Bayes model with continuous outputs), implemented in a network of agents. A range of parameter values is investigated, which the model I am proposing considers in the whole (and adds a second, more abstract level of representation).

In a similar vein, Connor Desai et al. (2020) examines a situation of informational conflict over time, resulting in the “continued influence effect” (CIE) in which misinformation is observed to persist even after corrections. The explanation given is based around the Bayesian probabilistic inertia of the perceiver’s hypothesis about whether some evidence from a source is real or not, given estimates of the reliability of the source. Again, three parameters surface in this model, which may be estimated as averages across all participants for any participant pool. However, for a different participant pool, and especially one from a different cultural background (there is no information on this given in Connor Desai et al., 2020), these parameters may be quite different, as may the predictions of the model. The model I am proposing is therefore at a higher level of generality in some sense than the specific model used by Connor Desai et al. (2020) to study CIE. While Desai’s model can be used to study the effects of misinformation on a specific group of participants, the model I am proposing generalizes this to any group of participants, with the important caveat that the parameters at play are not arbitrarily settable, but must remain on the simplex. Further, the model I am presenting can be used to model situations where people do not share these parameter settings.

Summary (political polarization):

Freedom	Conflicting evidence increases agent’s denotative uncertainty and positive freedom, which may lead to a compensatory decrease in negative freedom.
Free energy	Conflicting evidence increases free energy, which can be countered through polarization (reducing surprise from out-group members).
Simplex	Conflicting evidence may force agents off the simplex, and returning to it may involve increases in political polarization.

Categorization and legibility

As noted by Page (2007: 183), the “greater the manipulation envisaged [by the state of the people], the greater the legibility required to effect it.” Some of the biggest attempts at social engineering by high-modernist, rationalist, ideals were founded on the ability to categorize and count people. In order to make large-scale social changes as envisioned by high modernist thinkers, all people and all their activities need to be defined and represented in a way that makes the implementation of a rational plan possible. That is, the legibility of a people is intimately tied with the possibility of making the plan work. This state of positive equality requires a despot, but, as stated by Le (1964), the “despot is not a man. It is the Plan.” (quoted in Scott (1998: 112)). However, such excessive bureaucracy results in an “arbitrary, myopic layer of officials presiding over a dispirited workforce putting in a bad-faith day on the factory floor” (Page, 2007: 177), that is, the state pretends to tax the people, and the people pretend to work.

Two things are at work here, both as a direct result of attempting to implement a state of perfect positive equality (absolute lack of positive freedom). First, legibility is required as it increases the precision of the denotative state. In doing so, it allows for a rational plan to be developed that has a higher chance of success. Without legibility, the plan would still be possible, but the outcomes would be more uncertain, and negative externalities that are in the tails of the imprecise distributions can more easily knock the whole plan off kilter. Second, the lack of positive freedom leads to a lack of innovation. The state attempting these plans then wonders why the society stagnates. As with Le Corbusier’s high modernist architecture, the rational plan, although it may even been provably optimal, requires behaviors from individuals that may run aground on their social and emotional perspectives and values. Thus, the people will refuse, or will simply hide their activities, creating a significant principal-agent problem and a “dark twin” (Page, 2007). Regardless, the state ends up with even less information about its people than it started with, and the economy is likely to stagnate. Le Corbusier could not understand why people didn’t like his great master plan—it was so obviously a better way of life! The answer is that people don’t like it because it removes too much of their positive freedoms. To get to this high modernist state, a people would need to be shifted across the freedom simplex so drastically that they will refuse, perhaps only pretending to follow the plan.

Another way to say this is that legibility creates a “dark twin” that arises to “perform many of the various needs that the planned institution fails to fulfill” (Scott, 1998: 261), and that makes the institutions “parasitic on informal processes” (Scott, 1998: 310). The dark twin exists in the unmodeled space of the institution, and will therefore be the source of truly unexpected and surprising events or Black Swans. The more legible something is made (the model used is simpler and the environment has to be “forced” to fit that simpler model), the more is left out, and more Black Swans are created as negative externalities. However, legibility leads to control, at least in the short term and as long as no Black Swans come along. The reason is that the state who is “forcing” the social world, is reducing the variability in the population—reducing their positive freedoms and moving up the $\hat{δ}$ axis beyond the ecological uncertainty boundary.

One of the major purposes of state simplification […] is to strip down reality to the bare bones so that the rules will in fact explain more of the situation and provide a better guide to behavior. […] If the environment can be simplified down to the point where the rules do explain a great deal, those who formulate the rules and techniques have also greatly expanded their power. They have, correspondingly, diminished the power of those who do not. (Page, 2007: 303)

Summary (categorization and legibility):

Freedom	Positive freedoms may be severely restricted in a high-modernist state, often resulting in a revolution (action of the people, restricting negative freedom), or a coup (action of the military, restricting republican freedom).
Free energy	As the ecological uncertainty boundary is crossed, free energy increases, which may be reduced in the same way as freedoms are.
Simplex	As the ecological uncertainty boundary is crossed, a group falls off the simplex and must return to it.

Conclusion

I have argued in this paper that that the foundations of human collectivity and human intelligence are based on the sharing of information within groups, and on how individuals manage that information. A similar argument is made by Deutsch (2011), who shows how successful human societies are dynamic and reliant on the transmission and growth of knowledge, which in turn requires both memory (belief, storage of the knowledge by individuals) and behavior (transmission of knowledge to others). In this paper, I have deliberately been agnostic about how the sharing is accomplished, focusing instead on its expected effects. A mechanism for this sharing is therefore through emotional signaling. Bayesian Affect Control Theory (BayesAct) is a dual-process model based on the concepts above, but where the parameter space is an emotional identity space that is shared and communicable (Hoey et al., 2016, 2021; Schröder et al., 2016). In this view, strong and persistent ties in human networks are relational rather than transactional (Lawler et al., 2009). Rationality exists at the level of groups of agents, not of individuals. Intelligence is defined, in part, by a social order that exists in a group and is internalized by each member through affective dynamical structures of social identities. Members seek out other members of the group that play complementary roles, enact a joint behavior for their chosen relationship, and seek to establish boundaries around the group in the social identity space (e.g., definitions of what it means to be a group member). Small-scale breakdowns are handled through a restorative set of multi-modal communicative cues that are displayed in the voice, face, gestures, and body, and are commonly referred to as emotion signals. Larger-scale breakdowns are handled by cognitive skill in creating new structures that are reified and internalized by group members (Berger and Luckmann, 1967; Ostrom, 2015). The dynamics of the role relationships, coupled with human ability to cognitively explore, in a time- and energy-bounded fashion, using reason and rationality, allow the entire group to build, maintain, enact, and transform a social order (Goffman, 1963) that is jointly optimal for survival.

This view is not based on individual preference functions, because preference is something based on the context, the group, including how the holder of preferences perceives those group members. Instead, the core notion is predictability or avoidance of surprise (Friston, 2010). Since each group member’s attempts to minimize surprise will make up the group’s attempts, the individual is the group, and what are called her preferences are post-hoc explanations for her behavior that can be assembled only after the effects of her actions are known (Mercier and Sperber, 2017). In the heat of the moment, her actions will be based primarily on achieving stability in her current context, the current set of groups she is interacting with. Minimizing surprise for the group is the same as minimizing the free energy of the group, but this must be done, in the end, by individual action. This boils down to the individual minimizing surprise based simultaneously on two contexts for her: as a group member, and as an individual (Hoey, 2021). It also implies that individuals must be able to recognize group membership, which they do through multimodal signals, in particular emotional or status/power signals (Ridgeway, 2019).

I close by noting that uncertainty principles imply some kind of non-determinism. Take the position-momentum duality in physics. In theory, if one knew the exact position and velocity of a particle in a vacuum, one could compute its position and velocity at all future times using Newton’s laws. However, because of the need to make the measurement, one cannot do this exactly and therefore one’s predictions are non-deterministic. In a similar way, the social uncertainty principle I have been discussing is based on non-determinism in social processes, while still maintaining a functional view of human’s role in the past and future. Individuals and societies are not simple products of their historical environment, as they themselves define that environment, while also adapting to it, and the way in which they co-adapt with their environment is based on how they manage uncertainty.

Footnotes

Acknowledgments

I wish to thank John Levi Martin, Tobias Schröder, Neil MacKinnon, Angie Mercer, and David R. Heise for comments on early drafts. I thank the editors and reviewers for their invaluable assistance in refining this work.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: I gratefully acknowledge research funding from the Natural Science and Engineering Research Council of Canada (NSERC).

ORCID iD

Jesse Hoey

Notes

Appendix

Author biography

Dr. Jesse Hoey (ORCID:0000-0001-5340-2204) is a professor in the David R. Cheriton School of Computer Science at the University of Waterloo. Dr. Hoey holds a Ph.D. degree (2004) in computer science from the University of British Columbia. He has published over one hundred peer reviewed scientific papers. His 2017 paper in the American Sociological Review on “Modeling dynamic identities and uncertainty in social interactions” won the Outstanding Recent Contribution in Social Psychology Paper Award, and the Outstanding Article Award from the American Sociological Association Section on Mathematical Sociology. Dr. Hoey is Editor-in-Chief for the IEEE Transactions on Affective Computing (2023–2027) and a Senior Programme Committee member for IJCAI 2025.

References

Akerlof

Kranton

(2000) Economics and identity. Quarterly Journal of Economics 115(3): 715–753. DOI: 10.1162/003355300554881.

Anderson

(2017) Private Government. Princeton, NJ: Princeton University Press.

Arrow

(1951) Social Choice and Individual Values. New York, NY: John Wiley and Sons.

Asghar

Hoey

(2015) Monte-Carlo planning for socially aligned agents using Bayesian affect control theory. In: Proc. Uncertainty in Artificial Intelligence (UAI), Amsterdam, The Netherlands, 12–16 July 2015, pp. 72–81.

Åström

(1965) Optimal control of Markov decision processes with incomplete state estimation. Journal of Mathematical Analysis and Applications 10: 174–205.

Badcock

Friston

Ramstead

(2019) The hierarchically mechanistic mind: a free-energy formulation of the human psyche. Physics of Life Reviews 31: 104–121. DOI: 10.1016/j.plrev.2018.10.002.

Bail

Argyle

Brown

, et al. (2018) Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences 115(37): 9216–9221. DOI: 10.1073/pnas.1804840115.

Beni

Pietarinen

(2021) Aligning the free-energy principle with peirce’s logic of science and economy of research. European Journal for the Philosophy of Science 11(3): 94.

Berger

Luckmann

(1967) The Social Construction of Reality. New York, NY: Doubleday: Anchor Books.

10.

Bergman

(2015) An Invitation to General Algebra and Universal Constructions. 2 edition. Berlin, Germany: Springer.

11.

Berlin

(1961) The Hedgehog and the Fox. London, UK: Weidenfeld and Nicolson.

12.

Berlin

(2002) Historical inevitability. In: Liberty. Oxford, UK: Oxford University Press, 94–165.

13.

Bishop

(2006) Pattern Recognition and Machine Learning. Berlin, Germany: Springer.

14.

Bohr

(1950) On the notions of causality and complementarity. Science 111: 51–54.

15.

Boutilier

Dean

Hanks

(1999) Decision theoretic planning: structural assumptions and computational leverage. Journal of Artificial Intelligence Research 11: 1–94.

16.

Breiger

(1974) The duality of persons and groups. Social Forces 53(2): 181–190.

17.

Bruineberg

Rietveld

(2019) What’s inside your head once you’ve figured out what your head’s inside of. Ecological Psychology 31(3): 198–217. DOI: 10.1080/10407413.2019.1615204.

18.

Brush

Krakauer

Flack

(2018) Conflicts of interest improve collective computation of adaptive social structures. Science Advances 4(1): e1603311. DOI: 10.1126/sciadv.1603311.

19.

Burt

(1940) The Factors of the Mind: An Introduction to Factor Analysis in Psychology. London, UK: University of London Press.

20.

Carr

(1961) What Is History? New York, NY: Penguin Books.

21.

Chaiken

Trope

(1999) Dual-Process Theories in Social Psychology. New York, NY: Guildford.

22.

Connor Desai

Pilditch

Madsen

(2020) The rational continued influence of misinformation. Cognition 205: 104453. DOI: 10.1016/j.cognition.2020.104453.

23.

Cooley

(1902) Human Nature and the Social Order. New York, NY: Scribner’s.

24.

Dawes

(1994) Quantum neurodynamics. IFAC Proceedings Volumes 27(1): 491–495. DOI: 10.1016/S1474-6670(17)46315-4.

25.

Deutsch

(1958) Evidence and inference in nuclear research. Dædalus 87(4): 88–98.

26.

Deutsch

(1999) Quantum theory of probability and decisions. Proceedings of the Royal Society of London A 455: 3129–3137.

27.

Deutsch

(2011) The Beginning of Infinity. New York, NY: Penguin.

28.

Doshi

Gmytrasiewicz

(2009) Monte-Carlo sampling methods for approximating interactive POMDPs. Journal of Artificial Intelligence Research 34: 297–337.

29.

Dundes

(1980) Interpreting Folklore. Bloomington, IN: University of Indiana Press.

30.

Durkheim

(2014/1893) The Division of Labor in Society. New York, NY: Free Press.

31.

Evans

(2008) Dual-processing accounts of reasoning, judgment and social cognition. Annual Review of Psychology 59: 255–278.

32.

Facciani

(2020) How Self-Sentiments and Personal Networks Impact Political Polarization. PhD Thesis. Columbia, SC: University of South Carolina.

33.

Falandays

Smaldino

(2021) The emergence of cultural attractors: an agent-based model of collective perceptual alignment. Proceedings of the Annual Meeting of the Cognitive Science Society 43: 548–554.

34.

Fehr

Schmidt

(1999) A theory of fairness, competition, and cooperation. Quarterly Journal of Economics 114(3): 817–868.

35.

Feld

Elmore

(1982) Patterns of sociometric choices: transitivity reconsidered. Social Psychology Quarterly 45(2): 77–85.

36.

Francis

Heise

(2006) Mean affective ratings of 1,500 concepts by Indiana university undergraduates in 2002-3. Computer file, distributed at affect control theory website, program interact bayesact.ca.

37.

Friston

(2010) The free-energy principle: a unified brain theory? Nature Reviews Neuroscience 11(2): 127–138.

38.

Gallagher

(2008) Direct perception in the intersubjective context. Consciousness and Cognition 17(2): 535–543.

39.

Gallagher

(2020) Action and Interaction. Oxford, UK: Oxford University Press.

40.

Gibson

(1986) The Ecological Approach to Visual Perception. Oxford, UK: Taylor and Francis.

41.

Gilead

Trope

Liberman

(2020) Above and beyond the concrete: the diverse representational substrates of the predictive brain. Behavioral and Brain Sciences 43: e121. DOI: 10.1017/S0140525X19002000.

42.

Glenberg

(1997) What memory is for. Behavioral and Brain Sciences 20(1): 1–55. DOI: 10.1017/S0140525X97000010.

43.

Glöckner

Betsch

(2008) Modeling option and strategy choices with connectionist networks: towards an integrative model of automatic and deliberate decision making. Judgment and Decision Making 3(3): 215–228.

44.

Goffman

(1963) Behavior in Public Places. New York, NY: The Free Press.

45.

Goldman

(2012) Theory of mind. In: Margolis

Samuels

Stich

(eds) The Oxford Handbook of Philosophy of Cognitive Science. Oxford, UK: Oxford University Press, 402–424.

46.

Goldstein

Poole

Safko

(2002) Classical Mechanics. Boston, MA: Addison Wesley.

47.

Gollob

(1974) The subject-verb-object approach to social cognition. Psychological Review 81: 286–321.

48.

Graeber

Wengrow

(2021) The Dawn of Everything: A New History of Humanity. Toronto, ON: MacLelland Stewart.

49.

Haas

Cunningham

(2014) The uncertainty paradox: perceived threat moderates the effect of uncertainty on political tolerance. Political Psychology 35: 291–302.

50.

Hahn

(2024) Individuals, collectives, and individuals in collectives: the ineliminable role of dependence. Perspectives on Psychological Science 19(2): 418–431.

51.

Hahn

Hansen

Olsson

(2020) Truth tracking performance of social networks: how connectivity and clustering can make groups less competent. Synthese 197: 1511–1541.

52.

Hawkins

Franke

Frank

, et al. (2023) From partners to populations: a hierarchical bayesian account of coordination and convention. Psychological Review 130(4): 977–1016. DOI: 10.1037/rev0000348.

53.

Hayek

(1960) The Constitution of Liberty. 2011 edition. Chicago, IL: University of Chicago Press.

54.

Hegel

GWF

(1807) Phenomenology of Spirit. Oxford, UK: Oxford University Press.

55.

Heidegger

(1927) Being and Time. Albany, NY: SUNY Press.

56.

Heise

(2007) Expressive Order: Confirming Sentiments in Social Actions. Berlin, Germany: Springer.

57.

Heise

(2010) Surveying Cultures: Discovering Shared Conceptions and Sentiments. New York, NY: Wiley.

58.

Henrich

(2016) The Secret of Our Success. Princeton, NJ: Princeton University Press.

59.

Henrich

(2020) The WEIRDest People in the World. New York, NY: Farrar, Strauss and Giroux.

60.

Hobbes

(1651) Leviathan. UK: Oxford University Press.

61.

Hoey

(2021) Equality and freedom as uncertainty in groups. Entropy 23(11): 1384. DOI: 10.3390/e23111384.

62.

Hoey

(2022) A social uncertainty principle with application to principal agent problems. https://osf.io/9m6uq/.

63.

Hoey

Schröder

(2023) Disruption of social orders in societal transitions as affective control of uncertainty. American Behavioral Scientist 67(2): 311–331. DOI: 10.1177/00027642211066055.

64.

Hoey

Schröder

Alhothali

(2016) Affect control processes: intelligent affective interaction using a partially observable Markov decision process. Artificial Intelligence 230: 134–172.

65.

Hoey

MacKinnon

Schröder

(2021) Denotative and connotative management of uncertainty: a computational dual-process model. Judgment and Decision Making 16(2): 505–550.

66.

Hohwy

(2013) The Predictive Mind. Oxford, UK: Oxford University Press.

67.

Hohwy

Roepstorff

Friston

(2008) Predictive coding explains binocular rivalry: an epistemological review. Cognition 108(3): 687–701.

68.

Holyoak

Thagard

(1996) Mental Leaps: Analogy in Creative Thought. Reading, UK: Garnett.

69.

Hume

(1739/1911) A Treatise of Human Nature. London, UK: J. M. Dent and Sons.

70.

Iyengar

Sood

Lelkes

(2012) Affect, not ideology: a social identity perspective on polarization. Public Opinion Quarterly 76(3): 405–431.

71.

James

(1890) Principles of Psychology. New York, NY: Holt.

72.

Jones

Baumgartner

(2012) From there to here: punctuated equilibrium to the general punctuation thesis to a theory of government information processing. Policy Studies Journal 40(1): 1–20.

73.

Kaelbling

Littman

Cassandra

(1998) Planning and acting in partially observable stochastic domains. Artificial Intelligence 101: 99–134.

74.

Kiverstein

(2011) Social understanding without mentalizing. Philosophical Topics 39(1): 41–65.

75.

Kuhn

(1962) The Structure of Scientific Revolutions. Chicago, IL: University of Chicago Press.

76.

Lakoff

Johnson

(1980) Methaphors We Live by. Chicago, IL: University of Chicago Press.

77.

Lawler

Thye

Yoon

(2009) Social Commitments in a Depersonalized World. Manhattan, NY: Russell Sage Foundation.

78.

(1964) The Radiant City. New York, NY: Orion Press.

79.

Lindblom

(1959) The science of muddling through. Public Administration Review 19(2): 79–88.

80.

Litt

Eliasmith

Kroon

, et al. (2006) Is the brain a quantum computer? Cognitive Science 30(3): 593–603. DOI: 10.1207/s15516709cog0000_59.

81.

Locke

(1690) Essay Concerning Human Understanding. Oxford, UK: Oxford University Press.

82.

MacKay

DJC

(2003) Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press.

83.

MacKinnnon

(2015) Self Esteem and Beyond. New York, NY: Palgrave MacMillan.

84.

MacKinnon

(1994) Symbolic Interactionism as Affect Control. Albany, NY: State University of New York Press.

85.

MacKinnon

Heise

(2010) Self, Identity and Social Institutions. New York, NY: Palgrave and Macmillan.

86.

MacKinnon

Hoey

(2021) Operationalizing the relation between affect and cognition with the somatic transform. Emotion Review 13(3): 245–256. DOI: 10.1177/17540739211014946.

87.

Marietta

Barker

(2019) One Nation, Two Realities: Dueling Facts in American Democracy. Oxford, UK: Oxford University Press.

88.

Martin

(2009) Social Structures. Princeton, NJ: Princeton University Press.

89.

Mason

(2013) The rise of uncivil agreement: issue versus behavioral polarization in the American electorate. American Behavioral Scientist 57(1): 140–159.

90.

Mason

(2015) “I disrespectfully agree”: the differential effects of partisan sorting on social and issue polarization. American Journal of Political Science 59(1): 128–145.

91.

Mead

(1934) Mind, Self and Society. Chicago, IL: University of Chicago Press.

92.

Mercier

(2020) Not Born Yesterday: The Science of Who We Trust and what We Believe. Princeton, NJ: Princeton University Press.

93.

Mercier

Sperber (2017) The Enigma of Reason. Cambridge, MA: Harvard University Press.

94.

Messick

McClintock

(1968) Motivational bases of choice in experimental games. Journal of Experimental Social Psychology 4: 1–25.

95.

Montesquieu

(1748/1989) The Spirit of the Laws. Cambridge, UK: Cambridge University Press.

96.

Mumford

(1934) Technics and Civilization. Chicago, IL: University of Chicago Press.

97.

Myerson

(1991) Game Theory: Analysis of Conflict. Cambridge, MA: Harvard University Press.

98.

Nowak

(2006) Five rules for the evolution of cooperation. Science 314: 1560–1563.

99.

Nozick

(1974) Anarchy, State and Utopia. New York, NY: Basic Books.

100.

Oaksford

Chater

(2001) The probabilistic approach to human reasoning. Trends in Cognitive Sciences 5(8): 349–357.

101.

Ortony

Norman

Revelle

(2005) Affect and proto-affect in effective functioning. In: Fellous

Arbib

(eds) Who Needs Emotions: The Brain Meets the Machine. Oxford, UK: Oxford University Press, 173–202.

102.

Osgood

Suci

Tannenbaum

(1957) The Measurement of Meaning. Urbana, IL: University of Illinois Press.

103.

Ostrom

(2015) Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge, UK: Cambridge University Press. DOI: 10.1017/CBO9781316423936.

104.

Overgaard

(2017) Merleau-ponty and wittgenstein on mindreading: exposing the myth of the given mind. In: Romdenh-Romluc

(ed) Wittgenstein and Merleau-Ponty. London, UK: Routledge, 49–65.

105.

Page

(2007) The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools and Societies. Princeton, NJ: Princeton University Press.

106.

Peirce

(1955) Philosophical Writings of Peirce. New York, NY: Dover Publications, Inc. Selected and edited by Justus Buchler.

107.

Pennycook

Epstein

Mosleh

, et al. (2021) Shifting attention to accuracy can reduce misinformation online. Nature 592(7855): 590–595.

108.

Polanyi

(1951) The Logic of Liberty: Reflections and Rejoinders. London, UK: Routlegde and Kegan Paul.

109.

Pothos

Busemeyer

(2022) Quantum cognition. Annual Review of Psychology 73(1): 749–778. DOI: 10.1146/annurev-psych-033020-123501.

110.

Powers

(1973) Behavior: The Control of Perception. Chicago, IL: Aldine Publishing Co.

111.

Putnam

(2000) Bowling Alone: The Collapse and Revival of American Community. New York, NY: Simon and Schuster.

112.

Rabin

(1993) A theory of fairness, competition and cooperation. The American Economic Review 83(5): 1281–1302.

113.

Ramstead

Kirchhoff

Friston

(2019) A tale of two densities: active inference is enactive inference. Adaptive Behavior 28(4): 225–239. DOI: 10.1177/1059712319862774.

114.

Ridgeway

(2019) Status: Why Is It Everywhere? Why Does It Matter? Troy, NY: Russell Sage.

115.

Sartre

(1943) Being and Nothingness. New York, NY: Washington Square Press.

116.

Schröder

Hoey

Rogers

(2016) Modeling dynamic identities and uncertainty in social interactions: Bayesian affect control theory. American Sociological Review 81(4): 828–855. DOI: 10.1177/0003122416650963.

117.

Scott

(1998) Seeing like a State. New Haven, CT: Yale University Press.

118.

Shapira

Liberman

Trope

, et al. (2012) Levels of mental construal. In: Sage Handbook of Social Cognition. Thousand Oaks, CA: Sage, 229–250.

119.

Simon

(1967) Motivational and emotional controls of cognition. Psychological Review 74: 29–39.

120.

Slors

(2012) The model-model of the theory-theory. Inquiry 55(5): 521–542. DOI: 10.1080/0020174X.2012.716205.

121.

Smith

(1759) The Theory of Moral Sentiments. London, UK: W. Strahan.

122.

Smith

Parr

Friston

(2019) Simulating emotions: an active inference model of emotional state inference and emotion concept learning. Frontiers in Psychology 10. DOI: 10.3389/fpsyg.2019.02844.

123.

Stanovich

West

(2000) Individual differences in reasoning: implications for the rationality debate? Behavioral and Brain Sciences 23: 645–665.

124.

Stephenson

(1986a) William James, Niels Bohr, and complementarity: II—pragmatics of a thought. Psychological Record 36: 529–543.

125.

Stephenson

(1986b) William James, Niels Bohr, and complementarity: I—concepts. Psychological Record 36: 519–527.

126.

Sutton

(2022) The quest for a common model of the intelligent decision maker. ArXiv Preprint arXiv:2202.13252.

127.

Tajfel

Turner

(1986) The social identity theory of intergroup behavior. In: Worchel

Austin

(eds) Psychology of Intergroup Relations. Chicago, IL: Nelson-Hall, 276–293.

128.

Taleb

(2001) Fooled by Randomness. New York, NY: Random House.

129.

Wallace

(1983) Principles of Scientific Sociology. New York, NY: Aldine Publishing.

130.

Webster

Abramowitz

(2017) The ideological foundations of affective polarization in the U.S. electorate. American Politics Research 45(1): 621–647.

131.

White

(1992) Identity and Control. 1st edition. Princeton, NJ: Princeton University Press.

Social organization as the collective management of uncertainty

Abstract

Keywords

Introduction

Related work

Mechanism design

Dual-process models

Basic concepts

Complementarity

Neuroscientific complementarity

Social psychological and sociological complementarity

Three freedoms and three forms of dominance

Heuristics for forming collectives

Uncertainty

Free energy

Bayesian learning

Simplified model

Uncertainties and freedoms

Individuals, groups, and social movement

Descriptive examples

State emergence

Positive freedom lost

Republican freedom lost

Negative freedom lost

Public policy

Separation of powers

Political polarization

Categorization and legibility

Conclusion

Footnotes

Acknowledgments

Declaration of conflicting interests

Funding

ORCID iD

Notes

Appendix

Author biography

References