Abstract
Despite its success in financial markets and other domains, collective intelligence seems to fall short in many critical contexts, including infrequent but repeated financial crises, political polarization and deadlock, and various forms of bias and discrimination. We propose an evolutionary framework that provides fundamental insights into the role of heterogeneity and feedback loops in contributing to failures of collective intelligence. The framework is based on a binary choice model of behavior that affects fitness; hence, behavior is shaped by evolutionary dynamics and stochastic changes in environmental conditions. We derive collective intelligence as an emergent property of evolution in this framework, and also specify conditions under which it fails. We find that political polarization emerges in stochastic environments with reproductive risks that are correlated across individuals. Bias and discrimination emerge when individuals incorrectly attribute random adverse events to observable features that may have nothing to do with those events. In addition, path dependence and negative feedback in evolution may lead to even stronger biases and levels of discrimination, which are locally evolutionarily stable strategies. These results suggest potential policy interventions to prevent such failures by nudging the “madness of mobs” towards the “wisdom of crowds” through targeted shifts in the environment.
Keywords
Collective intelligence refers to the group knowledge and wisdom that emerges from the collaboration and competition among many individuals. Despite its ubiquity and significance in financial markets and other domains, collective intelligence is not easy to achieve and can also fail dramatically under certain conditions. Examples include infrequent but repeated financial crises, political polarization and deadlock, and various forms of bias and discrimination. We propose an evolutionary framework that provides fundamental insights into the failure of collective intelligence by answering the following questions: In what environments are polarization and discrimination likely emerge? What are the drivers behind these phenomena? And more importantly, how can we avoid “collective ignorance” and promote collective intelligence instead? We derive collective intelligence as an emergent property of evolution and specify conditions under which it fails. Political polarization emerges in stochastic environments with reproductive risks that are correlated across individuals. Bias and discrimination emerge when individuals incorrectly attribute random adverse events to observable features that may have nothing to do with those events. Moreover, path dependence and negative feedback in evolution may lead to even stronger levels of discrimination. These results suggest potential policy interventions to prevent such failures by nudging the “madness of mobs” towards the “wisdom of crowds” through targeted shifts in the environment, which is likely to be more effective than attempting to outlaw undesirable behaviors. As long as the environmental factors giving rise to these behaviors are still in force, the banned behaviors will re-emerge in one form or another.Significance statement
Introduction
Collective intelligence—a term for shared or group knowledge and wisdom that emerges from the collaboration and competition of many individuals—has been studied across decades in many disciplines ranging from the cognitive neurosciences to evolutionary biology to economics and sociology to engineering and computer science. However, despite its ubiquity and importance, collective intelligence is not easy to achieve and can also fail, sometimes repeatedly. One such example is the prevalence of bubbles and crashes in financial markets (Lo, 2013), such as the dot-com bubble in 1990s, the financial crisis of 2007–2008, and most recently, the financial turmoil during the first few months of the COVID-19 pandemic. No matter how different the latest financial frenzy or crisis appears to be, there are usually similarities to past experience (Reinhart and Rogoff, 2009).
Two of the most hotly debated issues today—political polarization and discrimination—are also examples of the failure of collective intelligence. Since the 2010s, we have witnessed the rise of populism and nationalism as part of a reaction against the global policies of the last 30 years in Western democracies and beyond, not to mention gender, religious, and other types of bias. These examples raise the natural question of why collective intelligence falters in these cases, but succeeds so well in so many other contexts?
In this article, we propose a formal mathematical model of the evolution of behavior to understand failures of collective intelligence by answering the following questions: In what environments will polarization and discrimination likely emerge? What are the key drivers behind these phenomena? And, most importantly, how can we avoid “collective ignorance” 1 and promote collective intelligence instead?
We start by introducing our modeling framework, which builds upon the binary choice model of Brennan and Lo (2011) and Zhang et al. (2014a). We then apply this framework to study the rise of extreme political views, after which we turn our attention to discrimination. We conclude by discussing the broad applicability as well as the limitation of our framework, and provide several practical policy implications for reducing or preventing failures of collective intelligence. Given the breadth of engagement in our chosen topic, we also provide a review of the several distinct literatures related to our work in the Supplementary Material.
Modeling framework
When any behavior has consequences for fitness, evolutionary principles apply. The actions underneath polarization and bias—which political views to adopt and whether to discriminate against a particular group—yield different economic (or, in an evolutionary context, reproductive) consequences for individuals in different environments. In addition, the nature of risks in the environment also affect what behavior will emerge, and these behaviors may not always agree with individual rationality (Zhang et al., 2014a; 2014b).
Our framework consists of an initial population of hypothetical individuals (not necessarily human) that live for one period of unspecified length, and engage in a single binary decision that has consequences for the random number of offspring they will generate asexually. To the extent that their behavior is linked to fecundity, only the most reproductively successful behaviors will flourish, due to the dynamics of evolution. 2 Although obvious from an evolutionary biologist’s perspective, this observation yields surprisingly specific implications regarding the types of behavior that are sustainable over time, behaviors that are likely to be innate to most living organisms due to the simplicity and generality of the binary choice framework. The evolved behavior will be collectively intelligent to the extent that it maximizes the population growth rate, but it may also generate other undesirable consequences in certain environments.
To illustrate the basic intuition behind this approach, we first present a simple numerical example before turning to the formal model. 3 Consider a population of individuals, each facing a binary choice between one of two possible actions, a and b. Environmental conditions will be positive 70% of the time, and action a will lead to reproductive success, generating 3 offspring for the individual. Environmental conditions will be negative 30% of the time, and action a will lead to 0 offspring. Action b has exactly the opposite outcomes—whenever a yields 3 offspring, b yields 0, and whenever a yields 0, b yields 3. From the individual’s perspective, always choosing a, which has the higher probability of reproductive success, will lead to more offspring on average. However, if all individuals in the population behaved in this “rational” manner, the first time that a negative environmental condition occurs, the entire population would become extinct. Assuming that offspring behave identically to their parents, the “always choose a” behavior cannot survive over time. For the same reason, “always choose b” is also unsustainable.
In fact, in this special case, the behavior with the highest fitness over time is for each individual to choose a 70% of the time, and b 30% of the time, matching the probabilities of reproductive success and failure. The group of individuals exhibiting this probability-matching behavior will achieve the maximum possible growth rate, and eventually, this behavior will dominate the entire population. As a result, it appears as though selection operates at the group level, and that this group—all individuals who randomize their actions with 70% probability—is the fittest from the perspective of reproductive success. 4
This simple but abstract example illustrates the principle that a given behavior may seem irrational, but when viewed in the broader context of a given environment, can come to dominate the population because individuals engaging in such behavior will reproduce more quickly in that environment than those with other behaviors. To alter such behavior, we must look to the environment that gave rise to this adaptation and change that environment, otherwise the behavior will persist.
Model parameters and constraints.
Formal model
We begin with a population of individuals that live for one period, produce a random number of offspring asexually and only once, and then die. During their lives, individuals make only one decision: they choose from two actions, a and b, and this results in one of two corresponding random numbers of offspring, x a and x b . Note that x a and x b can be correlated, and their joint distribution represents the entirety of the implications of an individual’s actions for fitness.
We impose a factor structure for x
a
and x
b
, that is, suppose there are two independent environmental factors, λ1 and λ2, that determine fitness, and x
a
and x
b
are both linear combinations of these two factors
(A1) λ1 and λ2 are independent random variables with some well-behaved distribution functions, such that (x a , x b ) and log(px a + (1 − p)x b ) have finite mean and variance for all p ∈ [0, 1], β a ∈ [0, 1], and β b ∈ [0, 1]; and
(A2) (λ1, λ2) is independent and identically distributed (IID) over time and identical for all individuals in a given generation.
We shall henceforth refer to (β a , β b ) as an individual’s characteristics. For each action, individuals’ fitness involves a tradeoff between exposure to these two factors.
We give two examples of such factor structure to provide intuition for the key idea of the model. In the context of the evolution of hypothetical animals, λ1 might represent weather conditions and λ2 might represent the topography of the terrain. An animal can choose to hunt on the mountain (action a) or in the forest (action b). The success of hunting on the mountain is highly dependent on the weather, corresponding to a high value of β a . On the other hand, because the forest provides shelter against extreme weather, the success of hunting in the forest depends mostly on its topography, corresponding to a low value of β b .
In the context of social evolution in humans, λ1 might represent the degree of globalization in a society, and λ2 might represent the amount of natural resources available locally, such as crude oil. An individual then faces the choice of opening a manufacturing facility (action a) or an oil refinery (action b). The success of the manufacturing facility depends on the degree of globalization, which provides access to cheap labor globally, corresponding to a high value of β a . However, the success of the oil refinery obviously depends on the availability of crude oil locally, corresponding to a low value of β b .
Our framework is general, in the sense that we embed in x a and x b —or equivalently, in factors and individual characteristics—the entire biological machinery that is fundamental to evolution, that is, genetics, but which is of less direct interest to social scientists than the link between behavior and fitness. If action a leads to higher fecundity than action b for individuals in a given population, the particular set of genes that predispose individuals to select a over b will be favored by natural selection, in which case these genes will survive and flourish, implying that the behavior “choose a over b” will flourish as well.
Using this framework, we show below that the degree of globalization as a factor can affect the emergence of extreme political views, and that the crime rate of racially categorized groups is another factor that can affect the emergence of discriminatory behaviors.
Individual behavior
Suppose each individual chooses action a with some probability p ∈ [0, 1] and action b with probability 1 − p, denoted by the Bernoulli random variable I
p
, hence the number of offspring of an individual is given by the random variable
In this framework, an individual is completely characterized by its behavior p and characteristics (β a , β b ). We shall henceforth refer to f ≡ (p, β a , β b ) as an individual’s type. To complete the specification of our model, we assume that offspring behave in a manner identical to their parent, that is, they have the same characteristics (β a , β b ), and choose between a and b according to the same p; hence, the population may be viewed as comprising many different types, each indexed by the triplet f. The assumption that offspring from a type-f parent are also of the same type f implies perfect genetic transmission of behavior from one generation to the next (that is, once a type f, always a type f).
Although clearly unrealistic from a biological perspective, this simplification highlights and clarifies the impact of evolutionary dynamics on behavior, allowing us to derive the growth-optimal behavior explicitly. 6 However, Brennan et al. (2018) have extended this model to allow for mutation, which we shall also consider in our framework below.
In summary, an individual i of type f = (p, β
a
, β
b
) produces a random number of offspring
Population dynamics
Now consider an initial population of individuals that contains an equal number of all types, which we normalize to be 1 each without loss of generality. Suppose the total number of type f = (p, β
a
, β
b
) individuals in generation T is
Over time, because the population grows exponentially, individuals with the largest growth rate will dominate the population at a geometric rate, as specified in the following result: 9
Under assumptions (A1) and (A2), the optimal factor loading, Furthermore, based on (7), the growth-optimal type, The three possible scenarios in (7) reflect the relative fitness of the two factors. The growth-optimal characteristics and associated optimal behaviors in Table 2 show that, when The results in Table 2 also highlight the fact that evolution can lead to multiple coexisting types of individuals. It is mathematically possible that types with different characteristics (β
a
and β
b
) and different behaviors (p) will lead to the same factor loading
Growth-optimal type
Binary choice model of political polarization
We first apply our framework to explain the emergence of coordinated groups, groups whose individual members appear to act with a single purpose, such as unions, military alliances, and patient advocacy groups, among others. Here, we focus on extreme political views as an example to illustrate the emergence of political polarization.
The key lies in the fact that the fitness of individuals share several common factors. The consequences of this one feature—which is the evolutionary instantiation of the adage “the enemy of my enemy is my friend”—are enormous, giving rise to seemingly coordinated behavior among subsets of individuals, or groups, purely through evolutionary dynamics.
Consider a hypothetical island isolated from the rest of the world. There are two factors that determine the fitness of any individual on this island. The first factor, λglob, represents the degree of globalization where, without loss of generality, we assume that larger values represent higher degrees of globalization. 11 The second factor, λother, represents everything else that may be relevant to an individual’s fitness. This is obviously an oversimplification, but more general specifications will become obvious once we present the analysis for this simpler setting. 12
A simple example
To develop intuition about the model, we first consider the special case in which the factors are specified by the following Bernoulli distribution
An individual on this island lives for one period, has one opportunity to choose one of two political attitudes (actions)—pro-globalization or anti-globalization—that determines its fitness, and then dies immediately after reproduction. The number of offspring is given by xanti if the individual chooses to be anti-globalization, and xpro if the individual chooses to be pro-globalization.
On the other hand, for those who are harmed by globalization, choosing to be anti and supporting policies that limit globalization can promote their fitness when the level of globalization is high. Therefore, they have a positive characteristic
To summarize, we use the superscript “benefit” or “harm” to represent these two types of individuals, and their fitness is determined by
Under assumptions (A1) and (A2) and the environment specified by (9) and (10), the population growth rate in (4) can be evaluated explicitly as This example illustrates a primitive form of polarization. When the average degree of globalization is either too low or too high, two distinct groups of individuals emerge. They coexist through the evolutionary process, but within each group, individuals share the same characteristics. A particular behavior must be paired with a particular set of characteristics to achieve the optimal growth rate. Note that the individuals in (13) and (14) are optimal only in the group sense. In fact, from any individual’s perspective, the survival-maximizing behavior is to always choose the action with higher average fitness (p = 0 or 1). The continuous spectrum of growth-optimal behaviors in Figure 1 only emerges because a group possesses survival benefits above and beyond an individual. In our framework, these benefits arise purely from stochastic environments with systematic risk.
14
The usual conception of group selection in the evolutionary biology literature is that natural selection acts at the level of the group, instead of at the more conventional level of the individual (or the gene), and that interaction between members within each group is much more frequent than interaction among individuals across groups. In this case, similar individuals are usually clustered geographically. However, in our model, individuals do not interact at all. Nevertheless, the fact that individuals with the same behavior generate offspring with like behavior makes them more likely to cluster geographically and appear as a “group.” In reality, the environment is generally nonstationary. Factor distributions change over time, and old factors fade while new factors emerge. In fact, the change in the environment can itself be a consequence of previous adaptations. We see this in the history of globalization itself. From the Silk Road dating back to the 2nd century BCE, to the World Trade Organization established in 1995, the course of globalization has always been fueled by a number of historical factors, such as the desire to trade local goods for exotic products, or to gain access to cheap labor. Imagine that the environment (λglob, λother) experiences a sudden shift. To an outside observer, behaviors among individuals in this population will become increasingly similar after the shift, creating the appearance—but not necessarily the reality—of intentional coordination, communication, and synchronization. If the reproductive cycle is sufficiently short, this change in population-wide behavior may seem highly responsive to environmental changes, giving the impression that individuals are learning about their environment. This is indeed a form of learning, but it occurs at the population level—a form of collective learning—not at the individual level, and not within an individual’s lifespan.

Growth-optimal behavior pbenefit for individuals who benefit from globalization, and pharm for individuals who are harmed by globalization. The horizontal axis shows the probability q in (9). The vertical axis and the color bar show the growth-optimal behavior, p*, in different environments parameterized by q. Blue indicates the “pro-globalization” action, while dark red indicates the “anti-globalization” action.
The general case
The factor distribution in (9) can be easily generalized to any arbitrary number of offspring Growth-optimal behaviors for both the “Benefit” group and the “Harm” group, f Benefit and f harm, as functions of environmental parameters. (2a): moderate globalization with q = 0.5. (2b): high globalization with q = 0.9. The first row shows f Benefit; the second row shows f harm; the last row shows the absolute difference, that is, polarization: |f Benefit − f harm|.

Figure 2(a) shows the case with a moderate level of globalization over time (q = 0.5). The plot in the first row shows the growth-optimal behavior for those who benefit from globalization (f Benefit). As the fitness for the globalization factor (C1) increases, individuals tend to be pro (blue), but as the fitness for the other factor (C2) increases, individuals tend to be anti (dark red). The plot in the second row shows the growth-optimal behavior for those who are harmed by globalization (f harm), which are the opposite of the behaviors for the “Benefit” group, in the sense that f harm = 1 − f Benefit. The plot in the last row shows the absolute difference between the growth-optimal behaviors of the two groups of individuals, |f Benefit − f harm|, which is a simple measure of polarization. When the “Benefit” group and the “Harm” group show opposing behaviors, the level of polarization is high (dark blue).
Figure 2(b) shows the same set of growth-optimal behaviors when the average level of globalization is high (q = 0.9). Compared to the behaviors in Figure 2(a), when the average globalization shifts toward a higher level, behaviors shift accordingly as well. As a result, the same environmental conditions (the region of the (C1, C2)-plane) that generated unity before may lead to polarization in this environment.
The simple example here considers two groups of individuals: those who benefit from globalization
Binary choice model of bias and discrimination
Our framework can also be used to understand the emergence of bias and discrimination, as well as to determine their underlying causes and what can be done to counteract these causes. We use racial discrimination as the main example of bias in this section, but the same principles apply more broadly to other kinds of bias and discrimination, including gender, sexual orientation, religion, socioeconomic strata, and so on.
A simple example
We consider a hypothetical world with a population composed of two racial groups: a majority group which we refer to as the “Andorians,” and a minority group which we refer to as the “Tellarians.” Group membership is unambiguous, mutually exclusive (an individual is a member of one and only one group), immutable, and observable by all.
15
There are two factors that determine each individual’s fitness: λ
A
and λ
T
. They represent social interactions with Andorian and Tellarian individuals, respectively. An individual who interacts with Andorian individuals is subject to the Andorian factor, λ
A
, whereas an individual who interacts with Tellarian individuals is subject to the Tellarian factor, λ
T
. λ
A
and λ
T
are independent random variables with the following distributions
Historically, the Tellarian community has been politically underrepresented, with less access to education and economic opportunity. As a result, this greater inequality has led to a higher crime rate for the Tellarian community compared to the average population. Note that the higher crime rate is not because of race, but the result of a complicated set of determinants, including less access to resources historically. However, in this model, individuals observe only each other’s race, modeled here as group membership, which they use as a marker in the absence of any other information. The true underlying causes of higher crime rates, such as a lack of educational opportunity or socioeconomic status, are assumed to be unobservable, a key assumption.
We now focus on the perspective of an Andorian, who faces a decision between one of two actions—whether or not to discriminate against a Tellarian—which determines their fitness. We assume that an Andorian’s number of offspring is given by xdiscriminate if the individual chooses to discriminate, and xnot discriminate if the individual chooses not to discriminate
For a particular behavior p (the probability to discriminate against a Tellarian), the population growth rate in (4) is a function of the environment (that is, the adverse probabilities, q and r) and the characteristic (β). In this simple case, as in the example of political polarization in the previous section, we can characterize the growth-optimal behavior explicitly:
Under assumptions (A1) and (A2) and the environment specified by (16) and (17), the population growth rate can be evaluated explicitly as
and the behavior (that is, the value of p) that maximizes this growth rate is
Equation (19) is the behavior that yields the highest growth rate and therefore characterizes the behavior favored by natural selection. Recall that p* = 1 corresponds to fully discriminatory behavior. We plot p* in Figure 3 with two different population group percentages. Figure 3(a) shows a world with an equal number of Andorian and Tellarian individuals (β = 0.5), and Figure 3(b) shows a world with only 20% Tellarians (β = 0.2). In both cases, when the adverse probability associated with Tellarians (r) is low compared to the adverse probability associated with Andorians (q), no discrimination emerges. As r increases relative to the adverse probability for Andorians (q), discrimination emerges, that is, p* increases from 0 to 1. This is because individuals who choose to avoid interactions with Tellarians gain an evolutionary advantage by reducing their exposure to the factor λ
T
and the higher adverse probability r on average. This effect emerges from the fact that in our model, race is the only observable marker of the individuals in the population and the true underlying causes of the higher adverse probability are not observable. This phenomenon is also referred to as statistical discrimination (Phelps, 1972; Arrow, 1973). In addition, we can observe from the first case of (19) that the environment leading to full discrimination (p* = 1) does not depend on the percentage of Tellarians in the population (β). It is only a function of the adverse probability, q and r. This is also clear by comparing Figure 3(a) and (b). In both cases, when the adverse probability associated with Tellarians is high compared to that for Andorians On the other hand, when Tellarians are the minority (β = 0.2), the region where individuals have partially discriminatory behavior shrinks (given by the middle case in (19), where p* is strictly between 0 and 1). This implies that when the group in consideration consists of a small fraction of the entire population, the boundary of the environmental conditions leading to no discrimination and full discrimination is sharper. In our simple example, the key to the emergence of discrimination is the fact that race is the only observable feature of individuals. However, these implications will likely remain true even if other attributes of the individuals are partially observable, given the insight of the memory/prediction framework by Hawkins and Blakeslee (2004), who argue that we store memory patterns and use them to predict what will happen in the future. When individuals experience a random adverse event in association with a Tellarian, they tend to attribute it to the Tellarian’s race because it is the most easily observable marker, leading to discrimination against Tellarians. Based on a similar hypothesis, Bordalo et al. (2016) develop a model of stereotyping based on the representativeness heuristic (Tversky and Kahneman, 1983): agents overweight the prevalence of a trait in a group when that trait appears to be highly representative of the group in question. This is, however, not the root cause of the adverse event. In other words, it is much too easy to confuse correlation with causation. We have seen that the difference in relative adverse probabilities, q and r, can lead to serious biases and discriminatory practices. Next, we are able to strengthen our results by showing that even when the two groups have equal probabilities of adverse events, or even in certain cases when Tellarian individuals have a lower probability of adverse events than their Andorian counterparts, discrimination can still emerge.

Growth-optimal behaviors, p*, as a function of environmental parameters. (3a): percentage of Tellarians in the population β = 0.5. (3b): percentage of Tellarians in the population β = 0.2.
Feedback loops
Discrimination against Tellarians in the general population affects the Tellarian community adversely. For example, those individuals who participate in discriminatory behavior against Tellarians may contact law enforcement more often, leading to a higher incidence of false accusations against the Tellarian community. They may develop more hostile behaviors toward the Tellarian community, reducing educational and economic opportunities for the Tellarian community, which further increases the probability of an adverse event associated with Tellarians.
Another less obvious type of feedback comes from the increasing popularity and prevalence of engagement-based recommender systems on news and social media platforms. When presented with new information (which may be a news broadcast or a social media post), humans tend to anchor towards what they originally believe (Tversky and Kahneman, 1974). As a result, even a small initial bias acquired randomly can be reinforced and amplified through feedback based on a recommender algorithm.
To incorporate this feedback loop into our model, we make the following assumption: (A3) Factor λ
T
’s distribution in generation T is given by
When the level of bias is higher in the population (that is, when
Note that the factors in (16) are identically distributed over time. In other words, they do not depend on time, nor on realizations of the past evolution of results. In contrast, the factor in (20) introduces path dependency into the evolutionary process, because it depends on the past realizations of population behavior. As a result, λ T is no longer stationary over time. This simple change generates a surprisingly rich set of new implications.
We first use simulation methods to develop an intuition for the effect of different intensities of negative feedback. We consider a world that starts from an equal number of individuals in the population with 11 different behaviors: p ∈ {0, 1/10, 2/10, …, 1}. Figure 4 shows the evolution of the relative frequency of these behaviors over 10,000 generations, given different environmental conditions. The evolution of 11 behaviors, p ∈ {0, 1/10, 2/10, …, 1}, over 10,000 generations. The vertical axis represents the relative frequency of each behavior, and the horizontal axis represents time. (4a): equal adverse probability (q = r = 0.2), no feedback τ = 0; (4b): equal adverse probability (q = r = 0.2), mild feedback (τ = 0.6); (4c): equal adverse probability (q = r = 0.2), more feedback (τ = 1); (4d): lower Tellarian adverse probability (q = 0.2, r = 0.15), even more feedback (τ = 2).
Figure 4(a)–(c) depict simulations of an environment with equal adverse probabilities for Tellarians and Andorians (q = r = 0.2), with the feedback intensity, τ, increasing from 0 (no feedback) to 1 (the adverse probability is doubled with full discrimination in the population). 18 Figure 4(a) corresponds to an environment with no feedback, and the behavior p* = 0 (no discrimination) quickly dominates the population. This also corresponds to the growth-optimal behavior in the upper right corner of Figure 3(a). As the feedback intensity increases to τ = 0.6, as shown in Figure 4(b), positive p* (partial discrimination) emerges. Finally, as the feedback intensity increases to τ = 1, as shown in Figure 4(c), p* = 1 (full discrimination) quickly dominates the population.
In addition, Figure 4(d) illustrates an environment in which Tellarians have a lower probability of an adverse event than Andorians (r < q). Given conditions of strong feedback (τ = 2), fully discriminatory behavior (p* = 1) still dominates the population. This is because the feedback intensity is so high that discrimination quickly worsens the adverse probability for the Tellarian population, leading to severe discrimination against the population, despite the fact that the Tellarian population starts with a more favorable adverse probability. 19
More generally, despite the challenging complexities of a nonstationary and path-dependent environment created by the feedback mechanism, we can analytically quantify the growth-optimal behavior, p*, implicitly. The factor with feedback in (20) is mathematically equivalent to the simple environment we considered in (16), except that the adverse probability associated with Tellarians, r, is replaced by the feedback-adjusted adverse probability,
Under assumptions (A1)–(A3) and the environment specified by (17), the growth-optimal behavior, p*, with feedback must satisfy the following fixed-point condition Equation (21) is a necessary, but insufficient, condition for any behavior to survive in the long run. Due to its nonlinearity, the growth-optimal behavior, p*, implied by (21) may not be unique for some environments. However, without intervention, only one behavior is stable and able to persist in each environment, for which we need to define the new notion of a locally evolutionarily stable strategy.
Locally evolutionarily stable strategies
An evolutionarily stable strategy (ESS), first introduced by Maynard Smith and Price (1973), 20 is a strategy that is impermeable to other strategies when adopted by a population in adaptation to a specific environment. In other words, it cannot be displaced by an alternative strategy, which may be novel or initially rare. In game-theoretical terms, an ESS is an equilibrium refinement of the Nash equilibrium concept, given that a Nash equilibrium is also “evolutionarily stable.” Once fixed in a population, natural selection alone is sufficient to prevent alternative (or mutant) strategies from replacing it.
We define a locally evolutionarily stable strategy (L-ESS) to be one that is stable locally. In other words, it is a strategy that cannot be displaced by any local perturbation of that strategy. 21
A In other words, when randomness in the environment causes the average behavior of the population, Figure 5 shows an environment with strong feedback intensity (τ = 2). Recall that the nonlinearity of the fixed-point condition (21) can lead to multiple solutions of p*, and we compare the L-ESS (Figure 5(a)) and non-L-ESS behaviors (Figure 5(b)). The dashed triangular regions
22
represent the set of environments where the fixed-point condition (21) leads to one L-ESS and one non-L-ESS behavior. In this region, the non-L-ESS behaviors are less discriminatory, and the strong feedback intensity nudges the population to evolve towards fully discriminatory behaviors. In addition, Figure 5(c) shows the differences in population growth rates between the L-ESS behavior and non-L-ESS behavior. We refer to this as the “L-ESS excess growth rate”

Comparison of L-ESS and non-L-ESS behaviors for an environment with strong feedback intensity (τ = 2). (5a): L-ESS growth-optimal behaviors implied by the fixed-point equation (21); (5b): non-L-ESS growth-optimal behaviors if the fixed-point equation (21) yields multiple solutions, otherwise we plot the unique solution from (21) which is L-ESS; (5c): L-ESS excess growth rate as defined in (22), which is the difference in growth rates between the L-ESS behavior and the non-L-ESS behavior.
Under assumptions (A1)–(A3) and the environment specified by (17), if (21) yields multiple solutions where the L-ESS behavior is more discriminative than the non-L-ESS behavior This example demonstrates that path dependency can lead to evolutionary outcomes with slower growth rates than otherwise achievable, and the population ends up with a suboptimal growth rate compared to a world without feedback. In the context of our model, L-ESS behavior implies that the Andorian individual will always avoid any interaction with the Tellarian individual, a state of “collective ignorance” that could otherwise be improved with greater diversity in the population.
23
Feedback can lead to greater bias
With our understanding of L-ESS behavior, we can now finally show the variation in growth-optimal behavior in environments with different feedback intensities. We have the following intuitive but important result:
Under assumptions (A1)–(A3) and the environment specified by (17), as the feedback intensity, τ, increases, discriminatory behaviors are more likely to emerge, in the sense that they dominate the population for increasingly larger regions of environmental conditions, as parameterized by the adverse probabilities, q and r, for the Andorian and the Tellarian groups, respectively. Figure 6 shows L-ESS behaviors for different levels of feedback intensity and demonstrates Proposition 6. As τ increases from 0 (Figure 6(a)) to 2 (Figure 6(d)), discriminatory behaviors dominate the population for increasingly larger regions of environmental conditions. When feedback is absent from these evolutionary dynamics (Figure 6(a)), discrimination only emerges when Tellarians have a higher probability of adverse events than Andorians. However, when the feedback intensity is high (Figures 6(c) and (d)), full discrimination prevails, even in environments where the adverse probability for Tellarians is lower than that for Andorians. These results emphasize the central role feedback plays in the emergence of bias and discrimination. By combining the observation that individuals tend to attribute the occurrence of random adverse events to the only observable characteristic, race (Hawkins and Blakeslee, 2004), with the negative feedback from those random adverse events, our model has demonstrated the power of these forces in generating widespread bias and discrimination in the population. These results shed light on the evolutionary dynamics behind the emergence of biases not only toward the Tellarian community (which is of course fictional), but also other forms of bias and discrimination. From the policy perspective, these results emphasize the importance of preventing the effects of negative feedback in the greater population. One example is to proactively provide more educational and economic opportunities among disadvantaged groups. This does not directly eliminate the negative feedback, but will indirectly help to reduce its impact by elevating their socioeconomic status and reducing their adverse probabilities. Another example is to enforce regulations that cut through such (sometimes unconscious) negative feedback mechanisms. These actions together will create more favorable environments for collective intelligence to emerge rather than allowing collective ignorance to propagate, and can potentially reduce, and eventually reverse, selection pressure behind the emergence of bias and discrimination.

L-ESS behaviors, p*, as a function of environmental parameters, when there is feedback. The feedback intensity, τ, increases from 0 in (Figure 6(a)) to 2 in (Figure 6(d)).
Path-dependent evolution and initial conditions
When feedback loops exist in the environment, evolution may become path dependent. Therefore, the dominant behavior that emerges in a given population will sometimes depend on the initial composition of that population. We consider evolution in populations that begin with non-uniform initial distributions of behaviors in this section.
Figure 7 demonstrates that two realizations of an evolutionary system under the same environment can lead to different growth-optimal behaviors, and different initial populations can also lead to different growth-optimal behaviors. Like the simulations illustrated in Figure 4, we simulate the evolution of 11 behaviors, p ∈ {0, 1/10, 2/10, …, 1}, for an environment with equal adverse probabilities for the Tellarian and Andorian populations (q = r = 0.2), and with a feedback intensity τ = 1. Path dependency of evolution in an environment with equal adverse probability (q = r = 0.2) and feedback τ = 1. We show the evolution of 11 behaviors, p ∈ {0, 1/10, 2/10, …, 1}, over time, with different starting populations. The vertical axis represents the relative frequency of each behavior, and the horizontal axis represents time. (7a): the initial population has low discrimination, n0 = (0.8, 0.02, 0.02, …, 0.02); (7b): a different simulation run with the same conditions as in (7a); (7c): the initial population has high discrimination, n0 = (0.02, 0.02, …, 0.02, 0.8); (7d): the initial population has low discrimination, n0 = (0.8, 0.02, 0.02, …, 0.02), and behaviors have a 0.1% mutation rate.
We use n0 to denote the frequency of different behaviors in the initial population. Figure 7(a) and (b) show two simulation runs of the evolution for an initial population with little bias: n0 = (0.8, 0.02, 0.02, …, 0.02). In other words, 80% of the initial population starts with no discrimination (p = 0). After 2000 generations, p = 0 dominates the population in the first case, whereas p = 1 dominates in the second case.
In contrast, Figure 7(c) shows the evolution for an initial population with a substantial amount of bias: n0 = (0.02, 0.02, …, 0.02, 0.8), hence 80% of the initial population starts with fully discriminatory behavior (p = 1). Not surprisingly, p = 1 dominates.
When the initial population is non-uniform, some behaviors may quickly become extinct before they have a chance to spread. In fact, if we allow a small amount of mutation in each generation—modeled as in Brennan et al. (2018), that is, with some small probability, for example, 0.1%, that offspring of type-p parents will be, in fact, type p′ ≠ p where p′ is uniformly distributed in [0, 1] − {p}—discrimination will again dominate, even if the initial population begins with very little bias. Figure 7(d) shows such an example. 24
This result underscores the fact that public policy may be able to guide a society towards different outcomes by purposefully imposing a strong prior belief onto the population. This may be achievable by encouraging fairer beliefs through early education, and by providing more accurate portrayals of other cultures to counteract inaccurate stereotypes. From the perspective of our binary choice model, these policies would nudge the initial population such that its subsequent evolution may lead to a less discriminatory society collectively.
Discussion
We present an evolutionary framework based on a binary choice model subject to evolutionary dynamics and stochastic environments that affects the fitness of a differentiated population. This framework yields collective intelligence in the form of sophisticated rational behaviors that emerge out of an initial population in which all possible behaviors are equally represented (Brennan and Lo, 2011, 2012). Within the same model, we can also specify conditions under which this collective intelligence breaks down, especially under conditions where agents face correlated fitness, or in the presence of path-dependent feedback. This offers one explanation of the emergence of political polarization, bias, and racial discrimination.
The root cause of these failures is complexity, particularly with respect to population heterogeneity, stochastic environments, and feedback mechanisms. Yet it is precisely in such complex environments that we are in most need of collective intelligence. Our results show that it is the complexity within the evolutionary process—not the complexity of the task (the task in our model is a simple binary choice)—that can undermine collective intelligence, which is far more subtle and challenging a problem. 25
Of course, our model has several limitations and is by no means a complete description of reality. Even a partial description would involve the interplay between sophisticated human behavior and highly complex nonstationary environments with multiple unknown factors. However, our approach offers a starting point for describing and understanding the fundamental principles behind the emergence of these failures of collective intelligence. A natural next step for future research is to develop more realistic models and conditions under which such failures can be expected.
Some of the biggest challenges facing humanity can only be solved through a collective and global effort. They include not only dealing with political polarization and discrimination, but also climate change, various life-threatening diseases, economic and social inequality, and the spread of disinformation. Extensions of our framework may help to explain the spread of disinformation and belief polarization, another example of the failure of collective intelligence (Haghtalab et al., 2021). This is closely related to political polarization and racial discrimination because the spread of disinformation facilitates the formation of these biases. With the advent and popularity of engagement-based recommender systems on news and social media platforms, disinformation has a much greater chance of propagating across the population. One of the great insights of Tversky and Kahneman (1974) is that humans tend to anchor towards their original beliefs. When first presented with new information, either through a news service, or simply a Twitter post, regardless of its authenticity, there will always be a group of people who happen to share a similar belief, even if that belief is false. Regardless of the small size of this initial group, through engagement-based recommendations their beliefs can be amplified rapidly throughout the population. This effect, in turn, will cause recommender algorithms to serve up similar information more frequently, reinforcing these false beliefs in a vicious cycle.
So how can we prevent failures of collective intelligence? Our evolutionary perspective suggests that the key is to foster environments under which the desired behavior—collective intelligence—will emerge naturally through evolutionary dynamics, instead of simply regulating against the undesired outcome which could create selective pressures that make matters worse. In our example of globalization, the fundamental cause of the emergence of polarization is the sharp difference in personal outcomes that comes with global integration: some individuals benefit, while others suffer. Constructing the right tools for those who are harmed by the polarizing factor—options such as extended education and providing employment opportunities in the new industrial landscape—is likely to be more effective than simply “shutting down” globalization.
More generally, proactively providing educational, social, and economic opportunities to counteract negative feedback loops, encouraging more accurate beliefs among current and future generations through early exposure, and shaping the environment to favor collective intelligence are likely to be more successful policies than attempting to outlaw undesirable behaviors. As long as the environmental factors giving rise to these behaviors are still in force, the banned behaviors will re-emerge in one form or another.
Continuing with our example of Andorians and Tellarians, if bias and discrimination already exist against the Tellarians, an obvious policy may be to simply criminalize such discrimination. This can lead to more forced interactions between the Andorians and the Tellarians, which in turn causes everyone to have a higher factor exposure to the Tellarians. However, since bias already exists in the population (since the Tellarians will have a higher probability of adverse events either initially, or through negative feedback loops), this will lead to more Andorians experiencing adverse events from their interactions with Tellarians, inevitably leading to even stronger negative feedback (and even higher adverse probabilities) for the Tellarians—a cognitive tendency that is difficult to change (Hawkins and Blakeslee, 2004). As a result, direct attempts to outlaw bias and discrimination against the Tellarians may actually make matters worse. In this sense, our society needs not only more integration among different groups (Anderson, 2010) but, more importantly, measures to ensure that negative feedback does not reinforce itself after the integration.
These simple examples illustrate how seemingly well-intended policies can create more selective pressure for collective ignorance to emerge. The fundamental reason is that they are addressing the symptoms, not the root cause, of these failures of collective intelligence. We do not model the objective function that policy makers should use for managing societal issues such as polarization and discrimination, but implicit in our framework is the fitness of different types of individuals that determines their survival. Therefore, as representatives of a given group of constituents, policy makers can reasonably be expected to focus on what improves the long-term fitness (in the economic sense) of those constituents. Our evolutionary framework provides a lens through which the underlying causes—the environment in which these failures emerge—can be identified so as to construct more productive policies.
Using history as a mirror, these implications are even more relevant now as we experience the Artificial Intelligence (AI) revolution (Makridakis, 2017; Diamandis and Kotler, 2020). Just as in the Industrial Revolution 200 years ago, and modern globalization over the past 50 years, the AI revolution will increase aggregate productivity while inevitably leading to another major shift in the industrial landscape and composition of the labor market. In this process, some individuals will benefit while others may be harmed. The policy suggestions outlined in this article, including extended education for those whose jobs have been replaced by AI, and providing children with equal access to education, particularly in STEM and AI-related subjects, are more pressing than ever.
Supplemental Material
Supplemental Material - The wisdom of crowds versus the madness of mobs: An evolutionary model of bias, polarization, and other challenges to collective intelligence
Supplemental Material for The wisdom of crowds versus the madness of mobs: An evolutionary model of bias, polarization, and other challenges to collective intelligence by Andrew W Lo and Ruixun Zhang in Collective Intelligence
Footnotes
Acknowledgements
Research support from the MIT Laboratory for Financial Engineering is gratefully acknowledged. We thank Zach Church, Jessica Flack (editor), Steven A. Frank, Wendy Liu, David C. Schmittlein, Harriet A. Zuckerman, and an anonymous reviewer for helpful comments and discussion, and Jayna Cummings for editorial assistance. The views and opinions expressed in this article are those of the authors only, and do not necessarily represent the views and opinions of any institution or agency, any of their affiliates or employees, or any of the individuals acknowledged above.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
