Abstract
How individuals value their own and others’ outcomes is conceptualized as their Social Value Orientation (SVO). Research demonstrates that SVO is a valid predictor of cooperative behavior across various empirical settings. However, once individuals interact repeatedly, the relative strength and stability of the SVO–behavior link are less clear cut. We postulate that learning mechanisms have a bearing on cooperative behavior and potentially override the influence of SVO. In an experiment with a Step-Level Public Goods design with impact asymmetry (N = 120), participants were randomly assigned to groups of five and interacted for six rounds. SVO was measured with the 9-item Triple-Dominance Measure. Using a multi-level Bayesian approach, we corroborate that SVO is predictive of behavior at the onset of interaction. Yet, after the first interaction, the relationship between SVO and behavior virtually disappears.
Introduction
In collective action initiatives focused on the production of public goods, a group of individuals with aligned interests undertakes joint action to further these interests (Olson, 2012). A public good is something that, by definition, a single individual cannot produce relying exclusively on their own means (Offerman, 2013). Its achievement is instead only possible through the contribution of sufficiently impactful people (Baldassarri, 2017). Individuals participating in the joint action are said to invest in the (production of the) public good. Many instances of collective action focused on public good production have a ‘critical mass’ structure, meaning that collective action is only successful if a sufficient number of group members participate (Marwell and Oliver, 1993), think of social movements, grass roots associations, or renewable energy initiatives (Goedkoop et al., 2022; McAdam et al., 2003). Here, social dilemmas typically arise in identifying and motivating the coalition of individuals who should invest (Flache and Dijkstra, 2015).
How individuals value their own and others’ outcomes in situations of interdependence is conceptualized as their social preference or Social Value Orientation (SVO) (Messick and McClintock, 1968; Van Lange et al., 1997). The majority of SVO measures classify individuals as either prosocial (altruistic and cooperative individuals) or proself (individualistic or competitive), and in our hypotheses we contrast prosocials and proselfs. Within the category of prosocials, a distinction is frequently made between either social preferences for joint gain maximization or inequality aversion, however, in the current paper this distinction plays no role (but see the Discussion section). A prosocial SVO is correlated with personality traits such as generosity and trustworthiness (Liu et al., 2024; Thielmann and Hilbig, 2014). Furthermore, the Honesty-Humility factor from the HEXACO personality inventory shows that individuals scoring high on this factor have a tendency to behave fairly in cooperation with others even in the presence of high risks of being exploited (Ashton and Lee, 2007:156). Proselfs are motivated to produce outcomes beneficial for themselves (with individualists focused on personal gain and competitors on maximizing positive payoff differences with others).
SVO is one of the most frequently studied psychological traits in research on behavior in public good problems, and ample research demonstrates that SVO is predictive of cooperative behavior in public goods games (Ackermann and Murphy, 2019; Balliet et al., 2009; Bilancini et al., 2022; Bogaert et al., 2008; Cartwright et al., 2024; Kanagaretnam et al., 2009; Pletzer et al., 2018; Thielmann et al., 2020; Van Lange et al., 2013). That being said, most of these studies have looked at the working of SVO in experiments in which individuals interact in one-shot games (i.e., interacting only once).
In repeated encounters, mechanisms such as trust and reciprocity start to play a role as individuals get into a conditionally cooperative mindset if they know they will face each other again (Balliet et al., 2009; Boone et al., 2010; Croson et al., 2005; Dreber et al., 2014; Fischbacher et al., 2001; Nax et al., 2015; Thielmann et al., 2020; Van Lange et al., 2013). This partially explains why literature shows mixed results concerning the SVO–behavior link in repeated settings (Ackermann et al., 2016, 2019; Andreoni, 1995; Fiedler et al., 2013; Fischbacher and Gächter, 2010; Murphy and Ackermann, 2015; Nax et al., 2015, 2016, 2017; Nax and Perc, 2015; Neugebauer et al., 2009; Nunner, 2022; Offerman, 2013; Offerman et al., 1996; Parks, 1994; Przepiorka et al., 2021; Pulford et al., 2017). A number of studies find that the SVO–behavior link remains stable and significant in repeated settings (Balliet et al., 2009). Other studies report that the strength of the SVO–behavior link diminishes over repetitions, with a tendency to bounce back at the very end (Ackermann and Murphy, 2019). Yet other studies find weak or no clear evidence for the SVO–behavior link, showing instead that structural characteristics of the public good game determine investment behavior. It seems that these mixed findings could be due to at least the following factors.
First, partner matching matters. When interaction partners change after each round (randomized groups), participants cannot learn about specific others, rendering personal preferences (such as SVO) a more useful guideline for behavior (Balliet et al., 2009; Fiedler et al., 2013; Pulford et al., 2017). In contrast, when group composition remains constant and learning becomes possible, SVO effects are less pronounced (Przepiorka et al., 2021). Second, the structure of the public good game affects decision interdependence. In linear public goods games, contributions are additive, while in critical mass games, outcomes depend on whether a contribution threshold is reached, making players’ decisions more reliant on expectations of others’ behavior (Abele et al., 2010; Ladyard, 1995). Results from repeated linear public goods games often demonstrate a reliable SVO–behavior link (Ackermann et al., 2016; Nax and Perc, 2015; Offerman et al., 1996).
Studies of repeated games with a critical mass structure find that the SVO–behavior link is not as convincing (Przepiorka et al., 2021). For instance, in the study of Przepiorka et al. (2021), fixed groups of participants repeatedly play a volunteer’s dilemma game in which the critical mass is reached if one member of the group invests, while additional investments do not increase the public good outcomes (Weesie and Franzen, 1998). They find that SVO is not predictive of investment behavior. It is rather the structural properties of the interaction situation, and in particular payoff asymmetry between group members, that determines behavior.
Our paper contributes to the literature by taking another look the SVO–behavior link for repeated critical mass structure, fixed groups and asymmetry between group members. Rather than introducing payoff asymmetry (like Przepiorka et al., 2021), we consider asymmetry in terms of the impact investments have. Impact asymmetry is a prominent feature of many real-life critical mass problems. Consider the case in which a group of community members seeks to establish a community energy initiative (CEI, e.g., Nientimp et al., 2024), which can be anything from a local group giving energy advice to a fully citizen-run energy company. To start such an initiative, the input of various types of resources is required, such as technical skills and knowledge, legal and organizational expertise, communication skills and knowledge, etc. In addition, sheer time and effort also count. Importantly, setting up a CEI has a critical mass structure (e.g., Goedkoop et al., 2022). Now consider the potential contribution of a community member with a lot of knowledge concerning renewable energy technology, and compare it to the potential contribution of someone whose main asset is free time, but who lacks any relevant knowledge and skills. All other things equal, the contribution of the former is likely to carry a much larger impact on the chances of successfully starting the initiative than the contribution of the latter. And this is true quite apart from any payoff asymmetries existing between them. Thus, impact asymmetry should be considered as a separate factor affecting behavior in critical mass problems, on par with payoff asymmetry. Moreover, a similar account can be given for any number of collective action and social movement endeavors: Apart from the existence of payoff asymmetries, impact asymmetries are ubiquitous. It is important to note that our focus is explicitly on the SVO–behavior link in repeated interactions with a model that harbors impact asymmetry. However, our theoretical model and hypotheses focus solely on the SVO–behavior link in repeated interactions. Bringing lessons from the SVO–behavior link in critical mass problems, repeated encounters, fixed groups matching and structural impact asymmetry together, the overarching research question of our paper becomes: How does the SVO–behavior link change over the course of repeated interactions in cooperation problems having a critical mass structure in which the impact of contributions is heterogenous?
To answer the research question, we design a theoretical framework to be tested empirically with an experiment following a Step-Level Public Good design (SPG) (Van de Kragt et al., 1983). In a SPG, the public good is produced if and only if a sufficiently large and resourceful subgroup of group members contributes to its production. Over and above this threshold, additional contributions have no effect on the level of the public good. Likewise, when total contributions are shy of the threshold, they have no impact and the public good is not produced. If enough impactful individuals invest, the public good is produced and a beneficial surplus (payoff) is obtained by all individuals in the group. Since it is a public good characterized by jointness of supply, all group members obtain the benefits regardless of whether they contributed towards its production (Dijkstra and Bakker, 2017). Investing is a costly action, however, and investors reap a lower net gain than non-investors, provided the public good is produced. Moreover, if the public good is not successfully produced, any investor experiences a net loss whereas non-investors remain unaffected. In our design, the costs and benefits of investment are identical across group members. Hence, there are no payoff asymmetries. However, the impact of investments on the production of the public good is heterogeneous across group members. With this model we thus capture critical mass and impact asymmetry.
We postulate that the SVO–behavior link in repeated interactions works as follows. At the onset of interaction, without prior exposure to and information about the behavior of their interaction partners, people make decisions in accordance with their SVO. Subsequently, when people meet again and have thus been exposed to the actual behavior of their counterparts, they learn what behavior to expect and choose their own behavior based on these expectations. Consequently, people do not fully rely on personal preferences after the first few interaction episodes. This reasoning is in line with literature on learning in repeated interactions, which shows that repetition strongly affects cooperation (Axelrod, 1986; Bó, 2005). Two key mechanisms shape cooperative behavior in repeated interactions: learning from past experiences and control based on expectations of future interactions (Blake et al., 2015; Flache and Macy, 2002; Macy and Flache, 2002). Individuals adjust their behavior based on prior exposure to partners’ actions (Diekmann and Przepiorka, 2016; Przepiorka et al., 2021). For example, repeated exposure to non-cooperative partners can lead even cooperative individuals to defect in response (Rapoport et al., 2015). Building on this, we derive a counterfactual reinforcement-learning model: Players choose actions that would have been optimal in the previous round based on what they learned about others’ actions. Our main and most significant contribution is thus that we take a closer look at how the mechanisms of learning and control in repeated play might override the effects of SVO in the SPG. We elaborate on the details of this model extensively in the Theory and Hypotheses section.
To answer our research question and test our theoretical model empirically, we use a dataset of experimental research conducted by Dijkstra and Bakker (2017). This dataset of N = 120 participants consists of an experiment and of measures of SVO, and has neither been published nor subjected to analyses before, and perfectly fits the purposes and aims of our paper. The experiment was computer-mediated and programmed using Z-tree (Fischbacher, 2007). It consisted of two elements: (1) two variations of SPG experiments in which fixed groups of five participants interacted repeatedly for six consecutive rounds, and (2) measurements of SVO elicited by the Triple-Dominance Measure SVO questionnaire. To account for the relatively small sample size and the multilevel structure of the data (decisions nested in individuals, who are nested in groups) we use a Bayesian approach (Balwin and Fellingham, 2013; Gelman et al., 2014; Gelman and Hill, 2007). We use independent, non-informative priors on the probability scale for all parameters, implying that we assume no prior knowledge but rather inform our parameter estimates with data. While such an approach does not obviate a power analysis, it does inform us of the extent to which the data contain information that reduces our uncertainty about effects.
The structure of our paper is as follows. In the Theory and Hypotheses section, we describe the theoretical framework and hypotheses. In the Experimental Design and Data section, we describe the setup and methods of our research. In the Results section, we report a descriptive overview of the data and show results of hypotheses tests. In the Conclusion and Discussion section, we answer the research question, discuss our findings and their implications, elaborate on limitations, and propose directions for future research.
Theory & hypotheses
SVO and cooperative behavior
Ample research demonstrates that SVO is predictive of cooperative behavior in collective action situations (Ackermann et al., 2016; Andreoni, 1995; Au and Kwong, 2004; Balliet et al., 2009; De Cremer and Van Lange, 2001; Dijkstra and Bakker, 2017; Fiedler et al., 2013; Fischbacher and Gächter, 2010; Kanagaretnam et al., 2009; Murphy and Ackermann, 2015; Murphy et al., 2011; Nax et al., 2015; Nax and Perc, 2015; Neugebauer et al., 2009; Offerman, 2013; Offerman et al., 1996; Parks, 1994; Przepiorka et al., 2021; Pulford et al., 2017; Thielmann et al., 2020; Van Lange et al., 2013). The theoretical mechanism underlying this effect is that in collective action and social dilemmas generally, individual actions carry externalities for other group members. In fact, individual contribution decisions (co-)determine payoff distributions across self and other. Hence, in any rational or purposeful form of decision-making, preferences over such payoff distributions are pertinent to the decision. In Step-Level Public Goods in particular, individual contributions raise the likelihood that the public good will be successfully produced, generating positive externalities for others in addition to higher payoffs for self. Moreover, since contributing carries a cost whereas receiving the positive externality is a windfall, especially individuals with a prosocial value orientation are expected to be motivated to contribute. We expect that SVO is crucial at the onset of repeated interactions. In particular, in the first (few) encounter(s) of a series, we expect individuals to behave in congruence with their SVO. From this we derive the following hypothesis:
SVO predicts individual behavior at the onset of repeated interactions in the SPG. Especially, prosocial individuals have a higher investment probability than others.
SVO and behavior in repeated SPGs
The structure of an SPG creates at least two problems for groups to overcome. The first problem is the free-rider problem: Though for each player (following the parlance of game theory, we will be using the terms “player” and “individual” interchangeably) the value of the public good exceeds the costs of contributing to it in case of successful production, things would be even better from a selfish perspective if others invested and one could benefit from the good without getting one’s hands dirty (Lumsden et al., 2012; Rapoport, 1988). However, if all players follow this reasoning, the good is not produced.
The second problem is designating the coalition of investors, as the threshold structure of the game also creates a coordination problem of who should invest. On the one hand, too few investments lead to a failure in public good production, while on the other hand, investments above the threshold will not have any additional effects, leading to a loss of costly efforts either way. This threshold structure and the resultant coordination problem have an important implication for the subjectively perceived consequences of individual behavior and hence for the decisions players make. If a player “pessimistically” believes her potential contribution will add little or nothing to the probability that the SPG is produced, the consequence is that she believes that her decision will not produce any positive externalities for others at all. Hence, the conduit for SVO to affect choice behavior in the SPG is cut off. Such “pessimistic” beliefs about the effects of one’s contribution easily arise in the SPG; they result whenever a player believes the contributions of others put the group either well below or well above the threshold for success. In either case, any additional individual contribution is subjectively inefficacious. Hence, due to the threshold property of the SPG, individual contributions cannot be said to unconditionally carry positive externalities for other group members, contrary to what is the case in, for instance, a linear public good (Abele et al., 2010; Ackermann et al., 2016; Zelmer, 2003).
In the study of Przepiorka et al. (2021), the coordination problem is effectively solved by creating payoff asymmetries between players, such that the onus of contributing naturally falls on those benefitting most from the public good. In such a situation, SVO has little effect on behavior, as it is overwhelmed by payoff considerations. Moreover, by the sheer force of the focal point payoff asymmetries provide, participants very quickly learn the best response and coordinate on that tacitly, even in the symmetric treatment, where groups quickly learn that taking turns is an optimal response. As we argued above, however, impact asymmetry is distinct from payoff asymmetry, and empirically at least as important. Moreover, in the absence of payoff asymmetry, impact asymmetry by itself does not necessarily provide an overwhelmingly potent focal point. Thus, while embodying an empirically relevant element of social structure, impact asymmetry cannot a priori be expected to wash out either SVO or learning. Hence, situations of impact asymmetry may provide interesting contexts for testing the relative predictive force of SVO and learning.
In a repeated interactions context, we posit that beliefs about the efficacy of one’s contribution are dependent upon the history of the interaction. Research demonstrates that repetition has a major influence on an individual’s tendency to cooperate (Axelrod, 1986; Bó, 2005; Dijkstra and Oude Mulders, 2014; Kerr, 1992). When individuals share a cooperative future, two key features affect individuals’ considerations for cooperative behavior: past experiences (shadow of the past) and expected future interactions (shadow of the future) (Blake et al., 2015). These mechanisms are known as ‘learning’ and ‘control’, respectively (Flache and Macy, 2002; Macy and Flache, 2002). Individuals learn from past interactions, form beliefs or expectations about their partners’ future actions, and adjust their behavior as a control mechanism for outcomes in the future (Diekmann and Przepiorka, 2016; Przepiorka et al., 2021). For instance, if individuals in groups learn by experience that their counterparts repeatedly defect or do not contribute, they become more likely to also defect and retaliate, even though they might individually have cooperative preferences (Rapoport et al., 2015).
Given the threshold structure of the SPG and the learning and control mechanisms that emerge in repeated play, we expect that the influence of SVO is overridden when groups interact repeatedly. In particular, through repeated play, a player is confronted with the fact that her behavior does not unconditionally produce positive externalities for others. Rather, the extent to which a player estimates her potential contribution to be efficacious (i.e., to have a discernible impact on the probability that the public good is produced) depends on her expectations concerning the contributions of others. Said expectations are in turn determined by the history of group interaction. The extent to which a player expects to be able to affect the payoff distribution between self and others at all is crucially dependent upon the history of group interaction. Therefore, the impact of dispositional traits such as SVO should wane over time as contextual structure (i.e., the threshold structure) and learning mechanisms (i.e., forming expectations about how far removed from the threshold the group’s contributions likely are) gain a more profound influence. To investigate this proposition, we derive the following hypothesis:
The impact of SVO on investment in the SPG decreases over multiple interactions.
Investment probability and learning
Here we take a closer look at how the mechanisms of learning and control in repeated play might override the effects of SVO in the SPG. Beliefs, learning and control behavior can develop both trough rational learning and reinforcement learning. Rational learning, also known as Bayesian learning, means that individuals form expectations of other players’ future behavior by rationally integrating new information and prior beliefs (Flache and Dijkstra, 2015). In the case of reinforcement learning, individuals learn by associating behavior with favorable or unfavorable outcomes. Actions linked to favorable outcomes will be repeated with greater likelihood whereas actions linked to unfavorable outcomes will more likely be shunned (Flache and Dijkstra, 2015; Flache and Macy, 2002). This way of learning links to psychological theories of classical and operant conditioning (Hughes et al., 2018).
In order to theorize learning processes in the experimental SPG we implement, we must first specify two of its key properties. First, in the present context we are considering a finitely repeated SPG (Croson, 2010). This feature is common knowledge among all group members, as is the number of rounds to be played. Second, after each round of interaction (in which all players simultaneously and in ignorance of what others are doing choose whether or not to invest), all players learn their own payoff and whether the public good was produced. Hence, players do not learn which of their fellow group members invested or not. Given these general features of the situation, we can theorize the learning process.
When individuals have participated in the first round of interaction and are presented with a certain outcome in terms of their own payoffs, and whether the public good is produced or not, they form expectations about their group mates’ behavior and chances of success for the next round. Based on these expectations, they decide whether or not to adjust their behavior in order to control the group outcome and their own payoffs in the next round of social interaction. Current learning models such as reinforcement learning (Camerer and Weber, 2012), belief learning (Gächter and Renner, 2010) and experience weighted attraction learning (Camerer et al., 2003) show that outcomes in previous rounds serve as guidelines for actions in future rounds (Camerer and Fehr, 2006). In the present paper, we choose a simple “counterfactual reinforcement-learning” principle as the basis of our theoretical reasoning: In the current round, players choose the behavior that would have been the best response in the previous round, given their beliefs about what happened in the previous round. This learning principle is akin to rational learning in that beliefs are formed based on evidence from previous interactions. It falls short of being fully rational because not the entire history of interaction, but rather just the latest round, is used to form beliefs. Apart from the fact that players do not know exactly what every other player did the previous round, our learning principle is very close to Cournot best-response behavior (Ho et al., 1988)
SPG previous round histories.
In terms of payoffs for self, the free-rider history ranks highest, followed by the reward history, the punishment history, and the sucker history. In terms of learning, the free-rider history provides definite information that enough others invested to produce the public good. Likewise, the sucker history is proof of the fact that an insufficient number of others pulled their load. Thus, in both cases best responding to last round’s outcomes leads to non-contribution in the current round. Moreover, this best response behavior is optimal regardless of the player’s SVO: Given both histories, contribution would not have (free-rider history) or cannot have (sucker history) affected the payoffs of others and would only have (free-rider history) or only has (sucker payoff) depressed the payoff of self. Hence, irrespective of a player’s SVO, the free-rider and sucker histories are predicted to lead to a decreasing probability of contribution in the current round. Early in the game, by the very nature of SVO, prosocials are more likely to experience the sucker history, while proselfs more likely experience the free-rider history. But either way, both SVO-types quickly learn their contribution does not matter.
The reward and punishment histories are more ambiguous from a learning perspective. In the former, the player cannot be sure whether withholding her contribution would not have yielded a higher individual payoff. In the latter, the player cannot know for sure whether her contribution would have pushed the group over the threshold. Thus, these histories leave room for believing one’s contribution will be impactful in terms of payoffs for self and other. Therefore, we predict SVO to have an effect after these histories have played out. In particular, we predict prosocial players to have higher probabilities of investing following these histories than other SVO types. Again, early on, prosocials are relatively more likely to experience reward histories than punishment histories, but either way they should be more cooperative. Based on this logical analysis, we present our third set of hypotheses:
A player’s investment probability in any round but the first is affected by the outcome of the previous round. In particular, for all SVO types the investment probability is decreased after an occurrence of the free-rider history (H3a) and the sucker history (H3b). Following the reward history (H3c) and punishment history (H3d), prosocials’ probability of investing is increased compared to other SVO types. In the next paragraph, we outline the design and methods with which we test our hypotheses.
Experimental design & data
Experimental design
To test our hypotheses, we use a secondary dataset (N = 120 participants) from study 1 reported in Dijkstra and Bakker (2017). Dijkstra and Bakker (2017) use only the one-shot data they collected, and we are the first to analyze the repeated data from that experiment. The one-shot SPG data is not part of this paper. Below we first outline the experimental design of the study and then we elaborate on the details of the dataset.
The experiment is computer-mediated and programmed using the Z-tree software (Fischbacher, 2007) and consists of three separate parts presented in a balanced order: (i) a SVO questionnaire, (ii) a one-shot SPG and (iii) a repeated SPG. The experimental design features two key manipulations across treatments. The first manipulation is impact asymmetry, that is, that the impact of the investments of different participants will differ depending on the so-called impact assigned to the participant. Impacts are randomly assigned to participants. The second manipulation concerns whether or not the participants were fully cognizant of the differential impacts (complete or incomplete information). In our statistical analyses, we control for these manipulations.
The design consists of a 2 × 2 × 2 × 2 structure in which ‘SVO-first’ (SVO questionnaire first), ‘repeated-first’ (repeated game first) and ‘information’ (complete/incomplete) are between-subject factors, and ‘iteration’ (repeated/one-shot) is a within-subject factor. This yields a total of eight between-subject treatments. The manipulation of the between-subjects information treatment is not under investigation in this paper and is discussed at length in Dijkstra and Bakker (2017). However, in our statistical analyses, we control for these manipulations. Participants were randomly assigned to a group before playing the first SPG game (either one-shot or repeated depending on the treatment) and were randomly reassigned to new groups before the start of the second SPG game. This procedure was common knowledge. Below we describe the SVO and SPG parts in detail.
The SVO part
Participants’ SVO were elicited using the nine-item Triple-Dominance Measure (TDM; Van Lange 1997). We are aware that there are other measures (such as the SVO slider measure) out there that are currently more commonly used. However, since we rely on secondary data, we cannot change the measurements anymore. Nevertheless, research shows that the TDM is a robust SVO instrument in dealing with participants’ behavior that could bias observations such as random answers or lapses in attention (Bakker and Dijkstra, 2021; De Matos Fernandes et al., 2022; Lui et al., 2024).
SVO item example.
The SPG part
Payoffs of individual in SPG depending on own behavior and behavior of others.
In addition to the endowment of 10 points, each player is randomly assigned an impact, where the impact stands for a relative impact that a player has in producing the public good, i.e., this operationalizes the efficacy of a player. The impact distribution in this game is heterogeneous (varies between players) and is as follows: One player has an impact of 50, one player has an impact of 2, and three players have an impact of 16. Players always know their own impact, but do not always know which impact other players have. It is common knowledge that the impacts of all 5 players sum to 100. In half of the sessions, players have complete information on the distribution of impacts. In the other half of the sessions, players have incomplete information, knowing only their own impact and the fact that the impacts sum to 100. The SPG is produced if and only if the impacts of the investors sum to at least 51. Players have the same impacts throughout the entire experiment. After all group members have made their decisions, participants are informed whether the SPG is produced. Participants are not told who of their fellow group members had invested. The SPG is repeated for six rounds with the same group of participants, and this fact is common knowledge. Points that participants earned in the SPG games were converted to money at the rate of 10 eurocents per point and participants received this amount in a closed envelope after the experimental session was finished. The payment procedures were common knowledge and final pay ranged between 5 and 10 euros for a 40-min session.
Dataset
Number of participants, broken down by information condition and impacts.
Results
Descriptives
Based on the 9-Item TDM for SVO in our sample of N = 120, 2 subjects were classified as competitive, 62 subjects were classified as prosocial, and 39 subjects were classified as individualist. The remaining 17 subjects were unclassified, since they did not meet TDM’s consistency check. Since our hypotheses are focused on effects on investment behavior of a prosocial SVO compared to all others, we choose not to exclude the unclassified participants.
Distributions of SVO types of half samples who took nine-item TDM before SPG (“SVO-first”) and after SPG (“SVO-last”); N = 120.
Note: Fisher’s exact test for difference between distributions: p = 1.

Number of investments in SPG per round, per group.
Number of subjects (with % per row) investing in 0, 1, 2, …, 6 rounds out of 6, by SVO type.
Hypotheses tests
To test our hypotheses, we estimate a series of multilevel logistic regressions with investment (0/1) as the dependent variable and random intercept terms for groups and subjects.
We use a Bayesian approach and code and estimate the models in the Stan language using R Statistical Software (R core team, 2021; McElreath, 2020; Stan Development Team, 2021). We use independent, non-informative priors on the probability scale for all parameters, meaning that we have no clear expectations but rather inform our parameters with the data. We report posterior means and symmetric posterior 95% probability intervals. This yields “Bayesian hypothesis tests” at a two-sided significance level of 0.05 (Gelman et al., 2014). The results of these tests can be interpreted in a straightforward probabilistic manner: If the posterior 95% probability interval does not include the value 0, this implies that the conditional probability that the true parameter is actually zero, given the data and the priors, is less than 0.05. To test the hypotheses, we evaluate the posterior probability intervals of the relevant (transformations of) parameters. In our estimation, we implement 4 chains with random starting values for each model and run 4000 iterations. All models we report have converged, with R-hat values of at most 1.01 for each parameter. Data and R code are available at Open Science Framework.
As mentioned before, we work with a secondary dataset that has not previously been subjected to analysis. As the researchers of the dataset had not conducted a power analysis prior to the study, we do not have quantifiable measures to label effect sizes, and we need to be careful in interpreting and concluding on observed effects. A way to paint a more nuanced picture with our hypothesis tests is by employing Bayesian analysis. By employing a Bayesian approach, we aim to stabilize and shrink estimates towards the (uninformative) prior so as to obtain more reasonably parsimonious results and a nuanced view of the data (Baldwin and Fellingham, 2013).
SVO and investment over time
Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets.
Note: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested in 120 subjects, nested in 24 groups. Control variables were also included in Model 1, all posterior 95% probability intervals cover the value 0 and are reported in Table 9 labelled as Model 1.2a.

Regression effects of Prosocial SVO (compared to all other subjects) per period; posterior medians and 95% probability intervals.
Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets.
aNote: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested in 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period.
Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets.
aNote: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested in 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period.

Trends for each impact by period.
Prosocials’ investment probability and learning
To evaluate Hypothesis 3, we create four 1-period lagged variables indexing previous round history (one for each of the four possible histories in Table 1). The first round of play thus scores a 0 on all these history variables, and the estimate for this period is the intercept. We first zoom in on the responses of prosocials and non-prosocials separately.
Figure 4 displays how the four possible histories yield response-rates for the different SVO types: for prosocials on the left side and for non-prosocials on the right side. Figure 4 was built based on the estimates reported in Table 10. We included the influence of control variables such as impact and order of tasks in Model 3 and report these in Table 11. Model 3 shows that there is no influence of the order of tasks (SPG, SVO) and no treatment effects (complete vs. incomplete information). We do observe that there are differential effects for impact: Investment probability increases monotonically with impact. Furthermore, we performed additional analyses to test the robustness of investment behavior for prosocials versus other players, where we test for the influence of period effects and interaction effects with histories of play. We found that all posterior 95% probability intervals cover the value 0, meaning that the influence of history of play is not affected by period. Thus, the effect that a certain history may have on the probability to invest does not interact with the round of play, neither for prosocials nor for non-prosocials. These tests are reported in Tables 12 and (for control variables) in Table 13. Regression effects of prosocials (left) and non-prosocials (right) per history; posterior means and 95% probability intervals. Bayesian Parameter Estimates From Multilevel Logistic Regression; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets. aNotes: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested in 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period, and are reported in Table 11. Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets. aNote: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested in 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period. Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets. aNote: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period. Bayesian Parameter Estimates From Multilevel Logistic Regressions; dependent variable is “invest” (0/1); independent, non-informative priors for all parameters; posterior means, with (2.5%, 97.5%)-percentiles in brackets. aNote: Multilevel logistic regression HMC fit in Stan with random intercepts for individuals and groups, with means of zero; 4000 iterations; 20 investment decisions, nested 120 subjects, nested in 24 groups; control variables are dummies for SVO-first, repeated-first, info, and dummies for impact per period.
From Figure 4 it seems that prosocials respond differently to the histories compared to non-prosocials. Specifically, for prosocials, the punishment and reward histories seem to elicit higher investment probabilities while the free-rider history seems to depress the investment probability. The responses of non-prosocials seem to indicate that history of play hardly influences their response behavior in the next round. Note, however, that for both prosocials and non-prosocials the 95% probability intervals associated with all histories contain the value 0.
Figure 5 displays the estimated differential effects of the four possible histories for the SVO types, based on Model 3 in Table 10. It shows how the 95% probability intervals of the contrast between prosocials and others covers 0 for each history. Hence, different SVO types do not respond differently to different histories, and we are led to reject Hypothesis 3. Rather than referring to the subtle learning mechanism underlying Hypothesis 3, the data can more parsimoniously be summarized by saying that compared to non-prosocials, prosocials start with a higher inclination to invest and quickly learn to behave like anyone else regardless of the actual history of play. Even though prosocials do indeed lower their investment probabilities after an occurrence of the free-rider (H3a) and sucker histories (H3b), the same is not true of non-prosocials (who appear wholly unaffected by the history of play). In this respect it is noteworthy that prosocials seem to respond particularly negatively following a free-rider history and particularly positively following a reward-history, as reflected in Figure 5. Regression effects of prosocials’ SVO (compared to all other subjects) per history; posterior means and 95% probability intervals.
Conclusion and discussion
In this paper, we extended research on the link between Social Value Orientations and cooperative behavior in repeated interactions. Previous research had taught us that SVO is a valid predictor of cooperative behavior, specifically when individuals interact with each other in one-shot games across various experimental settings (Balliet et al., 2009; Bogaert et al., 2008; Pletzer et al., 2018). We ask whether the link between SVO and cooperative behavior is still as strong when groups interact in the same group repeatedly, in the context of a cooperation problem with a critical mass structure. Drawing on social learning theories and paying close attention to the incentive structure of the situations that groups face, we hypothesized that in “critical mass” or threshold incentive structures, SVO quickly loses predictive value in repeated interactions. That this is indeed the case in situations of payoff asymmetry is suggested by Przepiorka et al. (2021). We explored whether impact asymmetry, too, is a significant form of structural heterogeneity that influences the SVO–behavior link in repeated interactions. We theorized that when people interact repeatedly and are thus exposed to the actual behavior of their counterparts, they learn what behavior to expect and choose their own behavior based on these expectations. This was the basis of our counterfactual reinforcement-learning model. We thus took a closer look at how the mechanisms of learning and control in repeated play might override the effects of SVO in the SPG. That being said, we did not derive and test specific hypotheses about how SVO–behavior relations interact with specific form of impact asymmetry. Based on a set of simple assumptions about the learning process, we derived three hypotheses about how the SVO–behavior link develops over time. We employed a computerized experiment with a Step-Level Public Goods design (N = 120) that also included a measurement of SVO to test our hypotheses. So, what has our study taught us about the link between SVO and cooperative behavior in SPG games when individuals interact in the same group repeatedly?
First, looking at the support for the two hypotheses: We confirm that SVO is a strong predictor of cooperative behavior in social dilemmas at the onset of cooperation, but its predictive power quickly wanes as play proceeds. In particular, prosocials are found to be more likely to invest in the first encounter than non-prosocials, but the two types become indistinguishable afterward. This phenomenon sets in right after period 1: After the first period, we witness a sharp drop in prosocials’ differential tendency to invest, although they do seem to remain more prone to invest than non-prosocials in subsequent periods.
Second, and on a more abstract level, our results show how the impact of dispositional traits such as SVO can change over time as context and learning gain a more profound influence. Future research should investigate the SVO–behavior link in a variety of structural conditions to gain more insight into how (learning in different) structural conditions hamper or foster the behavioral manifestation of prosocial preferences. In particular, our study suggests (but most certainly does not prove) that impact asymmetry is a form of heterogeneity to be considered on its own. In particular, in our next study, we aim to investigate the robustness of the SVO–behavior link under various degrees of impact asymmetry.
Third 1 , our analysis of history effects suggests that prosocials (as compared to non-prosocials) respond slightly more negatively to a free-rider history and as a result start to free-ride themselves, while they seem to respond more positively after a reward history. That being said, while Hypothesis 3 predicts that prosocial players should react differently to various interaction histories, the data suggest that all players ultimately converge toward similar behavior, thus we found no evidence in favor of the specific learning hypotheses we derived (H3). The rejection of Hypothesis 3 indicates that the model could benefit from a more detailed analysis of how player expectations evolve over time. Overall, the theory underlying our hypotheses has found only modest support, perhaps also due to the rather small sample size.
While we do find that the effect of SVO is short-lived and limited to the beginning of the interaction, we find only very weak evidence for our claim that this pattern is attributable to differential responses of different SVO types to particular histories. An a posteriori explanation for why prosocials respond slightly more negatively to free-rider histories is that most prosocials hold relatively strong fairness principles which are violated after such histories (Ackermann and Murphy, 2019; Mischkowski et al., 2019; Nunner, 2022).
Our study harbors several limitations that can be improved in future studies. To start with, the data set consists of a relatively small sample size, and lacks information from an a priori conducted power analysis to inform us on how to reliably detect an effect with our hypothesis tests. The literature is divided on the best course of action in such cases, but in statistics and modelling, conducting a post-hoc power analysis is a controversial practice (Dziak et al., 2018) as it may not show reliable and valid outcomes (Dziak et al., 2020; Zhang et al., 2019). To deal with the small sample size and make more cautious inferences concerning our parameters, we employ a Bayesian approach where we use independent, non-informative priors and build baseline models that have no clear expectations but rather inform parameters with the data at hand (Gelman et al., 2014; Gelman and Hill, 2007). For the results, we report posterior means and probability intervals for more precise estimates (Balwin and Fellingham, 2013).
The second limitation is that our learning model only provides insights into short-term repeated decision making, and has a rather limited temporal horizon. In this experiment, groups interact only six times, and here even one round can set a precedent in repeated interactions for a repeated best-response tactic (Camerer et al., 2003; Camerer and Weber, 2012). However, in real life, many instances of collective action tend to take years of cooperation and several interactions with similar partners to be realized. Interaction partners consider lessons to learn from multiple interactions and make decisions not based on a few interactions, but based on a whole series of interactions. In such a situation, people have much better opportunities to form beliefs and expectations. Moreover, individuals’ expectations as to the investments of others are likely to impact the SVO–behavior link, opening up an avenue for learning effects. Theoretically, this creates an interesting link between learning and the so-called triangle hypothesis (Kelley and Stahelski, 1970; Van Lange, 1992) which states that individuals’ beliefs are systematically related to their SVO. Relatedly, because of the shorter time horizon in this model, it is possible that there is an underestimation of the effects of SVO. Previous research suggests that in longer-term settings or those involving reputation signals, SVO may reemerge as a significant factor (Ackermann and Murphy, 2019). Exploring these conditions could provide a more comprehensive understanding of the role of social value orientation in repeated cooperative interactions.
The third imitation of our study which makes it difficult to elicit the explicit role of SVO over time is that we lack measurements of beliefs and expectations of players. Because of this, we cannot investigate considerations underlying choices that could inform the development of theory about belief learning and the role of SVO. Relatedly, our study did also not include factors such as trust, reputation and principles of reciprocity, factors that are known to influence decision making in repeated play and also closely related with SVO (Boone et al., 2010).
The fourth limitation is that we did not incentivize decisions, whereas recent studies show that incentives seem to underline the true consequences of decisions and thereby render observation instances as actual behavior and bring out social preferences more clearly. To appear more prosocial than you are, you must really show it and not just note it. The presence of consequences thus renders observations instances of “actual” behavior (Thielmann et al., 2020).
All in all, our results suggest that the SVO–behavior link might be weaker in repeated strategic settings with a critical mass structure than in one-shot settings or settings with a linear public good structure. This implies an important caveat for those who wish to use SVO to predict behavior in real-life contexts. Whenever these contexts involve repeated interaction in which individuals can learn about the behavior of specific others with whom they are strategically interdependent (“group members”, say), the effects of the history of interaction may override the effects of SVO, specifically if this strategic interdependence has a critical mass structure. Complicating matters, what counts as “repeated interaction in a fixed group of individuals” is much less clear-cut in real-life settings than in experimental ones. We speculate that the ability to learn and form beliefs about the behavior of a well-defined group of others coupled with a significant degree of payoff interdependence with these others are important factors determining whether individual decision makers define their situation as “repeated”. Thus, charitable giving (in which neither of these conditions is fulfilled to a meaningful degree) would hardly count as a repeated setting in this sense even if individuals repeatedly gave to the same cause. Therefore, SVO should be a good predictor of charitable giving, a fact borne out by research (Bekkers and Wiepking, 2011; Van Lange et al., 2007). Other forms of collective action, however, such as the founding of local community energy cooperatives (even if they appear rather unique on the face of it), are very likely construed by community members as part of a repeated interaction structure (Goedkoop and Devine-Wright, 2016; Goedkoop et al., 2022; Nientimp et al., 2024). In such a situation, we would speculate SVO to have little value for predicting individuals’ contribution levels. Of course, these are mere extrapolations based on a single experiment.
Future studies could improve on our study by elaborating and theorizing specific features of impact asymmetry directly in relation to repeated play and SVO–behavior link. These studies could dive deeper into what drives or depresses the expression of prosocial behavior under various structural conditions by looking explicitly at belief learning. Moreover, additional motivational aspects such as inequality aversion and joint gain maximization could be included, by adopting the SVO slider measure (Murphy and Ackermann, 2014). Relatedly, mechanisms inherent to repeated interactions such as trust, principles of reciprocity and also reputation, could be taken into account to create a more complete picture. Future research should identify the social conditions under which individuals define their cooperation problems as being embedded in a repeated interaction within the same group, and should investigate the extent to which this (perceived or actual) repetition and the perceived interdependence structure affects the SVO–behavior link.
Footnotes
Acknowledgements
We thank Dieko Bakker, Carlos de Matos Fernandes and NNC members for feedback and insightful comments during the development of this manuscript. We thank Andreas Flache, Editor-in-Chief of R&S and four anonymous reviewers for helpful comments and in particular reviewer 3 for their spot-on suggested formulations that we have adopted in our discussion.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Department of Sociology of Groningen University funded the experiment.
Informed consent
Not applicable, however, participants were informed on the procedures and content of the study with both written instructions and verbal debriefing before conducting the study.
