Abstract
Studies examining the effects of gender on honesty, deceptive behavior, pro-sociality, and risk aversion, often find significant differences between men and women. The present study contributes to the debate by exploiting one of the largest tax compliance experiments to date in a highly controlled environment conducted in the United States, the United Kingdom, Sweden, and Italy. Our expectation was that the differences between men’s and women’s behavior would correlate broadly with the degree of gender equality in each country. Where social, political and cultural gender equality is greater we expected behavioral differences between men and women to be smaller. In contrast, our evidence reveals that women are significantly more compliant than men in all countries. Furthermore, these patterns are quite consistent across countries in our study. In other words, the difference between men’s and women’s behavior is not significantly different in more gender neutral countries than in more traditional societies.
Introduction
Gender equality is a top priority throughout the world and is often considered a basic feature of human rights. Gender inequality contributes to a large number of social ills, and the evidence is clear that improving gender equality improves standard of living, life expectancy, quality of life, amongst many other benefits. Improving the quality of women’s lives and improving the opportunities available to them improves everyone’s life.
Progress towards gender equality has certainly been achieved in many societies. But differences persist. Even in Scandinavian countries, where many of the structural and legal barriers against women have been reduced, we still find differences in pay, representation in private business boards, as well as the distribution of household and childcare work/effort. 1
Institutionalists suggest that citizens in different societies should have different preferences based on their experiences in that society. Institutions and history matter, in this sense, because they can shape preferences over time (Steinmo and Thelen, 1992). We have attempted to disentangle the effects of institutions from other structural and social variables by running a simple tax compliance experiment in a set of countries: Sweden, UK, Italy and the US. In this paper we report the results from what we believe to be the largest cross-national tax compliance experiment conducted anywhere in the world.
We believe that this design allows us to explore the issue of gender behavior and gender differences that is relatively unique in both the experimental literature and the field of gender studies. Specifically, we are able to examine whether the differences between the behavior of men and women correspond to the extent of the structural barriers or limitations that women face in a given country. It is reasonable to assume that men’s and women’s behavior converges (i.e. men and women will behave in a similar way) in contexts where they are in fact treated more equally than they are in countries in which women still face large and significant discrimination. Sweden and Italy, for example, rank 4 and 16 on the gender inequality index, respectively, while the UK and US rank 28 and 43 respectively (Human Development Report, 2016).
Precisely because this experiment was conducted in four countries with over 1500 participants we are able to look at how particular incentives affect individual decisions across and within individual countries. Our study examines how individuals respond to different types of redistribution, different tax rates, and different tax structures. We are thus able to compare how people in different countries, and of different genders, respond to these different incentives and or disincentives (i.e. variations in the level of redistribution, tax rates or structure).
Our study was not initially, nor primarily, intended to study gender differences in tax compliance. Instead, this study was designed to explore how people in different societies would behave under similar conditions (aka institutions). We wanted to know, quite simply, if people in different societies were faced with exactly the same incentives and choices with respect to taxation, would they make different decisions? In this paper, however, we report the fact that men and women significantly differ in their willingness to comply with their taxes across countries and conditions. These differences are remarkably large and are consistent across a wide variety of institutional choices. Simply put, women appear to be much more tax compliant than men in every country and under every condition.
Literature review
There is an extensive body of literature addressing attitudinal and political differences between men and women. Virtually all of this work confirms that men and women have different policy attitudes and preferences. Women tend to be more left-leaning and more likely to support state intervention through the expansion of the welfare state (Inglehart and Norris, 2003; Morgan-Collins, 2013). They are also less likely to condone corruption (Barnes and Beaulieu, 2014; Dollar et al., 2001; Torgler and Valev, 2010), but less politically engaged (Atkeson, 2003; Burns et al., 2001; Lawless and Fox, 2010; Verba et al., 1997).
It is generally assumed that differences in attitudes will translate into difference in behavior. We know, for example, that women in the US are more likely to vote than men (Centre for American Women and Politics (CAWP), 2015). Also women in advanced industrial democracies increasingly tend to vote more to the Left than men (Inglehart and Norris, 2000). But, since behavior is structured by the social, political, and economic environment, it can be difficult to determine the extent to which individual behaviors are the product of individual preferences or that of socially constrained (and constructed) choices. In recent years, experimental research has blossomed, partly in order to address this issue of identifying the main anchoring point of manifested behaviors.
The evidence regarding gender differences, however, is somewhat contradictory. Some studies have shown that men and women behave differently even when facing abstract choices. For example, women tend to be less competitive and less certain of the quality of their performance (Preece and Stoddard, 2015). Eckel and Grossman (2001) have demonstrated in a variety of experiments that women are more altruistic, but others have shown men to be more willing to contribute to the public good (1998, 2001, 2008; see also Bruner et al., 2017; Brown-Kruse and Hummels, 1993; Sell and Wilson, 1991; Solow and Kirkwood, 2002).
Men and women also appear to have different attitudes and behavior when it comes to taxation specifically. Surveys have shown that in contrast to men, women tend to think that the tax code is fairer, the likelihood of getting caught for evasion is greater, and they overestimate the penalties for evasion (Kinsey, 1992; Smith and Stalans, 1991). In terms of behavior, a number of tax compliance experiments have also shown women to be more compliant than men (Cadsby et al., 2004; Cadsby et al., 2006; Chung and Trivedi, 2003; Gërxhani, 2007; Hasseldine and Hite, 2002; Kastlunger et al., 2010; Lohse and Qari, 2014; Powell and Ansic, 1996; Spicer and Hero, 1985). However, most of these studies treat gender as a residual predictor.
We know that behavior is sensitive to context (Chermak and Krause, 2002; Seguino et al., 1996). As Sequino et al. pointed out, “our environment helps shape how we act and how we see others.” And therefore they “suggest that social structures that shape our preferences may differ along gender lines” (p.14–15). Even in the lab, behavior is shaped by broader social norms. The inconsistent role gender plays in many of the existent experiments might be a product of differences in the experiments themselves, or a product of the differences in the social/political context in which the experiments were conducted. Our study attempts to answer this issue of contextual or treatment variations that might confound gender differences. As such, our study examined the gender differences manifested by subjects in the same fiscal compliance experiment, across different countries. We can therefore further the current research by controlling for the potential effects broader social/political contexts might bear on behavior.
Experimental overview
Our experiments were conducted at universities during the 2013/2014 and 2014/2015 academic years. 2 We took great care to assure that all subject pools were demographically very similar and that the selection mechanisms were nearly identical and unbiased (see Tables A1, A2, A3, and A4 in the online appendix). Each university uses an electronic database to which students, or past students, voluntarily submit their information for participation in experiments. The participants were then randomly selected and invited by email to participate in the experiment (for more details on the Online Recruitment System for Economic Experiments (ORSEE), see Greiner, 2004). Once the participants arrived at the laboratory they were given an anonymized identification number and assigned to a partitioned computer to limit the interaction between themselves and other participants. 3 We linked participant pay to ID number thus ensuring complete anonymity.
The experiment consisted of three stages with three income reporting rounds within each stage. 4 In total, subjects report their income nine times for tax purposes. Subjects were given a different experimental parameter with each income reporting decision. In the first stage we varied the size of the general fund; in stage two we altered the tax rates; and in stage three we changed the tax structure. At the beginning of each stage, subjects were asked to perform a simple clerical task for which they received experimental currency (EC) that would be converted into real money at the end of the experiment. Subjects were paid 10 experimental currency units (ECU) for each correctly copied line of text which was then exchanged for domestic currency at the rate of 0.01 per token.
At the beginning of each income reporting round, subjects were given specific examples similar to the decisions that they would make in that particular round (see Table 1). Specifically, in stage one, the tax rate was 30% of reported income, the audit probability was 5%, and we varied the amount of redistribution. In round one (Round 1: no redistribution) there was no general fund, and thus no redistribution. In round two (Round 2: redistribution) the tax revenue was placed in a general fund and redistributed equally to all participants. For round three (Round 3: redistribution x 2) we doubled the general fund and divided it equally amongst all subjects. In stage two, the redistribution of the general fund remained constant, there was still a 5% chance of being audited, but we varied the tax rates in each round. In round four (Round 4: 10% tax rate) there was a 10% flat tax; for round five (Round 5: 30% tax rate) there was a 30% flat tax; and in round six (Round 6: 50% tax rate) there was a 50% flat tax. Lastly, in stage three the redistribution and audit rate remained constant, while we adjusted the tax structure. In round seven (Round 7: progressive 1) the top 10% of declared incomes payed a 50% tax rate; the bottom 10% of declared incomes payed a 10% tax rate; and everyone else payed a 30% rate. In round eight (round 8: progressive 2), all income over 100 ECU was taxed at a 50% rate; income between 50 and 100 ECU was taxed at a 30% rate; and all income below 50 ECU was taxed at a 10% rate. 5
Summary of tax reporting rounds.
ECU:experimental currency unit.
At the end of the tax compliance experiment, participants were asked to participate in a simple iterated dictator game designed by Ryan Murphy and Kurt Ackermann (2011) to assess one’s level of prosociality. Once all experiments were complete, we asked the subjects to complete a fifteen-minute survey regarding certain demographic and attitudinal characteristics.
Altogether there were a total of 1564 subjects: 311 (Italy), 360 (UK), 566 (US), and 327 (Sweden). In our pool 50% were female, 38% were employed, and 21% were economics majors. The vast majority, 72%, of our subjects had participated in experiments before (see Table 2).
Summary statistics.
SVO: social value orientation.
Methods and results
Given the literature and common expectations about the differences observed between men and women’s behavior we draw the hypotheses:
H1: Tax compliance among women will be higher than men across rounds.
H2: The differences observed between men and women’s responses should be smaller in countries that have achieved greater levels of legal and social equality.
First, we are interested in whether females comply more than males across countries and rounds. Figure 1 demonstrates prima facie evidence suggesting women are more compliant than men across countries. From Figure 2, we can observe that there are large gender differences across treatments. Although it is not the central component of this study, it should be noted, however, that the gender gap does vary between decisions. The gender gap decreases, for example, when we increase the return on the public good. Similarly, the gender gap increases when we increase tax rates. In Italy and Sweden, these differences are mainly being driven by changes in men’s behavior. More specifically, men are more responsive to the incentives in each decision in Sweden and Italy, than in the US. Moreover, whereas women tend to respond less to the experimental treatment in Sweden and Italy, in the UK and US women are only slightly less reponsive to the treatment. For a more thorough analysis of the effects of specific treatments on gender in each country see Bruner et al. (2017).

Average compliance rate by gender.

Average compliance rate by gender.
We now examine how gender differences affect tax compliance within and across countries. We treat tax reporting as a single variable with distinctive values for subject and experimental period. We perform a series of Ordinary Least Squares (OLS) analyses represented by the following equation:
OLS Regression
where:
In Table 3 we present the results of our OLS analyses. Economics is a dummy variable for economics majors, and risk is an individual risk assessment measure. Past-participation is a dummy variable for whether participants have participated in experiments in the past, and employed is a variable for whether the subjects are employed. Each of these variables was taken from our attitudinal survey, completed at the end of the experiment. Income (standardized) is participants’ income in the first eight rounds. Pro-redistribution, duty to pay, and trust are factors produced from an orthogonal rotated principal components factor analysis, also from the attitudinal survey (for more details about the factor analysis, see Pampel et al., 2017). Pro-redistribution is a factor with self-placement on a left-right scale as the key component. Duty represents participants’ sense of responsibility to the state, such as the extent to which cheating on one’s taxes, cheating on government benefits, and not paying taxes for a variety of reasons are justifiable. Trust characterizes participants’ level of confidence in government. SVO angle is measured from an iterated dictator game in which one person is asked to allocate their endowment to an unknown partner in the room as a test of altruism (for measurement of the SVO see Murphy and Ackerman (2011). Native-born mother and father are variables which control for the birthplace of the subjects’ parents. Finally, we control for each individual reporting round.
Ordinary least squares regression for average compliance rate.
SVO: social value orientation.
Robust standard errors in parentheses.
p<0.01, **p<0.05, *p<0.1.
In column 1 of Table 3, we estimate the effect of gender on tax compliance in our pooled-country dataset with a host of control variables. We determine that being female is statistically significant and positively correlated with tax compliance, generating a large effect. Economics majors, past-participation, and risk are all negatively correlated with tax compliance, whereas pro-redistribution, duty to pay, and SVO are all positive. In columns 2 to 5, we estimate the effect of gender on tax compliance in each individual country. The effects of our control variables do depend slightly on the country context, but the effect of being female is robust in each individual country. We determine that the effect of being female on tax compliance ranges from an 11%increase in the US to a 20% increase in Sweden, all else being equal.
Amongst our subjects we find moderate support for the idea that women are slightly more risk averse than men, but still we find that being female remains highly correlated with tax compliance even when controlling for risk acceptance, meaning that risk acceptance is not the variable driving these gender differences. The effect does soften some, but maintains a large effect on tax compliance. This result is especially important, because it establishes that women are profoundly more compliant even when their degree of risk acceptance is kept at an identical degree to their male cohorts. Confirming previous literature, risk acceptance is also negatively correlated with tax compliance. Female remains highly significant and positive, holding all else constant. In column 6, we are mainly interested in the interaction terms to examine the gender tax gap between countries, with the US as our baseline case. Here the results are unexpected: the gender tax gap in the US is significantly smaller than in Sweden at the .001 level. The gender tax gap is also significantly smaller in the US than in Italy and the UK, although the significance is weak.
Finally, in column 7, we examine our full model. Economics majors, income, willingness to accept risk, and past-participation are all statistically significant. The effect of participating in experiments in the past has a large effect, decreasing tax compliance by approximately 8% when holding all other variables at their mean. Moreover, our attitudinal variables such as support for the welfare state and duty to pay taxes are also statistically significant. Trust in authority is significant, but not in the expected direction. This could largely be due to the relatively small variation across countries in tax compliance, combined with significant variation in trust in government. Swedes, for example, demonstrate high trust, whereas Italians demonstrate low trust. Furthermore, having native-born parents has no effect on our model. 6
Against our expectations, we discover that Sweden, a country which has achieved one of the highest levels of gender equality in the world, demonstrates the largest tax compliance gap, and that gap is statistically greater than the US, the country with the largest level of gender inequality, according to the gender inequality index. 7 In fact, Sweden is the only country with a gender gap that is significantly larger than the gap in the US. Figure 3 displays the predicted probabilities from column 7 of Table 3 for the compliance rate by gender in each country. What stands out from the figure is the fact that the gender gap is quite large in all countries, but especially large in Sweden, the UK, and Italy.

Predicted probabilities for the compliance rate.
Robustness checks
In this section we examine the robustness of our results. Specifically, we are concerned with whether our results are robust to treatment order. We ran an additional six sessions in Italy varying the treatments. First, we ran a series of difference in means t-tests in each round to test if there are gender differences in compliance in each round for treatment order B (see Table 4). The difference in means t-test suggests that there are significant differences between men and women in each round, with the exception of the 10% tax rate round.
Difference in means t-test for gender by round in treatment order B.
Notes: The t-test for equal is reported in the third, fifth, and seventh column with statistical significance indicated by asterisks: *** indicates the difference is significant at the 1% level.
Finally, we estimated an OLS with clustered standard errors and dummy variables for rounds 2 through 8 (see Table 5). The coefficients for the dummy variables show the increase or decrease in compliance relative to the first round, which serves as the reference. Indeed, we still uncover a large gender gap on average in treatment order B.
Regression analysis for compliance in treatment B.
Robust standard errors in parentheses.
p<0.01, **p<0.05, *p<0.1.
Discussion and conclusion
The first and obvious result drawn from this experiment is that women behave differently from men in all conditions and in all countries studied here. Interestingly, the “gender gap” differs greatly between the US and Sweden, and in a direction that was completely unexpected when embarking on this research. Whereas we expected the gender gap to be substantially smaller in more gender egalitarian countries, we find instead that gender is a powerful and robust variable across societies.
Although gender differences in tax compliance have been reported in previous articles (see more recently in Brockman et al., 2016), these works generally treat gender as a residual predictor. The present study demonstrates clearly and systematically why this is a major omission: gender is strongly associated with an individual’s tax behavior, when controlling for institutional, cultural, and social factors. Our experiments show substantial gender variation in tax compliance across the four countries, given varying levels of social and legal gender equality.
Indeed, the original assumption that gender differences should be substantially minimized in more gender neutral societies assumed a temporal connection between norms, institutions and behavior that may be unrealistic. Norms are sticky. This, we believe, can help explain why what Preece calls “gendered psyches” continue to have strong effects on behavior that may not correspond immediately to recently effected institutional changes. As Inglehart and Norris (2003: 79) noted “structural developments lead to, and interact with, cultural shifts that tend to reshape political values.” Thus, we should not be so surprised to see that behavioral change lags behind the institutional change.
Our findings thus call into question the liberal assumption that given equal conditions, individuals behave in the same ways. Instead, our research demonstrates, even where structural reforms have actively targeted and effectively reduced legal and formal gender gaps, the behavioral differences between males and females persist. We find only weak evidence of intrinsic behavioral differences, such as attitudes towards risk, and as such our study strengthens the argument on variation in gendered perceptions (Fox and Lawless, 2011; Lawless and Fox, 2010.
In sum, our study uncovers significant gender variation in tax compliance across tax conditions and countries. Furthermore, this study makes the claim that albeit more social and legal gender equality, large behavioral differences between genders still persist – even in Sweden. The results do leave room for alternative interpretations and require further study. We welcome and encourage scholars to utilize our data in combination with other studies to further our understanding of these behavioral differences.
Footnotes
Acknowledgements
We would like to thank David Bruner, Nan Zhang, Georgina Waylen, Fed Pampel, and Young-Im Lee for their helpful comments. I also thank the two anonymous referees and the editors at Research and Politics for the valuable feedback.
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Funding
This work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013) (grant number 295675).
Notes
Carnegie Corporation of New York Grant
This publication was made possible (in part) by a grant from Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
