Abstract
This article seeks to make two contributions to the understanding of social insurance, a central policy tool of the modern welfare state. Focusing on Britain, it locates an important strand of theoretical support for early social insurance programs in antecedent developments in mathematical probability and statistics. While by no means the only source of support for social insurance, it argues that these philosophical developments were among the preconditions for the emergence of welfare policies. In addition, understanding the influence of these developments on British public discourse and policy sheds light on the normative principles that have undergirded the welfare state since its inception. Specifically, it suggests that the best model, or normative reconstruction, of social insurance in this context is a value-pluralist one, which pursues efficiency and equality or solidarity, grounded in group-based perceptions of risk.
Social insurance—the provision of event-conditioned benefits through a publicly operated system of contributions and distribution—has long been a central policy tool of the modern welfare state. Scholars have advanced a number of normative explanations for the practice, including its ability to promote economic efficiency (Barr 1989; Heath 2011), its expression of relational or distributive equality (Anderson 2008; Dworkin 2000; Landes and Néron 2015), and its role in cultivating social solidarity (Lehtonen and Liukko 2015; Liukko 2010). While these aims are not mutually exclusive, prioritizing one or another can lead to different policy outcomes. For instance, a social insurance system that aims to promote efficiency by satisfying individual preferences for security will not necessarily lead to egalitarian results (Heath 2006, 346–48), while a system that aims for solidarity by providing uniform benefits to all may not satisfy the demands of wealthier citizens to insure themselves at desired levels of consumption (Ebbinghaus and Gronwald 2011; Korpi and Palme 1998). As Joseph Heath puts it, “even among the most enthusiastic supporters of the welfare state there are several different theoretical reconstructions of the normative commitments that are taken to underlie it, all of which are in tension with one another” (Heath 2011, 14).
Acknowledging the force of this observation, this article seeks to make two contributions to the understanding of social insurance. First, it argues that the best normative account of this practice is a value-pluralist one, which pursues efficiency as well as equality or solidarity, grounded in group-based perceptions of risk. Focusing on the emergence of social insurance in Britain, it argues that a plurality of principles found expression in public debates and policies at this time. Such a pluralist reconstruction better characterizes the emergence of the practice and its purposes than a model that focuses solely on any one principle.
Second, the article locates an important but little remarked strand of theoretical support for early social insurance programs in antecedent developments in mathematical probability and statistics. This strand of thinking originated with frequentism, an interpretation of probability that emerged in the mid-nineteenth century and enjoyed particular prominence in Britain. Frequentism helped to justify a collective response to uncertainty, and was moreover linked with both utilitarianism and developments in statistical thinking. While by no means the only source of support for social insurance, these philosophical influences were among the preconditions for its emergence. Moreover, understanding the character of these influences sheds light on the normative principles that have undergirded key elements of the welfare state since its inception. While others have well noted the plurality of values or aims served by welfare state institutions (see Goodin et al. 1999), this paper adds an additional source of support for such pluralism in accounts of the interpretation and quantification of risk itself.
Focusing on the emergence of social insurance in Britain has two advantages for the project of normative reconstruction. First, it offers a valuable test case for a public goods or efficiency model of the welfare state, which contends that the primary purpose of a major set of welfare policies is to provide goods that citizens demand but that are undersupplied or imperfectly supplied by private markets (Heath 2011). Since today the United Kingdom is generally regarded as a liberal welfare regime, in which the market plays a dominant role and the decommodifying effects of social transfers are limited (Esping-Andersen 1990), it should be an exemplar of the market-failures rationale for government intervention. If, however, it turns out that the emergence of welfare state institutions there cannot be explained exclusively or even principally in terms of efficiency, this could provide evidence against a strict public goods view. 1 Second, if we are able to identify a plurality of normative principles in the practice of social insurance itself, this makes it more likely that normative pluralism characterizes the welfare state as a whole, given that the latter comprises a range of other programs whose aim is more explicitly egalitarian than social insurance.
The argument proceeds as follows: the first section offers an introduction to three seminal social insurance policies enacted in Britain around the turn of the twentieth century, as well as the variety of arguments invoked to support them. The three ensuing sections examine the history of probability theory, introducing frequentism and its implications for accounts of insurance, as well as echoes of its class-based approach to risk management in subsequent statistical thought. 2 The final section returns to the realm of practice, focusing on reverberations of these developments in British economic discourse and public policy. 3 This intellectual history reveals that even risk-pooling justifications for welfare reflect a plurality of normative aims. The conclusion draws lessons from this historical analysis for contemporary thinking about the welfare state, suggesting that a market-failures reconstruction may not fully account for the claims of equality and solidarity that have long been prominent in defenses of social insurance.
The Emergence of Social Insurance
Before turning to the development of social insurance, it is important to explain why examining the history of welfare policies is crucial to the contemporary project of normative reconstruction. As Heath puts it, the task of normative reconstruction is to offer an “account of the normative purposes that are already implicit in the practices of the welfare state.” Such an account should be informed by what the state currently does as well as why it does those things. Following Jürgen Habermas, Heath evaluates each of the contending models using a standard he calls “expressive adequacy,” which comprises three prongs: first, whether major welfare state activities can be described as serving a particular normative purpose; second, whether that normative purpose played a role in the emergence of the relevant policies or institutions; and third, whether the model enhances our “normative self-understanding,” thereby enabling us to better achieve our goals (Heath 2011, 14, 28). This standard makes clear that understanding the emergence of various institutional forms, and in particular their guiding principles and aims, is central to the effort to normatively reconstruct them. We therefore begin with a brief survey of some of the purposes that drove early British social insurance programs before turning to the philosophical and technical developments that helped to support them.
The idea of social insurance did not originate in the nineteenth century. A number of prominent proposals date to the time of the French Revolution (Condorcet [1795] 2012; Jones 2005, 34–36), and the idea of mutual provision against contingency goes back even further (Cordery 2003, 13–21; Ismay 2018, 23–46). In Britain, guilds had long been a source of aid for working people, to be succeeded by cooperative associations, trade unions, and friendly societies, which typically provided benefits in the event of poor health or death (Cordery 2003; Ismay 2018).
The earliest plans for social insurance proposed to extend the logic of mutual provision to larger groups or to society as a whole, employing the newly developed calculus of probabilities (Jones 2005, 17–35). 4 These proposals grew out of concern for the implications of new economic realities, in particular, their influence on the working poor, reliant on wage labor and consequently vulnerable to any number of disruptions. Probabilistic mutual insurance, with its premiums based on mathematical likelihoods, promised to reduce this vulnerability by pooling risks: provided enough similarly situated individuals join together, creating a large common fund from their many small contributions, they may equitably share the burdens of a misfortune that happens to strike any one of them (Laplace [1825] 1994, 89). 5
The enactment of full-fledged social insurance policies would have to wait until the 1880s, however, when several European countries instituted national-level schemes and inaugurated a trend that would spread across industrialized states (Baldwin 1990, 55–106). As François Ewald has shown in the context of the French welfare state, many of these developments drew support from probabilistic and statistical thinking of the time (Ewald 1986). 6 As governments increasingly collected information about a variety of economic and social phenomena, and as mathematical developments allowed for sophisticated uses of those data, the notion that the state could insure its citizens against various forms of misfortune became increasingly prominent and plausible.
In Britain, these trends helped set the stage for three seminal pieces of legislation: the Workmen’s Compensation Act of 1897, the Old Age Pensions Act of 1908, and the National Insurance Act of 1911, which provided a significant expansion of health insurance coverage and (as will be the focus here) a more limited unemployment insurance scheme. Many factors precipitated the enactment of these laws, including greater awareness among the middle classes of the reality of working poverty; changing elite views about the role of the state in addressing insecurity; and the growing role of labor and working-class organizations, to which elites had new incentives to respond (Hay 1975, 25–29; Orloff 1993, 153–171). The philosophy of liberalism had also evolved, moving away from the ideal of self-help and the harsh deterrence of the poor law system toward a new fusion of individualist and collectivist values known as the New Liberalism (Freeden 1986; Vincent and Plant 1984; Weinstein 2007).
What is noteworthy for our purposes is that all three of the schemes in question invoked the logic of insurance in some respect. They also garnered support from both economic and egalitarian arguments. Workers’ compensation was at various points framed both as a form of recompense for service to national productivity, particularly in the face of foreign competition, and as a tool for a kind of redistribution, shifting “a fair portion of the loss sustained” from the injured to his employer, as one government report put it (Great Britain—Home Office 1904, 13; see also Bartrip and Burman 1983; Moses 2018, 129). Pensions were justified as promoting workers’ personal well-being, due to the fact that the young may not save for old age, and as a means to protect the especially vulnerable from devastation (Churchill [1909] 1973b, 311–14; Harris 2004, 157; Macnicol 1998, 162–63). Finally, unemployment coverage was framed both as a means to help individual workers cope with unpredictable risks and as an expression of solidarity, a recognition that any individual’s own welfare cannot be promoted without regard for the larger collective of which he is a part.
The following three sections aim to show that thinking about probability and statistics in Britain around this time lent support to both sets of normative arguments. On one hand, such thinking justified social insurance as a means to reduce economic uncertainty for individuals, ensuring that they could sustain themselves in times both good and bad. On the other hand, it affirmed the distributive fairness of insurance and the importance of mutual support among groups of risk-prone workers. While developments in the interpretation and calculation of risk were not the only source of support for such arguments, they were one such source, justifying social insurance both as a reflection of personal prudence and as a means to fairly distribute the burdens of industry within and among various groups.
It is important to note that the argument presented here does not purport to satisfy a test of direct influence between developments in probability theory and social policy outcomes. One such test, proposed by Quentin Skinner, would in this case require showing that those responsible for policy decisions had studied frequentist views, could not have found the relevant doctrines anywhere but in frequentism, and could not have arrived at those doctrines independently (Skinner 2002, 75–76; see also Toye 2010, 162–63). It is true that many prominent economists and political actors involved in formulating welfare policy in Britain at this time were familiar with frequentist views or with the statistical advances made in their wake. Nevertheless, the more modest claim here is that one finds echoes of frequentism’s flexible class-based approach to risk management in the discourse surrounding social insurance, particularly among policymakers and politicians. Developments in probability and statistics ought therefore to be considered among the sources of support for welfare policy at this time.
The Collective View of Risk
Frequentism offered a novel interpretation of the relationship between personal judgments about risk and the probability values on which insurance arrangements are based. Prior to this time, thinkers had noted the possibility that personal or subjective probability estimates may not align with the averages calculated for groups. Many had set out to address this problem with the concept of moral expectation, influentially proposed in 1738 by Swiss mathematician Daniel Bernoulli. Bernoulli had argued that an individual with fewer initial assets will be warier of a loss than someone who starts out with a greater fortune, and that for any individual, the pain of losing money will exceed the pleasure of winning it (Bernoulli [1738] 1954, 24–25). In the context of insurance, moral expectation was thought to measure the utility that the individual derives from being insured, including whatever exceeds the strict monetary value of the contract itself (Laplace [1825] 1994, 88). This concept thus allowed thinkers to reconcile the inherently aggregative character of group insurance with the need to justify it to each individual participant: since the insured derives personal utility from the contract, it is not necessarily unfair for her to contribute more for her coverage than the strict cost of her own risk.
With the emergence of the frequency theory in the 1840s, earlier accounts of probability and insurance faced a philosophical challenge. Frequentism was initially worked out during the late 1830s and early 1840s, during a rising tide of philosophical empiricism in Britain, France, and elsewhere. In Britain, this period saw the creation of a number of influential statistical societies, including those in Manchester and Liverpool, as well as the Statistical Society of London (now the Royal Statistical Society) in 1834. The latter was founded by a group of Cambridge-affiliated scholars who, as Lawrence Goldman argues, sought to create a new, inductive style of social and economic analysis (Goldman 1983). It was before this society that Charles Booth presented his pioneering research on London poverty (Booth 1887). Hubert Llewellyn Smith, whose work on unemployment insurance will be taken up below, was among the researchers who helped carry out Booth’s study.
Frequentism both reflected and furthered this new inductive spirit. Its earliest expositors, including Robert Leslie Ellis and John Stuart Mill in Britain and Antoine Augustin Cournot in France, began with an apparently straightforward proposition: a probability value is the observed ratio of successes over the total number of trials in a series of relevantly similar occurrences (Porter 1986, 77–78). In contrast to many prior accounts, then, frequentists held that probabilities do not reflect the individual’s state of mind but rather measure occurrences in the world (see Laplace [1825] 1994, 3).
Several features stand out as broadly distinguishing the frequentist understanding of probability as it emerged and developed in the latter half of the nineteenth century. These include an emphasis on empirical observation over subjective belief, a reluctance or refusal to assign probability values to single instances, and an understanding of randomness as a property of events within a properly defined series. Ultimately, what makes frequentism significant for the history of social insurance is its acknowledgment that all probability estimates are conditional on the prior identification of a series or class. As a result, frequentism supported a class-based approach to insurance, prioritizing groups of citizens over individuals and demoting claims about personal responsibility for some misfortunes.
Those who adopted a frequentist interpretation at this time expressed a range of epistemological views. Ellis, for example, a Cambridge-educated mathematician and editor of the works of Francis Bacon, adopted an idealist interpretation of the foundations of probability; John Venn, perhaps the most prominent frequentist of the era, rejected claims of a priori truth (Verburgt 2014). Nevertheless, they shared an insistence that a probability value is a ratio or frequency derived from a series of occurrences, each of which is uncertain in isolation. In Ellis’s words, our judgments of probabilities depend not on the “fortuitous and varying circumstances of each trial” but rather on the natural fact that “on the long run, the action of fortuitous causes disappears” (Ellis [1842] 1849, 3; see also Cournot 1843, 185; Venn [1888] 2006, 3). Frequentism consequently called into question the relevance of probability values to individual events. As Venn put it, for bearing in mind that the employment of probability postulates ignorance of the single event, it is not easy to see how we are to justify any other opinion or statement about the single event than a confession of such ignorance. (Venn [1888] 2006, 142)
In assigning a mathematical expectation to any individual, he explained, the frequentist intends “nothing more than to make a statement about the average of his class” (151). Although Venn denied the existence of fixed natural types, he did suggest that imperfect series exist in nature, and that these could be replaced with refined or idealized versions for the purposes of statistical analysis (Hájek 1996, 218–19; Verburgt 2014, 191–93).
Frequentism was also accompanied by a distinctive understanding of what it means for events to be random. Specifically, it supported a view of randomness as a uniform distribution of individual trials within a properly defined series. As Venn explained, randomness presumes some agent, human or other, operating within a set of limits such that it is as likely to generate any given outcome as any other (Venn [1888] 2006, 100). Charles Sanders Peirce, the American pragmatist philosopher, similarly defined a random sample as one taken according to a precept or method which, being applied over and over again indefinitely, would in the long run result in the drawing of any one set of instances as often as any other set of the same number. (Peirce 1883, 152; see also Keynes 1921, 331–32)
Frequentism and Insurance
If mathematical probability values do not apply to singular events, we may wonder what justifies insurance from the perspective of the individual insured. That is, if there is not a probability value that applies to one’s particular case, how can one reasonably decide whether to insure oneself? Turning away from the notion of moral expectation, frequentism implied a different answer to this question, calling on a kind of prior identification with the class, only after which the individual can regard its probability value as her own. The claim here is not that frequentism as it was understood at this time entails a rejection of personal utility as a justification for decisions about insurance. Rather, it is that frequentism addressed the perceived need for such a value by first aligning subjective and objective expectations via the epistemic priority of the class. In addition, in Venn and Cournot, the rejection of moral expectation was linked to an aggregative approach to social welfare, which focused more on the distributive effects of insurance as an institution than on the fairness of any individual contract (Cournot 1843, 334–37; Venn [1888] 2006, 391–92).
Venn recognized that for insurance to remain personally reasonable on a frequentist view, it would require a different justification from the one offered by many previous accounts. One possible approach is for the individual to consider his own actions as a series, and to find that the “equalization of his gains and losses, for which he cannot hope in annuities, insurances, and lotteries taken separately, may yet be secured to him out of these events taken collectively” (Venn [1888] 2006, 148). This approach is problematic, however, with respect to events that the individual will experience infrequently or only once. More promising in such cases is Venn’s suggestion to “suppose the existence of an enlarged fellow-feeling,” or an identity with the other members of one’s group (149). On this account, the reasonableness of insurance hinges on each person’s ability to see himself first and foremost as a member of the series or class, and to enlarge his own interest to encompass the group.
Peirce, for his part, explained that because “probability essentially belongs to a kind of inference which is repeated indefinitely,” “there can be no sense in reasoning in an isolated case, at all” (Peirce [1878] 1955, 159–60). He gave the example of choosing a card from a pack of twenty-five red cards and one black, or twenty-five black and one red. If choosing red will bring lasting happiness and black everlasting sorrow, one will clearly opt to pick from the first pack. Yet on Peirce’s view, there is no valid inference that justifies that choice. Because the exercise is only repeated once, there is no “real fact” whose existence gives truth to the statement that if he draws from one pack, a particular color will likely appear. The example, he argues, therefore illustrates that only by having enlarged interests—by caring “equally for what was to happen in all possible cases of the sort”—is it possible to act logically in choosing from the red pack (160–61).
Frequentism thus implied the epistemic priority of the class, variously defined. The truly reasonable person will ground her decisions in the “social principle,” caring for every other similar case in the same way that she cares for her own. Such class-based solidarity rests on a kind of interpersonal identity rather than an individualized risk. It is also flexible, based on an admission that the insured’s designated reference class can vary according to the insurer’s needs and available information, as well as over time (Venn [1888] 2006, 224–31). As we will see, this claim found resonance in statistical developments that followed in frequentism’s wake and helped support early welfare interventions.
There is also a historical connection of note between probability theory and the early welfare state. Utilitarian philosophers and political economists, particularly in Britain, took considerable interest in the foundations of probability. Like frequentism, their approach to political economy took an aggregative approach to individuals in the name of a common or collective good, and rested on an abstract assumption of equality that allowed for such aggregation. This is not to say that a frequentist view of probabilities necessarily lends itself to utilitarian economics. Rather, the point here is that these two families of ideas were worked out in close proximity, with important overlaps between them. 7
Thus, in his final addition of the Logic, Venn recommended utilitarianism as the successor to and fulfillment of the concept of moral expectation, in that it answers the question of which “distribution of wealth tends to secure the maximum of happiness” (392–93). For example, if the disutility of a losing gambler exceeds the utility of the winner, then overall happiness has decreased, and what is proved is that “inequality is bad, on the ground that two fortunes of £50 are better than one of £60 and one of £40” (390). The real problem with gambling, therefore, is “its tendency to the increase of the inequality in the distribution of wealth,” a conclusion that recommends the “Socialist’s ideal as being distinctly that which tends to increase happiness” (391, 392). Arthur Pigou, whose analysis of market failures would prove influential in justifying certain forms of welfare policy, also invoked diminishing marginal utility of income in a later work to argue that significant inequalities of wealth entail overall social losses (Pigou 1935, 121; see also Harris 2012, 87–88; Medema 2010).
In making his remarks, Venn credited political economist and statistician Frances Ysidro Edgeworth for having discovered the theoretical successor to moral expectation, a reference that is revealing of the intersection between probability theory and political economy in the latter decades of the nineteenth century (Mirowski 1994, 46–47). Edgeworth expressed some reservations about frequentism, but like Venn, he insisted that probabilities should “rest upon precise experience” if they are to be measurable (Edgeworth 1884, 235; see also Porter 1986, 97; Stigler 1986, 309–10). He also explicitly related the epistemology of probability to utilitarian ethics, explaining that both make commonsense assumptions about which cases or events can be considered equal for the purposes of calculation (Edgeworth 1887, 484; see also Sidgwick 1874, 387). This assumption of equality serves the practical needs of the utilitarian calculus in the same way that it serves those of scientific endeavor: it provides “an hypothesis which may serve as a starting point for further observation” and calculation (Edgeworth 1884, 233; Mirowski 1994, 25–27, 40).
In conjunction, Edgeworth also proposed an alternative psychological foundation for economic order. “Self-regarding self-interest, the gospel of Adam Smith, is not alone sufficient for industrial salvation: a leaf must be taken from his older and less familiar testament, of which the cardinal doctrine was sympathy.” In analogizing intellectual probability to the utilitarian assumption of equality, Edgeworth was explicit that both must be founded on experience and revised in light of continued observation. Such learning would ultimately promote the utilitarian ideal, wherein the individual recognizes that her own happiness carries equal weight to that of everyone else (Edgeworth 1904, 218). As he later put it, “A man, say, buys a life annuity, insures his life on a railway journey, puts into a lottery, and so on.” It may be expected, I think, that the class of actions which cannot be regarded as part of a “series” will diminish with the increase of providence and sympathy. (Edgeworth 1922, 260)
Insurance and Statistical Innovation
The argument of the preceding section focused on the frequentist justification for class-based risk pooling. This section argues that contemporary developments in statistical reasoning confirmed the frequentists’ flexible approach to defining population groups and the risks they face, and that these developments in turn found expression in thinking about insurance.
Probabilistically informed arguments for insurance had long held that there is safety in numbers, both for the insurer who spreads his risks and for the insured whose collective experience manifests a reliable average (Laplace [1825] 1994, 89). Yet the explosion of economic and social data over the first half of the nineteenth century had led thinkers to search for more precise ways to quantify these effects. As Stephen Stigler has shown, Edgeworth played a pivotal role in these developments, setting out to apply techniques that had been developed with regard to physical observations to the phenomena of the social world. Whereas “the mean of observations is a cause, as it were the source from which diverging errors emanate,” he explained, the “mean of statistics is a description, a representative quantity put for a whole group, the best representative of the group” (Edgeworth 1885b, 139–40; Stigler 1986, 309). If one must select a single quantity to represent many different outcomes, then, the statistical mean is the value that results in the least possible error in doing so.
Edgeworth’s first major advance, published in 1885, was to devise a basic significance test for determining whether an observed difference between two proposed population means is a product of chance or some other cause. Applying a path laid out by Francis Galton, who had used similar methods to study heredity, Edgeworth developed a method for testing the differences between groups using estimates of the variability or dispersion within each. At the foundation of his discussion was what he called the “law of error, or probability-curve,” which represents the degree of divergence between each member of a set of observations or statistics and a central point or mean. While the curve may be more or less spread out from its center in accordance with what Edgeworth called the “modulus”—the square of which, Stigler points out, amounts to twice the variance in modern terms—if a set of statistical numbers fulfills the law of error, then it is “exceedingly improbable” that any member of the set taken at random will deviate from the mean by twice the modulus (Edgeworth 1885a, 183–85; Stigler 1986, 308–13).
Edgeworth explained that the law of error is often reflected in nature, including in the errors of physical observations, such as measurements of the location of a star or samples of balls taken at random from an urn. Yet whereas many before him had mistakenly focused on whether a set of observations manifests a normal distribution, Edgeworth insisted that it is not necessary for statistical analysis that the “raw material of our observations should fulfill the law of error.” Rather, what is essential is “that they should be constant to any law.” Edgeworth went on to examine various cases in which, by manipulating the observations in some way—rearranging groups to increase the number in each, or dividing a larger set into subsets—“art” can facilitate the “elimination of chance” (Edgeworth 1885a, 187). Thus, even where the law of error is not fulfilled in nature, the statistician can compare different artfully created groups to distinguish between the effects of chance and the work of other forces.
Edgeworth’s approach to classifying and subdividing observations offers another justification for flexible risk management and found echoes in other contemporary analyses. For example, Arthur Bowley 1901 Elements of Statistics, which a contemporary reviewer praised as the “latest and best summary of [the] methods” of mathematical statistics at the time, explicitly followed Edgeworth’s writings (Bowley 1901, 262; Ford 1901, 444). The text went through six editions between its first publication and 1937, by which time Bowley had also authored statistical studies on the causes of economic insecurity (Boyer 2018). In the introduction to his account of the applications of probability to statistics, he noted that he would assume readers’ familiarity with Venn’s Logic and cited Mill in support of the claim that politicians and economists investigating a phenomenon “are as a rule concerned with its effect on the whole mass, not on the individuals in particular” (Bowley 1901, 262–63). In his brief remarks on insurance, Bowley echoed Venn in noting that “a thing happens by chance, when its occurrence is influenced by many independent causes whose separate effects we cannot trace, as when we draw a card from a thoroughly shuffled pack.” Thus, “if we consider a man’s death from the point of view of an insurance office,” we ignore the individual causes of each instance and speak instead about the overall frequency, or average result of those causes, for the group (267–68).
Such analyses reveal the influence of the currents we have been considering and provide a link to social policy developments of the time. Bowley testified before the Royal Commission on the Poor Laws in 1907, discussing the problem of unemployment. While his proposed solution focused on government relief works and the creation of labor exchanges, he employed statistical analysis to distinguish between unemployment caused by seasonal fluctuations and that caused by periodic change, and argued that public aid should be limited to the latter, secured by allocating funds in “fat years” to “provide for the lean” (Bowley 1907; Great Britain—Royal Commission on the Poor Laws and Relief of Distress 1910, 467). As we shall see in greater detail below, the discourse surrounding unemployment and the ultimate adoption of nationwide insurance against it constitute a powerful example of the resonance of such probabilistic and statistical arguments in the early welfare state.
Risk in the Early British Welfare State
The preceding sections highlighted the character of the frequentist interpretation of probability, its implications for thinking about risk-based solidarities, and the developments in statistical methods that extended many of those insights to social and economic investigations. This section returns to the realm of policy, arguing that the development of social insurance in Britain reflects, among other factors, these intellectual currents. Probabilistic arguments lent support to policymakers seeking to justify their interventions on broadly liberal grounds, and thus in a way that could appeal to wide constituencies while addressing working-class resistance to the harshness of poor relief (Harris 2004, 154–57; Orloff 1993, 153–54).
In Britain, workers’ compensation laws initially grew out of the common law of torts. Yet while judges had experimented with new legal remedies and some jurists had attempted to revise standards of care to protect individuals from harm, it was ultimately the legislature that made the most significant advances toward addressing the dangers of industrialization (Lobban 2010). The Workman’s Compensation Act of 1897 provided compensation for accidents occurring in the course of employment. As Julia Moses notes, courts often chose to interpret the term “work” broadly, resulting in outcomes more favorable toward workers than a strict interpretation of the law would allow. Nevertheless, workers were still required to establish a direct link between their workplace and a particular harm (Moses 2018, 128–41). This made it difficult to address the problem of industrial illnesses. As a 1904 governmental report put it, citing the testimony of factory medical inspector Thomas Legge, the “question of how and when the disease was contracted . . . would in the great majority of cases be so obscure and uncertain that it would probably lead to much dispute and litigation.” Although there was considerable evidence of a connection between certain diseases and particular occupations, Legge concluded that a system of sickness insurance, “where the cause of the ailment would be immaterial,” was a more fitting response than accident liability (Great Britain—Home Office Departmental Committee on Workmen’s Compensation 1904, 45–46).
The prevailing approach changed in 1906, when a newly elected Liberal government pushed for the inclusion of industrial diseases within the compensation scheme, starting with six and eventually expanding the list to thirty. With this, according to Moses, Britain “signaled a move away from linear thinking about occupational risk,” toward a “more expansive and flexible” understanding of the concept (Moses 2018, 141). It also shifted further toward the principle of insurance, which conditions benefits on the occurrence of an event rather than the causal story behind it. We have seen that accounts of statistical insurance emphasized the predictability of the aggregate rather than causal knowledge of the individual case. The evolution of workers’ compensation shows how a statistical understanding of certain phenomena can militate in favor of social insurance for groups of citizens demonstrably affected by them. 8
Compulsory pensions, enacted in 1908, also took a flexible approach to defining risk and managing it for the benefit of particular groups. The scheme was financed by general taxes and offered payments to every citizen over the age of seventy years with an income of less than a certain amount per year. As E. P. Hennock shows, the National Committee of Organized Labour for Promoting Old Age Pensions for All, which succeeded in electing several members of parliament in 1906, had advocated universal, noncontributory pensions, although in the end, financial concerns, among others, meant that benefits were granted only to the poorer segments of the population and those who passed certain character tests (Hennock 2007, 221–25; see also Thane 1984, 896). 9 The choice of tax financing was partly based on the view that the poor could not be expected to save for themselves and partly a capitulation to friendly societies, who feared that a contributory system would deflect working-class savings and reduce their own ranks (Gilbert 1965; see also Baldwin 1990, 99–100; Harris 2004, 159; Heclo 1974, 175).
Despite their coverage limitations and design, however, the pension scheme can be seen as part of the broader turn away from poor relief and toward social insurance principles. A royal commission had examined the possibility of public pensions in the context of poor law reform, yet by this time, developments in Denmark, Germany, and elsewhere had inspired policymakers to turn to insurance-based solutions instead (Boyer 2018, 214–15; Hennock 1981). Labor groups had also rejected poor relief, which operated on a logic of deterrence that included the severe threat of the workhouse (Hennock 2007, 225; Macnicol 1998, 138–39; Orloff 1993, 7–8, 153). Finally, the argument for pensions received support from recent statistical findings, including those of Booth, which confirmed that large percentages of the elderly were compelled to seek poor relief and that most had not previously done so, belying the claim that poverty was somehow their fault (Booth 1891, 632; Booth 1899, 214).
It is also worth recalling in this context that social insurance programs have traditionally allowed for a variety of financing arrangements and do not require any particular relationship between individual contributions and benefits. Many rely on general tax financing to a degree, and some do not require worker contributions (see Baldwin 1990, 63–65; Burns 1949, 29–31). It is therefore difficult to draw a bright-line distinction between event-conditioned coverage that is tied to modest or nonexistent worker contributions and coverage that is financed by general tax revenues, particularly given that all citizens are in principle responsible for contributing at some level to the latter. In addition, a number of prominent arguments at this time presented entitlement as deriving from workers’ service to society, a form of contribution “in kind” (Orloff 1993, 157, 178). It is therefore more instructive, in analyzing the emergence of the welfare state during this period, to focus on the trend toward offering event-based provision to groups of vulnerable citizens and thereby removing them from the purview of the poor laws (Burns 1943, 518). Indeed, George Boyer points out that the authors of the pension scheme took pains to ensure that payments were not associated with the stigma of poor relief, including by distributing payments through the post office rather than poor law authorities (Boyer 2018, 195).
In addition to addressing the needs of vulnerable groups, the design of the pension scheme shows how citizens’ perceptions of a risk can influence the shape of policy. Winston Churchill, then president of the Board of Trade, noted that the possibility of attaining old age “seems so doubtful and remote to the ordinary man, when in the full strength of manhood, that it has been found in practice almost impossible to secure from any very great number of people the regular sacrifices” to provide for that eventuality. By contrast, “unemployment, accident, sickness, and the death of the breadwinner are catastrophes which may reach any household at any moment,” making employee contributions more palatable (Churchill [1909] 1973b, 311–12). One implication of this observation is that when citizens recognize their equal vulnerability to a risk—or in probabilistic terms, when they are able to understand themselves as members of the same “series”—they are more willing to regard the mutual protection of social insurance as serving their personal advantage alongside the demands of distributive fairness.
Unemployment insurance, enacted in 1911, offers another illustration of the features associated with probabilistic thinking of the time, as well as the plurality of principles supporting social insurance policies. The Royal Commission on the Poor Laws had considered unemployment insurance in depth, noting both market failures and solidarity as justifications for state provision. The commission report explained that insurance had long been the “only possible way of providing against the miseries of unemployment,” but that many workers, particularly the unskilled and unorganized, found themselves without protection (Great Britain—Royal Commission on the Poor Laws and Relief of Distress 1909, 332). Given the limited reach of trade unions in providing benefits, the report considered whether a state subsidy to poorer unions and friendly societies might enable greater coverage. 10 Its findings were favorable toward subsidized union coverage, but doubtful that existing friendly societies, which lack the “sense of solidarity amongst the men within a trade,” would undertake such insurance even with the help of a subsidy (416–19). The report concluded that unemployment insurance would only be possible within a new type of society, comprising workers who knew one another’s circumstances and could see themselves as equals with respect to the risk.
Nevertheless, the unemployment insurance scheme adopted in 1911 rested on a broader vision of solidarity than the one identified by the commission. Recent economic depressions had confirmed to the public that certain types of unemployment were the product of forces more powerful than any individual worker (Churchill [1908] 1973a, 196; De Swaan 1988, 183–84, 196). The resulting scheme focused on a number of industries and financed the claims of the unemployed through contributions from those currently working, together with employers and the state (Boyer 2018, 201–2; Gilbert 1966, 281). William Beveridge, one of the main proponents of the policy—and later an architect of the postwar welfare state—emphasized the interdependence and equal vulnerability of those covered. “The regular workman must admit a certain solidarity . . . with the irregular workman, since without the latter the industry by which the former lives could not be carried on” (Harris 1977, 173).
One might suspect that frequentism, which defines probability with reference to a series of similar occurrences, would not support risk pooling that extends beyond actuarial classes in this way. Yet a vision of solidarity beyond narrow groups is not incompatible with a frequentist view. Rather, as we have seen, frequentism acknowledged the relativity of reference classes and allowed for their flexible definition in light of available information. Thus, a reference class could incorporate a narrow group or an entire nation, depending on the risk in question and the state of knowledge about it (see Venn [1888] 2006, 45–46). In the case of unemployment, as many observed at the time, the lack of reliable statistical information made it difficult to rigorously calculate how much members of the different trades should pay for their own protection (Great Britain—Royal Commission on the Poor Laws and Relief of Distress 1909, 416). While this observation could be seen to militate in favor of restricting coverage to each trade, as the commission concluded, it could also support an enlarged reference class, in which those who cannot clearly establish their own odds are more likely to see themselves as equally vulnerable to the harm. Moreover, a collective understanding of probability need not preclude cooperation among distinct risk classes, either for the sake of some direct advantage to each or based on claims about the fairness of distributing the costs of the risk more broadly. The key point here is that frequentism and associated statistical developments supported a novel focus on such classes as the targets of social policy and, increasingly over time, as the subjects demanding them as well (see also Baldwin 1990; Lindert 2014).
The theoretical foundations of the unemployment scheme were articulated in a 1910 article by Hubert Llewellyn Smith, then commissioner of the newly formed Labour Department of the Board of Trade, who worked closely with Beveridge in preparing the scheme (Harris 1977, 169–85). Llewellyn Smith identified a range of causal factors at work in unemployment. Examining each in turn, he found that, at least in the major trades, the effects of personal characteristics were on average less significant than the effects of broader economic conditions over which the individual has no control. This statistical analysis, eschewing individualized causal inquiry, led to the conclusion that insurance is an appropriate solution for such risks. Even where a given worker’s personal inadequacies lead him to be selected for unemployment over another, “it does not necessarily follow that these defects are a principal or even a contributory cause of his unemployment” (Smith 1910, 525).
In making his case, Llewellyn Smith also invoked a crucial element of what is now known as the market-failures rationale for social insurance, namely the importance of the “distribution of income in respect of time” (516; see also Gruber 2016, 338–42). A regular income of a certain amount per month will have a different value to the individual than the same total sum received at irregular intervals or all at once. To the extent that markets or private initiative do not allow individuals to achieve such intertemporal security, there may be a case for government to provide it in the form of compulsory insurance. Yet Llewellyn Smith also suggested that economic reasoning could not on its own decide the question. “. . . [T]here is a noble as well as an ignoble ideal of security, and the great problem that lies before us in the future is to distinguish rightly between them and to direct our national policy accordingly” (Smith 1910, 517). Indeed, the essay concluded, the aim of promoting regularity in working-class incomes is but one policy objective among many, to be weighed alongside others that stake a claim to public resources. This article has suggested that a version of distributive fairness and solidarity, understood as the sharing of burdens among those who are vulnerable to a given risk, provided an important normative complement to this logic.
We have thus seen how a statistical understanding of risk allowed for observations about groups of individuals without regard for the causal lineage of each outcome. We have also seen how the frequentist interpretation of probability supported a perception of equal vulnerability and with it a form of solidarity within groups affected by specific hazards. While this article has not intended to provide definitive evidence of the influence of frequentist ideas on social policy developments, the analysis has shown that early social insurance rested on an awareness of statistical classes as the targets of state interventions and on a willingness to define those classes flexibly in light of available information and social needs. Moreover, explanations of these policies emphasized both their economic advantage for individuals and the fairness of redistributing the costs of development within and among groups. These observations suggest that efficiency-based and distributive arguments for social insurance are not mutually exclusive but have long coexisted, and that probabilistic arguments for risk pooling may lend support to both.
Contemporary Implications
In emphasizing the links between probabilistic and statistical developments and a number of early social insurance programs, the foregoing has intended to highlight an aspect of political discourse that has been neglected by many scholars of welfare policy development. While it is well known that economic and social policy took a more collectivist turn in a number of countries around this time, students of the welfare state have not sufficiently appreciated the role of changing conceptions of probability and risk in enabling and supporting that shift.
This argument is of more than historical interest, however. Recently, the notion that risk groups are responsible for generating social policy has gained prominence in the history and political economy of the welfare state (see Rehm 2016; Rehm, Hacker, and Schlesinger 2012). In a similar vein, Robert Goodin has argued that the virtue of understanding welfare as insurance is that “redistribution, of a sort, is thus justified without any appeal to old-style and increasingly unfashionable values” such as altruism or social citizenship (Goodin 2003, 216). The preceding analysis adds further evidence and detail to support this view, and also clarifies the distinct but compatible normative principles that underlie it. While a market-failures reconstruction of the welfare state, and social insurance specifically, has much to recommend it, it does not fully account for the explicitly egalitarian and solidaristic justifications also offered for such policies over the course of their history. Focusing on the philosophy of probability and statistics allows us to perceive the close conceptual connections between these arguments and the ways in which social insurance reflects and advances them.
Returning to the standard of “expressive adequacy,” this paper has argued that a plurality of normative concerns better explains the emergence of social insurance than any single principle. Moreover, as Goodin’s remark attests, social insurance can still be plausibly described as serving both efficiency and equality goals, grounded in a flexible account of risk-based solidarities. In some cases, this shared risk pool has included the population as a whole, while in others, it has been constituted by one or more subgroups that face similar hazards and among whom pooling is seen to be mutually beneficial or fair.
What remains to be shown is how such a pluralist or “mixed” model of social insurance can further our normative self-understanding and the achievement of social or political goals. While a full demonstration of this point is not possible here, it is likely that the capacity of social insurance to accommodate distinct principles and aims is its greatest strength as an institution. While correcting some forms of market failure, it also expresses an understanding of mutual responsibility grounded in a perception of shared vulnerability to harm. The degree to which different social insurance programs further these normative purposes will differ from place to place and over time. Yet by appreciating how social insurance accommodates, and to some degree harmonizes, such principles, we will be better situated to understand its potential and its distinct role in the modern welfare state.
Footnotes
Acknowledgements
Special thanks to Peter Baldwin, Hanoch Dagan, David Enoch, Robert Goodin, Peter Hall, Joseph Heath, and Roy Kreitner, as well as three anonymous reviewers, for insightful questions and comments on the ideas presented in this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors would like to thank the Edmond J. Safra Foundation for generous financial assistance that aided in the writing of this article.
